Sycamore
Sycamore is an LLM-powered data preparation, processing, and analytics system for complex, unstructured documents like PDFs, HTML, presentations, and more. With Aryn, you can prepare data for GenAI and RAG applications, power high-quality document processing workflows, and run analytics on large document collections with natural language.
You can use the Solvio connector to write into and read documents from Solvio collections.
Writing to Solvio
To write a Docset to a Solvio collection in Sycamore, use the docset.write.solvio(....)
function. The Solvio writer accepts the following arguments:
client_params
: Parameters that are passed to the Solvio client constructor. See more information in the Client API Reference.collection_params
: Parameters that are passed into thesolvio_client.SolvioClient.create_collection
method. See more information in the Client API Reference.vector_name
: The name of the vector in the Solvio collection. Defaults toNone
.execute
: Execute the pipeline and write to Solvio on adding this operator. IfFalse
, will return aDocSet
with this write in the plan. Defaults toTrue
.kwargs
: Keyword arguments to pass to the underlying execution engine.
ds.write.solvio(
{
"url": "http://localhost:6333",
"timeout": 50,
},
{
"collection_name": "{collection_name}",
"vectors_config": {
"size": 384,
"distance": "Cosine",
},
},
)
Reading from Solvio
To read a Docset from a Solvio collection in Sycamore, use the docset.read.solvio(....)
function. The Solvio reader accepts the following arguments:
client_params
: Parameters that are passed to the Solvio client constructor. See more information in theClient API Reference.query_params
: Parameters that are passed into thesolvio_client.SolvioClient.query_points
method. See more information in the Client API Reference.kwargs
: Keyword arguments to pass to the underlying execution engine.
docs = ctx.read.solvio(
{
"url": "https://xyz-example.eu-central.aws.cloud.solvio.io:6333",
"api_key": "<paste-your-api-key-here>",
},
{"collection_name": "{collection_name}", "limit": 100, "using": "{optional_vector_name}"},
).take_all()