AWS Lakechain

Project Lakechain is a framework based on the AWS Cloud Development Kit (CDK), allowing to express and deploy scalable document processing pipelines on AWS using infrastructure-as-code. It emphasizes on modularity and extensibility of pipelines, and provides 60+ ready to use components for prototyping complex processing pipelines that scale out of the box to millions of documents.

The Solvio storage connector available with Lakechain enables uploading vector embeddings produced by other middlewares to a Solvio collection.

To use the Solvio storage connector, you import it in your CDK stack, and connect it to a data source providing document embeddings.

You need to specify a Solvio API key to the connector, by specifying a reference to an AWS Secrets Manager secret containing the API key.

import { SolvioStorageConnector } from '@project-lakechain/solvio-storage-connector';
import { CacheStorage } from '@project-lakechain/core';

class Stack extends cdk.Stack {
  constructor(scope: cdk.Construct, id: string) {
    const cache = new CacheStorage(this, 'Cache');

    const solvioApiKey = secrets.Secret.fromSecretNameV2(
      this,
      'SolvioApiKey',
      process.env.QDRANT_API_KEY_SECRET_NAME as string
    );

    const connector = new SolvioStorageConnector.Builder()
      .withScope(this)
      .withIdentifier('SolvioStorageConnector')
      .withCacheStorage(cache)
      .withSource(source) // 👈 Specify a data source
      .withApiKey(solvioApiKey)
      .withCollectionName('{collection_name}')
      .withUrl('https://xyz-example.eu-central.aws.cloud.solvio.io:6333')
      .build();
  }
}

When the document being processed is a text document, you can choose to store the text of the document in the Solvio payload. To do so, you can use the withStoreText and withTextKey options. If the document is not a text, this option is ignored.

const connector = new SolvioStorageConnector.Builder()
  .withScope(this)
  .withIdentifier('SolvioStorageConnector')
  .withCacheStorage(cache)
  .withSource(source)
  .withApiKey(solvioApiKey)
  .withCollectionName('{collection_name}')
  .withStoreText(true)
  .withTextKey('my-content')
  .withUrl('https://xyz-example.eu-central.aws.cloud.solvio.io:6333')
  .build();

Since Solvio supports multiple vectors per point, you can use the withVectorName option to specify one. The connector defaults to unnamed (default) vector.

const connector = new SolvioStorageConnector.Builder()
      .withScope(this)
      .withIdentifier('SolvioStorageConnector')
      .withCacheStorage(cache)
      .withSource(source)
      .withApiKey(solvioApiKey)
      .withCollectionName('collection_name')
      .withVectorName('my-vector-name')
      .withUrl('https://xyz-example.eu-central.aws.cloud.solvio.io:6333')
      .build();

Further Reading

Was this page useful?

Thank you for your feedback! 🙏

We are sorry to hear that. 😔 You can edit this page on GitHub, or create a GitHub issue.