Link Search Menu Expand Document Documentation Menu

Preparing vectors

In OpenSearch, you can either bring your own vectors or let OpenSearch generate them automatically from your data. Letting OpenSearch automatically generate your embeddings reduces data preprocessing effort at ingestion and search time.

Option 1: Bring your own raw vectors or generated embeddings

You already have pre-computed embeddings or raw vectors from external tools or services.

  • Ingestion: Ingest pregenerated embeddings directly into OpenSearch.

    Pre-generated embeddings ingestion

  • Search: Perform vector search to find the vectors that are closest to a query vector.

    Pre-generated embeddings search

Steps

Working with embeddings generated outside of OpenSearch involves the following steps:

1
Generate embeddings

Generate embeddings outside of OpenSearch using your favorite embedding utility.

2

Create an OpenSearch index to store your embeddings.

3

Ingest your embeddings into the index.

4

Search your embeddings using vector search.

Getting started with vector search

Use raw vectors or embeddings generated outside of OpenSearch

Option 2: Generate embeddings within OpenSearch

Use this option to let OpenSearch automatically generate vector embeddings from your data using a machine learning (ML) model.

  • Ingestion: You ingest plain data, and OpenSearch uses an ML model to generate embeddings dynamically.

    Auto-generated embeddings ingestion

  • Search: At query time, OpenSearch uses the same ML model to convert your input data to embeddings, and these embeddings are used for vector search.

    Auto-generated embeddings search

Steps

Working with text that is automatically converted to embeddings within OpenSearch involves the following steps:

1

Configure a machine learning model that will automatically generate embeddings from your text at ingestion time and query time.

2

Create an OpenSearch index to store your text.

3

Ingest your text into the index.

4

Search your text using vector search. Query text is automatically converted to vector embeddings and compared to document embeddings.

Generating embeddings automatically

Automatically convert data to embeddings within OpenSearch

Getting started with semantic and hybrid search

Learn how to implement semantic and hybrid search

350 characters left

Have a question? .

Want to contribute? or .