An ingest pipeline is a sequence of processors that are applied to documents as they are ingested into an index. Each processor in a pipeline performs a specific task, such as filtering, transforming, or enriching data.
Processors are customizable tasks that run in a sequential order as they appear in the request body. This order is important, as each processor depends on the output of the previous processor. The modified documents appear in your index after the processors are applied.
Ingest pipelines can only be managed using ingest API operations.
The following are prerequisites for using OpenSearch ingest pipelines:
- When using ingestion in a production environment, your cluster should contain at least one node with the node roles permission set to
ingest. For information about setting up node roles within a cluster, see Cluster Formation.
- If the OpenSearch Security plugin is enabled, you must have the
cluster_manage_pipelinespermission to manage ingest pipelines.
Define a pipeline
A pipeline definition describes the sequence of an ingest pipeline and can be written in JSON format. An ingest pipeline consists of the following:
"description" : "..."
"processors" : [...]
Request body fields
|Array of processor objects
|A component that performs a specific data processing task as the data is being ingested into OpenSearch.
|A description of the ingest pipeline.
Learn how to:
- Create a pipeline.
- Test a pipeline.
- Retrieve information about a pipeline.
- Delete a pipeline.
- Use ingest processors in OpenSearch