You're viewing version 2.16 of the OpenSearch documentation. This version is no longer maintained. For the latest version, see the current documentation. For information about OpenSearch version maintenance, see Release Schedule and Maintenance Policy.
Pipeline processor
The pipeline
processor allows a pipeline to reference and include another predefined pipeline. This can be useful when you have a set of common processors that need to be shared across multiple pipelines. Instead of redefining those common processors in each pipeline, you can create a separate base pipeline containing the shared processors and then reference that base pipeline from other pipelines using the pipeline processor.
The following is the syntax for the pipeline
processor:
{
"pipeline": {
"name": "general-pipeline"
}
}
Configuration parameters
The following table lists the required and optional parameters for the pipeline
processor.
Parameter | Required/Optional | Description |
---|---|---|
name | Required | The name of the pipeline to execute. |
description | Optional | A description of the processor’s purpose or configuration. |
if | Optional | Specifies to conditionally execute the processor. |
ignore_failure | Optional | Specifies to ignore processor failures. See Handling pipeline failures. |
on_failure | Optional | Specifies to handle processor failures. See Handling pipeline failures. |
tag | Optional | An identifier for the processor. Useful for debugging and metrics. |
Using the processor
Follow these steps to use the processor in a pipeline.
Step 1: Create a pipeline
The following query creates a general pipeline named general-pipeline
and then creates a new pipeline named outer-pipeline
, which references the general-pipeline
:
PUT _ingest/pipeline/general_pipeline
{
"description": "a general pipeline",
"processors": [
{
"uppercase": {
"field": "protocol"
},
"remove": {
"field": "name"
}
}
]
}
PUT _ingest/pipeline/outer-pipeline
{
"description": "an outer pipeline referencing the general pipeline",
"processors": [
{
"pipeline": {
"name": "general-pipeline"
}
}
]
}
Step 2 (Optional): Test the pipeline
It is recommended that you test your pipeline before you ingest documents.
To test the pipeline, run the following query:
POST _ingest/pipeline/outer-pipeline/_simulate
{
"docs": [
{
"_source": {
"protocol": "https",
"name":"test"
}
}
]
}
Response
The following example response confirms that the pipeline is working as expected:
{
"docs": [
{
"doc": {
"_index": "_index",
"_id": "_id",
"_source": {
"protocol": "HTTPS"
},
"_ingest": {
"timestamp": "2024-05-24T02:43:43.700735801Z"
}
}
}
]
}
Step 3: Ingest a document
The following query ingests a document into an index named testindex1
:
POST testindex1/_doc/1?pipeline=outer-pipeline
{
"protocol": "https",
"name": "test"
}
Response
The request indexes the document with the protocol
field converted to uppercase and the field name removed from the index testindex1
, as shown in the following response:
{
"_index": "testindex1",
"_id": "1",
"_version": 2,
"result": "created",
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"_seq_no": 1,
"_primary_term": 1
}
Step 4 (Optional): Retrieve the document
To retrieve the document, run the following query:
GET testindex1/_doc/1
Response
The response shows the document with the protocol
field converted to uppercase and the field name removed:
{
"_index": "testindex1",
"_id": "1",
"_version": 2,
"_seq_no": 1,
"_primary_term": 1,
"found": true,
"_source": {
"protocol": "HTTPS"
}
}