Link Search Menu Expand Document Documentation Menu

You're viewing version 2.16 of the OpenSearch documentation. This version is no longer maintained. For the latest version, see the current documentation. For information about OpenSearch version maintenance, see Release Schedule and Maintenance Policy.

Pipeline processor

The pipeline processor allows a pipeline to reference and include another predefined pipeline. This can be useful when you have a set of common processors that need to be shared across multiple pipelines. Instead of redefining those common processors in each pipeline, you can create a separate base pipeline containing the shared processors and then reference that base pipeline from other pipelines using the pipeline processor.

The following is the syntax for the pipeline processor:

{
  "pipeline": {
    "name": "general-pipeline"
  }
}

Configuration parameters

The following table lists the required and optional parameters for the pipeline processor.

Parameter Required/Optional Description
name Required The name of the pipeline to execute.
description Optional A description of the processor’s purpose or configuration.
if Optional Specifies to conditionally execute the processor.
ignore_failure Optional Specifies to ignore processor failures. See Handling pipeline failures.
on_failure Optional Specifies to handle processor failures. See Handling pipeline failures.
tag Optional An identifier for the processor. Useful for debugging and metrics.

Using the processor

Follow these steps to use the processor in a pipeline.

Step 1: Create a pipeline

The following query creates a general pipeline named general-pipeline and then creates a new pipeline named outer-pipeline, which references the general-pipeline:

PUT _ingest/pipeline/general_pipeline  
{  
  "description": "a general pipeline",  
  "processors": [  
    {  
      "uppercase": {  
        "field": "protocol"  
      },  
      "remove": {  
        "field": "name"  
      }  
    }  
  ]  
}

PUT _ingest/pipeline/outer-pipeline  
{  
  "description": "an outer pipeline referencing the general pipeline",  
  "processors": [  
    {  
      "pipeline": {  
        "name": "general-pipeline"  
      }  
    }  
  ]  
}

Step 2 (Optional): Test the pipeline

It is recommended that you test your pipeline before you ingest documents.

To test the pipeline, run the following query:

POST _ingest/pipeline/outer-pipeline/_simulate
{  
  "docs": [  
    {  
      "_source": {  
        "protocol": "https",  
        "name":"test"  
      }  
    }  
  ]  
}  

Response

The following example response confirms that the pipeline is working as expected:

{  
  "docs": [  
    {  
      "doc": {  
        "_index": "_index",  
        "_id": "_id",  
        "_source": {  
          "protocol": "HTTPS"  
        },  
        "_ingest": {  
          "timestamp": "2024-05-24T02:43:43.700735801Z"  
        }  
      }  
    }  
  ]  
}

Step 3: Ingest a document

The following query ingests a document into an index named testindex1:

POST testindex1/_doc/1?pipeline=outer-pipeline  
{  
  "protocol": "https",  
  "name": "test"  
}  

Response

The request indexes the document with the protocol field converted to uppercase and the field name removed from the index testindex1, as shown in the following response:

{  
  "_index": "testindex1",  
  "_id": "1",  
  "_version": 2,  
  "result": "created",  
  "_shards": {  
    "total": 2,  
    "successful": 2,  
    "failed": 0  
  },  
  "_seq_no": 1,  
  "_primary_term": 1  
}  

Step 4 (Optional): Retrieve the document

To retrieve the document, run the following query:

GET testindex1/_doc/1

Response

The response shows the document with the protocol field converted to uppercase and the field name removed:

{  
  "_index": "testindex1",  
  "_id": "1",  
  "_version": 2,  
  "_seq_no": 1,  
  "_primary_term": 1,  
  "found": true,  
  "_source": {  
    "protocol": "HTTPS"  
  }  
}