Link Search Menu Expand Document Documentation Menu

This documentation describes using the drop processor in OpenSearch ingest pipelines. Consider using the Data Prepper drop_events processor, which runs on the OpenSearch cluster, if your use case involves large or complex datasets.

Drop processor

The drop processor is used to discard documents without indexing them. This can be useful for preventing documents from being indexed based on certain conditions. For example, you might use a drop processor to prevent documents that are missing important fields or contain sensitive information from being indexed.

The drop processor does not raise any errors when it discards documents, making it useful for preventing indexing problems without cluttering your OpenSearch logs with error messages.

Syntax example

The following is the syntax for the drop processor:

{
  "drop": {
    "if": "ctx.foo == 'bar'"
  }
}

Configuration parameters

The following table lists the required and optional parameters for the drop processor.

Parameter Required Description
description Optional A brief description of the processor.
if Optional A condition for running the processor.
ignore_failure Optional If set to true, failures are ignored. Default is false. See Handling pipeline failures for more information.
on_failure Optional A list of processors to run if the processor fails. See Handling pipeline failures for more information.
tag Optional An identifier tag for the processor. Useful for distinguishing between processors of the same type when debugging.

Using the processor

Follow these steps to use the processor in a pipeline.

Step 1: Create a pipeline

The following query creates a pipeline, named drop-pii, that uses the drop processor to prevent a document containing personally identifiable information (PII) from being indexed:

PUT /_ingest/pipeline/drop-pii
{
  "description": "Pipeline that prevents PII from being indexed",
  "processors": [
    {
      "drop": {
        "if" : "ctx.user_info.contains('password') || ctx.user_info.contains('credit card')"
      }
    }
  ]
}

Step 2 (Optional): Test the pipeline

It is recommended that you test your pipeline before ingesting documents.

To test the pipeline, run the following query:

POST _ingest/pipeline/drop-pii/_simulate
{
  "docs": [
    {
      "_index": "testindex1",
      "_id": "1",
      "_source": {
        "user_info": "Sensitive information including credit card"
      }
    }
  ]
}

Response

The following example response confirms that the pipeline is working as expected (the document has been dropped):

{
  "docs": [
    null
  ]
}

Step 3: Ingest a document

The following query ingests a document into an index named testindex1:

PUT testindex1/_doc/1?pipeline=drop-pii
{
  "user_info": "Sensitive information including credit card"
}

The following response confirms that the document with the ID of 1 was not indexed:

{ “_index”: “testindex1”, “_id”: “1”, “_version”: -3, “result”: “noop”, “_shards”: { “total”: 0, “successful”: 0, “failed”: 0 } }

350 characters left

Have a question? .

Want to contribute? or .