You're viewing version 2.17 of the OpenSearch documentation. This version is no longer maintained. For the latest version, see the current documentation. For information about OpenSearch version maintenance, see Release Schedule and Maintenance Policy.

Split processor

Introduced 2.17

The split processor splits a string field into an array of substrings based on a specified delimiter.

Request body fields

The following table lists all available request fields.

Field	Data type	Description
`field`	String	The field containing the string to be split. Required.
`separator`	String	The delimiter used to split the string. Specify either a single separator character or a regular expression pattern. Required.
`preserve_trailing`	Boolean	If set to `true`, preserves empty trailing fields (for example, `''`) in the resulting array. If set to `false`, then empty trailing fields are removed from the resulting array. Default is `false`.
`target_field`	String	The field in which the array of substrings is stored. If not specified, then the field is updated in place.
`tag`	String	The processor’s identifier.
`description`	String	A description of the processor.
`ignore_failure`	Boolean	If `true`, then OpenSearch ignores any failure of this processor and continues to run the remaining processors in the search pipeline. Optional. Default is `false`.

Example

The following example demonstrates using a search pipeline with a split processor.

Setup

Create an index named my_index and index a document containing the field message:

POST /my_index/_doc/1
{
  "message": "ingest, search, visualize, and analyze data",
  "visibility": "public"
}

Creating a search pipeline

The following request creates a search pipeline with a split response processor that splits the message field and stores the results in the split_message field:

PUT /_search/pipeline/my_pipeline
{
  "response_processors": [
    {
      "split": {
        "field": "message",
        "separator": ", ",
        "target_field": "split_message"
      }
    }
  ]
}

Using a search pipeline

Search for documents in my_index without a search pipeline:

GET /my_index/_search

The response contains the field message:

Response

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "my_index",
        "_id": "1",
        "_score": 1,
        "_source": {
          "message": "ingest, search, visualize, and analyze data",
          "visibility": "public"
        }
      }
    ]
  }
}

To search with a pipeline, specify the pipeline name in the search_pipeline query parameter:

GET /my_index/_search?search_pipeline=my_pipeline

The message field is split and the results are stored in the split_message field:

Response

{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "my_index",
        "_id": "1",
        "_score": 1,
        "_source": {
          "visibility": "public",
          "message": "ingest, search, visualize, and analyze data",
          "split_message": [
            "ingest",
            "search",
            "visualize",
            "and analyze data"
          ]
        }
      }
    ]
  }
}

You can also use the fields option to search for specific fields in a document:

POST /my_index/_search?pretty&search_pipeline=my_pipeline
{
    "fields": ["visibility", "message"]
}

In the response, the message field is split and the results are stored in the split_message field:

Response

{
  "took": 7,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "my_index",
        "_id": "1",
        "_score": 1,
        "_source": {
          "visibility": "public",
          "message": "ingest, search, visualize, and analyze data",
          "split_message": [
            "ingest",
            "search",
            "visualize",
            "and analyze data"
          ]
        },
        "fields": {
          "visibility": [
            "public"
          ],
          "message": [
            "ingest, search, visualize, and analyze data"
          ],
          "split_message": [
            "ingest",
            "search",
            "visualize",
            "and analyze data"
          ]
        }
      }
    ]
  }
}

Request body fields
Example

WAS THIS PAGE HELPFUL?

✔ Yes ✖ No

Tell us why

350 characters left

Have a question? Ask us on the OpenSearch forum.

Want to contribute? Edit this page or create an issue.

Split processor

Request body fields

Example

Setup

Creating a search pipeline

Using a search pipeline

OpenSearch Links

Get Involved

Resources

Contact Us