You're viewing version 2.18 of the OpenSearch documentation. This version is no longer maintained. For the latest version, see the current documentation. For information about OpenSearch version maintenance, see Release Schedule and Maintenance Policy.

Unique token filter

The unique token filter ensures that only unique tokens are kept during the analysis process, removing duplicate tokens that appear within a single field or text block.

Parameters

The unique token filter can be configured with the following parameter.

Parameter	Required/Optional	Data type	Description
`only_on_same_position`	Optional	Boolean	If `true`, the token filter acts as a `remove_duplicates` token filter and only removes tokens that are in the same position. Default is `false`.

Example

The following example request creates a new index named unique_example and configures an analyzer with a unique filter:

PUT /unique_example
{
  "settings": {
    "analysis": {
      "filter": {
        "unique_filter": {
          "type": "unique",
          "only_on_same_position": false
        }
      },
      "analyzer": {
        "unique_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "unique_filter"
          ]
        }
      }
    }
  }
}

Generated tokens

Use the following request to examine the tokens generated using the analyzer:

GET /unique_example/_analyze
{
  "analyzer": "unique_analyzer",
  "text": "OpenSearch OpenSearch is powerful powerful and scalable"
}

The response contains the generated tokens:

{
  "tokens": [
    {
      "token": "opensearch",
      "start_offset": 0,
      "end_offset": 10,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "is",
      "start_offset": 22,
      "end_offset": 24,
      "type": "<ALPHANUM>",
      "position": 1
    },
    {
      "token": "powerful",
      "start_offset": 25,
      "end_offset": 33,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "and",
      "start_offset": 43,
      "end_offset": 46,
      "type": "<ALPHANUM>",
      "position": 3
    },
    {
      "token": "scalable",
      "start_offset": 47,
      "end_offset": 55,
      "type": "<ALPHANUM>",
      "position": 4
    }
  ]
}

Parameters
Example
Generated tokens

WAS THIS PAGE HELPFUL?

✔ Yes ✖ No

Tell us why

350 characters left

Have a question? Ask us on the OpenSearch forum.

Want to contribute? Edit this page or create an issue.

Unique token filter

Parameters

Example

Generated tokens

OpenSearch Links

Get Involved

Resources

Contact Us