Link Search Menu Expand Document Documentation Menu

Limit token filter

The limit token filter is used to limit the number of tokens passed through the analysis chain.

Parameters

The limit token filter can be configured with the following parameters.

Parameter Required/Optional Data type Description
max_token_count Optional Integer The maximum number of tokens to be generated. Default is 1.
consume_all_tokens Optional Boolean (Expert-level setting) Uses all tokens from the tokenizer, even if the result exceeds max_token_count. When this parameter is set, the output still only contains the number of tokens specified by max_token_count. However, all tokens generated by the tokenizer are processed. Default is false.

Example

The following example request creates a new index named my_index and configures an analyzer with a limit filter:

PUT my_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "three_token_limit": {
          "tokenizer": "standard",
          "filter": [ "custom_token_limit" ]
        }
      },
      "filter": {
        "custom_token_limit": {
          "type": "limit",
          "max_token_count": 3
        }
      }
    }
  }
}

Generated tokens

Use the following request to examine the tokens generated using the analyzer:

GET /my_index/_analyze
{
  "analyzer": "three_token_limit",
  "text": "OpenSearch is a powerful and flexible search engine."
}

The response contains the generated tokens:

{
  "tokens": [
    {
      "token": "OpenSearch",
      "start_offset": 0,
      "end_offset": 10,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "is",
      "start_offset": 11,
      "end_offset": 13,
      "type": "<ALPHANUM>",
      "position": 1
    },
    {
      "token": "a",
      "start_offset": 14,
      "end_offset": 15,
      "type": "<ALPHANUM>",
      "position": 2
    }
  ]
}
350 characters left

Have a question? .

Want to contribute? or .