You're viewing version 2.16 of the OpenSearch documentation. This version is no longer maintained. For the latest version, see the current documentation. For information about OpenSearch version maintenance, see Release Schedule and Maintenance Policy.
Match phrase query
Use the match_phrase
query to match documents that contain an exact phrase in a specified order. You can add flexibility to phrase matching by providing the slop
parameter.
The match_phrase
query creates a phrase query that matches a sequence of terms.
The following example shows a basic match_phrase
query:
GET _search
{
"query": {
"match_phrase": {
"title": "the wind"
}
}
}
To pass additional parameters, you can use the expanded syntax:
GET _search
{
"query": {
"match_phrase": {
"title": {
"query": "the wind",
"analyzer": "stop"
}
}
}
}
Example
For example, consider an index with the following documents:
PUT testindex/_doc/1
{
"title": "The wind rises"
}
PUT testindex/_doc/2
{
"title": "Gone with the wind"
}
The following match_phrase
query searches for the phrase wind rises
, where the word wind
is followed by the word rises
:
GET testindex/_search
{
"query": {
"match_phrase": {
"title": "wind rises"
}
}
}
The response contains the matching document:
Response
{
"took": 30,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.92980814,
"hits": [
{
"_index": "testindex",
"_id": "1",
"_score": 0.92980814,
"_source": {
"title": "The wind rises"
}
}
]
}
}
Analyzer
By default, when you run a query on a text
field, the search text is analyzed using the index analyzer associated with the field. You can specify a different search analyzer in the analyzer
parameter. For example, the following query uses the english
analyzer:
GET testindex/_search
{
"query": {
"match_phrase": {
"title": {
"query": "the winds",
"analyzer": "english"
}
}
}
}
The english
analyzer removes the stopword the
and performs stemming, producing the token wind
. Both documents match this token and are returned in the results:
Response
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 0.19363807,
"hits": [
{
"_index": "testindex",
"_id": "1",
"_score": 0.19363807,
"_source": {
"title": "The wind rises"
}
},
{
"_index": "testindex",
"_id": "2",
"_score": 0.17225474,
"_source": {
"title": "Gone with the wind"
}
}
]
}
}
Slop
If you provide a slop
parameter, the query tolerates reorderings of the search terms. Slop specifies the number of other words permitted between words in a query phrase. For example, in the following query, the search text is reordered compared to the document text:
GET _search
{
"query": {
"match_phrase": {
"title": {
"query": "wind rises the",
"slop": 3
}
}
}
}
The query still returns the matching document:
Response
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.44026947,
"hits": [
{
"_index": "testindex",
"_id": "1",
"_score": 0.44026947,
"_source": {
"title": "The wind rises"
}
}
]
}
}
Empty query
For information about a possible empty query, see the corresponding match query section.
Parameters
The query accepts the name of the field (<field>
) as a top-level parameter:
GET _search
{
"query": {
"match_phrase": {
"<field>": {
"query": "text to search for",
...
}
}
}
}
The <field>
accepts the following parameters. All parameters except query
are optional.
Parameter | Data type | Description |
---|---|---|
query | String | The query string to use for search. Required. |
analyzer | String | The analyzer used to tokenize the query string text. Default is the index-time analyzer specified for the default_field . If no analyzer is specified for the default_field , the analyzer is the default analyzer for the index. |
slop | 0 (default) or a positive integer | Controls the degree to which words in a query can be misordered and still be considered a match. From the Lucene documentation: “The number of other words permitted between words in query phrase. For example, to switch the order of two words requires two moves (the first move places the words atop one another), so to permit reorderings of phrases, the slop must be at least two. A value of zero requires an exact match.” |
zero_terms_query | String | In some cases, the analyzer removes all terms from a query string. For example, the stop analyzer removes all terms from the string an but this . In those cases, zero_terms_query specifies whether to match no documents (none ) or all documents (all ). Valid values are none and all . Default is none . |