Search processors
Search processors can be of the following types:
Search request processors
A search request processor intercepts a search request (the query and the metadata passed in the request), performs an operation with or on the search request, and submits the search request to the index.
The following table lists all supported search request processors.
Processor | Description | Earliest available version |
---|---|---|
filter_query | Adds a filtering query that is used to filter requests. | 2.8 |
ml_inference | Invokes registered machine learning (ML) models in order to rewrite queries. | 2.16 |
neural_query_enricher | Sets a default model for neural search and neural sparse search at the index or field level. | 2.11 (neural), 2.13 (neural sparse) |
neural_sparse_two_phase_processor | Accelerates the neural sparse query. | 2.15 |
oversample | Increases the search request size parameter, storing the original value in the pipeline state. | 2.12 |
script | Adds a script that is run on newly indexed documents. | 2.8 |
Search response processors
A search response processor intercepts a search response and search request (the query, results, and metadata passed in the request), performs an operation with or on the search response, and returns the search response.
The following table lists all supported search response processors.
Processor | Description | Earliest available version |
---|---|---|
collapse | Deduplicates search hits based on a field value, similarly to collapse in a search request. | 2.12 |
ml_inference | Invokes registered machine learning (ML) models in order to incorporate model output as additional search response fields. | 2.16 |
personalize_search_ranking | Uses Amazon Personalize to rerank search results (requires setting up the Amazon Personalize service). | 2.9 |
rename_field | Renames an existing field. | 2.8 |
rerank | Reranks search results using a cross-encoder model. | 2.12 |
retrieval_augmented_generation | Used for retrieval-augmented generation (RAG) in conversational search. | 2.10 (generally available in 2.12) |
sort | Sorts an array of items in either ascending or descending order. | 2.16 |
truncate_hits | Discards search hits after a specified target count is reached. Can undo the effect of the oversample request processor. | 2.12 |
Search phase results processors
A search phase results processor runs between search phases at the coordinating node level. It intercepts the results retrieved from one search phase and transforms them before passing them to the next search phase.
The following table lists all supported search phase results processors.
Processor | Description | Earliest available version |
---|---|---|
normalization-processor | Intercepts the query phase results and normalizes and combines the document scores before passing the documents to the fetch phase. | 2.10 |
Viewing available processor types
You can use the Nodes Search Pipelines API to view the available processor types:
GET /_nodes/search_pipelines
The response contains the search_pipelines
object that lists the available request and response processors:
Response
{
"_nodes" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"cluster_name" : "runTask",
"nodes" : {
"36FHvCwHT6Srbm2ZniEPhA" : {
"name" : "runTask-0",
"transport_address" : "127.0.0.1:9300",
"host" : "127.0.0.1",
"ip" : "127.0.0.1",
"version" : "3.0.0",
"build_type" : "tar",
"build_hash" : "unknown",
"roles" : [
"cluster_manager",
"data",
"ingest",
"remote_cluster_client"
],
"attributes" : {
"testattr" : "test",
"shard_indexing_pressure_enabled" : "true"
},
"search_pipelines" : {
"request_processors" : [
{
"type" : "filter_query"
},
{
"type" : "script"
}
],
"response_processors" : [
{
"type" : "rename_field"
}
]
}
}
}
}
In addition to the processors provided by OpenSearch, additional processors may be provided by plugins.
Selectively enabling processors
Processors defined by the search-pipeline-common module are selectively enabled through the following cluster settings: search.pipeline.common.request.processors.allowed
, search.pipeline.common.response.processors.allowed
, or search.pipeline.common.search.phase.results.processors.allowed
. If unspecified, then all processors are enabled. An empty list disables all processors. Removing enabled processors causes pipelines using them to fail after a node restart.