Performance Analyzer provides a number of metrics to help you evaluate performance. The following tables describe the available metrics, grouped by the dimensions that are most relevant for that metric. All metrics support the avg, sum, min, and max aggregations, although for certain metrics, the measured value is the same regardless of aggregation type.
For information about each of the dimensions, see dimensions reference later in this topic.
This list is extensive. We recommend using Ctrl/Cmd + F to find what you’re looking for.
CPU usage ratio. CPU time (in milliseconds) used by the associated thread(s) in the past five seconds, divided by 5000 milliseconds.
Paging_MajfltRate
The number of major faults per second in the past five seconds. A major fault requires the process to load a memory page from disk.
Paging_MinfltRate
The number of minor faults per second in the past five seconds. A minor fault does not requires the process to load a memory page from disk.
Paging_RSS
The number of pages the process has in real memory---the pages that count towards text, data, or stack space. This number does not include pages that have not been demand-loaded in or swapped out.
Sched_Runtime
Time (seconds) spent executing on the CPU per context switch.
Sched_Waittime
Time (seconds) spent waiting on a run queue per context switch.
Sched_CtxRate
Number of times run on the CPU per second in the past five seconds.
Heap_AllocRate
An approximation, in bytes, of the heap memory allocated per second in the last 5 seconds.
IO_ReadThroughput
Number of bytes read per second in the last five seconds.
IO_WriteThroughput
Number of bytes written per second in the last five seconds.
IO_TotThroughput
Number of bytes read or written per second in the last five seconds.
IO_ReadSyscallRate
Read system calls per second in the last five seconds.
IO_WriteSyscallRate
Write system calls per second in the last five seconds.
IO_TotalSyscallRate
Read and write system calls per second in the last five seconds.
Thread_Blocked_Time
The average amount of time, in seconds, that the associated thread has been blocked from entering or reentering a monitor.
Thread_Blocked_Event
The total number of times that the associated thread has been blocked from entering or reentering a monitor (that is, the number of times a thread has been in the `blocked` state).
Thread_Waited_Time
The average amount of time, in seconds, that the associated thread has waited to enter or reenter a monitor (that is, the amount of time a thread has been in the `WAITING` or `TIMED_WAITING` state)".
Thread_Waited_Event
The total number of times that the associated thread has waited to enter or reenter a monitor (that is, the number of times a thread has been in the WAITING or TIMED_WAITING state).
ShardEvents
The total number of events executed on a shard in the past five seconds.
ShardBulkDocs
The total number of documents indexed in the past five seconds.
Relevant dimensions: ShardID, IndexName
Metric
Description
Indexing_ThrottleTime
Time (milliseconds) that the index has been under merge throttling control in the past five seconds.
Cache_Query_Hit
The number of successful lookups in the query cache in the past five seconds.
Cache_Query_Miss
The number of lookups in the query cache that failed to retrieve a `DocIdSet` in the past five seconds. `DocIdSet` is a set of document IDs in Lucene.
Cache_Query_Size
Query cache memory size in bytes.
Cache_FieldData_Eviction
The number of times OpenSearch has evicted data from the fielddata heap space (occurs when the heap space is full) in the past five seconds.
Cache_FieldData_Size
Fielddata memory size in bytes.
Cache_Request_Hit
The number of successful lookups in the shard request cache in the past five seconds.
Cache_Request_Miss
The number of lookups in the request cache that failed to retrieve the results of search requests in the past five seconds.
Cache_Request_Eviction
The number of times OpenSearch evicts data from shard request cache (occurs when the request cache is full) in the past five seconds.
The total heap size, in bytes, that is available for use by an index shard in a particular indexing stage (Coordinating, Primary, or Replica).
Indexing_Pressure_Current_Bytes
The total heap size, in bytes, occupied by an index shard in a particular indexing stage (Coordinating, Primary, or Replica).
Indexing_Pressure_Last_Successful_Timestamp
The timestamp of a successful request for an index shard in a particular indexing stage (Coordinating, Primary, or Replica).
Indexing_Pressure_Rejection_Count
The total number of rejections performed by OpenSearch for an index shard in a particular indexing stage (Coordinating, Primary, or Replica).
Indexing_Pressure_Average_Window_Throughput
The average throughput of the last n requests (The value of n is determined by the `shard_indexing_pressure.secondary_parameter.throughput.request_size_window` setting) for an index shard in a particular indexing stage (Coordinating, Primary, or Replica).
The number of garbage collections that have occurred in the past five seconds.
GC_Collection_Time
The approximate accumulated time (milliseconds) of all garbage collections that have occurred in the past five seconds.
Heap_Committed
The amount of memory (bytes) that is committed for the JVM to use.
Heap_Init
The amount of memory (bytes) that the JVM initially requests from the operating system for memory management.
Heap_Max
The maximum amount of memory (bytes) that can be used for memory management.
Heap_Used
The amount of used memory in bytes.
Relevant dimension: DiskName
Metric
Description
Disk_Utilization
Disk utilization rate: percentage of disk time spent reading and writing by the OpenSearch process in the past five seconds.
Disk_WaitTime
Average duration (milliseconds) of read and write operations in the past five seconds.
Disk_ServiceRate
Service rate: MB read or written per second in the past five seconds. This metric assumes that each disk sector stores 512 bytes.
Relevant dimension: DestAddr
Metric
Description
Net_TCP_NumFlows
The number of samples collected. Performance Analyzer collects 1 sample every 5 seconds.
Net_TCP_TxQ
The average number of TCP packets in the send buffer.
Net_TCP_RxQ
The average number of TCP packets in the receive buffer.
Net_TCP_Lost
The average number of unrecovered recurring timeouts. This number is reset when the recovery finishes or `SND.UNA` is advanced. `SND.UNA` is the sequence number of the first byte of data that has been sent but not yet acknowledged.
Net_TCP_SendCWND
The average size, in bytes, of the sending congestion window.
Net_TCP_SSThresh
The average size, in bytes, of the slow start size threshold.
Relevant dimension: Direction
Metric
Description
Net_PacketRate4
The total number of IPv4 datagrams transmitted/received from/by interfaces per second, including those transmitted or received in error.
Net_PacketDropRate4
The total number of IPv4 datagrams transmitted or received in error per second.
Net_PacketRate6
The total number of IPv6 datagrams transmitted or received from or by interfaces per second, including those transmitted or received in error.
Net_PacketDropRate6
The total number of IPv6 datagrams transmitted or received in error per second.
Net_Throughput
The number of bits transmitted or received per second by all network interfaces.
Relevant dimension: ThreadPoolType
Metric
Description
ThreadPool_QueueSize
The size of the task queue.
ThreadPool_RejectedReqs
The number of rejected executions.
ThreadPool_TotalThreads
The current number of threads in the pool.
ThreadPool_ActiveThreads
The approximate number of threads that are actively executing tasks.
The current number of pending tasks in the cluster state update thread. Each node has a cluster state update thread that submits cluster state update tasks, such as create index, update mapping, allocate shard, and fail shard.
The amount of time, in milliseconds, that a cluster manager task spent in the queue.
ClusterManager_Task_Run_Time
The amount of time, in milliseconds, that a cluster manager task has been running.
Relevant dimension: CacheType
Metric
Description
Cache_MaxSize
The maximum size of the cache, in bytes.
Relevant dimension: ControllerName
Metric
Description
AdmissionControl_RejectionCount
The total number of rejections performed by a Controller of Admission Control.
AdmissionControl_CurrentValue
The current value for Controller of Admission Control.
AdmissionControl_ThresholdValue
The threshold value for Controller of Admission Control.
Relevant dimension: NodeID
Metric
Description
Data_RetryingPendingTasksCount
The number of throttled pending tasks on which the data node is actively performing retries. It is an absolute metric at that point in time.
ClusterManager_ThrottledPendingTasksCount
The sum of the total pending tasks that were throttled by the cluster manager node. This is a cumulative metric, so make sure to check the max aggregation.
Relevant dimensions: N/A
The following metrics are relevant to the cluster as a whole and do not require specific dimensions.
Metric
Description
Election_Term
A number that increases monotonically with every cluster manager election.
PublishClusterState_Latency
The amount of time taken by the quorum of nodes to publish the new cluster state. This metric is available for the current cluster manager.
PublishClusterState_Failure
The number of times the new cluster state failed to publish on the cluster manager node.
ClusterApplierService_Latency
The amount of time taken by each node for the apply cluster state sent by the cluster manager.
ClusterApplierService_Failure
The number of times that the apply cluster state action failed on each node.
The average time elapsed for all tasks currently running on the node at the search task level.
Dimensions reference
Dimension
Return values
ShardID
The ID of the shard, for example, 1.
IndexName
The name of the index, for example, my-index.
Operation
The type of operation, for example, shardbulk.
ShardRole
The shard role, for example, primary or replica.
Exception
OpenSearch exceptions, for example, org.opensearch.index_not_found_exception.
Indices
The list of indexes in the request URL.
HTTPRespCode
The OpenSearch response code, for example, 200.
MemType
The memory type, for example, totYoungGC, totFullGC, Survivor, PermGen, OldGen, Eden, NonHeap, or Heap.
DiskName
The name of the disk, for example, sda1.
DestAddr
The destination address, for example, 010015AC.
Direction
The direction, for example, in or out.
ThreadPoolType
The OpenSearch thread pools, for example, index, search, or snapshot.
CBType
The circuit breaker type, for example, accounting, fielddata, in_flight_requests, parent, or request.
ClusterManagerTaskInsertOrder
The order in which the task was inserted, for example, 3691.
ClusterManagerTaskPriority
The priority of the task, for example, URGENT. OpenSearch executes higher-priority tasks before lower-priority ones, regardless of insert_order.
ClusterManagerTaskType
The task type, for example, shard-started, create-index, delete-index, refresh-mapping, put-mapping, CleanupSnapshotRestoreState, or Update snapshot state.
ClusterManagerTaskMetadata
The metadata for the task (if any).
CacheType
The cache type, for example, Field_Data_Cache, Shard_Request_Cache, or Node_Query_Cache.
NodeID
The ID of the node.
Searchbp_mode
The search backpressure mode, for example, monitor_only (default), enforced, or disabled.