Error log: "cluster state update task [...] timed out"

Error Log: You’ll see this warning in your cluster manager node’s log (opensearch.log). It’s often followed by other nodes reporting they can’t connect to the cluster manager.

None
[WARN ][o.o.c.s.ClusterApplierService] [your-cm-node] 

  cluster state update task [create-index [my-new-index], ...] 

  timed out after [30s]

You might also see:

None
[WARN ][o.o.c.s.MasterService] [your-cm-node] 

  slow task submission [...] took [12s]...

Why… is this happening? This is a serious warning. It means your cluster manager node is overloaded.

The cluster manager is the “brain” of the cluster. It manages all cluster state changes (creating indices, mapping updates, nodes joining/leaving, shard movements). It processes these in a single-threaded queue.

This error means a task (like “create-index”) sat in the queue for more than 30 seconds without being processed. This “clogs” the brain, and all other cluster-wide operations (like nodes sending their heartbeats) get stuck behind it. This can lead to nodes dropping out of the cluster, a “split brain” scenario, or a cluster-wide freeze.

Common causes:

Overworked cluster manager: You are running your cluster manager on a small, under-powered node, or you’re running it on a node that also handles heavy ingest or search traffic.
Long garbage collection: The cluster manager node had a long “stop-the-world” GC pause, freezing the process.
Rapid mapping updates: You have an application that is creating thousands of new fields per second (“mapping explosion”), which creates a massive cluster state that is slow to process and publish.
Network issues: The cluster manager is struggling to communicate its state changes to all the other nodes.

Best Practice:

Dedicated cluster manager nodes: This is the #1 fix. In any production cluster, you should have 3 dedicated cluster manager nodes. These nodes should only have the cluster_manager role (node.roles: [ cluster_manager ] in opensearch.yml) and should not be data nodes or ingest nodes. This isolates them from search/indexing load.
Monitor your cluster manager: Watch the CPU, memory, and especially JVM heap and GC counts on your cluster manager nodes.
Avoid mapping explosions: Do not use dynamic variables (like user IDs or timestamps) as field names in your documents. Use a fixed, predefined mapping.

What else can I do? Is your cluster constantly unstable? An overloaded cluster manager is a likely cause. Ask the OpenSearch community for advice on cluster topology and node roles, or contact us in The OpenSearch Slack Channel in #General.

Author

OpenSearch

View all posts

Error log: “cluster state update task […] timed out”

Author

OpenSearch is a community-driven, Apache 2.0-licensed open source search and analytics suite that makes it easy to ingest, search, visualize, and analyze data.

Participate

Providers

Resources

Error log: “cluster state update task […] timed out”

Share or Summarize with AI

Author

Participate

Providers

Resources