Error Log: This is a fatal startup error. Your OpenSearch node fails to start, and you see this in your opensearch.log file.
None
[ERROR][o.o.n.Node] [your-node-name]
failed to obtain node lock, is another node running on the same data path?
java.lang.IllegalStateException:
failed to obtain node lock on [/var/lib/opensearch/nodes/0]...
Or:
None
[ERROR][o.o.b.Bootstrap] [your-node-name]
Exception
java.lang.IllegalStateException: Node is already locked...
Why… is this happening? This is an OS-level file lock conflict. To prevent two OpenSearch processes from accidentally running and trying to write to the same data files (which would cause catastrophic data corruption), the first process to start grabs an exclusive “node lock” (a write.lock file) in its data.path directory.
This error means your node tried to start, but it found that the write.lock file was already held by another process.
Common causes:
- Another node is running: The most obvious reason. You are accidentally trying to start a second OpenSearch process on the same server, pointing to the same
data.path. Useps aux | grep opensearchto check for other running OpenSearch processes. - Stale lock file: The OpenSearch process crashed uncleanly (e.g.,
kill -9) and didn’t have time to release the lock. The new process starts, sees the old lock file, and (incorrectly) thinks another node is still running. - Permissions issue: The
opensearchuser doesn’t have the correct read/write/execute permissions on thedata.pathdirectory, so it can’t create or check the lock file properly. - Network file system (NFS): You are (incorrectly) using an NFS mount for your
data.path. File locking over NFS is notoriously unreliable and can lead to this error. Do not run OpenSearch data paths on NFS.
Best Practice:
- Check for other processes: First, run
ps -ef | grep java | grep opensearch. If you see two OpenSearch processes, kill one. - Check file permissions: Ensure the
opensearchuser owns the entire data directory:sudo chown -R opensearch:opensearch /var/lib/opensearch. - Clear the stale lock (if you are sure): If you have 100% confirmed that no other OpenSearch process is running, you can manually remove the stale lock.
- Go to your data path (e.g., /var/lib/opensearch/nodes/0).
- Delete the
write.lockfile. - Try to start the node again.
- Use local disks: Always use fast, local (e.g., SSD, nvme) disks for your
data.path.
What else can I do? If you’ve cleared the lock and the node still won’t start, there might be a deeper file system or permissions issue. The OpenSearch community forums can help you troubleshoot. For direct support, contact us in The OpenSearch Slack Channel in #General.