[elasticsearch] All shards failed

I was working on elastic search and it was working perfectly. Today I just restarted my remote server (Ubuntu). Now I am searching in my indexes, it is giving me this error.

{"error":"SearchPhaseExecutionException[Failed to execute phase [query_fetch], all shards failed]","status":503}

I also checked the health. The status is red. Can anyone tell me what's the issue.

This question is related to elasticsearch

The answer is


It is possible on your restart some shards were not recovered, causing the cluster to stay red.
If you hit:
http://<yourhost>:9200/_cluster/health/?level=shards you can look for red shards.

I have had issues on restart where shards end up in a non recoverable state. My solution was to simply delete that index completely. That is not an ideal solution for everyone.

It is also nice to visualize issues like this with a plugin like:
Elasticsearch Head


first thing first, all shards failed exception is not as dramatic as it sounds, it means shards were failed while serving a request(query or index), and there could be multiple reasons for it like

  1. Shards are actually in non-recoverable state, if your cluster and index state are in Yellow and RED, then it is one of the reason.
  2. Due to some shard recovery happening in background, shards didn't respond.
  3. Due to bad syntax of your query, ES responds in all shards failed.

In order to fix the issue, you need to filter it in one of the above category and based on that appropriate fix is required.

The one mentioned in the question, is clearly in the first bucket as cluster health is RED, means one or more primary shards are missing, and my this SO answer will help you fix RED cluster issue, which will fix the all shards exception in this case.


If you encounter this apparent index corruption in a running system, you can work around it by deleting all files called segments.gen. It is advisory only, and Lucene can recover correctly without it.

From ElasticSearch Blog


If you're running a single node cluster for some reason, you might simply need to do avoid replicas, like this:

curl -XPUT -H 'Content-Type: application/json' 'localhost:9200/_settings' -d '
{
    "index" : {
        "number_of_replicas" : 0
    }
}'

Doing this you'll force to use es without replicas