Elasticsearch query to return all records

Question

I have a small database in Elasticsearch and for testing purposes would like to pull all records back   I am attempting to use a URL of the form     http   localhost 9200 foo  search pretty true amp q   matchAll          Can someone give me the URL you would use to accomplish this  please

User · Answer

Simple  You can use size and from parameter   http   localhost 9200  your index name   search size 1000 amp from 0   then you change the from gradually until you get all of the data

User · Answer

If you want to pull many thousands of records then    a few people gave the right answer of using  scroll   Note  Some people also suggested using  search type scan   This was deprecated  and in v5 0 removed  You don t need it   Start with a  search  query  but specifying a  scroll  parameter  here I m using a 1 minute timeout    curl -XGET  http   ip1 9200 myindex  search scroll 1m  -d          query                  match all                   That includes your first  batch  of hits  But we are not done here  The output of the above curl command would be something like this      scroll id   c2Nhbjs1OzUyNjE6NU4tU3BrWi1UWkNIWVNBZW43bXV3Zzs1Mzc3OkhUQ0g3VGllU2FhemJVNlM5d2t0alE7NTI2Mjo1Ti1TcGtaLVRaQ0hZU0FlbjdtdXdnOzUzNzg6SFRDSDdUaWVTYWF6YlU2Uzl3a3RqUTs1MjYzOjVOLVNwa1otVFpDSFlTQWVuN211d2c7MTt0b3RhbF9oaXRzOjIyNjAxMzU3Ow     took  109  timed out  false   shards    total  5  successful  5  failed  0   hits    total  22601357  max score  0 0  hits        It s important to have  scroll id handy as next you should run the following command       curl -XGET   localhost 9200  search scroll   -d                 scroll     1m             scroll id     c2Nhbjs2OzM0NDg1ODpzRlBLc0FXNlNyNm5JWUc1                 However  passing the scroll id around is not something designed to be done manually  Your best bet is to write code to do it  e g  in java        private TransportClient client   null      private Settings settings   ImmutableSettings settingsBuilder                      put CLUSTER NAME  cluster-test   build        private SearchResponse scrollResp    null       this client   new TransportClient settings       this client addTransportAddress new InetSocketTransportAddress  ip   port         QueryBuilder queryBuilder   QueryBuilders matchAllQuery        scrollResp   client prepareSearch index  setSearchType SearchType SCAN                    setScroll new TimeValue 60000                                                 setQuery queryBuilder                    setSize 100  execute   actionGet         scrollResp   client prepareSearchScroll scrollResp getScrollId                     setScroll new TimeValue timeVal                    execute                    actionGet      Now LOOP on the last command use SearchResponse to extract the data

User · Answer

A simple solution using the python package elasticsearch-dsl   from elasticsearch dsl import Search from elasticsearch dsl import connections  connections create connection hosts   localhost     s   Search index  foo   response   s scan    count   0 for hit in response        print hit to dict       be careful  it will printout every hit in your index     count    1  print count    See also https   elasticsearch-dsl readthedocs io en latest api html elasticsearch dsl Search scan

User · Answer

You can  use size 0 this will return you all the documents  example  curl -XGET  localhost 9200 index type  search  -d        size 0      query          match all

User · Answer

curl -XGET    IP localhost   9200   Index name     type    search scroll 10m amp pretty  -d     query      filtered      query      match all

User · Answer

curl -X GET  localhost 9200 foo  search q   amp pretty

User · Answer

this is the query to accomplish what you want   I am suggesting to use Kibana  as it helps to understand queries better   GET my index name my type name  search       query           match all              size   20     from   3     to get all records you have to use  match all  query   size is the no of records you want to fetch  kind of limit   by default  ES will only return 10 records  from is like skip  skip first 3 records   If you want to fetch exactly all the records  just use the value from the  total  field  from the result once you hit this query from Kibana and the use it with  size

User · Answer

By default Elasticsearch return 10 records so size should be provided explicitly   Add size with request to get desire number of records    http    host  9200  index name   search pretty true amp size  number of records   Note      Max page size can not be more than index max result window index setting which defaults to 10 000

User · Answer

If it s a small dataset  e g  1K records   you can simply specify size   curl localhost 9200 foo index  search size 1000   The match all query isn t needed  as it s implicit   If you have a medium-sized dataset  like 1M records  you may not have enough memory to load it  so you need a scroll   A scroll is like a cursor in a DB  In Elasticsearch  it remembers where you left off and keeps the same view of the index  i e  prevents the searcher from going away with a refresh  prevents segments from merging    API-wise  you have to add a scroll parameter to the first request   curl  localhost 9200 foo index  search size 100 amp scroll 1m amp pretty    You get back the first page and a scroll ID         scroll id     DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAADEWbmJlSmxjb2hSU0tMZk12aEx2c0EzUQ        took    0        Remember that both the scroll ID you get back and the timeout are valid for the next page  A common mistake here is to specify a very large timeout  value of scroll   that would cover for processing the whole dataset  e g  1M records  instead of one page  e g  100 records    To get the next page  fill in the last scroll ID and a timeout that should last until fetching the following page   curl -XPOST -H  Content-Type  application json   localhost 9200  search scroll  -d       scroll    1m      scroll id    DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAADAWbmJlSmxjb2hSU0tMZk12aEx2c0EzUQ         If you have a lot to export  e g  1B documents   you ll want to parallelise  This can be done via sliced scroll  Say you want to export on 10 threads  The first thread would issue a request like this   curl -XPOST -H  Content-Type  application json   localhost 9200 test  search scroll 1m amp size 100  -d       slice          id   0        max   10           You get back the first page and a scroll ID  exactly like a normal scroll request  You d consume it exactly like a regular scroll  except that you get 1 10th of the data   Other threads would do the same  except that id would be 1  2  3

User · Answer

http   127 0 0 1 9200 foo  search  size 1000 amp pretty 1                                        Note the size param  which increases the hits displayed from the default  10  to 1000 per shard   http   www elasticsearch org guide en elasticsearch reference current search-request-from-size html

User · Answer

I think lucene syntax is supported so   http   localhost 9200 foo  search pretty true amp q      size defaults to 10  so you may also need  amp size BIGNUMBER to get more than 10 items   where BIGNUMBER equals a number you believe is bigger than your dataset   BUT  elasticsearch documentation suggests for large result sets  using the scan search type   EG   curl -XGET  localhost 9200 foo  search search type scan amp scroll 10m amp size 50  -d          query               match all                  and then keep requesting as per the documentation link above suggests   EDIT  scan Deprecated in 2 1 0   scan does not provide any benefits over a regular scroll request sorted by  doc  link to elastic docs  spotted by  christophe-roussy

User · Answer

size param increases the hits displayed from from the default 10  to 500   http   localhost 9200  indexName   search pretty true size 500 q      Change the from step by step to get all the data   http   localhost 9200  indexName   search size 500 from 0

User · Answer

From Kibana DevTools its   GET my index name  search      query          match all

User · Answer

You can use the  count API to get the value for the size parameter   http   localhost 9200 foo  count q  lt your query gt    Returns  count X        Extract value  X  and then do the actual query   http   localhost 9200 foo  search q  lt your query gt  amp size X

User · Answer

http   localhost 9200 foo  search  size 1000 amp pretty 1  you will need to specify size query parameter as the default is 10

User · Answer

To return all records from all indices you can do   curl -XGET http   35 195 120 21 9200  all  search size 50 amp pretty  Output      took    866     timed out    false      shards           total    25       successful    25       failed    0         hits           total    512034694       max score    1 0       hits                index     grafana-dash           type     dashboard           id     test           score    1 0

User · Answer

Elasticsearch will get significant slower if you just add some big number as size  one method to use to get all documents is using scan and scroll ids   https   www elastic co guide en elasticsearch reference current search-request-scroll html  In Elasticsearch v7 2  you do it like this   POST  foo  search scroll 1m        size   100       query              match all                The results from this would contain a  scroll id which you have to query to get the next 100 chunk   POST   search scroll         scroll     1m         scroll id      lt YOUR SCROLL ID gt

User · Answer

Note  The answer relates to an older version of Elasticsearch 0 90  Versions released since then have an updated syntax  Please refer to other answers that may provide a more accurate answer to the latest answer that you are looking for    The query below would return the NO OF RESULTS you would like to be returned    curl -XGET  localhost 9200 foo  search size NO OF RESULTS  -d      query           match all                Now  the question here is that you want all the records to be returned  So naturally  before writing a query  you wont know the value of NO OF RESULTS    How do we know how many records exist in your document  Simply type the query below  curl -XGET  localhost 9200 foo  search  -d     This would give you a result that looks like the one below      hits         total          2357     hits                                       The result total tells you how many records are available in your document  So  that s a nice way to know the value of NO OF RESULTS  curl -XGET  localhost 9200  search  -d      Search all types in all indices  curl -XGET  localhost 9200 foo  search  -d     Search all types in the foo index  curl -XGET  localhost 9200 foo1 foo2  search  -d     Search all types in the foo1 and foo2 indices  curl -XGET  localhost 9200 f   search   Search all types in any indices beginning with f  curl -XGET  localhost 9200  all type1 type2  search  -d     Search types user and tweet in all indices

User · Answer

The official documentation provides the answer to this question  you can find it here         query      match all            size   1     You simply replace size  1  with the number of results you want to see

User · Answer

For Elasticsearch 6 x  Request  GET  foo  search pretty true  Response  In Hits-  total  give the count of the docs               took   1         timed out   false          shards              total   5           successful   5           skipped   0           failed   0                 hits              total   1001           max score   1           hits

User · Answer

This is the best solution I found using python client      Initialize the scroll   page   es search    index    yourIndex     doc type    yourType     scroll    2m     search type    scan     size   1000    body           Your query s body          sid   page   scroll id     scroll size   page  hits    total        Start scrolling   while  scroll size  gt  0       print  Scrolling         page   es scroll scroll id   sid  scroll    2m         Update the scroll ID     sid   page   scroll id         Get the number of results that we returned in the last scroll     scroll size   len page  hits    hits        print  scroll size      str scroll size        Do something with the obtained page   https   gist github com drorata 146ce50807d16fd4a6aa  Using java client   import static org elasticsearch index query QueryBuilders     QueryBuilder qb   termQuery  multi    test     SearchResponse scrollResp   client prepareSearch test           addSort FieldSortBuilder DOC FIELD NAME  SortOrder ASC           setScroll new TimeValue 60000            setQuery qb           setSize 100  execute   actionGet      100 hits per shard will be returned for each scroll   Scroll until no hits are returned do       for  SearchHit hit   scrollResp getHits   getHits                Handle the hit               scrollResp   client prepareSearchScroll scrollResp getScrollId    setScroll new TimeValue 60000   execute   actionGet      while scrollResp getHits   getHits   length    0      Zero hits mark the end of the scroll and the while loop    https   www elastic co guide en elasticsearch client java-api current java-search-scrolling html

User · Answer

None except  Akira Sendoh has answered how to actually get ALL docs  But even that solution crashes my ES 6 3 service without logs  The only thing that worked for me using the low-level elasticsearch-py library was through scan helper that uses scroll   api   from elasticsearch helpers import scan  doc generator   scan      es obj      query   query     match all             index  my-index        use the generator to iterate  dont try to make a list or you will get out of RAM for doc in doc generator        use it somehow   However  the cleaner way nowadays seems to be through elasticsearch-dsl library  that offers more abstract  cleaner calls  e g  http   elasticsearch-dsl readthedocs io en latest search dsl html hits

User · Answer

elasticsearch ES  supports both a GET or a POST request for getting the data from the ES cluster index    When we do a GET   http   localhost 9200  your index name   search size  no of records you want  amp q       When we do a POST   http   localhost 9200  your index name   search      size    your value    default 10    from    your start index    default 0    query             match all                  I would suggest to use a UI plugin with elasticsearch http   mobz github io elasticsearch-head  This will help you get a better feeling of the indices you create and also test your indices

User · Answer

The best way to adjust the size is using size number in front of the URL   Curl -XGET  http   localhost 9200 logstash-   search size 50 amp pretty    Note  maximum value which can be defined in this size is 10000  For any value above ten thousand it expects you to use scroll function which would minimise any chances of impacts to performance

User · Answer

If still someone is looking for all the data to be retrieved from Elasticsearch like me for some usecases  here is what I did  Moreover  all the data means  all the indexes and all the documents types  I m using Elasticsearch 6 3  curl -X GET  localhost 9200  search pretty true  -H  Content-Type  application json  -d         query              match all                  Elasticsearch reference

User · Answer

The maximum result which will return by elasticSearch is 10000 by providing the size    curl -XGET  localhost 9200 index type  search scroll 1m  -d         size  10000      query          match all                  After that  you have to use Scroll API for getting the result and get the  scroll id value and put this value in scroll id  curl -XGET   localhost 9200  search scroll   -d        scroll     1m        scroll id

User · Answer

Using kibana console and my index as the index to search the following can be contributed  Asking the index to only return 4 fields of the index  you can also add size to indicate how many documents that you want to be returned by the index  As of ES 7 6 you should use  source rather than filter it will respond faster      GET  address  search         source     streetaddress   city   state   postcode        size   100      query        match all

User · Answer

use server 9200  stats also to get statistics about all your aliases   like size and number of elements per alias  that s very useful and provides helpful information

User · Answer

Using Elasticsearch 7 5 1   http     HOST  9200   INDEX   search pretty true amp q     amp scroll 10m amp size 5000   in case you can also specify the size of your array with  amp size   number   in case you  don t know you index   http     HOST  9200  cat indices v

User · Answer

You actually don t need to pass a body to match all  it can be done with a GET request to the following URL  This is the simplest form  http   localhost 9200 foo  search

[database] Elasticsearch query to return all records

Examples related to database

Examples related to elasticsearch

Examples related to query-string

Examples related to elasticsearch-dsl