Solving the problem by grouping the data - DrTech's answer used facets in managing this but, will be deprecated according to Elasticsearch 1.0 reference.
Warning
Facets are deprecated and will be removed in a future release. You are encouraged to
migrate to aggregations instead.
Facets are replaced by aggregates - Introduced in an accessible manner in the Elasticsearch Guide - which loads an example into sense..
The solution is the same except aggregations require aggs
instead of facets
and with a count of 0 which sets limit to max integer - the example code requires the Marvel Plugin
# Basic aggregation
GET /houses/occupier/_search?search_type=count
{
"aggs" : {
"indexed_occupier_names" : { <= Whatever you want this to be
"terms" : {
"field" : "first_name", <= Name of the field you want to aggregate
"size" : 0
}
}
}
}
Here is the Sense code to test it out - example of a houses index, with an occupier type, and a field first_name:
DELETE /houses
# Index example docs
POST /houses/occupier/_bulk
{ "index": {}}
{ "first_name": "john" }
{ "index": {}}
{ "first_name": "john" }
{ "index": {}}
{ "first_name": "mark" }
# Basic aggregation
GET /houses/occupier/_search?search_type=count
{
"aggs" : {
"indexed_occupier_names" : {
"terms" : {
"field" : "first_name",
"size" : 0
}
}
}
}
Response showing the relevant aggregation code. With two keys in the index, John and Mark.
....
"aggregations": {
"indexed_occupier_names": {
"buckets": [
{
"key": "john",
"doc_count": 2 <= 2 documents matching
},
{
"key": "mark",
"doc_count": 1 <= 1 document matching
}
]
}
}
....