[search] Solr vs. ElasticSearch

What are the core architectural differences between these technologies?

Also, what use cases are generally more appropriate for each?

This question is related to search solr lucene elasticsearch

The answer is


I have use Elasticsearch for 3 years and Solr for about a month, I feel elasticsearch cluster is quite easy to install as compared to Solr installation. Elasticsearch has a pool of help documents with great explanation. One of the use case I was stuck up with Histogram Aggregation which was available in ES however not found in Solr.


Since the long history of Apache Solr, I think one strength of the Solr is its ecosystem. There are many Solr plugins for different types of data and purposes.

solr stack

Search platform in the following layers from bottom to top:

  • Data
    • Purpose: Represent various data types and sources
  • Document building
    • Purpose: Build document information for indexing
  • Indexing and searching
    • Purpose: Build and query a document index
  • Logic enhancement
    • Purpose: Additional logic for processing search queries and results
  • Search platform service
    • Purpose: Add additional functionalities of search engine core to provide a service platform.
  • UI application
    • Purpose: End-user search interface or applications

Reference article : Enterprise search


Add an nested document in solr very complex and nested data search also very complex. but Elastic Search easy to add nested document and search


While all of the above links have merit, and have benefited me greatly in the past, as a linguist "exposed" to various Lucene search engines for the last 15 years, I have to say that elastic-search development is very fast in Python. That being said, some of the code felt non-intuitive to me. So, I reached out to one component of the ELK stack, Kibana, from an open source perspective, and found that I could generate the somewhat cryptic code of elasticsearch very easily in Kibana. Also, I could pull Chrome Sense es queries into Kibana as well. If you use Kibana to evaluate es, it will further speed up your evaluation. What took hours to run on other platforms was up and running in JSON in Sense on top of elasticsearch (RESTful interface) in a few minutes at worst (largest data sets); in seconds at best. The documentation for elasticsearch, while 700+ pages, didn't answer questions I had that normally would be resolved in SOLR or other Lucene documentation, which obviously took more time to analyze. Also, you may want to take a look at Aggregates in elastic-search, which have taken Faceting to a new level.

Bigger picture: if you're doing data science, text analytics, or computational linguistics, elasticsearch has some ranking algorithms that seem to innovate well in the information retrieval area. If you're using any TF/IDF algorithms, Text Frequency/Inverse Document Frequency, elasticsearch extends this 1960's algorithm to a new level, even using BM25, Best Match 25, and other Relevancy Ranking algorithms. So, if you are scoring or ranking words, phrases or sentences, elasticsearch does this scoring on the fly, without the large overhead of other data analytics approaches that take hours--another elasticsearch time savings. With es, combining some of the strengths of bucketing from aggregations with the real-time JSON data relevancy scoring and ranking, you could find a winning combination, depending on either your agile (stories) or architectural(use cases) approach.

Note: did see a similar discussion on aggregations above, but not on aggregations and relevancy scoring--my apology for any overlap. Disclosure: I don't work for elastic and won't be able to benefit in the near future from their excellent work due to a different architecural path, unless I do some charity work with elasticsearch, which wouldn't be a bad idea


I have created a table of major differences between elasticsearch and Solr and splunk, you can use it as 2016 update: enter image description here


I see some of the above answers are now a bit out of date. From my perspective, and I work with both Solr(Cloud and non-Cloud) and ElasticSearch on a daily basis, here are some interesting differences:

  • Community: Solr has a bigger, more mature user, dev, and contributor community. ES has a smaller, but active community of users and a growing community of contributors
  • Maturity: Solr is more mature, but ES has grown rapidly and I consider it stable
  • Performance: hard to judge. I/we have not done direct performance benchmarks. A person at LinkedIn did compare Solr vs. ES vs. Sensei once, but the initial results should be ignored because they used non-expert setup for both Solr and ES.
  • Design: People love Solr. The Java API is somewhat verbose, but people like how it's put together. Solr code is unfortunately not always very pretty. Also, ES has sharding, real-time replication, document and routing built-in. While some of this exists in Solr, too, it feels a bit like an after-thought.
  • Support: there are companies providing tech and consulting support for both Solr and ElasticSearch. I think the only company that provides support for both is Sematext (disclosure: I'm Sematext founder)
  • Scalability: both can be scaled to very large clusters. ES is easier to scale than pre-Solr 4.0 version of Solr, but with Solr 4.0 that's no longer the case.

For more thorough coverage of Solr vs. ElasticSearch topic have a look at https://sematext.com/blog/solr-vs-elasticsearch-part-1-overview/ . This is the first post in the series of posts from Sematext doing direct and neutral Solr vs. ElasticSearch comparison. Disclosure: I work at Sematext.


I only use Elastic-search. Since I found solr is very hard to start. Elastic-search's features:

  1. Easy to start, very few setting. Even a newbie can setup a cluster step by step.
  2. Simple Restful API which using NoSQL query. And many language libraries for easy accessing.
  3. Good document, you can read the book: . There is a web version on official website.

Imagine the use case:

  1. A lot(100+) of small(10Mb-100Mb, 1000-100000 documents) search indexes.
  2. They are using by a lot of applications (microservices)
  3. Each application can use more than one index
  4. Small by size index, yes. But huge load(hundreds search-requests per second) and requests are complex (multiple aggregations, conditions and so on)
  5. Downtimes are not allowed
  6. All of that is working years long, and constantly growing.

Idea to have individual ES instance per each index - is huge overhead in this case.

Based on my experience, this kind of use case is very complex to support with Elasticsearch.

Why?

FIRST.

The major problem is fundamental back compatibility disregard.

Breaking changes are so cool! (Note: imagine SQL-server which require you to do small change in all your SQL-statements, when upgraded... can't imagine it. But for ES it's normal)

Deprecations which will dropped in next major release are so sexy! (Note: you know, Java contain some deprecations, which 20+ years old, but still working in actual Java version...)

And not only that, sometimes you even have something which nowhere documented (personally came across only once but... )

So. If you want to upgrade ES (because you need new features for some app or you want to get bug fixes) - you are in hell. Especially if it is about major version upgrade.

Client API will not back compatible. Index settings will not back compatible. And upgrade all app/services same moment with ES upgrade is not realistic.

But you must do it time to time. No other way.

Existing indexes is automatically upgraded? - Yes. But it not help you when you will need to change some old-index settings.

To live with that, you need constantly invest a lot of power in ... forward compatibility of you apps/services with future releases of ES. Or you need to build(and anyway constantly support) some kind of middleware between you app/services and ES, which provide you back compatible client API. (And, you can't use Transport Client (because it required jar upgrade for every minor version ES upgrade), and this fact do not make your life easier)

Is it looks simple & cheap? No, it's not. Far from it. Continuous maintenance of complex infrastructure which based on ES, is way to expensive in all possible senses.

SECOND. Simple API ? Well... no really. When you is really using complex conditions and aggregations.... JSON-request with 5 nested levels is whatever, but not simple.


Unfortunately, I have no experience with SOLR, can't say anything about it.

But Sphinxsearch is much better it this scenario, becasue of totally back compatible SphinxQL.

Note: Sphinxsearch/Manticore are indeed interesting. It's not Lucine based, and as result seriously different. Contain several unique features from the box which ES do not have and crazy fast with small/middle size indexes.


I see that a lot of folks here have answered this ElasticSearch vs Solr question in terms of features and functionality but I don't see much discussion here (or elsewhere) regarding how they compare in terms of performance.

That is why I decided to conduct my own investigation. I took an already coded heterogenous data source micro-service that already used Solr for term search. I switched out Solr for ElasticSearch then I ran both versions on AWS with an already coded load test application and captured the performance metrics for subsequent analysis.

Here is what I found. ElasticSearch had 13% higher throughput when it came to indexing documents but Solr was ten times faster. When it came to querying for documents, Solr had five times more throughput and was five times faster than ElasticSearch.


I have been working on both solr and elastic search for .Net applications. The major difference what i have faced is

Elastic search :

  • More code and less configuration, however there are api's to change but still is a code change
  • for complex types, type within types i.e nested types(wasn't able to achieve in solr)

Solr :

  • less code and more configuration and hence less maintenance
  • for grouping results during querying(lots of work to achieve in elastic search in short no straight way)

If you are already using SOLR, remain stick to it. If you are starting up, go for Elastic search.

Maximum major issues have been fixed in SOLR and it is quite mature.


Find a file by name in Visual Studio Code Search all the occurrences of a string in the entire project in Android Studio Java List.contains(Object with field value equal to x) Trigger an action after selection select2 How can I search for a commit message on GitHub? SQL search multiple values in same field Find a string by searching all tables in SQL Server Management Studio 2008 Search File And Find Exact Match And Print Line? Java - Search for files in a directory How to put a delay on AngularJS instant search?

Examples related to solr

Solr vs. ElasticSearch How to delete all data from solr and hbase How to query SOLR for empty fields? ElasticSearch, Sphinx, Lucene, Solr, Xapian. Which fits for which usage? using OR and NOT in solr query

Examples related to lucene

Solr vs. ElasticSearch How to query SOLR for empty fields? ElasticSearch, Sphinx, Lucene, Solr, Xapian. Which fits for which usage? Comparison of full text search engine - Lucene, Sphinx, Postgresql, MySQL? using OR and NOT in solr query

Examples related to elasticsearch

Elasticsearch error: cluster_block_exception [FORBIDDEN/12/index read-only / allow delete (api)], flood stage disk watermark exceeded Elasticsearch : Root mapping definition has unsupported parameters index : not_analyzed How to know elastic search installed version from kibana? Export to csv/excel from kibana Elasticsearch: Failed to connect to localhost port 9200 - Connection refused Elasticsearch difference between MUST and SHOULD bool query how to rename an index in a cluster? elasticsearch bool query combine must with OR Filter items which array contains any of given values How to check Elasticsearch cluster health?