[solr] ElasticSearch, Sphinx, Lucene, Solr, Xapian. Which fits for which usage?

We use Lucene regularly to index and search tens of millions of documents. Searches are quick enough, and we use incremental updates that do not take a long time. It did take us some time to get here. The strong points of Lucene are its scalability, a large range of features and an active community of developers. Using bare Lucene requires programming in Java.

If you are starting afresh, the tool for you in the Lucene family is Solr, which is much easier to set up than bare Lucene, and has almost all of Lucene's power. It can import database documents easily. Solr are written in Java, so any modification of Solr requires Java knowledge, but you can do a lot just by tweaking configuration files.

I have also heard good things about Sphinx, especially in conjunction with a MySQL database. Have not used it, though.

IMO, you should choose according to:

  • The required functionality - e.g. do you need a French stemmer? Lucene and Solr have one, I do not know about the others.
  • Proficiency in the implementation language - Do not touch Java Lucene if you do not know Java. You may need C++ to do stuff with Sphinx. Lucene has also been ported into other languages. This is mostly important if you want to extend the search engine.
  • Ease of experimentation - I believe Solr is best in this aspect.
  • Interfacing with other software - Sphinx has a good interface with MySQL. Solr supports ruby, XML and JSON interfaces as a RESTful server. Lucene only gives you programmatic access through Java. Compass and Hibernate Search are wrappers of Lucene that integrate it into larger frameworks.

Examples related to solr

Solr vs. ElasticSearch How to delete all data from solr and hbase How to query SOLR for empty fields? ElasticSearch, Sphinx, Lucene, Solr, Xapian. Which fits for which usage? using OR and NOT in solr query

Examples related to lucene

Solr vs. ElasticSearch How to query SOLR for empty fields? ElasticSearch, Sphinx, Lucene, Solr, Xapian. Which fits for which usage? Comparison of full text search engine - Lucene, Sphinx, Postgresql, MySQL? using OR and NOT in solr query

Examples related to elasticsearch

Elasticsearch error: cluster_block_exception [FORBIDDEN/12/index read-only / allow delete (api)], flood stage disk watermark exceeded Elasticsearch : Root mapping definition has unsupported parameters index : not_analyzed How to know elastic search installed version from kibana? Export to csv/excel from kibana Elasticsearch: Failed to connect to localhost port 9200 - Connection refused Elasticsearch difference between MUST and SHOULD bool query how to rename an index in a cluster? elasticsearch bool query combine must with OR Filter items which array contains any of given values How to check Elasticsearch cluster health?

Examples related to sphinx

Why does configure say no C compiler found when GCC is installed? ElasticSearch, Sphinx, Lucene, Solr, Xapian. Which fits for which usage? Comparison of full text search engine - Lucene, Sphinx, Postgresql, MySQL?

Examples related to xapian

ElasticSearch, Sphinx, Lucene, Solr, Xapian. Which fits for which usage?