By Gord Schnurr

What is Elasticsearch?

Elasticsearch is a search engine that uses a NoSQL database to store flexible, schema-less documents in an index. As the name implies, its primary use is for searching. It’s significantly faster than SQL-based relational databases at running complex queries on very large data sets.

Benefits of Elasticsearch

It provides many advanced search features, such as:

‘Fuzzy’ Searches

Elasticsearch allows you to specify a level of ‘fuzziness’ on text searches. Fuzziness is a number that refers to the Levenshtein distance between your search term and what is stored in the queried data. This edit distance is the number of single-character changes required to transform one term into another, including additions, subtractions, substitutions, and position swaps. For example, changing ‘Word Ptess’ to ‘WordPress’ requires removing a space and substituting an R for a T, resulting in an edit distance of 2. This is a wonderful feature that allows search to return correct results even when the user enters their search term with typos or misspellings.

‘Boosted’ Searches

It calculates a ‘relevancy score’ for each returned document and allows you to sort by this score. Boosted searches allow you to combine multiple queries and with different relevance weightings. For example, you could use this feature to make posts with the search term in their title score higher than those with the search term in the name of their category. Boosts are provided to the query as multipliers, so it is also possible to use decimal numbers less than 1 to reduce the relevance of posts with terms in specific fields.

Search Highlighting

Elasticsearch queries can also return ‘highlights’, sets of fields or words from a document that triggered it as a hit on the search. This can be used to build an autocomplete feature.

Leveraging Elasticsearch on a WordPress Site

Integration requires a separate server running an HTTP-accessible instance of Elasticsearch, and will likely also require some additional coding to customize what types of data are indexed or what types of queries are executed. Integrating a WordPress site with Elasticsearch is overkill if you don’t have a sufficiently large data set, however, the difference in speed between Elasticsearch and MySQL increases exponentially as the number of documents (posts) grows. We can help you determine whether or not Elasticsearch is something your site can benefit from.

The Elasticpress plugin allows you to connect your site to another server running Elasticsearch and automatically keep your WP post data synchronized to an Elasticsearch index. The plugin then uses WordPress’s action system to intercept database queries, parse the query arguments into an Elasticsearch query, search the index, and ultimately update the MySQL query to retrieve only specific posts by ID. A post’s ID is an indexed primary key field, making the resulting database query very fast. When the data set is large enough and/or the query is complex enough, this additional step increases search speeds drastically by taking the complex work away from MySQL and giving it to Elasticsearch.

 

Up Next: Plugins & WordPress