Elasticsearch is an open-source, distributed search engine and analysis engine for all data types, including textual, numeric, geospatial, structured, and unstructured.
Elasticsearch is based on Apache Lucene and was first released in 2010 by Elasticsearch N.V. (now Elastic).
Known for its simple REST API, distributed nature, speed, and scalability, Elasticsearch is the core component of Elastic Stack, an open-source toolset for data acquisition, enrichment, storage, analysis, and visualization.
Commonly referred to as ELK Stack (after Elasticsearch, Logstash, and Kibana), Elastic Stack now includes a rich collection of integrators known as Beats for sending data to Elasticsearch.
What is Elasticserach used for
The speed and scalability of Elasticsearch and its ability to index many types of content mean that it can be used for a variety of use cases:
- Search in Mobile Apps.
- Website search.
- Business Documents search.
- Log recording and analysis.
- Infrastructure metrics and container monitoring.
- Application performance monitoring.
- Geospatial data analysis and visualization.
- Security analysis.
- Business analysis.
You can read in this article a practical example of applying Elasticsearch with WordPress.
How does Elasticsearch work
Raw data flows into Elasticsearch from various sources, including logs, system metrics, and web applications.
Data entry is how this raw data is analyzed, normalized, and enriched before being indexed in Elasticsearch.
Once indexed in Elasticsearch, users can perform complex queries on their data and use aggregations to retrieve complex summaries of their data.
From Kibana, users can create powerful visualizations of their data, share dashboards, and manage the stack.
What is an index in Elasticsearch
An index in Elasticsearch is a collection of interrelated documents.
Elasticsearch stores the data as JSON documents.
Each document correlates a set of keys (field names or properties) with corresponding values (strings, numbers, Boolean values, dates, arrays of values, geolocation, or other data types).
Elasticsearch uses a data structure called an inverted index, designed to enable fast full-text searches.
An inverted index lists every unique word in any document and identifies all documents in which each word is found.
Elasticsearch stores the documents during the indexing process and creates an inverted index to make the document data searchable in near real-time.
Indexing is initiated with the index API, through which you can add or update a JSON document in a specific index.
You can read this article to discover 6 fundamental concepts to understand Elasticsearch more deeply.
What is Logstash for
Logstash, one of the core products of Elastic Stack, is used to aggregate and process data and send it to Elasticsearch.
Logstash is a server-side data processing pipeline that allows you to capture data from multiple sources at once, enrich and transform it before it is indexed in Elasticsearch.
What is Kibana for
Kibana is a data visualization and management tool for Elasticsearch that provides real-time histograms, line graphs, pie charts, and maps.
Kibana also includes advanced applications such as Canvas, allowing users to create custom dynamic infographics based on their data, and Elastic Maps for visualizing geospatial data.
Why use Elasticsearch
These are the main reasons why you should consider using elastisearch as a search engine for your site:
- Is fast.
- Is based on Lucene, it excels at full-text search.
- Is a near real-time search platform, which means that the latency from the time a document is indexed until it becomes searchable is very short, typically one second.
As a result, Elasticsearch is well suited for time-sensitive use cases such as security analysis and infrastructure monitoring. - Is distributed by its very nature.
Documents stored in Elasticsearch are distributed across multiple containers known as ‘shards,’ which are duplicated to provide redundant copies of the data in the event of hardware failure.
The distributed nature of Elasticsearch allows it to scale to hundreds (or even thousands) of servers and handle petabytes of data. - Comes with a wide range of features.
Elasticsearch has several powerful built-in features that make data storage and search even more efficient, such as data rollups and index lifecycle management. - Elastic stack simplifies data capture, visualization, and reporting.
Integration with Beats and Logstash simplifies data processing before indexing in Elasticsearch.
At the same time, Kibana provides real-time visualization of Elasticsearch data and user interfaces for quick access to application performance monitoring (APM), logs and infrastructure metrics data.