
The Elasticsearch Stack is built to work alongside Kubernetes as a monitoring stack for collecting, storing, and analyzing Kubernetes telemetry data. The easiest approach for establishing and deploying the Elasticsearch stack on Kubernetes is to use helm charts.
The blog contains all the basic and advanced information about Elasticsearch, along with a step-by-step guide for installing it on Kubernetes using Helm charts. You’ll also learn about installing Kibana, which is the data visualization dashboard for Elasticsearch.
What is Elasticsearch?
Elasticsearch can be viewed as a server that can handle JSON requests and return JSON data. Elasticsearch is a Java-based, distributed, and open-source search and analytics platform based on Apache Lucene. It was initially released as a scalable edition of Lucene, but was later designed to expand the indexes horizontally. A user can save, explore, or analyze large amounts of data in near real-time with Elasticsearch, with results arriving in milliseconds.
It is possible to get quick search results with Elasticsearch since it examines the indexes rather than searching the text directly. It has a document-based architecture rather than tables and schema, and it has rich REST APIs for collecting and finding data.
Backend Components
1. Cluster
A cluster is basically a collection of node copies that are linked together. The distribution of jobs, search, and indexing among all nodes inside an Elasticsearch cluster is what gives it its strength.
2. Node
Single servers that are part of a cluster are referred to as a node. A node is a computer that stores information and participates in the cluster’s indexing and search functions. Elasticsearch nodes can be set up in a variety of ways:
- Master Node: The master node is responsible for managing the Elasticsearch cluster, creating and deleting indexes, and adding and removing nodes.
- Data Node: Stores information and performs data-related tasks, including searching and aggregating.
- Client Node: Sends cluster requests to the master node and data requests to the data nodes.
3. Shards
Elasticsearch allows you to split the index into shards, which are smaller parts of the index. Each shard is a completely functional and self-contained “index” that may be stored on any cluster node. Elasticsearch ensures durability by dividing documents in an index over many shards and those shards across numerous nodes, which protects from hardware failures that eventually increase search capacity when nodes are deployed to the clusters.
4. Replicas
Elasticsearch permits users to duplicate one or more of the index’s shards, known as “replica shards” or simply “replicas.” A replica shard is essentially a duplicate of a primary shard. Each page in an index is associated with a single primary shard. Replicas provide multiple copies of your data to safeguard from hardware failure and enhance your capacity to handle and read demands such as document searches and retrieval.
Basic concepts of Elasticsearch
1. Documents
Documents are the simplest type of data that are indexed in Elasticsearch and are represented in JSON, the universal internet data exchange standard. In Elasticsearch, a document can be anything that is structured and encoded in JSON, not just text. Several examples of structured data include strings, numbers, and dates. A document has its own ID and identifies what kind of object it is based on the type of data it contains.
2. Index
A collection of items with similar qualities is referred to as an index. The index may be compared to a database in a relational database schema. In most cases, the documents in an index are logically connected. You can have an index for Customers, one for Products, one for Orders, and so on in the context of an e-commerce website. An index is given a name that is used to refer to it while conducting indexing, searching, updating, and deleting actions on the documents included inside it.
3. Inverted Index
An inverted index is a data structure that keeps a mapping between information (such words or integers) and their positions in a document or group of documents. It’s essentially a hashmap-like data structure that leads from a word to a document. Instead of storing strings explicitly, an inverted index divides each document into separate search phrases (i.e. each word) and then links each search term to the document in which it appears. Elasticsearch swiftly discovers the best matches for full-text searches from even very big data by employing distributed inverted indices.
Benefits of using Elasticsearch
1. Speed
Elasticsearch excels in full-text searching because it is built on top of Lucene. Elasticsearch is also a near-real-time search technology, which means that the time it takes for a document to be indexed and searchable is generally less than a second. As a result, Elasticsearch is ideally suited for time-critical applications like security analytics and infrastructure monitoring.
2. Distributed structure
Elasticsearch documentation is kept in several containers known as shards, that are replicated to offer multiple copies of the data for preventing hardware failures. Elasticsearch’s distributed structure enables it to expand to hundreds (or thousands) of servers and handle petabytes of data easily.
3. Wide features spectrum
Elasticsearch has a variety of strong built-in features that make storing and finding data, and tasks like data roll-ups and index lifecycle management, even more convenient and fast. It also provides better performance, scalability, and robustness.
4. Easy data ingestion, visualization, and reporting
It’s simple to handle data before indexing it in Elasticsearch, thanks to its ability for integrating with several platforms. Kibana, meanwhile, provides real-time visualization of Elasticsearch data as well as user interfaces for easily obtaining information regarding APM, logs, and infrastructure metrics.
5. Analyzing metrics
There are many organizations using the Elasticsearch stack to analyze numerous metrics. This includes a collection of the data through multiple performance parameters that vary as per the use case.
6. Security analytics
Elasticsearch’s security analysis is another popular analytics application. The ELK stack can analyze access logs and other logs related to system security, giving you a more complete view of what’s proceeding in real-time across your systems.
Steps to install Elasticsearch on Kubernetes Using Helm Chart
Given below are four simple steps for installing Elastricsearch on Kubernetes. But before proceeding, make sure that you have a Kubernetes cluster that you can create with minikube for ease in the process. Also, the installation of the kubectl command-line tool and Helm package manager is also very important.
1. Setting up clusters
First of all, you have to set up a Kubernetes cluster that can be done using Minikube. Clusters with multiple Elasticsearch nodes should have sufficient CPU and memory resources, so allocate enough CPU and memory for the –cpus and –memory options:
minikube start –cpus 4 –memory 8192
Run the following command for checking whether your cluster is running or not. The output displayed on the screen will define the status of the Kubernetes control plane and KubeDNS.
kubectl cluster-info
2. Deploying Elasticsearch using Helm
For initiating the installation process of Elasticsearch, first, you need to add the elastic repository in Helm:
helm repo add elastic https://helm.elastic.co
Now, execute the curl command for downloading the values.yaml file composed of configured info as defined:
curl -O https://raw.githubusercontent.com/elastic/helm-charts/master/elasticsearch/examples/minikube/values.yaml
Now run the helm install command along with the values.yaml file for installing Elasticsearch helm chart:
helm install elasticsearch elastic/elasticsearch -f ./values.yaml
The -f option allows you to provide the template’s yaml file. Add the -n option, followed by the name of the namespace, for installing Elasticsearch in a specified namespace:
helm install elasticsearch elastic/elasticsearch -n [namespace] -f ./values.yaml
The output received after this will define the status of the application as deployed and offer additional options for testing installations.
Now run the get pods command to verify if all the cluster members are up or not:
kubectl get pods –namespace=default -l app=elasticsearch-master -w
In the displayed output, wait for the READY column to check if it has reached 1/1 entries. If yes, then all the cluster members are up.
Run the following command to check the health of the customers.
helm test elasticsearch
Right after installing Elasticsearch, execute command for forwarding the port to 9200:
kubectl port-forward svc/elasticsearch-master 9200
3. Installing Kibana
Kibana is a free and open-source frontend tool that gets installed over Elasticsearch Stack, allowing users to search and visualize data stored in Elasticsearch. Run the below-mentioned command for installing Kibana over Elasticsearch.
helm install kibana elastic/kibana
Verify whether or not the pods are ready. All the Kibana pods will appear under Elasticsearch pods.
kubectl get pods
Now push Kibana to port 5601 with the help of kubectl:
kubectl port-forward deployment/kibana-kibana 5601
Right after forwarding the port, an individual can access Elasticsearch and Kibana GUI (Graphical User Interface) by typing http://localhost:5601 on the web browser.
4. Installing Metricbeat
Metricbeat is a lightweight package that users can install on their servers for gathering metrics from the operating system and the services that operate on them on a regular basis. The installation of Metricbeat is similar to that of Kibana.
Run the following helm command for running the installation process:
helm install metricbeat elastic/metricbeat
Ensure that all the pods related to Metricbeat are up and running:
kubectl get pods
After performing the aforementioned steps, you’ll be able to build the index patterns. For this, you have to visit Kibana and then navigate to the Stack Management -> Index patterns. From there, you can easily create index patterns.
Conclusion
Based on Apache Lucene, Elasticsearch is a Java-based, distributed, and open-source search and analytics platform. It allows users to save, browse, and analyze massive quantities of data in near real-time, with results arriving in milliseconds. Its back-end component is composed of Clusters, nodes, replicas, and shards.
The advantages of installing Elasticsearch on Kubernetes include ease of data ingestion, visualization, and reporting, as well as speeding up tasks and analyzing metrics. In this blog post, we provide a step-by-step guide to getting Elasticsearch up and running on Kubernetes using Helm charts.