Elasticsearch is a search engine that lets you access information on different platforms. For example, if you are creating an online eCommerce store, setting up an Elasticsearch cluster will help your customers search for their desired products. Apart from serving as a search engine, it also works as an analytics engine. It analyzes the logs of applications and transaction records.
Due to its open-sourced nature, Elasticsearch is quite easy to use on different platforms such as online stores, business intelligence applications, etc. However, if you don’t know how to set up the tool, you will miss out on most of its benefits. That’s why we have created this tutorial to discuss how to set up an Elasticsearch.
What is Elasticsearch Engine?
Elasticsearch (ES) is an extremely powerful tool that makes searching for products on data discovery applications possible. Because the tool is open-source, many developers from all over the world use its advantages such as cluster features, security, data management, thread-pooling, data monitoring API, and more for analyzing data easily. The distributed analytics was built on the Apache Lucene indexing library. The library is based on Java, and the tool or search engine caters to all your advanced search functions.
If you want to know about ES in short, we would say that the search engine provides storage for large clusters with application data and helps you analyze linguistic content. It makes detections of trends and recommendations of products easy on eCommerce stores and recommendation engines. Apart from these two types of platforms, ES is used for indexing instance metrics.
Elasticsearch usually comes with default settings that you don’t need to change much but are also easily changeable when needed.
What is Elasticsearch Cluster?
Elasticsearch cluster is a combination of multiple nodes that have the same name attribute. Many nodes can join a single cluster, and when the cluster recognizes the newer entries, it will automatically spread data among the new nodes. Once a platform runs the Elasticsearch engine, the cluster nodes will begin their automatic data management processes. Each node in the cluster has different tasks, and a new name is assigned to the cluster by default so that they are easily recognizable.
What are the Different Parts of Elasticsearch?
Elasticsearch contains multiple Elasticsearch node instances, all of which are linked together to distribute tasks across the cluster nodes. Apart from distributing tasks, it takes part in searching and indexing data in e-commerce applications. We will talk about the variety of Elasticsearch cluster nodes, but before that, let’s explore the different parts of Elasticsearch that are discussed as follows:
Time latency: The near real-time nature of ES refers to the time span it takes to index data of a document and makes it available for searching.
Cluster: A cluster in ES can have one or many nodes or servers that are combined together to store data in the cluster. You can use those nodes to index the data they have within or search that particular cluster for data.
Node: Node is an essential part of your cluster where you can store data and components such as master, data, HTTP, coordinating, or client nodes. Besides storing data, nodes also participate in cluster management by indexing and searching data.
Index: Index is a library of different characteristics of documents. You can use these characteristics to index, search, delete, and upgrade the operations of the documents.
Documents: A document contains information that you can index using Elasticsearch.
Shards and Replicas: One index can contain more than a thousand documents, and space is required to store the data present within the documents. Therefore, a single node cannot store all these documents within itself. That’s why the Shards and Replicas were born to eliminate the problem of storage by subdividing the index into multiple units that are known as shards. There are 5 shards in Elasticsearch by default.
On the other hand, replicas are like an alternative version of the shards. So, if there is a problem with the shard, you can scale out your search data through replicas. There is 1 replica in Elasticsearch.
What are the Different Types of Nodes in Elasticsearch Cluster?
The nodes in the Elasticsearch cluster take different responsibilities, and they have different roles to play. Have a look at the different types of nodes:
1. Data nodes
Data nodes store data within themselves and execute operations when needed, including searches and aggregation.
2. Master nodes
A master node helps to manage other nodes. Also, it assists a cluster to add or remove nodes as well as configure them.
3. Client nodes
Client nodes transact cluster requests to the master nodes as well as data requests to the data nodes.
4. Ingest nodes
These nodes pre-process the documents before preparing them for indexing.
After being installed, a single node in the Elasticsearch cluster creates an “elasticsearch” cluster based on that particular node. But you can configure that node and join another cluster using the same cluster name.
How to Install Elasticsearch?
There are multiple ways to install an Elasticsearch on your Ubuntu 16.04 with the system running AWS EC2 in the same VPC. We will get to that part where you can easily configure Elasticsearch but before that, installing the tool with Java is important. So, let’s get going.
Installing Java 8
You should install at least Java 8 or later and run it on your nodes in the cluster. Kindly install the same version of Java in all the nodes. To update your system, use the sudo apt-get update command, and then install Java with the following command:
sudo apt-get install default-JRE
Once installed, you can check the Java version with ‘java -version’ command, then you will receive a similar output like the following one:
OpenJDK version “1.8.0_151”
OpenJDK Runtime Environment (build 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12)
OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)
After installing Java, run the following command to install Elasticsearch:
#Installing Open JDK
sudo apt-get install OpenJDK-8-JDK
#Installing Oracle JDK
sudo add-apt-repository -y ppa:webupd8team/java
sudo apt-get update
sudo apt-get -y install oracle-java8-installer
Installing with a Tar File
This command will help you install Elasticsearch from a tar file:
curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.2.4.tar.gz
tar -xvf elasticsearch-6.2.4.tar.gz
Installing with a package manager
#import the Elasticsearch public GPG key into apt:
wget -qO – https://packages.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add –
#Create the Elasticsearch source list
echo “deb http://packages.elastic.co/elasticsearch/6.x/debian stable main” | sudo tee -a /etc/apt/sources.list.d/elasticsearch-6.x.list
sudo apt-get update
sudo apt-get -y install elasticsearch
Configuring Elasticsearch
After installing Elasticsearch, you should configure the nodes within it in a way that they can easily interact with each other. Just open the configuration file of Elasticsearch by running the following command:
sudo vim /etc/elasticsearch/elasticsearch.yml
For each node, you will have to open this file for configuration. It contains various settings for different sections. Therefore, you should enter the following configuration and replace these given ids with your ids.
#give your cluster a name.
cluster.name: my-cluster
#give your nodes a name (change node number from node to node).
node.name: “es-node-1”
#define node 1 as master-eligible:
node.master: true
#define nodes 2 and 3 as data nodes:
node.data: true
#enter the private IP and port of your node:
network.host: 172.11.61.27
http.port: 9200
#detail the private IPs of your nodes:
discovery.zen.ping.unicast.hosts: [“172.11.61.27”, “172.31.22.131”,”172.31.32.221″]
After entering this configuration, save the file and close the window.
If you have downloaded the tar file, you can find the configuration file location here:
vi /[YOUR_TAR_LOCATION]/config/elasticsearch.yml
But if you have installed Elasticsearch from the package manager, the configuration file location is given below:
vi /etc/elasticsearch/elasticsearch.yml
Try to use some descriptive names of the Elasticsearch cluster so that you can easily recognize them. Please remember that whatever names you are using, the nodes will use the same name attributes to form a cluster.
cluster.name: lineofcode-prod
To identify a node name in the Elasticsearch cluster, use the following command:
node.name: ${HOSTNAME}
You can group the nodes and keep them in the same data center with the following command:
node.attr.rack: us-east-1
There is no authentication or authorization in Elasticsearch. Therefore, you should not try to tie a network host property to the public IP. The node will itself tie with its IP address:
network.host: [_VPN_HOST_, _local_]
Learn some of the hostnames or IP addresses of the cluster by going to the discovery.zen.ping.unicast.hosts property. If needed, you can also configure the port number with http.port property from where the Elasticsearch can be accessible over the HTTP protocol.
How to Configure the JVM Options?
This section is optional, and you can leave it be if you don’t require it. According to your hardware configuration, you can tweak your JVM options in Elasticsearch.
#For example, if your server has eight GB of RAM, then set the following property as:
-Xms4g
-Xmx4g
If you don’t let Elasticsearch block the memory of the server, it will lead to performance issues. Try this command:
bootstrap.memory_lock: true property
The tool uses concurrent mark and sweeps GC by default, but if you wish to change that, you can do it with the configuration given below:
-XX:-UseParNewGC
-XX:-UseConcMarkSweepGC
-XX:+UseCondCardMark
-XX:MaxGCPauseMillis=200
-XX:+UseG1GC
-XX:GCPauseIntervalMillis=1000
-XX:InitiatingHeapOccupancyPercent=35
Configuring Elasticsearch Cluster for Production
If you want to set up an Elasticsearch cluster that is running in a production environment, you will have to go through some different configurations than the ones we have mentioned earlier. Have a look below to learn additional Elasticsearch cluster settings:
1. Avoid the “Split Brain” Situation
The “Split Brain” situation refers to the failed connection between the nodes in the Elasticsearch cluster. The situation can occur due to network or internal failure with one or many nodes. The failed node is considered by the master node by other nodes if the situation occurs, and that leads to misdirection of data. You can take control of the situation by not letting it happen.
You can change the discovery.zen.minimum_master_nodes directory that you will find in the Elasticsearch configuration file. The file has all the nodes needed for the data interaction, and it also determines the master node. You need to determine how many nodes there should be in the communication server, and to find the number; you can use the N/2+1 X N formula. The formula will bring out the number of eligible nodes in the cluster. If the cluster has only three nodes, use the following command:
discovery.zen.minimum_master_nodes: 2
2. Adjust the JVM heap size
The maximum heap size of JVM should take at least 50% of your RAM but don’t make it more than 32GB. You should also configure the value of the maximum and minimum heap size with the Xmx and Xms setting in the JVM.options file.
On DEB:
sudo vim /etc/elasticsearch/jvm.options
-Xms2g
-Xmx2g
How to Run Elasticsearch?
Since you have installed Elasticsearch and configured it already, you can easily run it on your computer using some commands. These commands are common, and you have heard about them before. Run the command given below for each of the instances:
sudo service elasticsearch start
If you have configured the cluster correctly with the codes stated in the “Configuring Elasticsearch” section, your Elasticsearch will run smoothly by now. In case you want to make sure that everything is working fine, ask Elasticsearch from any of the nodes:
curl -XGET’ http://localhost:9200/_cluster/state?pretty’
The response will be similar to this:
{
“cluster_name” : “my-cluster”,
“compressed_size_in_bytes” : 351,
“version” : 4,
“state_uuid” : “3LSnpinFQbCDHnsFv-Z8nw”,
“master_node” : “IwEK2o1-Ss6mtx50MripkA”,
“blocks” : { },
“nodes” : {
“IwEK2o1-Ss6mtx50MripkA” : {
“name” : “es-node-2”,
“ephemeral_id” : “x9kUrr0yRh–3G0ckESsEA”,
“transport_address” : “172.31.50.123:9300”,
“attributes” : { }
},
“txM57a42Q0Ggayo4g7-pSg” : {
“name” : “es-node-1”,
“ephemeral_id” : “Q370o4FLQ4yKPX4_rOIlYQ”,
“transport_address” : “172.31.62.172:9300”,
“attributes” : { }
},
“6YNZvQW6QYO-DX31uIvaBg” : {
“name” : “es-node-3”,
“ephemeral_id” : “mH034-P0Sku6Vr1DXBOQ5A”,
“transport_address” : “172.31.52.220:9300”,
“attributes” : { }
}
},
…
How to Set Up an Elasticsearch, Fluentd, and Kibana (EFK) Logging Stack on Kubernetes
You can set up a logging stack on your Kubernetes cluster to analyze the log data generated through pods. This is especially required when there is a heavy volume of services running on the cluster. To manage your Kubernetes log, you can use the EFK stack, which is easy to deploy. The stack refers to Elasticsearch, Fluent bit, and Kibana UI. They are lightweight and easy to operate.
Elasticsearch allows you to analyze the log data and the full-text search data. Kibana is a dashboard for Elasticsearch that you can use besides Elastic. The tool allows you to discover the log data of Elastic via a web interface. If you are looking for an easy and simple way to understand your Kubernetes applications, their data, and settings, EFK will be helpful.
So, now let’s talk about how to do an EFK stack setup quickly on Kubernetes to manage your web applications effectively.
Why Do You Need the EFK Stack?
Generally, the simple kubectl logs command helps you check the number of logs in Kubernetes pods. But if the number of pods is high, you cannot manage all of them at one time. But the Kibana dashboard UI or the EFK stack can help you configure the logs and their run time. This knowledge is important if you have little experience with Linux commands.
Also, large apps come with more than 100 logs running on Kubernetes clusters, and most of them come from Docker containers or Kubernetes systems. You will need a centralized log aggregation and management system to analyze all these logs. But with Elasticsearch, Fluentd, and Kibana, it will be an easy game.
While you are using fluentd, you can point out logs in different sources in the cluster. If you want, you can filter the logs, add more similar logs, or remove the ones not required. After doing so, you can store the whole information or data in Elasticsearch once again.
How Does the EFK Stack Works?
Suppose you have three nodes, and the three nodes have pods to run the services or applications. The services also involve the EFK stack, where F works as a DaemonSet. Every node you have in the Kubernetes cluster has an individual pod for the Fluent bit. The Fluent bit will read the logs based on the particular pod or from the /var/log/containers directory. The directory contains all types of log files for the Kubernetes cluster.
Both Elasticsearch and Kibana can run in individual pods, but the pods are situated in the same node. Well, the pods can be based on the same nodes according to the resource available if they don’t require much space. But since Elasticsearch and Kibana need too much CPU power and computer resources to run, most of the time, they come in different nodes.
The cluster will contain pods that run your web applications, and to read the data placed in these pods, you will have to use the Fluent bit service. The service will then transfer the data to a JSON file in Elasticsearch. After receiving the data in Elasticsearch, it will be accessible for Kibana so that it can show the information on the Elasticsearch UI.
How Elasticsearch and Kibana Solve Your Problem?
When you are deploying applications on Kubernetes, you spend a lot of time managing and troubleshooting the logs. But if you spend that time creating and deploying other applications instead, it would boost your productivity. With Elasticsearch, Fluentd, and Kibana, managing the logs of your applications has become easier. Here have a look at the one-line definitions of the three open-source tools in Kubernetes:
- Elasticsearch is an open-sourced engine that can also analyze any data of the containerized applications.
- Fluentd collects data to build and unite all the logging layers together.
- Kibana lets you manage and operate the data from the Elasticsearch dashboard itself.
Prerequisites for EFK Setup
We will help you set up EFK on your Kubernetes cluster, but first, you need to satisfy the following requirements:
- Make sure the role-based access control (RBAC) service is turned on in your Kubernetes 1.10+ cluster.
- Your cluster should have adequate resources to let the EFK stack run. You can add worker nodes and scale the cluster if needed.
- Install the kubectl command-line tool on your localhost. Also, make sure to configure the tool’s setting according to your needs so that you can link it to your cluster easily.
Steps to Set Up EFK on Kubernetes Cluster
That’s all you need for setting up EFK on your Kubernetes cluster, and once done, you can go on with the following steps:
1. Create a Namespace
You have to install all the logging components in a space that will help in running an Elasticsearch cluster. With the help of Namespace, you can separate the services running on in a cluster. Moreover, creating a namespace is the first step towards setting up the EFK stack. You can use the kube-logging command to find out the existing namespaces in your cluster.
kubectl get namespaces
The output will be like the following:
NAME STATUS AGE
default Active 5m
kube-system Active 5m
kube-public Active 5m
These are the preinstalled namespaces on your cluster. But we will create a cube-logging namespace by creating a cube-logging.yaml file. Use your Nano text editor or Vim.
nano cube-logging.yaml
Open the editor and paste the code below inside of the editor. After pasting, save the file and exit.
kind: Namespace
apiVersion: v1
metadata:
name: kube-logging
After creating the Namespace object file, you can create the Namespace with the kubectl create command and -f filename flag.
kubectl create -f kube-logging.yaml
The output will be:
namespace/Kube-logging created
The created Kube-logging Namespace will be kubectl get namespaces with the following output:
NAME STATUS AGE
default Active 23m
kube-logging Active 1m
kube-public Active 23m
kube-system Active 23m
This logging namespace is now ready for deploying an Elasticsearch.
2. Setup the Elasticsearch
Now we can deploy a 3-node Elasticsearch cluster in the newly created Namespace. Since we are using 3-pods, it won’t cause the “split-brain” situation. We will first create a Headless Service and then the Elasticsearch StatefulSet.
The Headless Service
The headless service is also called elasticsearch that is related to the DNS domain of the 3 pods. Open your text editor and open the elasticsearch_svc.yaml file. Then paste the following code and save it.
ikind: Service
apiVersion: v1
metadata:
name: elasticsearch
namespace: kube-logging
labels:
app: elasticsearch
spec:
selector:
app: elasticsearch
clusterIP: None
ports:
– port: 9200
name: rest
– port: 9300
name: inter-node
You first define the elasticsearch service in the kube-logging Namespace, and then, you have to set the .spec.selector to the app: elasticsearch to let the headless service select the app: elasticsearch pods. After connecting the Elasticsearch with headless, headless will get back to DNS, which will define the ports and continue with the inter-node communication. Create headless with the kubectl create -f elasticsearch_svc.yaml command. The output will be similar to this:
service/elasticsearch created
To check if the service was created successfully, use the kubectl get services –namespace=kube-logging command. The following output will appear:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
elasticsearch ClusterIP None <none> 9200/TCP,9300/TCP 26s
Now it is time to create the StatefulSet.
The Elasticsearch StatefulSet
The Kubernetes StatefulSet lets you allocate individual identities to the pods and consistent storage. Since Elasticsearch requires storage stability to transfer data through pods, you need to learn how to use Statefulset to make the storage stable. Open the elasticsearch_statefulset.yaml file on your editor. Then paste the following lines:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: es-cluster
namespace: kube-logging
spec:
serviceName: elasticsearch
replicas: 3
selector:
matchLabels:
app: elasticsearch
template:
metadata:
labels:
app: elasticsearch
Paste the following block immediately after the previous one-
. . .
spec:
containers:
– name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:7.2.0
resources:
limits:
cpu: 1000m
requests:
cpu: 100m
ports:
– containerPort: 9200
name: rest
protocol: TCP
– containerPort: 9300
name: inter-node
protocol: TCP
volumeMounts:
– name: data
mountPath: /usr/share/elasticsearch/data
env:
– name: cluster.name
value: k8s-logs
– name: node.name
valueFrom:
fieldRef:
fieldPath: metadata.name
– name: discovery.seed_hosts
value: “es-cluster-0.elasticsearch,es-cluster-1.elasticsearch,es-cluster-2.elasticsearch”
– name: cluster.initial_master_nodes
value: “es-cluster-0,es-cluster-1,es-cluster-2”
– name: ES_JAVA_OPTS
value: “-Xms512m -Xmx512m”
Paste this next block again after pasting the previous ones:
. . .
initContainers:
– name: fix-permissions
image: busybox
command: [“sh”, “-c”, “chown -R 1000:1000 /usr/share/elasticsearch/data”]
securityContext:
privileged: true
volumeMounts:
– name: data
mountPath: /usr/share/elasticsearch/data
– name: increase-vm-max-map
image: busybox
command: [“sysctl”, “-w”, “vm.max_map_count=262144”]
securityContext:
privileged: true
– name: increase-fd-ulimit
image: busybox
command: [“sh”, “-c”, “ulimit -n 65536”]
securityContext:
privileged: true
In this block, you have defined the different Init containers that are running in the Elasticsearch. After doing so, you can add the volumeClaimTemplate block to define the StatefulSet object:
. . .
volumeClaimTemplates:
– metadata:
name: data
labels:
app: elasticsearch
spec:
accessModes: [ “ReadWriteOnce” ]
storageClassName: do-block-storage
resources:
requests:
storage: 100Gi
When the Statefulset object is completed, it will look like this:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: es-cluster
namespace: kube-logging
spec:
serviceName: elasticsearch
replicas: 3
selector:
matchLabels:
app: elasticsearch
template:
metadata:
labels:
app: elasticsearch
spec:
containers:
– name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:7.2.0
resources:
limits:
cpu: 1000m
requests:
cpu: 100m
ports:
– containerPort: 9200
name: rest
protocol: TCP
– containerPort: 9300
name: inter-node
protocol: TCP
volumeMounts:
– name: data
mountPath: /usr/share/elasticsearch/data
env:
– name: cluster.name
value: k8s-logs
– name: node.name
valueFrom:
fieldRef:
fieldPath: metadata.name
– name: discovery.seed_hosts
value: “es-cluster-0.elasticsearch,es-cluster-1.elasticsearch,es-cluster-2.elasticsearch”
– name: cluster.initial_master_nodes
value: “es-cluster-0,es-cluster-1,es-cluster-2”
– name: ES_JAVA_OPTS
value: “-Xms512m -Xmx512m”
initContainers:
– name: fix-permissions
image: busybox
command: [“sh”, “-c”, “chown -R 1000:1000 /usr/share/elasticsearch/data”]
securityContext:
privileged: true
volumeMounts:
– name: data
mountPath: /usr/share/elasticsearch/data
– name: increase-vm-max-map
image: busybox
command: [“sysctl”, “-w”, “vm.max_map_count=262144”]
securityContext:
privileged: true
– name: increase-fd-ulimit
image: busybox
command: [“sh”, “-c”, “ulimit -n 65536”]
securityContext:
privileged: true
volumeClaimTemplates:
– metadata:
name: data
labels:
app: elasticsearch
spec:
accessModes: [ “ReadWriteOnce” ]
storageClassName: do-block-storage
resources:
requests:
storage: 100Gi
Now save the file and close it. Deploy the StatefulSet using kubectl create -f elasticsearch_statefulset.yaml command.
After deploying, you should be easily able to monitor the object using the following command:
kubectl rollout status sts/es-cluster –namespace=kube-logging.
This should deploy all the pods in the Elasticsearch cluster. You can forward the 9200 to the port 9200 on the cluster nodes using kubectl port-forward es-cluster-0 9200:9200 –namespace=kube-logging and curl http://localhost:9200/_cluster/state?pretty commands in different terminal windows.
The following output should appear:
{
“cluster_name” : “k8s-logs”,
“compressed_size_in_bytes” : 348,
“cluster_uuid” : “QD06dK7CQgids-GQZooNVw”,
“version” : 3,
“state_uuid” : “mjNIWXAzQVuxNNOQ7xR-qg”,
“master_node” : “IdM5B7cUQWqFgIHXBp0JDg”,
“blocks” : { },
“nodes” : {
“u7DoTpMmSCixOoictzHItA” : {
“name” : “es-cluster-1”,
“ephemeral_id” : “ZlBflnXKRMC4RvEACHIVdg”,
“transport_address” : “10.244.8.2:9300”,
“attributes” : { }
},
“IdM5B7cUQWqFgIHXBp0JDg” : {
“name” : “es-cluster-0”,
“ephemeral_id” : “JTk1FDdFQuWbSFAtBxdxAQ”,
“transport_address” : “10.244.44.3:9300”,
“attributes” : { }
},
“R8E7xcSUSbGbgrhAdyAKmQ” : {
“name” : “es-cluster-2”,
“ephemeral_id” : “9wv6ke71Qqy9vk2LgJTqaA”,
“transport_address” : “10.244.40.4:9300”,
“attributes” : { }
}
},
…
The whole steps and blocks help you create an Elasticsearch cluster with 3 nodes. Now, the next task is to set up Kibana on the cluster.
3. Create the Kibana Deployment
We will now create two YAML files for Kibana deployment and Kibana service. The deployment will involve a Pod replica, and the service name will be kibana. If you need, you can scale the replica numbers and state a LoadBalancer for the Kibana service that will load balance requests within the deployment pods. But, now we will create the service and deployment in the same kibana.yaml file.
Open the kibana-service.yaml file on your text editor and paste the code stated below:
apiVersion: v1
kind: Service
metadata:
name: kibana
namespace: Kube-logging
labels:
app: kibana
spec:
ports:
– port: 5601
selector:
app: kibana
—
apiVersion: apps/v1
kind: Deployment
metadata:
name: kibana
namespace: kube-logging
labels:
app: kibana
spec:
replicas: 1
selector:
matchLabels:
app: kibana
template:
metadata:
labels:
app: kibana
spec:
containers:
– name: kibana
image: docker.elastic.co/kibana/kibana:7.2.0
resources:
limits:
cpu: 1000m
requests:
cpu: 100m
env:
– name: ELASTICSEARCH_URL
value: http://elasticsearch:9200
ports:
– containerPort: 5601
Now, save the file and close the window. After this, you can roll out the Kibana deployment as well as service using the kubectl create -f kibana.yaml command and the output will be:
service/kibana created
deployment.apps/kibana created
Use this command to find out if the rollout went well:
kubectl rollout status deployment/kibana –namespace=kube-logging
And the following message will appear:
deployment “kibana” successfully rolled out
You can forward a local port to a node that is running the Kibana, and for that, you need access to the Kibana pod details using the kubectl get pods –namespace=kube-logging command.
Here is what the output will be:
NAME READY STATUS RESTARTS AGE
es-cluster-0 1/1 Running 0 55m
es-cluster-1 1/1 Running 0 54m
es-cluster-2 1/1 Running 0 54m
kibana-6c9fb4b5b7-plbg2 1/1 Running 0 4m27s
On the kibana-6c9fb4b5b7-plbg2 pod, forward the 5601 local port to port 5601 by using the following command:
kubectl port-forward kibana-6c9fb4b5b7-plbg2 5601:5601 –namespace=kube-logging
Output:
Forwarding from 127.0.0.1:5601 -> 5601
Forwarding from [::1]:5601 -> 5601
Go to http://localhost:5601 from your internet browser, and you will see the Kibana welcome page. It indicates that the deployment of Kibana on your Kubernetes cluster was successful.
4. Fluent Bit Service
Now, you will have to create 5 YAML files for the Fluentb service, and then we will use the files with the kubectl command. We will configure Fluentd as a DaemonSet that will help us roll out a Fluentd logging agent Pod on each of the cluster nodes. So, let’s open the fluentd.yaml file in your text editor. Then paste the following Kubernetes object definition blocks one by one:
apiVersion: v1
kind: ServiceAccount
metadata:
name: fluentd
namespace: kube-logging
labels:
app: fluentd
And then paste the ClusterRole block:
. . .
—
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: fluentd
labels:
app: fluentd
rules:
– apiGroups:
-””
resources:
– pods
– namespaces
verbs:
– get
– list
– watch
ClusterRoleBinding block:
. . .
—
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: fluentd
roleRef:
kind: ClusterRole
name: fluentd
apiGroup: rbac.authorization.k8s.io
subjects:
– kind: ServiceAccount
name: fluentd
namespace: kube-logging
DaemonSet spec:
. . .
—
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
namespace: kube-logging
labels:
app: fluentd
Now paste the below code:
. . .
spec:
selector:
matchLabels:
app: fluentd
template:
metadata:
labels:
app: fluentd
spec:
serviceAccount: fluentd
serviceAccountName: fluentd
tolerations:
– key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
– name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1.4.2-debian-elasticsearch-1.1
env:
– name: FLUENT_ELASTICSEARCH_HOST
value: “elasticsearch.kube-logging.svc.cluster.local”
– name: FLUENT_ELASTICSEARCH_PORT
value: “9200”
– name: FLUENT_ELASTICSEARCH_SCHEME
value: “http”
– name: FLUENTD_SYSTEMD_CONF
value: disable
And this is the final block:
. . .
resources:
limits:
memory: 512Mi
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
– name: varlog
mountPath: /var/log
– name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
terminationGracePeriodSeconds: 30
volumes:
– name: varlog
hostPath:
path: /var/log
– name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
When you put all the blocks in place, the whole section will look like this:
apiVersion: v1
kind: ServiceAccount
metadata:
name: fluentd
namespace: Kube-logging
labels:
app: fluentd
—
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: fluentd
labels:
app: fluentd
rules:
– apiGroups:
-””
resources:
– pods
– namespaces
verbs:
– get
– list
– watch
—
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: fluentd
roleRef:
kind: ClusterRole
name: fluentd
apiGroup: rbac.authorization.k8s.io
subjects:
– kind: ServiceAccount
name: fluentd
namespace: kube-logging
—
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
namespace: kube-logging
labels:
app: fluentd
spec:
selector:
matchLabels:
app: fluentd
template:
metadata:
labels:
app: fluentd
spec:
serviceAccount: fluentd
serviceAccountName: fluentd
tolerations:
– key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
– name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1.4.2-debian-elasticsearch-1.1
env:
– name: FLUENT_ELASTICSEARCH_HOST
value: “elasticsearch.kube-logging.svc.cluster.local”
– name: FLUENT_ELASTICSEARCH_PORT
value: “9200”
– name: FLUENT_ELASTICSEARCH_SCHEME
value: “http”
– name: FLUENTD_SYSTEMD_CONF
value: disable
resources:
limits:
memory: 512Mi
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
– name: varlog
mountPath: /var/log
– name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
terminationGracePeriodSeconds: 30
volumes:
– name: varlog
hostPath:
path: /var/log
– name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
Now, save and close the file. To roll out the Daemon set, use the kubectl create -f fluentd.yaml command.
Now, go to http://localhost:5601 and navigate to Discover from your left-side menu. Here you will find a configuration window of Kibana from where you can operate the Elasticsearch dashboard anytime. You will also find the recent log entries from this page itself.
Conclusion
Now that you know how to set up an Elasticsearch and EFK stack on your Kubernetes cluster, you can easily manage your web applications. At the same time, you will be able to analyze the application logs and transaction records. If you have any questions, feel free to comment down below.