What is etcd?

Photo of author

By admin

Kubernetes is a distributed container orchestration platform that works with various nodes that are managed and controlled from one master node. There can be unlimited worker nodes that are being distributed to the handle pods. Following is the set of responsibilities that are handled by Kubernetes:

  • Managing and fetching API calls from kubectl (a command-line utility for Kubernetes).
  • Assigning pods to work on various worker nodes.
  • Ensuring the functioning of pods on worker nodes.
  • Real-time monitoring, running regular check-ups, and self-healing.

For keeping the track of changes and updates associated with these nodes and storing all the data, Kubernetes utilizes etcd. Let’s have a look at etcd and its functioning along with knowing how Kubernetes utilizes it for running clusters.

What is etcd?

etcd is an open-source and distributed key-value storage that holds and manages all the useful information required for distributed systems or machine clusters for functioning properly.

Typically, it looks into the management of configuration and data, metadata of Kubernetes, and helps in achieving automatic updates along with assisting in the setup of overlay networking for containers. Using etcd allows various cloud-based applications to maintain a proper uptime and simultaneous work because all the applications pick up and write the data in etcd.

It also distributes the configuration data for offering more resilience to the configuration of nodes. Also, etcd in Kubernetes supports discovery services as well that make the deployed application mark their availability and set up the desired state for the system.

etcd differs from various traditional databases that are based on string data in tabular form. It is unique because it builds a database page for every record that doesn’t interfere with other records running in parallel while getting updated. Therefore, etcd successfully includes and controls the record in an efficient way for Kubernetes.

Features

  • Etcd will be situated completely on all the nodes of a cluster for storing every required and detailed information.
  • It manages to provide all-time availability because it efficiently ignores single-point failures of hardware and networking issues, and doesn’t let them hamper the whole process.
  • It offers great consistency by offering read returns to the recent writes across various hosts.
  • Comes with a simple and well-defined user-facing API.
  • Puts a Transport Layer Security protocol automatically along with client authentication.
  • Offers incredibly faster speed with 10,000 writes per second (approx.).
  • Utilizes the raft algorithm for smooth distribution.

How etcd Work?

To learn about the working of etcd, it is required for you to know about the following 3 terms:

1. Leaders

They are used to handle requests from clients that require cluster consensus. For the request which doesn’t require consensus (for example, reads) are proceeded by cluster members.

They look for all the new changes, replicate the information, and then implement the modifications right after the verification from the administrator. Every individual cluster is allowed to have only one leader at a given instance.

2. Election

In case when a leader dies or doesn’t respond to the requests, all of the worker pods will start an election after the timeout in order to decide on a new leader. The election will be held randomly for every node along with a proper time frame for nodes to wait after one another before calling a new election for an eligible candidate.

3. Term

If the leader doesn’t respond after a certain timeout, a node will begin election by initiating a new term and making itself a candidate for being the leader. It further shares it to other nodes asking for the vote, and after receiving the votes from the majority in the cluster, it will become the new leader. It is often possible that the first candidate becomes the leader.

However, if nodes receive the same number of votes, then the election will end without finalizing the leader and after a certain time, the new term will begin with new randomized election timers.

Therefore, every request made in etcd passes through the leader node and therefore, changes are not accepted and committed immediately. Instead, etcd utilizes the Raft algorithm to verify changes from the majority of nodes. The leader sends the receipt to get confirmation from all the nodes and after receiving the majority of the votes, the leader takes the action.

Running etcd clusters

Before proceeding, ensure to have the kubectl command-line tool configured to communicate with clusters. Also, have a look at the prerequisites before running etcd clusters:

Prerequisites

  • Since it runs on the basis of the leader node, check that the leader is functioning periodically on time with all the other nodes and is keeping the cluster stable.
  • Having an unstable etcd means that no leader is elected. In such a case, clusters can’t implement any change in the current state and therefore, no new pods can be released.
  • Ensure the running of etcd clusters within isolated environments to keep the system going simultaneously.
  • Use etcd 3.2.10 or above.

Running multi-node etcd cluster

Run multi-node etcd cluster for better performance and durability. A minimum of 5 clusters should be used in production. Configure each cluster by either static information or by dynamic discovery. Run the following command:

etcd --listen-client-urls=http://$IP1:2379,http://$IP2:2379,http://$IP3:2379,http://$IP4:2379,http://$IP5:2379
--advertise-client-urls=http://$IP1:2379,http://$IP2:2379,http://$IP3:2379,http://$IP4:2379,http://$IP5:2379

For running a load balancer, start the Kubernetes cluster with the following flag:

--etcd-servers=$LB:2379

etcd and Kubernetes

As the primary datastore of Kubernetes, the job of etcd is to store and replicate all the Kubernetes cluster, configuration, and state data along with metadata. And because Kubernetes runs on clusters, a distributed datastore like etcd becomes an integral part of them. As a result, it is crucial for etcd to have a reliable approach for configuration and management.

etcd is used as primary key-value storage for the creation of the Kubernetes cluster that can bear physical and technical single-point faults. All of the cluster’s state data gets stored on etcd through Kubernetes API and with the help of etcd’s “watch” function, it monitors all of the data for reconfiguring itself on the implementation of changes.

Also, the “watch” function saves the value that represents the ideal state clusters and can lay responses in case of divergence. All of the nodes of any cluster in Kubernetes are allowed to read and write data.

All of the “read” requests placed from kubectl commands are retrieved from the data stored inside etcd and any changes requested through the kubectl apply command will be created or updated in etcd.

etcd was adopted by Kubernetes in 2014 and since then, its popularity has grown exponentially. Also, it is now used in the production environments by industry leaders, such as Amazon, Google, and Azure.

The etcd Operators

Operators are skilled to have human operational knowledge through which using etcd on Kubernetes gets extremely simplified and smooth. They efficiently handle etcd associated with an operator framework and simplifies its cluster configuration and management.

An individual can install etcd operators with the help of a single command line through which they can configure and manage all the complexities with the help of declarative configuration that creates and manages etcd clusters.

Benefits of etcd operator

1. Creating and destroying clusters

For creating and destroying clusters and associated data, users just have to specify the exact size of the cluster instead of describing all the tedious configuration settings for every etcd member.

2. Resizing

To resize, an individual just has to modify the size. The rest of the tasks, such as deployment and reconfiguration of the clusters, will be done automatically by these operators.

3. Real-time backup

etcd operators run the backup cycle automatically with full transparency to the users. An individual is only required to agree and specify the backup policy once that can also be modified in the future if they want. For instance, users can define the backup cycle to run after every 30 minutes and store the last 3 backups.

4. Upgrading

Upgrading etcd takes a lot of time and is also considered to be a difficult task. Since it is a mandatory process that cannot be neglected, users can take the help of an etcd Operator that simplifies the operation and eliminates all the possible upgrade errors.

Properties of etcd

1. Replication

etcd is placed on all the clusters to never miss even a single dot. Therefore, each and every node within the etcd cluster is liberated to store as much data as it wants.

2. Availability

etcd maintains simultaneous functioning without any glitch or major break. It is possible because etcd is constructed to gracefully manage hardware or single-point failures along with network partitions.

3. Security

etcd takes the security to another level by placing automatic TLS and an optional Secure Socket Layer (SSL) for client authentication. Because etcd is composed of crucial and confidential information regarding organizations, admins should only give access to certain team members and ensure to limit their interaction time that too on the least-privileged level of access.

4. Simple

It is extremely simple to use and is compatible with most applications. Whether it be a simple web application or a highly robust and broad container orchestration platform, such as Kubernetes, it efficiently reads and writes data with the help of standard HTTP/JSON tools.

Conclusion

etcd is an extremely robust and reliable distributed key-value store majorly used for Kubernetes clusters. It has a consistent workflow that ensures to have all the data stored in a proper sequence throughout the cluster.

The blog walks you through all the important details regarding etcd, its usage, benefits along with information regarding its processing and how one can run it on their system.

These are a few of the most critical pieces of information that you must be aware of in order to work on etcd and Kubernetes. Let us know in the comments section about any of your queries or suggestions. We will respond at the earliest possible.

Leave a Comment