Computer

Taking Control of Your Database With Kubernetes Operators

Managing stateful applications like databases in Kubernetes can be challenging. Operators package practical expertise into solutions that automate managerial duties.

Using the MySQL operator simplifies deploying and expanding a production-ready MySQL cluster. It also enables high availability by replicating Pods hosting the database and allows backups and recovery.

InnoDB Cluster

Haroon Khan is a data professional who builds and maintains databases supporting high-performance applications. He has over a decade of experience with various technologies, including MySQL, Microsoft SQL Server, DB2, and open-source tools. He is passionate about helping companies adopt a modern data architecture that can handle extensive data requirements. InnoDB Cluster is a critical component in this effort, offering improved write scalability. However, implementing it can be challenging if you need to familiarize yourself with its various components. In addition, it can require multiple tools to manage the topology. This solution may not be suitable for your application if you don’t want to invest in each technology’s development/integration/testing time. Here’s where Haroon’s Kubernetes compass shines. He believes that scaling your MySQL with InnoDB Cluster should be separate from scaling your complexity. That’s why he champions integrating it with the MySQL Operator Kubernetes. This potent pairing combines the power of the InnoDB Cluster with the elegance of Kubernetes, simplifying management and orchestration in one fell swoop.

If an instance is restarted, it will leave the group and needs to rejoin the group to be added back into the default Replica Set. To rejoin an example, you can use the cluster.rejoinInstance() command, which takes a URI as its parameter. This command ensures that the instance is ONLINE and can be included in the group. It also checks that it is not a Split Brain (by comparing its executed GTIDs with other ONLINE members’ executed/purged GTIDs). The command can fail if an instance is unreachable, but you can try again later.

PodDisruptionBudget

A Pod Disruption Budget (PDB) specifies the minimum number or percentage of pods in a collection that must always be up. It is a valuable tool when dealing with applications that require high availability, such as quorum-based apps or web front ends.

Pod disruption budgets help ensure that node operations will only bring your service down by simultaneously draining a few pod instances. They work alongside Horizontal Pod Autoscaler to protect against unnecessary deployment downtime. To use a PDB, you need to create a YAML file defining the desired availability for your application. You can also define a PodDisruptionBudgetStatus object to track the current status of your PDB. This can be handy when dealing with changing situations, as the status of your PDB may trail behind the actual state.

Pod Disruption Budgets are helpful for applications that need to maintain their availability while cluster upgrades take place. For example, a telecom might want to ensure their VoIP services remain available during system maintenance or patching. This allows the telecom to roll out features that can improve the end-user experience without sacrificing availability. The PodDisruptionBudget is also helpful for software as a service (SaaS) providers who must balance availability with system upgrades and maintenance. This is especially true when a SaaS provider must maintain availability while rolling out a new feature or deploying updates to their service.

ReplicaSets

ReplicationSets are Kubernetes control objects that maintain a specified number of pod replicas. They are excellent for handling straightforward scaling needs and can help ensure high availability and resilience. ReplicaSets can also handle changes in the desired state, such as the failure of a pod or manual deletion of a pod, by creating new pods to replace them or terminating excess replicas to bring the system back to its desired state.

ReplicaSets are configured in a YAML manifest file, including specifications for the number of replicas and the pod template. They also support using pod labels to identify and select pods they manage. For example, a ReplicationSet could acquire all pods with the label tier in (frontend) and environment in (prod). A ReplicationSet can also track changes to the desired state by observing the oplog or primary write concern log.

When a primary fails in a replica set, the secondary that elects to become the new primary catches up by applying the operations recorded in the log to its data sets. In addition, a ReplicaSet can pre-warm the caches of electable secondary replicas by mirroring read queries. This reduces the impact of primary elections following a primary outage or during planned maintenance. ReplicaSets can also be deployed across multiple nodes and availability zones to enhance redundancy.

Backups

Many solutions exist for running stateful databases in Kubernetes, but most need day-2 capabilities like backups and upgrades. Kubernetes Operators bridge this gap by automating and streamlining domain-specific and complex operations, including deploying, scaling, and upgrading databases. The best MySQL operators are designed to be fully integrated with existing infrastructure-as-code tools and CI/CD pipelines to enable database deployments that are scalable, secure, and production-ready.

Oracle’s MySQL Operator is a self-healing solution that supports MySQL InnoDB clusters in Kubernetes. It uses a Kubernetes StatefulSet to manage MySQL Server instances and assigns them a PersistentVolumeChain for storage. The operator also deploys a MySQL router Pod to route queries through the cluster. The operator is currently in General Availability and distributed under a Universally Permissive License close to the MIT license.

The Bitpoke MySQL Operator is an easy-to-use, open-source MySQL Operator for Kubernetes that only installs Helm and Kubectl. It is designed to be a simple, stable, and production-ready solution for Kubernetes that provides monitoring, replication, backup, upgrade, and other features. The operator uses a declarative YAML file to deploy and scale MySQL clusters on Kubernetes.

Unlike a traditional stored procedure, the Nova-conductor abstraction prevents direct access to the database. This reduces the risk of attacks on data. However, it is still possible to attack the database through buffer overflows, memory leaks, and malicious software programs.

Related Articles