Diving Deep into Database Migration on Kubernetes: A Comparative Study

Database migration with Init Containers, Continuous Deployment Pipelines, Separate Helm Chart with Kubernetes Job and Custom-Developed SQL Script Executor.

Published in

Cloud Native Daily

5 min readJun 19, 2023

Introduction

Database migration is a crucial aspect of deploying applications on a Kubernetes cluster. It ensures that the database schema and data remain synchronised with the application’s evolving requirements.

In this blog, we will explore various methods for running database migration in a Kubernetes environment. We will discuss four different approaches: using init containers, running migrations through continuous deployment pipelines, creating a separate helm chart for running database migrations through a Kubernetes job and utilising a custom-developed SQL script executor.

We have containerised the migration scripts and have used helm charts for the deployments. Each approach has its own advantages and disadvantages, enabling you to choose the most suitable option for your specific deployment needs. Let’s discuss each of the approaches in detail.

Init Containers

Init containers are containers that run before the main application containers start. In the context of database migration, init containers can be utilised to perform migration tasks before the application containers are deployed. This is the simplest approach as it requires minimal changes in the deployment yaml file.

Advantages

Isolated migration process: Using init containers ensures a clean and isolated migration process, separate from the application containers.
Simplified deployment manifests: Migration tasks can be included in the same deployment manifest, simplifying the deployment configuration.

Disadvantages

Limited flexibility: Init containers are primarily designed for one-time initialisation tasks and may not be ideal for complex migration scenarios.
Increased resource consumption: Running additional containers, even for migration purposes, consumes additional resources.
Late feedback: Due to the way helm works, the deployment will always succeed irrespective of the status of the init containers. You need to implement additional monitoring to verify if the deployment has succeeded.

Continuous Deployment Pipelines

Continuous deployment pipelines integrate the database migration process into the application’s CI/CD pipeline. The pipeline triggers the necessary steps to execute the migrations. Executing the migrations scripts on the database requires connections parameters which are set by the pipeline as environment variables.

Advantages

Automated and streamlined process: Integration with the CI/CD pipeline ensures automated and consistent migration execution.
Version control: Migration scripts can be version-controlled alongside application code, enabling consistent and reproducible deployments.

Disadvantages

Complexity: Incorporating database migration into the CI/CD pipeline requires additional configuration and management effort.
Tight coupling: If the application and database migrations are tightly coupled in terms of deployment, it may limit flexibility in scaling and managing them independently.
Database exposure: Database is required to be made accessible to the build/release tool so that it can connect to the database. It is highly unlikely in case of production database due to security concerns.
Credential exposure: Database connection parameters will be set in plaintext as environment variables. This is a security concern.

Separate Helm Chart with Kubernetes Job

Using a separate Helm chart for database migrations leverages the package management capabilities of Helm. The chart includes a Kubernetes job that runs an image containing the migration script. The helm chart is deployed on the Kubernetes cluster from where the database is directly accessible. You don’t need to expose your database to any outside dependencies.

Advantages

Modularity and reusability: The separate Helm chart allows for modular deployment and reusability across different environments or projects.
Configuration flexibility: Helm charts offer flexible configuration options for customising the migration process for each deployment.
Isolation: The database migration is isolated within its own Helm release, ensuring separation from other application components.
No database exposure: The database is not required to be made accessible beyond the cluster network where the application is hosted.

Disadvantages

Learning curve: Utilising Helm and creating separate charts may require a learning curve, especially for teams new to Helm.
Management overhead: Managing separate Helm charts adds some management overhead compared to other approaches.
Credential exposure: Database connection parameters will be set in plaintext as environment variables. This is a security concern.

Custom-Developed SQL Script Executor

In this approach, a custom-developed SQL script executor is packaged as a container image and deployed as a Kubernetes job. The executor can connect to a secret store to retrieve the database connection details securely. This approach is an extension of the separate helm chart approach but replacing the database command line utility with custom developed one. It eliminates the requirement of setting database connection parameters to be set as environment variables.

Advantages

Flexibility and extensibility: The custom executor allows for flexibility and customisation to meet specific migration requirements.
Secure connection handling: The executor can securely retrieve database connection details from a secret store, reducing the risk of credential exposure.
Version control: Including migration scripts within the executor image enables versioning and ensures consistent deployments.

Disadvantages

Development effort: Developing and maintaining a custom executor requires dedicated development effort.
Image management: Managing the executor image, including updates and dependencies, can become challenging over time.
Scalability: Resource-intensive migration processes may impact the scalability of the Kubernetes cluster or result in longer deployment times.

Conclusion

When it comes to running database migration on a Kubernetes cluster, various approaches offer advantages and trade-offs. Remember, there is no one-size-fits-all solution. It’s crucial to evaluate your project’s needs, resources, and constraints to determine the most suitable approach.

Kubernetes Monitoring with OpenTelemetry

Learn how to monitor Kubernetes using OpenTelemetry with real-time visibility and granular error data - Reduce MTTR by…

gethelios.dev

OpenTelemetry: A full guide

Learn all about OpenTelemetry OpenSource and how it transforms microservices observability and troubleshooting

gethelios.dev

Trace-based Testing for the Modern Infrastructure

A high-level look at trace-based testing: what it is, why it's important, and how engineering teams might implement it.

gethelios.dev

Distributed Tracing: A Guide for 2023

Explore the basics of distributed tracing, how it works, the major components, key benefits, challenges, and best…

medium.com

9 Best Distributed Tracing Tools for Developers

Decide Among The Best Tools For Distributing Tracing in your Backend Microservices Architecture

javascript.plainenglish.io

Diving Deep into Database Migration on Kubernetes: A Comparative Study

Database migration with Init Containers, Continuous Deployment Pipelines, Separate Helm Chart with Kubernetes Job and Custom-Developed SQL Script Executor.

Introduction

Init Containers

Continuous Deployment Pipelines

Separate Helm Chart with Kubernetes Job

Custom-Developed SQL Script Executor

Conclusion

Further Reading:

Kubernetes Monitoring with OpenTelemetry

Learn how to monitor Kubernetes using OpenTelemetry with real-time visibility and granular error data - Reduce MTTR by…

OpenTelemetry: A full guide

Learn all about OpenTelemetry OpenSource and how it transforms microservices observability and troubleshooting

Trace-based Testing for the Modern Infrastructure

A high-level look at trace-based testing: what it is, why it's important, and how engineering teams might implement it.

Distributed Tracing: A Guide for 2023

Explore the basics of distributed tracing, how it works, the major components, key benefits, challenges, and best…

9 Best Distributed Tracing Tools for Developers

Decide Among The Best Tools For Distributing Tracing in your Backend Microservices Architecture

Written by Nikhil Samant