Spring Database Migration with Flyway

Database Migration

Database Migration simply means managing your schema changes over multiple copies of your database. When you have a single instance of your database, schema changes are not very problematic.  But if you have multiple instances of your schema it becomes very cumbersome to update every single instance manually when the schema is changed (add a new table, update column).

For example, if we create a copy of our schema on different deployment environments (dev, stage, prod), or if we need a new clone of our schema for each customer in a multitenant application it is a good idea to use database migration tools as Flyway or Liquibase. It is error-prone to conduct schema changes manually and running some manual scripts on each instance of a database.

Flyway Overview

Flyway is an open-source database migration tool that helps us to keep track of the changes in the schema and reflect all data sources. Automatic database migrations allow us:

  • Create a database from scratch.
  • Have a single source of truth for the version of the database state.
  • Have a reproducible state of the database in local and remote environments.
  • Automate database changes deployment, which helps to minimize human errors.

Basically, we provide SQL scripts which are called migration to the flyway and it runs the scripts sequentially. The migrations are then sorted based on their version number and applied in order.

Flyway initially creates flyway_schema_history table in the database to keep track of the migrations. Each migration is recorded in this table with its status. Once migration is run, its status is set as success = true to imply that it is already applied, so it won't be run on the next turns.

Implementation

We implement it on top of multitenant application from the post. So we assume that there are multiple datasources.

We need to start with adding the dependency to the spring boot app:

compile("org.flywaydb:flyway-core:6.4.1")
build.gradle

Initially, we should disable it to not run at startup as there is no data source selected otherwise, the app will crash.

spring:
  flyway:
    enabled: false
application.yml

Note that if you want to use it in a normal single-tenant application, then no need to disable flyway at startup, it will look for migrations in classpath and run the migrate method at startup automatically. Here in this example, it fails at startup as there is no datasource specified yet.

And then we can configure it to run after data sources are loaded to DataSourceProperties

@Configuration
public class DataSourceConfig {

    private final DataSourceProperties dataSourceProperties;

    public DataSourceConfig(DataSourceProperties dataSourceProperties) {
        this.dataSourceProperties = dataSourceProperties;
    }

    @Bean
    public DataSource getDataSource() {
        // ...
    }

    @PostConstruct
    public void migrate() {
        for (Object dataSource : dataSourceProperties
                .getDatasources()
                .values()) {
            DataSource source = (DataSource) dataSource;
            Flyway flyway = Flyway.configure().dataSource(source).load();
            flyway.migrate();
        }
    }
}

The main operation done here is that specifying the data source to flyway and calling the migrate method. We do this in the loop, so it is done for each data source. The migrate method runs the SQL scripts given under the resources/db/migration folder.

Let's assume, we have 2 SQL scripts.

  • to initialize a new DB: V1__init_schema.sql
  • to insert some necessary data: V2__insert_admin.sql

Flyway runs the scripts sequentially according to the version number.  Version numbers can be in the format of (V1, V1.0.1, V2020.01.10). Only an important point is that versions should be unique and note that they will work in order.

Example V1__init_schema.sql (Creates the tables)

create table if not exists user_group
(
	id bigserial not null
		constraint user_group_pkey
			primary key,
	value varchar(255),
	permission jsonb
);

alter table user_group owner to postgres;

create table if not exists application_user
(
	id bigserial not null
		constraint application_user_pkey
			primary key,
	password varchar(255),
	type varchar(255),
	username varchar(255),
	address varchar(255),
	avatar_url varchar(255),
	email varchar(255),
	name varchar(255),
	phone_number varchar(255),
	user_group_id bigint
		constraint fk14qhu4jb4b86vjiw8ax2u2p7d
			references user_group
);

alter table application_user owner to postgres;

# ...
V1__init_schema.sql

Example V2__insert_admin.sql (inserts initial necessary user data)

INSERT INTO public.user_group
(id, value, permission)
VALUES
(1, 'Super User', '');

INSERT INTO public.application_user
(id, password, type, username, address, avatar_url, email, name, phone_number, user_group_id)
VALUES
(1, '$2a$10$Sn9M0FNJTLxAnTn5Am5Ax.g/Esh6j1sGaGuEE0WYFYOFQe7YOnlJ2', null, 'admin', '', '', 'cemalturkogluiu@gmail.com', 'Cemal Turkoglu', '+35679409866', 1);
V2__insert_admin.sql

When we start the application these scripts will be run for each datasource. Flyway creates a table in each datasource named flyway_schema_history to keep track of changes. For example in the first run it fills the table as follows:

flyway_schema_history table

In the next run, the scripts won't run as they have already implemented status. So it works as a Version Control System for the databases.

If we need the schema to be changed we can create new scripts with a bumped up version and it will be reflected in each data source.