Aiven PostgreSQL migration will attempt to migrate a PostgreSQL cluster from an external source to a Aiven PostgreSQL service. That goal is achieved in one of 2 ways by using this tool https://github.com/aiven/aiven-db-migrate:

  • Using logical replication
  • Using a dump / restore process

The preferred first method chosen will always be logical replication, since once successful, it will be able to keep 2 databases in sync until interrupted. Should the preconditions for it not be met for a certain database, then the logic will fall back to pg_dump.

Regardless of migration method used, the initial step performed by the migration tool is a schema dump and migrate, which ensures schema compatibility

Prerequisites for the migration are:

  • the source server needs to be publicly available or there must be a VPC peering connection between the 2 private networks
  • the provided user can access the destination cluster from an external IP, as configured in the source cluster pg_hba.conf
  • for logical replication to work, the credentials provided must have superuser access to the source cluster or the https://github.com/aiven/aiven-extras extension needs to be installed on it
  • for logical replication to work, a replication slot needs to be created on the destination cluster for each migrated database from the source cluster, so the number of maximum available replication slots needs to be considered

To enable the aiven_extras extension on an Aiven Postgres you can run the following command:

$ CREATE EXTENSION aiven_extras CASCADE;

Checking the migration configuration options available:

$ avn service types -v
----
Service type 'pg' options:

Remove migration
=> --remove-option migration
Database name for bootstrapping the initial connection
=> -c migration.dbname=<string>
Hostname or IP address of the server where to migrate data from
=> -c migration.host=<string>
Password for authentication with the server where to migrate data from
=> -c migration.password=<string>
Port number of the server where to migrate data from
=> -c migration.port=<integer>
The server where to migrate data from is secured with SSL
=> -c migration.ssl=<boolean> (default=True)
User name for authentication with the server where to migrate data from
=> -c migration.username=<string>
----

The combination of dbname, username, password, host, port and SSL will be used to perform the initial connection to the PG cluster. Example migration configuration follows:

First create a couple of PG services:

$ avn service create -t pg -p startup-4 --project test pg-example-src
$ avn service create -t pg -p startup-4 --project test pg-example-dst

Then configure the migration details:

$ avn service update -c migration.host=<host> -c migration.port=<port> -c migration.ssl=true -c migration.username=<user> -c migration.password=<pass> --project test pg-example-dst

Note that there are 2 databases on the source cluster: one with the aiven-extras extension installed, and one without. Let's check the migration status:

$ avn --show-http service migration-status pg-example-dst --project test
------
-----Response Begin-----
{
"migration": {
"error": null,
"method": "",
"status": "done"
},
"migration_detail": [
{
"dbname": "has_aiven_extras",
"error": null,
"method": "replication",
"status": "syncing"
},
{
"dbname": "defaultdb",
"error": null,
"method": "pg_dump",
"status": "done"
}
]
}
-----Response End-----
STATUS METHOD ERROR
====== ====== =====
done null

Note that the text status method is left empty, due to the mixed methods used to migrate each database.

Now to remove the migration configuration from our service:

$ avn service update --project test --remove-option migration pg-example-dst

Removing the migration will remove all logical replication related objects from both the source and the target cluster, so it will effectively stop the logical replication. It will have no effect for the pg_dump method, since it is a one time operation.

Note that removing a migration configuration can leave the destination cluster in an inconsistent state, depending on where the migration procedure was when the removal was triggered. The states that are considered "safe" are "done" for the pg_dump method and "syncing" for logical replication.

While in a "running" state , both migration methods are still copying data around from the source cluster to the target one, so stopping the process will probably leave some tables moved while some partially there or altogether missing.

Also please note that running a logical replication migration twice on the same cluster will create duplicate data and logical replication also has some limitations on what it will copy.

Did this answer your question?