MySQL replication overview
MySQL replication is always based on replicating logical changes. That is, the replication protocol may contain an actual statement that the target server should apply or it may have an entry saying "update row with these old attributes to have these new attributes". These are called statement and row formats.
The statement format is more compact but cannot represent all changes because some statements would yield different results if executed as is on different servers. The row format can represent all changes and it allows using tools like Debezium since the binary log contains full details of all changes in itself.
For these reasons Aiven uses row format by default.
The row based replication works very well as long as the tables being replicated have a primary key. MySQL primary key look ups are very fast and the target server can find the rows to update and delete very quickly. However, when the table being replicated is lacking a primary key the, the target server needs to make a sequential table scan for each individual update or delete statement and the replication can become extremely slow if the table is large.
DELETE FROM nopk WHERE modified_time > '2020-01-13'
If a statement like above matched 500 rows and the table had a million rows
altogether, the row based replication format would contain 500 individual delete operations and the target server needed to do sequential scan over the one million rows for each of the individual deletions, which could take tens of minutes. If the table had a primary key the same statement would likely be replicated in under a second.
How Aiven uses replication
The considerations presented above are not only valid for services that actually have standby nodes or read replicas. Whenever the Aiven management platform needs to create a new node for a service, the node is first initialized from backup to most recent backed up state. This includes applying the full replication stream that has been created after the most recent full base backup. Once the latest backed
up state has been restored the node will connect to the current master (if available) and replicate latest state from that, which is also affected by possible replication slowness.
When new nodes are created, it needs to perform replication and having large tables without primary keys may make operations such as replacing failed nodes, upgrading service plan, migrating service to a different cloud provider or region, starting up new read replica service, forking a service, and some others to take extremely long time or depending on the situation practically not complete at all without manual operator intervention (e.g. new read replica might never be able to catch up with existing
master because replication is too slow).
To work around these issues Aiven operations people may need to resort to operations such as temporarily making master read only or promoting a replacement
server before it has fully applied the replication stream, resulting in data loss. To make the service operate correctly and avoid such drastic measures you should ensure the primary keys exist for any tables that are not trivially small.
Creating missing primary keys
To determine which tables are missing primary keys you can use the following:
tab.table_schema AS database_name,
tab.table_name AS table_name,
tab.table_rows AS table_rows
FROM information_schema.tables tab
LEFT JOIN information_schema.table_constraints tco
ON (tab.table_schema = tco.table_schema
AND tab.table_name = tco.table_name
AND tco.constraint_type = 'PRIMARY KEY')
tab.table_schema NOT IN ('mysql', 'information_schema', 'performance_schema', 'sys')
AND tco.constraint_type IS NULL
AND tab.table_type = 'BASE TABLE';
To see the exact table definition for the problematic tables you can do:
SHOW CREATE TABLE database_name.table_name;
If your table already contains a column or set of columns that can be used as primary key or composite key then using such column(s) is recommended.
E.g. if your table definition looks like
CREATE TABLE person (
social_security_number VARCHAR(30) NOT NULL,
CREATE TABLE team_membership (
user_id BIGINT NOT NULL,
team_id BIGINT NOT NULL
then you can create the primary key using statements like
ALTER TABLE person ADD PRIMARY KEY (social_security_number);
ALTER TABLE team_membership ADD PRIMARY KEY (user_id, team_id);
If none of the existing columns or combination of the existing columns can be used as primary key then you should add new separate id column.
ALTER TABLE mytable ADD id BIGINT PRIMARY KEY AUTO_INCREMENT;
Creating new tables without primary keys
Depending on when your MySQL service has been created, it may by default not allow creating new tables without primary keys. When creating tables without primary keys is not allowed, you will get the following error message:
Unable to create or change a table without a primary key, when the system variable 'sql_require_primary_key' is set. Add a primary key to the table or unset this variable to avoid this message. Note that tables without a primary key can cause performance problems in row-based replication, so please consult your DBA before changing this setting.
If creating tables without primary keys is prevented and the table that you're trying to create is known to be small you may override this setting and create the table anyway. There are two possible options:
- Run the following query:
SET SESSION sql_require_primary_key = 1;and then execute the
ALTER TABLEstatement again in the same session.
- Change the
mysql.sql_require_primary_keyadvanced configuration option from Aiven web UI to true. The setting can be found on the overview page of the affected service. It is only recommended to use this approach when the table is created by an external application and using the session variable is not an option. To prevent more problematic tables from being unexpected created in the future you should change the setting back to false once you have finished creating the tables without primary keys.