Validating Database Migration from MySQL to PostgreSQL
When a company realizes need of more powerful yet still free database management system, it may consider migration from MySQL to PostgreSQL.After database migration is completed, it is very important to verify that all database objects have been processed properly. This article explores main steps of the validation procedure (validation means to compare the corresponding entries in source and destination databases).
First step is to recognize what kind of objects must be validatedafter migration is completed:
- Table definitions
- Data
- Indexes
- Foreign keys
- Views
Table Definitions
In MySQL table definition can be displayed by running SQL query:DESC table_name
PostgreSQL exposes table definition by running the statement:\d table_name
MySQL table definition is migrated properly once every column hasproper type, size, NULL check and default value in the target PostgreSQL table. The tablesbelow illustratecorrect type mapping for MySQL to PostgreSQL migration.
MySQL | PostgreSQL |
BIGINT | BIGINT |
BINARY(n) | BYTEA |
BIT | BOOLEAN |
CHAR(n), CHARACTER(n) | CHAR(n), CHARACTER(n) |
DATE | DATE |
DATETIME | TIMESTAMP [WITHOUT TIME ZONE] |
DECIMAL(p,s), DEC(p,s) | DECIMAL(p,s), DEC(p,s) |
DOUBLE | DOUBLE PRECISION |
FLOAT | REAL |
INT, INTEGER | INT, INTEGER |
MEDIUMINT | INTEGER |
NUMERIC(p,s) | NUMERIC(p,s) |
SMALLINT | SMALLINT |
TINYBLOB, BLOB, MEDIUMBLOB, LONGBLOB | BYTEA |
TINYINT | SMALLINT |
TINYTEXT, TEXT, MEDIUMTEXT, LONGTEXT | TEXT |
TIME | TIME [WITHOUT TIME ZONE] |
TIMESTAMP | TIMESTAMP [WITHOUT TIME ZONE] |
VARBINARY(n) | BYTEA |
VARCHAR(n) | VARCHAR(n) |
MySQL provides attribute ‘auto_increment’ for integer-like columns, that increases the value of the field automatically each time when new row is inserted. PostgreSQL uses SERIAL for the same purpose:
MySQL | PostgreSQL |
BIGINT AUTO_INCREMENT | BIGSERIAL |
INTEGER AUTO_INCREMENT | SERIAL |
SMALLINT AUTO_INCREMENT | SMALLSERIAL |
TINYINT AUTO_INCREMENT | SMALLSERIAL |
InMySQLall integer types (tinyint, smallint, int, bigint) can have UNSIGNED attribute. Unsigned specification increases upper range of acceptable values by forcing to take positive numbers only. Since PostgreSQL does not support unsigned numeric types, those types must be converted as follows:
MySQL | PostgreSQL |
BIGINT UNSIGNED | NUMERIC(20) |
INT UNSIGNED | BIGINT |
MEDIUMINT UNSIGNED | INTEGER |
SMALLINT UNSIGNED | INTEGER |
TINYINT UNSIGNED | INTEGER |
Let us illustrate migration of CREATE TABLE statement from MySQL to PostgreSQL by following example. Assume, there is MySQL table defined as:
CREATE TABLE tbl1(
id int NOT NULL auto_increment,
added DATETIME,
data BLOB,
length INT UNSIGNED,
PRIMARY KEY(id)
);
PostgreSQL equivalent of this table definition is:
CREATE TABLE tbl1(
id SERIAL NOT NULL,
added TIMESTAMP,
data BYTEA,
length BIGINT,
PRIMARY KEY(id)
);
Data
Validation of migrated data begins with verifying MySQL and PostgreSQL tables have the same number of records. Both database management systems allow to obtain number of rows in a table through the following query:
SELECT COUNT(*) FROM table_name
Next verification that must be done is visual comparison of arandomdata fragmentsin MySQL and Postgres tables. MySQL allows to explore fragment of data as follows:
SELECT * FROM table_name LIMIT start_record, number_of_records
PostgreSQLsupports similar syntax of SELECT-query:
SELECT * FROM table_name LIMIT number_of_records OFFSET start_record
Indexes
Validation of migrated indexes includes comparison count of indexes in MySQL and PostgreSQL tables and checking that source and target indexes are the same.
MySQL allows to list indexes as follows: SHOW INDEXES FROM table_name;
PostgreSQL displays information about indexes at the end of table definition that is output of the statement: \d table_name
Foreign Keys
Verifying migrated foreign keys is very similar to indexes.
MySQL exposes information about foreign keys at the end of table definition that is output of the statement:SHOW CREATE TABLE `table name`
PostgreSQL can fetch foreign keys from service table “information_schema” through the following query:
SELECT
t_c.constraint_name, t_c.table_name, k_c_u.column_name,
c_c_u.table_name AS foreign_table_name,
c_c_u.column_name AS foreign_column_name
FROM
information_schema.table_constraints AS t_c
JOIN information_schema.key_column_usage AS k_c_u
ON t_c.constraint_name = k_c_u.constraint_name
JOIN information_schema.constraint_column_usage AS c_c_u
ON c_c_u.constraint_name = t_c.constraint_name
WHERE constraint_type = ‘FOREIGN KEY’ AND t_c.table_name=’table_name’;
Views
Verifying migrated views is the most complicated part of overall validation,since it requires comparing CREATE VIEW statement of every view in MySQL and PostgreSQL databases with respect todifferences between SQL dialects of these two DBMS.
MySQL exposes list of all views in the database using the query:
SELECT table_name, view_definition FROM information_schema.views WHERE table_schema=’database_name’;
PostgreSQL can do the same via the query:
select viewname, definition from pg_catalog.pg_views where schemaname NOT IN (‘pg_catalog’, ‘information_schema’)
Syntax of queries in MySQL and PostgreSQL is distinguished due to different sets of built-in functions.Therefore, while migrating views every MySQL embedded function must be replaced by PostgreSQL equivalent as it is specified in table below.
MySQL | PostgreSQL |
ADDATE($date, $interval) | ($date + $interval)::date |
CURTIME() | CURRENT_TIME |
DAY($a) or DAYOFMONTH($a) | EXTRACT(day from date($a))::integer |
DATEDIFF($a, $b) | $a – $b |
DATE_FORMAT | TO_CHAR with proper format mask |
HOUR($a) | EXTRACT(hour FROM $a)::int |
IFNULL($a,$b) | COALESCE($a,$b) |
INSTR($a, $b) or LOCATE ($a,$b) | POSITION($b in $a) |
ISNULL($a) | $a IS NULL |
LCASE ($a) | LOWER($a) |
LEFT($str, $len) | SUBSTRING($1 FROM 1 FOR $2) |
MID($str, $pos, $len) | SUBSTRING($1 FROM $2 FOR $3) |
MINUTE($a) | EXTRACT(minute FROM $1)::int |
MONTH($a) | EXTRACT(month FROM $1)::int |
ORD($a) | ASCII($a) |
RAND() | RANDOM() |
RIGHT($str, $len) | SUBSTRING($1 FROM length($1) – $2 FOR $2) |
SYSDATE() | CURRENT_DATE |
UCASE($a) | UPPER($a) |
UNIX_TIMESTAMP() | EXTRACT(epoch FROM current_timestamp) |
UNIX_TIMESTAMP($ts) | EXTRACT(epoch FROM $ts) |
UTC_DATE() | CAST(NOW() at time zone ‘utc’ AS date) |
UTC_TIME() | TIMEZONE(‘UTC’, NOW()) |
WEEK($a) | EXTRACT(week FROM ($a))::int |
YEAR($a) | EXTRACT(year from $1) |
Some of MySQL built-in functions that cannot be converted into PostgreSQL by simple text replace. The workaround for such cases is to create those missing functions in the destination database:
CREATE OR REPLACE FUNCTION _group_concat(text, text)
RETURNS text AS $$
BEGIN
SELECT CASE
WHEN $2 IS NULL THEN $1
WHEN $1 IS NULL THEN $2
ELSE $1 operator(pg_catalog.||) ‘,’ operator(pg_catalog.||) $2
END
END;
$$ LANGUAGE plpgsql;
DROP AGGREGATE IF EXISTS group_concat(text);
CREATE AGGREGATE group_concat (
BASETYPE = text,
SFUNC = _group_concat,
STYPE = text
);
CREATE OR REPLACE FUNCTION makedate(year int, dayofyear int)
RETURNS date AS $$
BEGIN
SELECT (date ‘0001-01-01’ + ($1 – 1) * interval ‘1 year’ + ($2 – 1) * interval ‘1 day’):: date
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION maketime(int, int, double precision)
RETURNS time AS $$
SELECT time ’00:00:00′ + $1 * interval ‘1 hour’ + $2 * interval ‘1 min’
+ $3 * interval ‘1 sec’
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION strcmp(text, text)
RETURNS int AS $$
BEGIN
SELECT CASE WHEN $1 < $2 THEN -1
WHEN $1 > $2 THEN 1
ELSE 0 END;
END;
$$ LANGUAGE plpgsql;
More articles about MySQL to PostgreSQL migration can be found at: https://www.convert-in.com/docs/sql2pgs/contents.htm