14

I have a requirement of merging two databases (Database1 & Database2) to a common database in Postgresql.

Database1

Table1

Id - Value (Id is Primary key)

1 - a
2 - aa
3 - aaa
4 - aaaa

Database2

Table1

Id Value (Id is Primary key)

2 - bb
5 - bbbbb

I want my output as

OutPutDatabase

Table1

Id Value (Id is Primary key)

1 - a
2 - bb
3 - aaa
4 - aaaa
5 - bbbbb

How can I achieve this?

1
  • How are you deciding what the id should be in the merged database - does it matter? Are there foreign keys referencing these tables, and if so what do you want to do then? Commented Feb 29, 2012 at 13:12

4 Answers 4

8

First, load the tables into two separate schemas in the same database.

CREATE SCHEMA db1;
CREATE TABLE db1.table1 (id INT PRIMARY KEY, value VARCHAR(10));
-- load the contents of Table1 @ Database1 into db1.table1

CREATE SCHEMA db2;
CREATE TABLE db2.table1 (id INT PRIMARY KEY, value VARCHAR(10));
-- load the contents of Table1 @ Database2 into db2.table1

Then you can merge the two, by prioritizing db2 over db1.

SELECT
  DISTINCT ON (id)
  id,
  value
FROM (
  SELECT
    *,
    1 AS db
  FROM
    db1.table1

  UNION

  SELECT
    *,
    2 AS db
  FROM
    db2.table1) AS mix
ORDER BY
  id,
  db DESC;
Sign up to request clarification or add additional context in comments.

Comments

5

Not my idea, but one I have read in the past.

The source: Move table to new database

pg_dump -d db1 -t table1 |psql db2

then psql and do

insert into table2 (select * from table1);

Comments

2

I used the migrate_pkey_sequence() function defined here to migrate the primary keys of multiple tables (with the same schema but different values from different databases) to distinct ranges so that I could merge the tables.

This is the approximate sequence I followed to merge two databases, d1 and d2, each of which has a table example:

  1. Dump d1 and d2 to the respective files d1.sql.gz and d2.sql.gz. This is the command I used to dump:

    $ pg_dump --no-owner | gzip > d1.sql.gz
    
  2. Create a new, empty database on a local PostgreSQL server.

  3. Load d1.sql.gz.
  4. Migrate d1.example using:

    # SELECT migrate_pkey_sequence('example', 'id', 'example_id_seq', 1);
    INFO:  00000: UPDATE example SET id = nextval('example_id_seq') + 0
    INFO:  00000: ALTER SEQUENCE example_id_seq RESTART WITH 1
    INFO:  00000: UPDATE example SET id = DEFAULT
     migrate_pkey_sequence
    -----------------------
                         4
    (1 row)
    
  5. Record the value printed (4 in this example). That will be the start of the next sequence.

  6. Dump to the file d1-new.sql.gz in the same way as above.
  7. Repeat steps 2 through 4 with d2.sql.gz but use the value from step 5 as the argument to migrate_pkey_sequence():

    # SELECT migrate_pkey_sequence('example', 'id', 'example_id_seq', 4);
    
  8. Dump to the file d2-new-without-schema.sql.gz without saving the schema and disable triggers. This is the command I used:

    $ pg_dump --no-owner --data-only --disable-triggers | \
      gzip > d2-new-without-schema.sql.gz
    
  9. Create a new, empty database.

  10. Load d1-new.sql.gz and d2-new-without-schema.sql.gz in order. If everything worked as expected, the primary keys should be disjoint and not conflict.

  11. Dump the database if you want to load it into a remote server.

1 Comment

As an added note, for migrate_pkey_sequence to work properly with foreign keys also make sure you have the keys set as ON UPDATE CASCADE
0

I have been able to easily merge 2 postgres database using pg_dump & pg_restore.

In my case, it was easy as I use UUID identifiers, so there are no duplicated keys.

Export (data-only, excluding some schemas and tables):

pg_dump --dbname=database_1 --username=${POSTGRES_USER} --no-owner --format=tar --data-only --exclude-schema=hdb_views --exclude-schema=hdb_catalog --exclude-table=app_configuration --exclude-table=migrations --file=/database_1.data-only.dump

Import (changing role if necessary):

pg_restore --username=$POSTGRES_USER --role=$POSTGRES_USER --no-owner --exit-on-error --verbose --dbname=database_2 /database_1.data-only.dump

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.