Another quick tip about databases, this time about throwing them around between development, staging and production.

This is easiest to explain by example.

At gomore.dk we facilitate ride sharing between our members. Say I get a support email from a member, explaining that a ride she just created isn’t showing up in the search results.

I first impersonate her on the production site, take a look at the ride she created and verify that it is indeed not present in search results. Noone else seems to be having this problem and I can’t see anything unusual about the ride.

Clearly I want to reproduce this locally.

script/pull_prod_db

My development database is now an exact clone of production, so unless something really nasty is going on, I’ll be able to reproduce and debug the issue on my laptop.

If for some reason I want to try out a fix on our staging server, getting the data there is easy as well.

script/push_staging_db

The scripts

Our database is not huge (we probably have about 500k rows total) so these scripts take only a few seconds to run. Here they are:

script/pull_prod_db

ssh root@gomore.dk "sudo -u gomore pg_dump -xO -F t gomore_production | gzip" | \
  gzip -d | pg_restore -v -c --no-acl -O -U gomore -d gomore_development

script/push_staging_db

pg_dump -xO -F t gomore_development | gzip | \
  ssh root@<staging_server> "gunzip | sudo -u gomore pg_restore -v -c --no-acl -O -U gomore -d gomore_staging"

As you can see, these scripts work by dumping the DB on one server, gzipping it on the fly, transferring the bits to another server via ssh, gunzipping them there and finally restoring the DB.

I’m not going to go into detail about the basic operation, but be sure to understand it or read up on the constructs used here, because the combination of ssh and pipes is extremely versatile and useful.

A few notes about the options given to the Postgres tools:

  • Common
    • -v Verbose mode
  • pg_dump
    • -x Skips privileges (GRANT commands)
    • -O Skips ownership of the dumped objects. This means that the DB owner becomes the owner of the objects on the receiving side.
    • -F t Sets the output format to “tar”. This allows me to use option -c when invoking pg_restore.
  • pg_restore
    • -c Delete database objects before recreating them
    • --no-acl Equivalent to -x, not strictly necessary here
    • -U gomore Connect to the Postgres server as the gomore user
    • -d db_name Specify the destination database name

Note that pg_restore reads from stdin because I didn’t specify a filename argument.

Thanks to Jørgen Erichsen for showing me very similar scripts a few years ago.

Skipping tables

If you have huge tables in your DB, it can be useful to skip them with -T:

pg_dump ... -T huge_table_name