PostgreSQL

PostgreSQL Hot-Standby Database Replication Tutorial

This article is part of our Academy Course titled PostgreSQL Database Tutorial.

In this course, we provide a compilation of PostgreSQL tutorials that will help you set up and run your own database management system. We cover a wide range of topics, from installation and configuration, to custom commands and datatypes. With our straightforward tutorials, you will be able to get your own projects up and running in minimum time. Check it out here!

In the previous articles of this series, we have learned several PostgreSQL database management skills. So far, those skills have only included working with one database on one machine only. Today we will explain how to set up database replication, with a master machine and a slave one, in order to provide redundancy.

In particular, we will use an approach known as hot standby, that allows to run read-only queries on a replicated database (residing on a separate machine) while the main one is under maintenance. Hot Standby is available in PostgreSQL starting with version 9.0.

Given the nature of the topic at hand, we will need an extra Ubuntu server. We will call it “slave” and will change its hostname to ubuntu-slave and set its IP address to 192.168.0.55. The master server (192.168.0.54 – the one we have been using until now) will be renamed to ubuntu-master. We will begin this article by outlining step by step the prerequisites for the setup, so you should not run into any significant issues.

If you’re using a VirtualBox-based VM to follow along with this series, you can easily clone it as we explained in Cloning, exporting, importing, and removing virtual machines in VirtualBox (don’t forget to check the Reinitialize the MAC address of all network cards box). Otherwise, you may need to install an Ubuntu 16.04 server instance from scratch.

1. Step 0 – Change hostnames and IP addresses as needed

In the master machine, do:

hostnamectl set-hostname ubuntu-master

Next, edit /etc/hosts as follows:

127.0.0.1    ubuntu-master
192.168.0.55    ubuntu-slave

In the slave machine:

hostnamectl set-hostname ubuntu-slave

Then edit /etc/network/interfaces and make sure the configuration for enp0s3 (the main NIC) looks as follows:

iface enp0s3 inet static
address 192.168.0.55
netmask 255.255.255.0
gateway 192.168.0.1
dns-nameservers 8.8.8.8 8.8.4.4

Next, edit /etc/hosts as follows:

127.0.0.1        ubuntu-slave
192.168.0.54    ubuntu-master

Finally, restart the network service on both machines

systemctl restart networking

and logout, then log back in to apply changes.

Now we’re ready to talk.

2. Step 1 – Configuring the master

To begin, we will create a dedicated user (repuser in this case) and we will limit the number of simultaneous connections to 1. Enter the psql command prompt and do:

CREATE USER repuser REPLICATION LOGIN CONNECTION LIMIT 1 ENCRYPTED PASSWORD 'rep4scg';

In /etc/postgresql/9.5/main/postgresql.conf, make sure the following settings and values are included:

listen_addresses = 'localhost,192.168.0.54'
wal_level = 'hot_standby'
max_wal_senders = 1
hot_standby = on

and in /etc/postgresql/9.5/main/pg_hba.conf:

hostssl    replication repuser  192.168.0.55  md5

Next, switch to user postgres, generate a public key and copy it to the slave. This will allow the master to replicate automatically to the slave:

ssh-keygen -t rsa
ssh-copy-id 192.168.0.55

When prompted to enter the password for user postgres in the slave machine, do so before proceeding.

Now restart the database service:

sudo systemctl restart postgresql

3. Step 2 – Configuring the slave

Make sure the database service is stopped before proceeding. Otherwise, you’re in for a nasty database corruption in a few moments.

systemctl stop postgresql

Then edit /etc/postgresql/9.5/main/postgresql.conf and make sure the following settings / values are included:

listen_addresses = 'localhost,192.168.0.55'
wal_level = 'hot_standby'
max_wal_senders = 1
hot_standby = on

Finally, add the following line to /etc/postgresql/9.5/main/pg_hba.conf:

hostssl    replication repuser  192.168.0.54  md5

4. Step 3 – Performing the replication

This step consists in two sub-steps:

3a- In the master, run the following commands to start an initial backup of all databases currently residing in our server, excluding the logs (you can choose a different backup identification instead of Initial backup). If the destination files exist, they will be updated in place; additionally, the data will be compressed during the transfer – this may come in handy if you have several large databases (otherwise, feel free to ignore the -z option of rsync).

psql -c "select pg_start_backup('Initial backup');"
rsync -cvaz --inplace --exclude=*pg_xlog* /var/lib/postgresql/9.5/main/ 192.168.0.55:/var/lib/postgresql/9.5/main/
psql -c "select pg_stop_backup();"

3b- In the slave, create a .conf file with the connection info to the master server. We will name it recovery.conf and save it in /var/lib/postgresql/9.5/main:

standby_mode = 'on'
primary_conninfo = 'host=192.168.0.54 port=5432 user=repuser password=rep4scg'

where user and password need to match the credentials created at the beginning of Step 1.

Now we can proceed to start the database server in the slave:

sudo systemctl start postgresql

and test the replication in the next step.

5. Step 4 – Testing the replication

In the master, we will switch to user postgres and execute a simple query to SELECT and then update a record from the city table in the World_db database. At the same time, we will query that same record in the slave before and after performing the UPDATE in the master. Refer to Fig. 1 for more details:

sudo -i -u postgres
psql
\c World_db;

then

SELECT name, countrycode, population FROM city WHERE name='Brisbane';
UPDATE city SET population=1402568 WHERE name='Brisbane';
Database replication in action
Figure 1 – Database replication in action

As you can see, the slave was updated automatically the population for Brisbane was changed in the master. If you attempt to perform an UPDATE from the slave, it will fail with the following error: “ERROR:  cannot execute UPDATE in a read-only transaction.”

6. Troubleshooting

If the database service refuses to start successfully, you will not be able to run psql in the Linux command line. In that case, you will have to troubleshoot using the following resources:

systemctl -l status postgresql@9.5-main
journalctl -xe
tail -f /var/log/postgresql/postgresql-9.5-main.log

That’s all folks! You should have a PostgreSQL hot standby replication in place.

Gabriel Canepa

Gabriel Canepa is a Linux Foundation Certified System Administrator (LFCS-1500-0576-0100) and web developer from Villa Mercedes, San Luis, Argentina. He works for a worldwide leading consumer product company and takes great pleasure in using FOSS tools to increase productivity in all areas of his daily work. When he's not typing commands or writing code or articles, he enjoys telling bedtime stories with his wife to his two little daughters and playing with them, the great pleasure of his life.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Alexey
Alexey
7 years ago

Topic is poorly disclosed.
What else types of sync exists, what its advantages and disadvantages?

What happen if master node is down?
What happen if slave node is down?

How to update failed node to master state after startup?

I hope these themes will be discussed in comments or future topics.

Thanks.

Back to top button