How We Upgraded PostgreSQL Database Version with Near Zero Downtime

Published in

Trendyol Tech

5 min readJan 6, 2022

Hi all, together with my colleague Ezgi, we completed the PostgreSQL database upgrade almost no downtime. That’s why we wanted to explain the PostgreSQL version upgrade process in detail. In the rest of the article, we will state the gains we have achieved while doing this process.

Before we start, we need to give information about the environment in which we work. We upgraded a database system with PostgreSQL version 11 to PostgreSQL version 13. Our database size was around 1TB.

Upgrade Plan and Scenario

Upgrade operations in the PostgreSQL database system are always an issue that requires awareness of many issues. Therefore, upgrade scenarios need to be determined and analyzed beforehand. As part of this process, we worked on 3 different plans. To talk about the plans, you can find the details below.

Export/Import(Dump Restore) Solution: In the solution here, the PostgreSQL database should be backed up with pg_dump and restored in a new environment. This didn’t seem like a very functional plan given our database size. Because the database cannot be updated during the dump/restore process. It means that the database system cannot work during this time. However, this method is one of the most suitable plans when the database system is kept closed.
Standby Solution: In this solution, firstly, we had to upgrade a PostgreSQL server, which we called a replica, and then start a restore to the new environment on this server. The biggest disadvantage in this scenario was that it was the most difficult to make and complex scenario. Therefore, we did not choose this scenario. In addition, it was a scenario that required us to abandon not only database upgrades but also other advantages.
Logical Replication Solution: This solution was the scenario that best suited our plans and that we could implement without any service interruptions. Due to its logical replication structure, it was a solution that solved our business online. For these reasons, we chose to work with logical replication.

With this upgrade, we also achieved the following gains.

We started using the connection pooling feature, which we did not use in the old database system, thanks to pgbouncer.
We revised the configurations in the old database system to optimize all of them.

Upgrade Operation

In this article, we’ll assume that PostgreSQL installation and configurations are done. So, we’ll focus on important migration steps.Firstly, we have to dump our database users and roles to a file on operating system. After that, we have to get an SQL file that contains the schema information and DDL. In order to collect required files you can use pg_dump and pg_dumpall cli commands as follows ;

pg_dumpall --roles-only -f roles.sql
pg_dump --schema-only -f schema.sql -d database_name

After this step, you need to run these SQL files on our new database server. You can run SQL files with the following order;

psql -U dba -d database_name -h ip_addr -p port -f roles.sql
psql -U dba -d database_name -h ip_addr -p port -f schema.sql

After these operations, we will use the logical replication feature provided by PostgreSQL. In order to use the logical replication feature, there are operations that we need to do on both the source and target database systems. First of all, publication should be created in the source database. Afterwards, a subscription should be created with the source database connection information in the target database. In the example below, there are commands that can be used to create publications and subscriptions.

-- Run on source systemCREATE PUBLICATION publication_name_created_on_source FOR ALL TABLES;-- Run on destination systemCREATE SUBSCRIPTION gib_new_pgs
CONNECTION 'host=source_db_ip port=source_db_port user=source_db_user dbname=source_db_name'
PUBLICATION publication_name_created_on_source;

After the Subscription object is created on the target system, we get an output like this. On the other hand, if we look at the source system, we will see that a new replication slot has been created there.

The following SQL command can be used to control replication slots on the source system.

SELECT slot_name,
slot_type,
active
FROM pg_replication_slots;

At this point, we had to wait for a while depending on the data size for the data to be synchronized between the two databases. However, these transactions continued completely online. Started replication between two databases.

A graph of records written to the database using logical replication can be seen as indicated in the screenshot on the left. Logical replication replicates the tables in the system to the opposite system in a certain order. When the process is finished, it instantly applies the changes in the source system to the target system. In this way, real-time replication can be made.

When the database replication was completed and started to be used, the following changes were implemented in the system. A summary of these changes and gains is as follows.
With pgbouncer, the number of clients in the system is 25. This number was around 200 in the previous system. We performed the same operations and overhead with 85% fewer connections.
Memory consumption on PostgreSQL database servers was around 25GB on average. With the configurations we applied in the new system, this memory consumption has reached the level of 10 GB. Memory consumption decreased by 60%.
CPU consumtion decreased by 50%.
We have migrated from PostgreSQL 11 to PostgreSQL 13.

Stay tuned and up to date. .

How We Upgraded PostgreSQL Database Version with Near Zero Downtime

Upgrade Plan and Scenario

Upgrade Operation

References

Written by Hüseyin Demir