Monday , April 19 2021

How To Backup and Restore PostgreSQL Database using pg_dump and psql

In this article, we will backup and restore the PostgreSQL database using pg_dump.

What is pg_dump?

pg_dump is a utility used to back up the Postgresql database.

The dump files backed up by pg_dump are completely consistent because the dump is a snapshot of the data when pg_dump started running.
pg_dump does not prevent other users from accessing the database (read or write).
A normal user can run pg_dump if he or she has been granted authority over a database.

pg_dump only backs up one database. Pg_dumpall is used to back up the entire cluster or to back up common objects (such as roles and tablespace) that are common to all databases in the cluster.

pg_dump file types

Dump files can be exported in script or archive file formats.

Scripted dump files are plain text files that contain SQL commands required to recovery the database to the state it was in when it was saved. Using the script created in this way; it is sufficient to run the generated sql file with psql or another tool to restore the database to its current state . The relevant database must have been created before this command is run, otherwise the objects are created in the postgres database by default. Due to its ease of use, it can be used to restore the database in architecture and machine changes.

If we do not want to create the database manually, we must create backups in the archive file format (custom, directory, tar) with pg_dump. However, in this case, we need to do the restore with the pg_restore tool instead of psql.

pg_restore is selective about what is being restored. It even allows you to rearrange items before they are restored, so we have a more flexible method.

pg_dump syntax

The syntax of the pg_dump command is as follows;

We specify the host of the database we want to backup with the -h parameter and the port with the -p parameter. The default server is the local server or whatever is specified in the PGHOST variable. Likewise, the value of the default port is the value specified in the PGPORT variable or takes the default value compiled.

Like other PostgreSQL client applications, by default pg_dump will connect to the database username that is equal to the current OS user’s name. To do this with a different user, we can use either the -U parameter or the PGUSER variable.

PostgreSQL pg_dump Examples

Backup Postgres Database psql

We can generate the SQL dump file of the database named mydb with several different syntaxes.

Restore Postgres Database From sql file

We can restore the create dump file as another database named newdb.

PostgreSQL pg_dump custom format

The dump of mydb database is created in custom file format with the following command. The backup file produced in this way will be much smaller than the sql file because in the background the file is compressed with zlib.

Restore PostgreSQL Database From custom file format dump

PostgreSQL pg_dump directory format

The dump of mydb database is created in directory format with the following command.

Parallel pg_dump backups

Using the command below, we can create a dump file with directory format in parallel.

NOTE: You can only use this option with a directory format. Because it is the only format that multiple processes can write their data at the same time.

Restore PostgreSQL Database From directory file

The command that can be used to restore the directory file.

PostgreSQL pg_dump tar format

The command below creates the dump file of the mydb database in tar file format.

Restore Postgres Database From tar file

We can restore the backup taken as tar with the following command.

PostgreSQL Backup Specific Tables

If we want to back up only the specified table, we can use the following command.

PostgreSQL Backup Tables Starting With x

If we want to back up the tables whose names starting with test_table, we can use the following command.

If we want to back up the tables whose names starting with test_table but not test_table_mustafa, we can use the following command.

To backup all schemas whose names start with east or west and end with gsm, and to exclude all schemas whose names contain the word test, we use the following;
NOTE: This command contains the schema itself as well as all the objects it contains.

Normally, no matter how we write the table name uppercase or lowercase, pg_dump assumes it as small. But if we want the value we write to be taken into account exactly, we need to write it in double quotes (“). Of course, double quotes are a special character for the shell, so we have to act like an escape charecter as follows.

The following command is used to restore directly from one database to another. Of course, in this case, the database named newdb must be in the specified cluster.

If we want to get data only, not metadata, the command is as follows.

PostgreSQL pg_dump compression

The following command is used to take a compressed backup with pg_dump.

PostgreSQL pg_dump split

If you want to split the dump output into multiple files of certain sizes, the command is as follows.

PostgreSQL restore split dump file

PostgreSQL pg_dump ON_ERROR_STOP

By default; If an error occurs while restoring, psql will ignore the error and continue to run. We can customize this case by setting the ON_ERROR_STOP variable. After this command, the codes will run until the line where the error occurred, but subsequent lines will not be processed.

PostgreSQL pg_dump single-transaction

Alternatively, we can specify that the restore must be completed in a single operation, that is, it will either be completely completed or completely rolled back. If we want it to take no action in the event of an error, we can do it this way.

PostgreSQL pg_dumpall

pg_dump can back up only one database at a time, not roles or tablespaces. Because these are information covering more than one database rather than a database. We use pg_dumpall to properly backup all the contents of a cluster. pg_dumpall creates a backup of each database in a given cluster and also a backup of data such as cluster-wide role and tablespace definitions.

When restoring with pg_dumpall, superuser access is always needed to restore role and tablespace information. While doing these operations, if we are using tablespace, we need to make sure that the paths of these tablespaces also exist.
pg_dumpall sends commands to create roles, tables, and empty databases, then runs pg_dump for each database. That is, each database will be consistent within itself, but other databases will not be synchronized instantly.

Mustafa Bektaş Tepe

About Mustafa Bektaş Tepe

Leave a Reply

Your email address will not be published. Required fields are marked *