- PostgreSQL, like any database software, requires that certain tasks be performed regularly to achieve optimum performance. The tasks discussed here are required, but they are repetitive in nature and can easily be automated using standard tools such as cron scripts or Windows’ Task Scheduler. It is the database administrator’s responsibility to set up appropriate scripts, and to check that they execute successfully.
- PostgreSQL databases require periodic maintenance known as vacuuming. You might need to adjust the autovacuuming parameters described there to obtain best results for your situation. Some database administrators will want to supplement or replace the daemon’s activities with manually-managed VACUUM commands, which typically are executed according to a schedule by cron or Task Scheduler scripts. To set up manually-managed vacuuming
properly, it is essential to understand the issues discussed in the next few subsections. Administrators who rely on autovacuuming may still wish to skim this material to help them understand and adjust autovacuuming.
Backup and Restore
- Backup File Formats Three different backup file formats can be created by pgAdmin:
- Plain-Text Format. A plain-text script file containing SQL statements and commands that can be executed by the psql command line terminal program to recreate the database objects and load the table data. Use the psql program to restore from a plain-text backup file.
- Custom Archive Format. A binary file that allows for restoration of all or only selected database objects from the backup file. Use pgAdmin to restore from a custom archive backup file.
- Tar Archive Format. A tar archive file that allows for restoration of all or only selected database objects from the backup file. Use pgAdmin to restore from a tar archive backup file.
Re-Indexing
- REINDEX is similar to a drop and recreate of the index in that the index contents are rebuilt from scratch. However, the locking considerations are rather different. REINDEX locks out writes but not reads of the index’s parent table. It also takes an exclusive lock on the specific index being processed, which will block reads that attempt to use that index. In contrast, DROP INDEX momentarily takes an exclusive lock on the parent table, blocking both writes and reads. The subsequent CREATE INDEX locks out writes but not reads; since the index is not there, no read will attempt to use it, meaning that there will be no blocking but reads might be forced into expensive sequential scans.
- One way to do this is to shut down the server and start a single-user PostgreSQL server with the -P option included on its command line. Then, REINDEX DATABASE, REINDEX SYSTEM, REINDEX TABLE, or REINDEX INDEX can be issued, depending on how much you want to reconstruct. If in doubt, use REINDEX SYSTEM to select reconstruction of all system indexes in the database. Then quit the single-user server session and restart the regular server.
- Alternatively, a regular server session can be started with -P included in its command line options. The method for doing this varies across clients, but in all libpq-based clients, it is possible to set the PGOPTIONS environment variable to -Pbefore starting the client. Note that while this method does not require locking out other clients, it might still be wise to prevent other users from connecting to the damaged database until repairs have been completed.
Rebuild a single index:
REINDEX INDEX my_index;
Rebuild all the indexes on the table my_table:
REINDEX TABLE my_table;
Vacuum
- VACUUM reclaims storage occupied by dead tuples. In normal PostgreSQL operation, tuples that are deleted or obsoleted by an update are not physically removed from their table; they remain present until a VACUUM is done. Therefore it’s necessary to do VACUUM periodically, especially on frequently-updated tables.
- With no parameter, VACUUM processes every table in the current database that the current user has permission to vacuum. With a parameter, VACUUM processes only that table.
- VACUUM ANALYZE performs a VACUUM and then an ANALYZE for each selected table. This is a handy combination form for routine maintenance scripts.
- Plain VACUUM (without FULL) simply reclaims space and makes it available for re-use. This form of the command can operate in parallel
with normal reading and writing of the table, as an exclusive lock is not obtained. However, extra space is not returned to the operating
system (in most cases); it’s just kept available for re-use within the same table. VACUUM FULL rewrites the entire contents of the table
into a new disk file with no extra space, allowing unused space to be returned to the operating system. This form is much slower and
requires an exclusive lock on each table while it is being processed. - To recover or reuse disk space occupied by updated or deleted rows.
- To update data statistics used by the PostgreSQL query planner.
- To protect against loss of very old data due to transaction ID wraparound.
- Each of these reasons dictates performing VACUUM operations of varying frequency and scope, as explained in the following subsections.
- There are two variants of VACUUM:Standard VACUUM and VACUUM FULL.
- VACUUM FULL can reclaim more disk space but runs much more slowly. Also, the standard form of VACUUM can run in parallel with production
database operations. (Commands such as SELECT, INSERT, UPDATE, and DELETE will continue to function normally, though you will not be able to modify the definition of a table with commands such as ALTER TABLE while it is being vacuumed.) - VACUUM FULLrequires exclusive lock on the table it
is working on, and therefore cannot be done in parallel with
other use of the table. Generally, therefore, administrators
should strive to use standard VACUUM and avoid VACUUM FULL.