Comment on page
Preventing WAL growth on Postgres running on AWS RDS
In this document, we will discuss how to prevent WAL growth for a Postgres database running on AWS RDS.
Last updated: 11/08/2023
WAL stands for Write-Ahead Logging, which is a method for Postgres to handle change data capture (CDC). Database record changes will be logged and stored within WAL to ensure data integrity.
The WAL also makes it accessible to downstream applications to subscribe to a replication slot and consume database CDC changes reliably.
WAL growth is a problem and can result in replication slot overflow. Replication slot overflow happens when the WAL accumulates and grows, consumes all your database’s memory, and causes your database to go down.
As Gunnar Morling covered extensively in his blog post here, AWS RDS periodically writes a heartbeat to a table within
rdsadminevery 5 minutes. AWS RDS has a default setting of 64MB for each WAL segment size, so each heartbeat takes up 64MB of memory. The purpose for AWS writing heartbeats is for various reasons, including to monitor the health of their databases.
An important note is that the heartbeats are written to a table within an internal database that is not observed by most CDC applications. For an active database, heartbeats don’t cause an issue because the WAL is almost constantly being drained as new CDC logs are processed. However, if you have a test database that has low traffic or an idle database, you will see your WAL accumulate by 64MB every 5 minutes, or 18.4GB per day! If left unchecked, this can cause replication slot overflow.
You only need to enable this feature if your database is low traffic or idle for long periods of time, which are primarily test databases. This feature is not necessary for active databases because the WAL growth will reset as soon as there are data changes from the table(s) you are observing.
- 1.To turn this on with Artie, first create a heartbeat_table within Postgres:
CREATE TABLE test_heartbeat_table (id text PRIMARY KEY, ts timestamp);
-- Then insert one row into this table.
-- Artie's periodic pings will be this:
-- UPDATE test_heartbeat_table set ts = now() where id = '1';
-- Such that we never end up adding additional rows.
INSERT INTO test_heartbeat_table (id, ts) VALUES (1, NOW());
Enable Heartbeatsunder Deployment Advanced Settings
- Does the heartbeats table (
test_heartbeat_table) actually exist?
- Is the heartbeats table included in your Postgres PUBLICATIONS?
- Does the service account have permissions to write to the table?
Are there any long-running queries that may prevent your replication slot from being advanced? Check if there are any long running queries by running this:
now() - pg_stat_activity.query_start AS duration,
WHERE (now() - pg_stat_activity.query_start) > interval '1 minute';
In addition to enabling heartbeats, it is best practice to set up the following:
- Monitoring your Amazon RDS instance for
-- See all replication slots
SELECT * FROM pg_replication_slots;
-- Drop replication slot
-- See the size of replication slot
pg_current_wal_lsn(), restart_lsn)) AS retained_wal,
restart_lsn FROM pg_replication_slots;
Last modified 18d ago