PostgreSQL
In this section, we will go over how to gather your credentials and prepare your PostgreSQL to start providing CDC logs.
Finding your PostgreSQL settings
This is necessary so that we are able to run a Debezium deployment to read from your PostgreSQL server's replication slot and publish them onto Kafka. To see additional configuration properties, please click here to see Debezium's documentation.
We will need the following:
Name | Description | Default value |
---|---|---|
Host Name | IP address or hostname of your database server. | No default |
Port | Port for where your server is running. |
|
Username Password | Username and Password for authentication into your database. See below if you'd like to create a service account specific to Artie. | No default |
Database | The name of the database that you want to capture changes for. | No default. |
Creating a new user
Using Amazon RDS? RDS has its own internal permissioning model. Run this command instead of ALTER USER REPLICATION
!
GRANT rds_replication to username;
Granting access
Additional features
PostgreSQL 16
With PostgreSQL 16, comes the ability for consumers to use logical replication on a standby / replica server. Instead of providing us with your primary database, you can provide us with your standby instead.
One thing to note here is that the publications dbz_publication
must be created on the primary database.
Heartbeats
Once you selected Enable heartbeats
under Advanced Settings, you will then need to run the following command:
Customize PostgreSQL publications
By default, Artie will create a publication that includes all table changes. You can override this behavior by selecting Filtered
under the Deployment advanced settings.
If you change this to be filtered, this means that we will update the publications whenever tables get added or removed. Additionally, the service account must be the owner of the tables as this is a PostgreSQL requirement.
PostgreSQL Watcher
To set up your PostgreSQL database for CDC-based replication, you will need to enable replication slots. When this is done incorrectly, it could potentially cause a replication slot overflow and bring your production database down.
PostgreSQL Watcher provides additional guardrails around your database replication, and will do the following:
Regularly check and monitor your replication slot size in 15-minute intervals and notify if the slot exceeds a certain threshold.
Heartbeats verification. For folks that are leveraging Heartbeats, PostgreSQL Watcher will also check to make sure table permissions are updated and our service account has access to run Heartbeats. Watcher will notify you if the verification fails.
Terminate any idle queries that are lingering for more than a day. You can avoid having this problem by setting
idle_in_transaction_session_timeout
.
PostgreSQL Watcher is available to all Artie Cloud customers using PostgreSQL as a data source.
Supported types
BOOLEAN
/BOOL
BIT(1)
SMALLINT
,SMALLSERIAL
INTEGER
,SERIAL
BIGINT
,BIGSERIAL
REAL
DOUBLE PRECISION
NUMERIC[(M, [,D])]
DECIMAL[(M, [,D])]
MONEY
CHAR[(M)]
CHARACTER[(M)]
CHARACTER VARYING[(M)]
DATE
TIMESTAMP(M)
TIMESTAMP WITH TIME ZONE
TIME(M)
TIME WITH TIME ZONE
INTERVAL [P]
, we store this in microseconds.JSON
,JSONB
HSTORE
XML
UUID
CITEXT
INET
CIDR
MACADDR
MACADDR8
INT4RANGE
INT8RANGE
NUMRANGE
TSRANGE
TSTZRANGE
DATERANGE
ENUM
LTREE
PostGIS data type
(following GeoJSON feature spec)Latitude
Longitude
Point
Geometry
Geography
More coming soon!
Running it yourself
Self-hosted notes:
Debezium will automatically create a replication slot for you.
value.converter
must be set toorg.apache.kafka.connect.json.JsonConverter
value.converter.schemas.enable
must be set totrue
Last updated