Artie Transfer
Search
K
Comment on page

Setting up Pub/Sub

Overview

In this tutorial, we will learn how to run Debezium Server with Pub/Sub sink and Artie Transfer locally using Docker.

Prerequisites

  • Terraform
  • Docker
  • gcloud CLI
  • GCP Project

Set-up

gcloud CLI
Please visit this link to download the CLI. Once you have done so, run this command:
gcloud auth application-default login
Pub/Sub API
To use Pub/Sub in your GCP project, you will also need to enable it. Visit this link to enable it.

Creating a service account

locals {
project = "PROJECT_ID"
role = "roles/pubsub.editor"
}
provider "google" {
project = local.project
// Authenticate via gcloud auth application-default login
// This requires the gcloud CLI downloaded: https://cloud.google.com/sdk/docs/install
// Need to enable PubSub API: https://console.cloud.google.com/marketplace/product/google/pubsub.googleapis.com
}
resource "google_service_account" "artie-svc-account" {
account_id = "artie-service-account"
display_name = "Service Account for Artie Transfer and Debezium"
}
resource "google_project_iam_member" "transfer" {
project = local.project
role = local.role
member = "serviceAccount:${google_service_account.artie-svc-account.email}"
}
$ terraform init # Install the necessary libraries.
$ terraform plan
$ terraform apply

Download the service account credentials

Once your service account has been created, head to the GCP console and create a key for the service account. Save the key as we will be referencing it in the later steps.

Create the Pub/Sub topic and subscriptions

Debezium will not automatically create topics or subscriptions for you.
resource "google_pubsub_topic" "customer_topic" {
name = "dbserver1.inventory.customers"
project = local.project
timeouts {}
}
resource "google_pubsub_subscription" "customer_subscription" {
ack_deadline_seconds = 300
enable_exactly_once_delivery = false
enable_message_ordering = true
message_retention_duration = "604800s"
name = "transfer_${google_pubsub_topic.customer_topic.name}"
project = local.project
retain_acked_messages = false
topic = google_pubsub_topic.customer_topic.id
timeouts {}
}
$ terraform plan
$ terraform apply

Running Debezium

Within the pubsub examples folder, make sure to modify the application.properties to specify the project_id. If you need help locating your GCP Project ID, see Getting your project identifier
# Offset storage
debezium.source.offset.storage.file.filename=/tmp/foo
debezium.source.offset.flush.interval.ms=0
# Pubsub setup: https://debezium.io/documentation/reference/stable/operations/debezium-server.html#_google_cloud_pubsub
debezium.sink.type=pubsub
debezium.sink.pubsub.project.id=PROJECT_ID
debezium.sink.pubsub.ordering.enabled=true
# Postgres
debezium.source.connector.class=io.debezium.connector.postgresql.PostgresConnector
debezium.source.database.hostname=postgres
debezium.source.database.port=5432
debezium.source.database.user=postgres
debezium.source.database.password=postgres
debezium.source.database.dbname=postgres
debezium.source.topic.prefix=dbserver1
debezium.source.table.include.list=inventory.customers
debezium.source.plugin.name=pgoutput

Running Transfer

Below is the config.yaml where the test database will just output the query commands into the terminal. Make sure to also fill out the projectID
Visit Options to see all the possible options for your configuration file and Examples.
outputSource: test
queue: pubsub
pubsub:
projectID: artie-labs
pathToCredentials: /tmp/credentials/service-account.json
topicConfigs:
- db: customers
tableName: customers
schema: public
topic: "dbserver1.inventory.customers"
cdcFormat: debezium.postgres.wal2json
cdcKeyFormat: org.apache.kafka.connect.json.JsonConverter
telemetry:
metrics:
provider: datadog
settings:
tags:
- env:production
namespace: "transfer."
addr: "127.0.0.1:8125"

Docker Compose File

Now, within the docker-compose.yaml file, you will need to specify the path to your credentials that you have downloaded from the prior step. Download the service account credentials.
version: '3.9'
services:
postgres:
image: quay.io/debezium/example-postgres:2.0
ports:
- 5432:5432
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
debezium-server:
image: quay.io/debezium/server:2.0
container_name: debezium-server
# Sleep the PostgreSQL service to spin up.
command: sh -c "sleep 15 && /debezium/run.sh"
environment:
GOOGLE_APPLICATION_CREDENTIALS: /tmp/credentials/service-account.json
links:
- postgres
ports:
- 8080:8080
volumes:
- ./application.properties:/debezium/conf/application.properties
- REPLACE_ME:/tmp/credentials/service-account.json
depends_on:
- postgres
transfer:
build:
context: .
dockerfile: Dockerfile
volumes:
- REPLACE_ME:/tmp/credentials/service-account.json

Putting everything together

When running this, the PostgreSQL database already has some seeded data. As a result, we can see the merge statement being issued to add the seeded data.
Now that PostgreSQL is running locally on 0.0.0.0:5432, you can open up a SQL editor to interact with the data model. The example below, we are updating the first_name of a customer object and the change is directly streamed to Artie.

Closing remarks

We hope you found this tutorial helpful.
  • The code for this tutorial can be found here.
  • To understand how Artie Transfer works with Google Pub/Sub under the hood, please click on this link.
  • If you run into any other issues, please file a bug report on our GitHub page or get in touch at [email protected].
Last modified 8mo ago