Options

This page describes the available configuration settings for Artie Transfer to use.

Below, these are the various options that can be specified within a configuration file. Once it has been created, you can run Artie Transfer like this:

/transfer -c /path/to/config.yaml

Note: Keys here are formatted in dot notation for readability purposes, please ensure that the proper nesting is done when writing this into your configuration file. To see sample configuration files, visit the Examples page.

Kafka

kafka:
  bootstrapServer: localhost:9092,localhost:9093
  groupID: transfer
  username: artie
  password: transfer
  enableAWSMSKIAM: false

bootstrapServer

Pass in the Kafka bootstrap server. For best practices, pass in a comma separated list of bootstrap servers to maintain high availability. This is the same spec as Kafka. Type: String Optional: No

groupID

This is the name of the Kafka consumer group. You can set to whatever you'd like. Just remember that the offsets are associated to a particular consumer group. Type: String Optional: No

username + password

If you'd like to use SASL auth, you can pass the username and password. Type: String Optional: Yes

enableAWSMSKIAM

Turn this on if you would like to use IAM authentication to communicate with Amazon MSK. If you enabel this, make sure to pass in AWS_REGION, AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY. Type: Boolean Optional: Yes

Topic Configs

TopicConfigs are used at the table level and store configurations like:

  • Destination's database, schema and table name.

  • What does the data format look like? Is there an idempotent key?

  • Whether it should do row based soft deletion or not.

  • Whether it should drop deleted columns or not.

These are stored in this particular fashion. See Examples for more details.

kafka:
  topicConfigs:
  - { }
  - { }
# OR as
pubsub:
  topicConfigs:
  - { }
  - { }

BigQuery Partition Settings

This is the object stored under Topic Config.

bigQueryPartitionSettings:
  partitionType: time
  partitionField: ts
  partitionBy: daily

partitionType

Type of partitioning. We currently support only time-based partitioning. The valid values right now are just time. Type: String Optional: Yes

partitionField

Which field or column is being partitioned on. Type: String Optional: Yes

partitionBy

This is used for time partitioning, what is the time granularity? Valid values right now are just daily Type: String Optional: Yes

Google Pub/Sub

pubsub:
  projectID: 123
  pathToCredentials: /path/to/pubsub.json
  topicConfigs:
  - { }

projectID

pathToCredentials

This is the path to the credentials for the service account to use. You can re-use the same credentials as BigQuery, or you can use a different service account to support use cases of cross-account transfers. Type: String Optional: No

topicConfigs

Follow the same convention as kafka.topicConfigs above.

BigQuery

Shared Transfer config

sharedTransferConfig:
  additionalDateFormats:
    - 02/01/06 # DD/MM/YY
    - 02/01/2006 # DD/MM/YYYY
  createAllColumnsIfAvailable: true

additionalDateFormats

By default, Artie Transfer supports a wide array of date formats. If your layout is supported, you can specify additional ones here. If you're unsure, please refer to this guide. Type: List of layouts Optional: Yes

createAllColumnsIfAvailable

By default, Artie Transfer will only create the column within the destination if the column contains a not null value. You can override this behavior by setting this value to true. Type: Boolean Optional: Yes

Snowflake

Please see: Snowflake on how to gather these values.

Redshift

S3

s3:
  optionalPrefix: foo # Files will be saved under s3://artie-transfer/foo/...
  bucket: artie-transfer
  awsAccessKeyID: AWS_ACCESS_KEY_ID
  awsSecretAccessKey: AWS_SECRET_ACCESS_KEY

optionalPrefix

Prefix after the bucket name. If this is specified, Artie Transfer will save the files under s3://artie-transfer/optionalPrefix/... Type: String Optional: Yes

bucket

S3 bucket name. Example: foo. Type: String Optional: No

awsAccessKeyID

The AWS_ACCESS_KEY_ID for the service account. Type: String Optional: No

awsSecretAccessKey

The AWS_SECRET_ACCESS_KEY for the service account. Type: String Optional: No

Telemetry

Overview of Telemetry can be found here: Overview.

Last updated