MOLT Replicator

On this page

MOLT Replicator continuously replicates changes from a source database to CockroachDB as part of a database migration. It supports migrations from a source database to CockroachDB with minimal downtime, and enables backfill from CockroachDB to your source database for failback scenarios to preserve a rollback option during a migration window.

MOLT Replicator consumes change data from PostgreSQL logical replication streams, MySQL GTID-based replication, Oracle LogMiner, and CockroachDB changefeeds (for failback). For details, refer to How it works.

Terminology

Checkpoint: The position in the source database's transaction log from which replication begins or resumes: LSN (PostgreSQL), GTID (MySQL), or SCN (Oracle).
Staging database: A CockroachDB database used by Replicator to store replication metadata, checkpoints, and buffered mutations. Specified with --stagingSchema and automatically created with --stagingCreateSchema. For details, refer to Staging database.
Forward replication: Replicate changes from a source database (PostgreSQL, MySQL, or Oracle) to CockroachDB during a migration. For usage details, refer to Forward replication with initial load.
Failback: Replicate changes from CockroachDB back to the source database. Used for migration rollback or to maintain data consistency on the source during migration. For usage details, refer to Failback to source database.

Prerequisites

Supported databases

MOLT Replicator supports the following source and target databases:

PostgreSQL 11-16
MySQL 5.7, 8.0 and later
Oracle Database 19c (Enterprise Edition) and 21c (Express Edition)
CockroachDB (all currently supported versions)

Database configuration

The source database must be configured for replication:

Database	Configuration Requirements	Details
PostgreSQL source	Enable logical replication by setting `wal_level = logical`.	Configure PostgreSQL for replication
MySQL source	Enable global transaction identifiers (GTID) and configure binary logging. Set `binlog-row-metadata` or `binlog-row-image` to `full`. Configure sufficient binlog retention for migration duration.	Configure MySQL for replication
Oracle source	Install Oracle Instant Client. Enable `ARCHIVELOG` mode, supplemental logging for primary keys, and `FORCE LOGGING`. Create sentinel table (`_replicator_sentinel`) in source schema. Grant and verify LogMiner privileges.	Configure Oracle for replication
CockroachDB source (failback)	Enable rangefeeds (`kv.rangefeed.enabled = true`) (CockroachDB self-hosted clusters only).	Configure CockroachDB for replication

User permissions

The SQL user running MOLT Replicator requires specific privileges on both the source and target databases:

Database	Required Privileges	Details
PostgreSQL source	`SUPERUSER` role (recommended), or the following granular permissions: `CREATE` and `SELECT` on database and tables to replicate. Table ownership for adding tables to publications. `LOGIN` and `REPLICATION` privileges to create replication slots and access replication data.	Create PostgreSQL migration user
MySQL source	`SELECT` on tables to replicate. `REPLICATION SLAVE` and `REPLICATION CLIENT` privileges for binlog access. For `--fetchMetadata`, either `SELECT` on the source database or `PROCESS` globally.	Create MySQL migration user
Oracle source	`SELECT`, `INSERT`, `UPDATE` on `_replicator_sentinel` table. `SELECT` on `V$` views (`V$LOG`, `V$LOGFILE`, `V$LOGMNR_CONTENTS`, `V$ARCHIVED_LOG`, `V$LOG_HISTORY`). `SELECT` on `SYS.V$LOGMNR_*` views (`SYS.V$LOGMNR_DICTIONARY`, `SYS.V$LOGMNR_LOGS`, `SYS.V$LOGMNR_PARAMETERS`, `SYS.V$LOGMNR_SESSION`). `LOGMINING` privilege. `EXECUTE` on `DBMS_LOGMNR`. For Oracle Multitenant, the user must be a common user (prefixed with `C##`) with privileges granted on both CDB and PDB.	Create Oracle migration user Create sentinel table Grant LogMiner privileges
CockroachDB target (forward replication)	`ALL` on target database. `CREATE` on schema. `SELECT`, `INSERT`, `UPDATE`, `DELETE` on target tables. `CREATEDB` privilege for creating staging schema.	Create CockroachDB user
PostgreSQL, MySQL, or Oracle target (failback)	`SELECT`, `INSERT`, `UPDATE` on tables to fail back to. For Oracle, `FLASHBACK` is also required.	Grant PostgreSQL user permissions Grant MySQL user permissions Grant Oracle user permissions

Installation

To install MOLT, download the binary that matches your architecture and source database:

Operating System	Architecture	PostgreSQL/MySQL	Oracle (Preview)
Windows	AMD 64-bit	Download	N/A
	ARM 64-bit	Download	N/A
Linux	AMD 64-bit	Download	Download
	ARM 64-bit	Download	N/A
Mac	AMD 64-bit	Download	N/A
	ARM 64-bit	Download	N/A

The following binaries are included:

molt
replicator

Tip:

For ease of use, keep both molt and replicator in your current working directory.

To display the current version of each binary, run molt --version and replicator --version.

For previous binaries, refer to the MOLT version manifest. For release details, refer to the MOLT changelog.

Docker images

MOLT Replicator

Docker images for MOLT Replicator are also available as a standalone binary:

docker pull cockroachdb/replicator

To pull a specific version (for example, v1.1.1):

docker pull cockroachdb/replicator:v1.1.1

Docker usage

Performance

MOLT Fetch, Verify, and Replicator are likely to run more slowly in a Docker container than on a local machine. To improve performance, increase the memory or compute resources, or both, on your Docker container.

Local connection strings

When testing locally, specify the host as follows:

For macOS, use host.docker.internal. For example:

--sourceConn 'postgres://postgres:postgres@host.docker.internal:5432/molt?sslmode=disable'
--targetConn "postgres://root@host.docker.internal:26257/molt?sslmode=disable"

For Linux and Windows, use 172.17.0.1. For example:

--sourceConn 'postgres://postgres:postgres@172.17.0.1:5432/molt?sslmode=disable'
--targetConn "postgres://root@172.17.0.1:26257/molt?sslmode=disable"

How it works

MOLT Replicator supports forward replication from PostgreSQL, MySQL, and Oracle, and failback from CockroachDB:

PostgreSQL source (pglogical): MOLT Replicator uses PostgreSQL logical replication, which is based on publications and replication slots. You create a publication for the target tables, and a slot marks consistent replication points. MOLT Replicator consumes this logical feed directly and applies the data in sorted batches to the target.
MySQL source (mylogical): MOLT Replicator relies on MySQL GTID-based replication to read change data from MySQL binlogs. It works with MySQL versions that support GTID-based replication and applies transactionally consistent feeds to the target. Binlog features that do not use GTIDs are not supported.
Oracle source (oraclelogminer): MOLT Replicator uses Oracle LogMiner to capture change data from Oracle redo logs. Both Oracle Multitenant (CDB/PDB) and single-tenant Oracle architectures are supported. Replicator periodically queries LogMiner-populated views and processes transactional data in ascending SCN windows for reliable throughput while maintaining consistency.
Failback from CockroachDB (start): MOLT Replicator acts as an HTTP webhook sink for a single CockroachDB changefeed. Replicator receives mutations from source cluster nodes, can optionally buffer them in a CockroachDB staging cluster, and then applies time-ordered transactional batches to the target database. Mutations are applied as UPSERT or DELETE statements while respecting foreign-key and table dependencies.

Consistency modes

MOLT Replicator supports three consistency modes for balancing throughput and transactional guarantees:

Consistent (failback mode only, default for CockroachDB sources): Preserves per-row order and source transaction atomicity. Concurrent transactions are controlled by --parallelism.
BestEffort (failback mode only): Relaxes atomicity across tables that do not have foreign key constraints between them (maintains coherence within FK-connected groups). Enable with --bestEffortOnly or allow auto-entry via --bestEffortWindow set to a positive duration (such as 1s).

Note:

For independent tables (with no foreign key constraints), BestEffort mode applies changes immediately as they arrive, without waiting for the resolved timestamp. This provides higher throughput for tables that have no relationships with other tables.
Immediate (default for PostgreSQL, MySQL, and Oracle sources): Applies updates as they arrive to Replicator with no buffering or waiting for resolved timestamps. For CockroachDB sources, provides highest throughput but requires no foreign keys on the target schema.

Commands

MOLT Replicator provides the following commands:

Command	Description
`pglogical`	Replicate from PostgreSQL source to CockroachDB target using logical replication.
`mylogical`	Replicate from MySQL source to CockroachDB target using GTID-based replication.
`oraclelogminer`	Replicate from Oracle source to CockroachDB target using Oracle LogMiner.
`start`	Replicate from CockroachDB source to PostgreSQL, MySQL, or Oracle target (failback mode). Requires a CockroachDB changefeed with rangefeeds enabled.
`make-jwt`	Generate JWT tokens for authorizing changefeed connections in failback scenarios. Supports signing tokens with RSA or EC keys, or generating claims for external JWT providers. For details, refer to JWT authentication.
`version`	Display version information and Go module dependencies with checksums. For details, refer to Supply chain security.

For command-specific flags and examples, refer to Usage and Common workflows.

Flags

Global flags

Flag	Type	Description
`--applyTimeout`	`DURATION`	The maximum amount of time to wait for an update to be applied. Default: `30s`
`--dlqTableName`	`IDENT`	The name of a table in the target schema for storing dead-letter entries. Default: `replicator_dlq`
`--enableParallelApplies`	`BOOL`	Enable parallel application of independent table groups during replication. By default, applies are synchronous. When enabled, this increases throughput at the cost of higher target pool usage and memory usage. Default: `false`
`--flushPeriod`	`DURATION`	Flush queued mutations after this duration. Default: `1s`
`--flushSize`	`INT`	Ideal batch size to determine when to flush mutations. Default: `1000`
`--gracePeriod`	`DURATION`	Allow background processes to exit. Default: `30s`
`--logDestination`	`STRING`	Write logs to a file. If not specified, write logs to `stdout`.
`--logFormat`	`STRING`	Choose log output format: `"fluent"`, `"text"`. Default: `"text"`
`--maxRetries`	`INT`	Maximum number of times to retry a failed mutation on the target (for example, due to contention or a temporary unique constraint violation) before treating it as a hard failure. Default: `10`
`--metricsAddr`	`STRING`	A `host:port` on which to serve metrics and diagnostics. The metrics endpoint is `http://{host}:{port}/_/varz`.
`--parallelism`	`INT`	The number of concurrent database transactions to use. Default: `16`
`--quiescentPeriod`	`DURATION`	How often to retry deferred mutations. Default: `10s`
`--retireOffset`	`DURATION`	How long to delay removal of applied mutations. Default: `24h0m0s`
`--retryInitialBackoff`	`DURATION`	Initial delay before the first retry attempt when applying a mutation to the target database fails due to a retryable error, such as contention or a temporary unique constraint violation. Default: `25ms`
`--retryMaxBackoff`	`DURATION`	Maximum delay between retry attempts when applying mutations to the target database fails due to retryable errors. Default: `2s`
`--retryMultiplier`	`INT`	Multiplier that controls how quickly the backoff interval increases between successive retries of failed applies to the target database. Default: `2`
`--scanSize`	`INT`	The number of rows to retrieve from the staging database used to store metadata for replication. Default: `10000`
`--schemaRefresh`	`DURATION`	How often a watcher will refresh its schema. If this value is zero or negative, refresh behavior will be disabled. Default: `1m0s`
`--sourceConn`	`STRING`	The source database's connection string. When replicating from Oracle, this is the connection string of the Oracle container database (CDB). Refer to Oracle replication flags.
`--stageDisableCreateTableReaderIndex`	`BOOL`	Disable the creation of partial covering indexes to improve read performance on staging tables. Set to `true` if creating indexes on existing tables would cause a significant operational impact. Default: `false`
`--stageMarkAppliedLimit`	`INT`	Limit the number of mutations to be marked applied in a single statement. Default: `100000`
`--stageSanityCheckPeriod`	`DURATION`	How often to validate staging table apply order (`-1` to disable). Default: `10m0s`
`--stageSanityCheckWindow`	`DURATION`	How far back to look when validating staging table apply order. Default: `1h0m0s`
`--stageUnappliedPeriod`	`DURATION`	How often to report the number of unapplied mutations in staging tables (`-1` to disable). Default: `1m0s`
`--stagingConn`	`STRING`	The staging database's connection string.
`--stagingCreateSchema`		Automatically create the staging schema if it does not exist.
`--stagingIdleTime`	`DURATION`	Maximum lifetime of an idle connection. Default: `1m0s`
`--stagingJitterTime`	`DURATION`	The time over which to jitter database pool disconnections. Default: `15s`
`--stagingMaxLifetime`	`DURATION`	The maximum lifetime of a database connection. Default: `5m0s`
`--stagingMaxPoolSize`	`INT`	The maximum number of staging database connections. Default: `128`
`--stagingSchema`	`STRING`	Name of the CockroachDB schema that stores replication metadata. Required each time `replicator` is rerun after being interrupted, as the schema contains a checkpoint table that enables replication to resume from the correct transaction. Default: `_replicator.public`
`--targetApplyQueueSize`	`INT`	Size of the apply queue that buffers mutations before they are written to the target database. Larger values can improve throughput, but increase memory usage. This flag applies only to CockroachDB and PostgreSQL (`pglogical`) sources, and replaces the deprecated `--copierChannel` and `--stageCopierChannelSize` flags.
`--targetConn`	`STRING`	The target database's connection string.
`--targetIdleTime`	`DURATION`	Maximum lifetime of an idle connection. Default: `1m0s`
`--targetJitterTime`	`DURATION`	The time over which to jitter database pool disconnections. Default: `15s`
`--targetMaxLifetime`	`DURATION`	The maximum lifetime of a database connection. Default: `5m0s`
`--targetMaxPoolSize`	`INT`	The maximum number of target database connections. Default: `128`
`--targetSchema`	`STRING`	The SQL database schema in the target cluster to update.
`--targetStatementCacheSize`	`INT`	The maximum number of prepared statements to retain. Default: `128`
`--taskGracePeriod`	`DURATION`	How long to allow for task cleanup when recovering from errors. Default: `1m0s`
`--timestampLimit`	`INT`	The maximum number of source timestamps to coalesce into a target transaction. Default: `1000`
`--userscript`	`STRING`	The path to a TypeScript configuration script. For example, `--userscript 'script.ts'`.
`-v`, `--verbose`	`COUNT`	Increase logging verbosity. Use `-v` for `debug` logging or `-vv` for `trace` logging.

`pglogical` replication flags

The following flags are used when replicating from a PostgreSQL source database.

Flag	Type	Description
`--publicationName`	`STRING`	The publication within the source database to replicate.
`--slotName`	`STRING`	The replication slot in the source database. Default: `"replicator"`
`--standbyTimeout`	`DURATION`	How often to report WAL progress to the source server. Default: `5s`

`mylogical` replication flags

The following flags are used when replicating from a MySQL source database.

Flag	Type	Description
`--defaultGTIDSet`	`STRING`	Default GTID set, in the format `source_uuid:min(interval_start)-max(interval_end)`. Required the first time `replicator` is run, as the GTID set provides a replication marker for streaming changes.
`--fetchMetadata`		Fetch column metadata explicitly, for older versions of MySQL that do not support `binlog_row_metadata`.
`--replicationProcessID`	`UINT32`	The replication process ID to report to the source database. Default: `10`

`oraclelogminer` replication flags

The following flags are used when replicating from an Oracle source database.

Flag	Type	Description
`--sourceSchema`	`STRING`	Required. Source schema name on Oracle where tables will be replicated from.
`--scn`	`INT`	The snapshot System Change Number (SCN) from the initial data load. Required the first time `replicator` is run, as the SCN provides a replication marker for streaming changes.
`--backfillFromSCN`	`INT`	The SCN of the earliest active transaction at the time of the initial snapshot. Ensures no transactions are skipped when starting replication from Oracle.
`--sourcePDBConn`	`STRING`	Connection string for the Oracle pluggable database (PDB). Only required when using an Oracle multitenant configuration. `--sourceConn` must be included.
`--oracle-application-users`	`STRING`	List of Oracle usernames responsible for DML transactions in the PDB schema. Enables replication from the latest-possible starting point. Usernames are case-sensitive and must match the internal Oracle usernames (e.g., `PDB_USER`).

`start` failback flags

The following flags are used for failback from CockroachDB.

Flag	Type	Description
`--assumeIdempotent`		Disable the extra staging table queries that debounce non-idempotent redelivery in changefeeds.
`--bestEffortOnly`		Eventually-consistent mode; useful for high-throughput, skew-tolerant schemas with foreign keys.
`--bestEffortWindow`	`DURATION`	Use an eventually-consistent mode for initial backfill or when replication is behind; `0` to disable. Default: `1h0m0s`
`--bindAddr`	`STRING`	The network address to bind to. Default: `":26258"`
`--disableAuthentication`		Disable authentication of incoming Replicator requests; not recommended for production.
`--enableCheckpointStream`		Enable checkpoint streaming (use an internal changefeed from the staging table for real-time updates), rather than checkpoint polling (query the staging table for periodic updates), for failback replication. Default: `false` (use checkpoint polling)
`--discard`		Dangerous: Discard all incoming HTTP requests; useful for changefeed throughput testing. Not intended for production.
`--discardDelay`	`DURATION`	Adds additional delay in discard mode; useful for gauging the impact of changefeed round-trip time (RTT).
`--healthCheckTimeout`	`DURATION`	The timeout for the health check endpoint. Default: `5s`
`--httpResponseTimeout`	`DURATION`	The maximum amount of time to allow an HTTP handler to execute. Default: `2m0s`
`--immediate`		Bypass staging tables and write directly to target; recommended only for KV-style workloads with no foreign keys.
`--limitLookahead`	`INT`	Limit number of checkpoints to be considered when computing the resolving range; may cause replication to stall completely if older mutations cannot be applied.
`--ndjsonBufferSize`	`INT`	The maximum amount of data to buffer while reading a single line of `ndjson` input; increase when source cluster has large blob values. Default: `65536`
`--tlsCertificate`	`STRING`	A path to a PEM-encoded TLS certificate chain.
`--tlsPrivateKey`	`STRING`	A path to a PEM-encoded TLS private key.
`--tlsSelfSigned`		If true, generate a self-signed TLS certificate valid for `localhost`.

`make-jwt` flags

The following flags are used with the make-jwt command to generate JWT tokens for changefeed authentication.

Flag	Type	Description
`-a`, `--allow`	`STRING`	One or more `database.schema` identifiers. Can be repeated for multiple schemas.
`--claim`		If `true`, print a minimal JWT claim instead of signing.
`-k`, `--key`	`STRING`	The path to a PEM-encoded private key to sign the token with.
`-o`, `--out`	`STRING`	A file to write the token to.

Usage

Replicator commands

MOLT Replicator provides four commands for different replication scenarios. For detailed workflows, refer to Common workflows.

Use pglogical to replicate from PostgreSQL to CockroachDB:

replicator pglogical

Use mylogical to replicate from MySQL to CockroachDB:

replicator mylogical

Use oraclelogminer to replicate from Oracle to CockroachDB:

replicator oraclelogminer

Use start to replicate from CockroachDB to PostgreSQL, MySQL, or Oracle (failback):

replicator start

Source connection strings

Tip:

Follow the security recommendations in Connection security and credentials.

--sourceConn specifies the connection string of the source database for forward replication.

Note:

The source connection string must point to the primary instance of the source database. Replicas cannot provide the necessary replication checkpoints and transaction metadata required for ongoing replication.

PostgreSQL connection string:

--sourceConn 'postgresql://{username}:{password}@{host}:{port}/{database}'

MySQL connection string:

--sourceConn 'mysql://{username}:{password}@{protocol}({host}:{port})/{database}'

Oracle connection string:

--sourceConn 'oracle://{username}:{password}@{host}:{port}/{service_name}'

For Oracle Multitenant databases, --sourcePDBConn specifies the pluggable database (PDB) connection. --sourceConn specifies the container database (CDB):

--sourceConn 'oracle://{username}:{password}@{host}:{port}/{cdb_service_name}'
--sourcePDBConn 'oracle://{username}:{password}@{host}:{port}/{pdb_service_name}'

For failback, --stagingConn specifies the CockroachDB connection string:

--stagingConn 'postgresql://{username}:{password}@{host}:{port}/{database}'

Target connection strings

--targetConn specifies the connection string of the target CockroachDB database for forward replication:

--targetConn 'postgresql://{username}:{password}@{host}:{port}/{database}'

Note:

For failback, --targetConn specifies the original source database (PostgreSQL, MySQL, or Oracle). For details, refer to Failback to source database.

Replication checkpoints

MOLT Replicator requires a checkpoint value to start replication from the correct position in the source database's transaction log.

For PostgreSQL, use --slotName to specify the replication slot created during the data load. The slot automatically tracks the LSN (Log Sequence Number):

--slotName molt_slot

For MySQL, use --defaultGTIDSet with the GTID set from the MOLT Fetch output:

--defaultGTIDSet '4c658ae6-e8ad-11ef-8449-0242ac140006:1-29'

For Oracle, use --scn and --backfillFromSCN with the SCN values from the MOLT Fetch output:

--scn 26685786
--backfillFromSCN 26685444

Staging database

The staging database stores replication metadata, checkpoints, and buffered mutations. Specify the staging database with --stagingSchema and create it automatically with --stagingCreateSchema:

--stagingSchema _replicator
--stagingCreateSchema

The staging database is used to:

Store checkpoints that enable resuming from the correct point after interruptions.
Buffer mutations before applying them to the target in transaction order.
Maintain consistency for time-ordered transactional batches while respecting table dependencies.
Provide restart capabilities after failures.

Security

Cockroach Labs strongly recommends the following:

Connection security and credentials

To keep your database credentials out of shell history and logs, follow these best practices when specifying your source and target connection strings:
- Avoid plaintext connection strings.
- Provide your connection strings as environment variables. For example:
```
export SOURCE="postgres://migration_user:a%2452%26@localhost:5432/molt?sslmode=verify-full"
export TARGET="postgres://root@localhost:26257/molt?sslmode=verify-full"
```
  Afterward, reference the environment variables in MOLT commands:
```
--sourceConn $SOURCE
--targetConn $TARGET
```
- If possible, use an external secrets manager to load the environment variables from stored secrets.
Use TLS-enabled connection strings to encrypt data in transit from MOLT to the database. When using TLS certificates, ensure certificate files are accessible to the MOLT binary on the same machine.

For example, a PostgreSQL connection string with TLS certificates:
```
postgresql://migration_user@db.example.com:5432/appdb?sslmode=verify-full&sslrootcert=/etc/molt/certs/ca.pem&sslcert=/etc/molt/certs/client.crt&sslkey=/etc/molt/certs/client.key
```
URL-encode connection strings for the source database and CockroachDB so special characters in passwords are handled correctly.
- Given a password a$52&, pass it to the molt escape-password command with single quotes:
```
molt escape-password --password 'a$52&'
```
  Use the encoded password in your connection string. For example:
```
postgres://migration_user:a%2452%26@localhost:5432/replicationload
```
Remove sslmode=disable from production connection strings.

CockroachDB changefeed security

For failback scenarios, secure the connection from CockroachDB to MOLT Replicator using TLS certificates. Generate TLS certificates using self-signed certificates, certificate authorities like Let's Encrypt, or your organization's certificate management system.

TLS from CockroachDB to Replicator

Configure MOLT Replicator with server certificates using the --tlsCertificate and --tlsPrivateKey flags to specify the certificate and private key file paths. For example:

replicator start \
--tlsCertificate ./certs/server.crt \
--tlsPrivateKey ./certs/server.key \
...

These server certificates must correspond to the client certificates specified in the changefeed webhook URL to ensure proper TLS handshake.

Encode client certificates for changefeed webhook URLs:

Webhook URLs: Use both URL encoding and base64 encoding: base64 -i ./client.crt | jq -R -r '@uri'
Non-webhook contexts: Use base64 encoding only: base64 -w 0 ca.cert

JWT authentication

You can use JSON Web Tokens (JWT) to authorize incoming changefeed connections and restrict writes to a subset of SQL databases or user-defined schemas in the target cluster.

Replicator supports JWT claims that allow writes to specific databases, schemas, or all of them. JWT tokens must be signed using RSA or EC keys. HMAC and None signatures are automatically rejected.

To configure JWT authentication:

Add PEM-formatted public signing keys to the _replicator.jwt_public_keys table in the staging database.
To revoke a specific token, add its jti value to the _replicator.jwt_revoked_ids table in the staging database.

The Replicator process re-reads these tables every minute to pick up changes.

To pass the JWT token from the changefeed to the Replicator webhook sink, use the webhook_auth_header option:

CREATE CHANGEFEED ... WITH webhook_auth_header='Bearer <encoded_token>';

Token quickstart

The following example uses OpenSSL to generate keys, but any PEM-encoded RSA or EC keys will work.

# Generate an EC private key using OpenSSL.
openssl ecparam -out ec.key -genkey -name prime256v1

# Write the public key components to a separate file.
openssl ec -in ec.key -pubout -out ec.pub

# Upload the public key for all instances of Replicator to find it.
cockroach sql -e "INSERT INTO _replicator.jwt_public_keys (public_key) VALUES ('$(cat ec.pub)')"

# Reload configuration, or wait one minute.
killall -HUP replicator

# Generate a token which can write to the ycsb.public schema.
# The key can be decoded using the debugger at https://jwt.io.
# Add the contents of out.jwt to the CREATE CHANGEFEED command:
# WITH webhook_auth_header='Bearer {out.jwt}'
replicator make-jwt -k ec.key -a ycsb.public -o out.jwt

External JWT providers

The make-jwt command also supports a --claim flag, which prints a JWT claim that can be signed by your existing JWT provider. The PEM-formatted public key or keys for that provider must be inserted into the _replicator.jwt_public_keys table. The iss (issuers) and jti (token id) fields will likely be specific to your auth provider, but the custom claim must be retained in its entirety.

You can repeat the -a flag to create a claim for multiple schemas:

replicator make-jwt -a 'database.schema' --claim

{
  "iss": "replicator",
  "jti": "d5ffa211-8d54-424b-819a-bc19af9202a5",
  "https://github.com/cockroachdb/replicator": {
    "schemas": [
      [
        "database",
        "schema"
      ]
    ]
  }
}

Note:

For details on the make-jwt command flags, refer to make-jwt flags.

Production considerations

Avoid --disableAuthentication and --tlsSelfSigned flags in production environments. These flags should only be used for testing or development purposes.

Supply chain security

Use the version command to verify the integrity of your MOLT Replicator build and identify potential upstream vulnerabilities.

replicator version

The output includes:

Module name
go.mod checksum
Version

Use this information to determine if your build may be subject to vulnerabilities from upstream packages. Cockroach Labs uses Dependabot to automatically upgrade Go modules, and the team regularly merges Dependabot updates to address security issues.

Common workflows

Forward replication with initial load

To start replication after an initial data load with MOLT Fetch, use the pglogical command:

replicator pglogical

To start replication after an initial data load with MOLT Fetch, use the mylogical command:

replicator mylogical

To start replication after an initial data load with MOLT Fetch, use the oraclelogminer command:

replicator oraclelogminer

Specify the source and target database connections. For connection string formats, refer to Source connection strings and Target connection strings:

--sourceConn $SOURCE
--targetConn $TARGET

For Oracle Multitenant databases, also specify the PDB connection:

--sourcePDBConn $SOURCE_PDB

Specify the source Oracle schema to replicate from:

--sourceSchema migration_schema

To replicate from the correct position, specify the appropriate checkpoint value.

Use --slotName to specify the slot created during the data load, which automatically tracks the LSN (Log Sequence Number) checkpoint:

--slotName molt_slot

Use --defaultGTIDSet from the cdc_cursor field in the MOLT Fetch output:

--defaultGTIDSet '4c658ae6-e8ad-11ef-8449-0242ac140006:1-29'

Use the --scn and --backfillFromSCN values from the MOLT Fetch output:

--scn 26685786
--backfillFromSCN 26685444

Use --stagingSchema to specify the staging database. Use --stagingCreateSchema to create it automatically on first run:

--stagingSchema _replicator
--stagingCreateSchema

At minimum, the replicator command should include the following flags:

replicator pglogical \
--sourceConn $SOURCE \
--targetConn $TARGET \
--slotName molt_slot \
--stagingSchema _replicator \
--stagingCreateSchema

For detailed steps, refer to Load and replicate.

replicator mylogical \
--sourceConn $SOURCE \
--targetConn $TARGET \
--defaultGTIDSet '4c658ae6-e8ad-11ef-8449-0242ac140006:1-29' \
--stagingSchema _replicator \
--stagingCreateSchema

For detailed steps, refer to Load and replicate.

replicator oraclelogminer \
--sourceConn $SOURCE \
--sourcePDBConn $SOURCE_PDB \
--sourceSchema migration_schema \
--targetConn $TARGET \
--scn 26685786 \
--backfillFromSCN 26685444 \
--stagingSchema _replicator \
--stagingCreateSchema

For detailed steps, refer to Load and replicate.

Resume after interruption

When resuming replication after an interruption, MOLT Replicator automatically uses the stored checkpoint to resume from the correct position.

Rerun the same replicator command used during forward replication, specifying the same --stagingSchema value as before. Omit --stagingCreateSchema and any checkpoint flags. For example:

replicator pglogical \
--sourceConn $SOURCE \
--targetConn $TARGET \
--slotName molt_slot \
--stagingSchema _replicator

For detailed steps, refer to Resume replication.

replicator mylogical \
--sourceConn $SOURCE \
--targetConn $TARGET \
--stagingSchema _replicator

For detailed steps, refer to Resume replication.

replicator oraclelogminer \
--sourceConn $SOURCE \
--sourcePDBConn $SOURCE_PDB \
--sourceSchema migration_schema \
--targetConn $TARGET \
--stagingSchema _replicator

For detailed steps, refer to Resume replication.

Failback to source database

When replicating from CockroachDB back to the source database, MOLT Replicator acts as a webhook sink for a CockroachDB changefeed.

Use the start command for failback:

replicator start

Specify the target database connection (the database you originally migrated from). For connection string formats, refer to Target connection strings:

--targetConn $TARGET

Specify the CockroachDB connection string. For details, refer to Connect using a URL.

--stagingConn $STAGING

Specify the staging database name. This should be the same staging database created during Forward replication with initial load:

--stagingSchema _replicator

Specify a webhook endpoint address for the changefeed to send changes to. For example:

--bindAddr :30004

Specify TLS certificate and private key file paths for secure webhook connections:

--tlsCertificate ./certs/server.crt
--tlsPrivateKey ./certs/server.key

At minimum, the replicator command should include the following flags:

replicator start \
--targetConn $TARGET \
--stagingConn $STAGING \
--stagingSchema _replicator \
--bindAddr :30004 \
--tlsCertificate ./certs/server.crt \
--tlsPrivateKey ./certs/server.key

For detailed steps, refer to Migration failback.

Monitoring

Metrics

MOLT Replicator can export Prometheus metrics by setting the --metricsAddr flag to a port (for example, --metricsAddr :30005). Metrics are not enabled by default. When enabled, metrics are available at the path /_/varz. For example: http://localhost:30005/_/varz.

For a list of recommended metrics to monitor during replication, refer to:

Forward replication metrics (PostgreSQL, MySQL, and Oracle sources)
Failback replication metrics (CockroachDB source)

You can use the Replicator Grafana dashboard to visualize the metrics. For Oracle-specific metrics, import the Oracle Grafana dashboard.

To check MOLT Replicator health when metrics are enabled, run curl http://localhost:30005/_/healthz (replacing the port with your --metricsAddr value). This returns a status code of 200 if Replicator is running.

Logging

By default, MOLT Replicator writes two streams of logs: operational logs to stdout (including warning, info, trace, and some errors) and final errors to stderr.

Redirect both streams to ensure all logs are captured for troubleshooting:

# Merge both streams to console
./replicator ... 2>&1

# Redirect both streams to a file
./replicator ... > output.log 2>&1

# Merge streams to console while saving to file
./replicator > >(tee replicator.log) 2>&1

# Use logDestination flag to write all logs to a file
./replicator --logDestination replicator.log ...

Enable debug logging with -v. For more granularity and system insights, enable trace logging with -vv. Pay close attention to warning- and error-level logs, as these indicate when Replicator is misbehaving.

Best practices

Test and validate

To verify that your connections and configuration work properly, run MOLT Replicator in a staging environment before replicating any data in production. Use a test or development environment that closely resembles production.

Optimize performance

Configure the following replicator flags to optimize replication throughput and resource usage. Test different combinations in a pre-production environment to find the optimal balance of stability and performance for your workload.

Note:

The following parameters apply to PostgreSQL, Oracle, and CockroachDB (failback) sources.

Flag	Description
`--parallelism`	Control the maximum number of concurrent target transactions. Higher values increase throughput but require more target connections. Start with a conservative value and increase based on target database capacity.
`--flushSize`	Balance throughput and latency. Controls how many mutations are batched into each query to the target. Increase for higher throughput at the cost of higher latency.
`--targetApplyQueueSize`	Control memory usage during operation. Increase to allow higher throughput at the expense of memory; decrease to apply backpressure and limit memory consumption.
`--targetMaxPoolSize`	Set larger than `--parallelism` by a safety factor to avoid exhausting target pool connections. Replicator enforces setting parallelism to 80% of this value.
`--collapseMutations`	Reduce the number of queries to the target by combining multiple mutations on the same primary key within each batch. Disable only if exact mutation order matters more than end state.
`--enableParallelApplies`	Improve apply throughput for independent tables and table groups that share foreign key dependencies. Increases memory and target connection usage, so ensure you increase `--targetMaxPoolSize` or reduce `--parallelism`.
`--flushPeriod`	Set to the maximum allowable time between flushes (for example, `10s` if data must be applied within 10 seconds). Works with `--flushSize` to control when buffered mutations are committed to the target.
`--quiescentPeriod`	Lower this value if constraint violations resolve quickly on your workload to make retries more frequent and reduce latency. Do not lower if constraint violations take time to resolve.
`--scanSize`	Applies to failback (`replicator start`) scenarios only. Balance memory usage and throughput. Increase to read more rows at once from the CockroachDB staging cluster for higher throughput, at the cost of memory pressure. Decrease to reduce memory pressure and increase stability.

Troubleshooting

Forward replication issues

Performance troubleshooting

If MOLT Replicator appears hung or performs poorly:

Enable trace logging with -vv to get more visibility into the replicator's state and behavior.

If MOLT Replicator is in an unknown, hung, or erroneous state, collect performance profiles to include with support tickets. Replace {host} and {metrics-port} with your Replicator host and the port specified by --metricsAddr:

curl '{host}:{metrics-port}/debug/pprof/trace?seconds=15' > trace.out
curl '{host}:{metrics-port}/debug/pprof/profile?seconds=15' > profile.out
curl '{host}:{metrics-port}/debug/pprof/goroutine?seconds=15' > gr.out
curl '{host}:{metrics-port}/debug/pprof/heap?seconds=15' > heap.out

Monitor lag metrics and adjust performance parameters as needed.

Unable to create publication or slot

This error occurs when logical replication is not supported.

Resolution: If you are connected to a replica, connect to the primary instance instead. Replicas cannot create or manage logical replication slots or publications.

Verify that the source database supports logical replication by checking the wal_level parameter on PostgreSQL:

SHOW wal_level;

If wal_level is not set to logical, update it and restart PostgreSQL:

ALTER SYSTEM SET wal_level = 'logical';

Replication slot already exists

ERROR: replication slot "molt_slot" already exists

Resolution: Either create a new slot with a different name, or drop the existing slot to start fresh:

SELECT pg_drop_replication_slot('molt_slot');

Warning:

Dropping a replication slot can be destructive and delete data that is not yet replicated. Only use this if you want to restart replication from the current position.

Publication does not exist

run CREATE PUBLICATION molt_fetch FOR ALL TABLES;

Resolution: Create the publication on the source database. Ensure you also create the replication slot:

CREATE PUBLICATION molt_publication FOR ALL TABLES;
SELECT pg_create_logical_replication_slot('molt_slot', 'pgoutput');

Could not connect to PostgreSQL

could not connect to source database: failed to connect to `user=migration_user database=source_database`

Resolution: Verify the connection details including user, host, port, and database name. Ensure the database name in your --sourceConn connection string matches exactly where you created the publication and slot. Verify you're connecting to the same host and port where you ran the CREATE PUBLICATION and SELECT pg_create_logical_replication_slot() commands. Check if TLS certificates need to be included in the connection URI.

Wrong replication slot name

run SELECT pg_create_logical_replication_slot('molt_slot', 'pgoutput'); in source database

Resolution: Create the replication slot or verify the correct slot name:

SELECT pg_create_logical_replication_slot('molt_slot', 'pgoutput');

Repeated binlog syncing restarts

If Replicator repeatedly restarts binlog syncing or starts replication from an unexpectedly old location, this indicates an invalid or purged GTID. When an invalid GTID is provided, the binlog syncer will fall back to the first valid GTID.

Resolution: Verify the GTID set is valid and not purged:

-- Check if GTID is in executed set
SELECT GTID_SUBSET('your-gtid-set', @@GLOBAL.gtid_executed) AS in_executed;

-- Check if GTID is purged
SELECT GTID_SUBSET('your-gtid-set', @@GLOBAL.gtid_purged) AS in_purged;

Interpret the results as follows:

If in_executed returns 1 and in_purged returns 0, the GTID is valid for replication.
If in_purged returns 1, the GTID has been purged and you must find a newer consistent point.
If both return 0, the GTID doesn't exist in the records and is invalid.

If the GTID is purged or invalid, follow these steps:

Increase binlog retention by configuring binlog_expire_logs_seconds in MySQL:
```
-- Increase binlog retention (example: 7 days = 604800 seconds)
SET GLOBAL binlog_expire_logs_seconds = 604800;
```
Note:

For managed MySQL services (such as Amazon RDS, Google Cloud SQL, or Azure Database for MySQL), binlog retention is typically configured through the provider's console or CLI. Consult your provider's documentation for how to adjust binlog retention settings.

Get a current GTID set to restart replication:

-- For MySQL < 8.0:
SHOW MASTER STATUS;
-- For MySQL 8.0+:
SHOW BINARY LOG STATUS;

+---------------+----------+--------------+------------------+-------------------------------------------+
| File          | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                         |
+---------------+----------+--------------+------------------+-------------------------------------------+
| binlog.000005 |      197 |              |                  | 77263736-7899-11f0-81a5-0242ac120002:1-38 |
+---------------+----------+--------------+------------------+-------------------------------------------+

Use the Executed_Gtid_Set value for the --defaultGTIDSet flag.

Invalid GTID format

Invalid GTIDs can occur when GTIDs are purged due to insufficient binlog retention, when connecting to a replica instead of the primary host, or when passing a GTID that has valid format but doesn't exist in the binlog history.

Resolution: Use a valid GTID from SHOW MASTER STATUS (MySQL < 8.0) or SHOW BINARY LOG STATUS (MySQL 8.0+) and ensure you're connecting to the primary host. If GTIDs are being purged, increase binlog retention.

Table/column names exceed 30 characters

Oracle LogMiner excludes tables and columns with names longer than 30 characters from redo logs.

Resolution: Rename tables and columns to 30 characters or fewer before migration.

Unsupported data types

LogMiner and replication do not support:

Long BLOB/CLOBs (4000+ characters)
User-defined types (UDTs)
Nested tables
Varrays
GEOGRAPHY and GEOMETRY

Resolution: Convert unsupported data types or exclude affected tables from replication.

LOB column UPDATE statements

UPDATE statements that only modify LOB columns are not supported by Oracle LogMiner.

Resolution: Avoid LOB-only updates during replication, or use Binary Reader for Oracle 12c.

JSONB null handling

SQL NULL and JSON null values are not distinguishable in JSON payloads during replication.

Resolution: Avoid using nullable JSONB columns where the distinction between SQL NULL and JSON null is important.

Missing redo logs or unavailable SCN

If the Oracle redo log files are too small or do not retain enough history, you may get errors indicating that required log files are missing for a given SCN range, or that a specific SCN is unavailable.

Resolution: Increase the number and size of online redo log files, and verify that archived log files are being generated and retained correctly in your Oracle environment.

Replicator lag

If the replicator process is lagging significantly behind the current Oracle SCN, you may see log messages like: replicator is catching up to the current SCN at 5000 from 1000…. This indicates that replication is progressing but is still behind the most recent changes on the source database.

Schema drift errors

Indicates source and target schemas are mismatched:

WARNING: schema drift detected in "database"."table" at payload object offset 0: unexpected columns: column_name

Resolution: Align schemas or use userscripts to transform data.

Apply flow failures

Apply flow failures occur when the target database encounters error conditions such as unique constraint violations, target database being unavailable, or incorrect data (missing or extraneous columns) during apply operations:

WARNING: warning during tryCommit: ERROR: duplicate key value violates unique constraint
ERROR: maximum number of retries (10) exceeded

Resolution: Check target database constraints and connection stability. MOLT Replicator will log warnings for each retry attempt. If you see warnings but no final error, the apply succeeded after retrying. If all retry attempts are exhausted, Replicator will surface a final error and restart the apply loop to continue processing.

Failback issues

If the changefeed shows connection errors in SHOW CHANGEFEED JOB:

Connection refused

transient error: Post "https://replicator-host:30004/molt/public": dial tcp [::1]:30004: connect: connection refused

This indicates that Replicator is down, the webhook URL is incorrect, or the port is misconfigured.

Resolution: Verify that MOLT Replicator is running on the port specified in the changefeed INTO configuration. Confirm the host and port are correct.

Incorrect schema path errors

This error occurs when the CockroachDB changefeed webhook URL path does not match the target database schema naming convention:

transient error: 400 Bad Request: unknown schema:

The webhook URL path is specified in the INTO clause when you create the changefeed. For example: webhook-https://replicator-host:30004/database/schema.

Resolution: Verify the webhook path format matches your target database type:

PostgreSQL or CockroachDB targets: Use /database/schema format. For example, webhook-https://replicator-host:30004/migration_schema/public.
MySQL targets: Use /database format (schema is implicit). For example, webhook-https://replicator-host:30004/migration_schema.
Oracle targets: Use /DATABASE format in uppercase. For example, webhook-https://replicator-host:30004/MIGRATION_SCHEMA.

For details on configuring the webhook sink URI, refer to Webhook sink.

GC threshold error

batch timestamp * must be after replica GC threshold

This indicates starting from an invalid cursor that has been garbage collected.

Resolution: Double-check the cursor to ensure it represents a valid range that has not been garbage collected, or extend the GC TTL on the source CockroachDB cluster:

ALTER DATABASE defaultdb CONFIGURE ZONE USING gc.ttlseconds = {gc_ttl_in_seconds};

Duplicated data re-application

This occurs when resuming a changefeed from a cursor causes excessive data duplication.

Resolution: Clear the staging database to prevent duplication. This deletes all checkpoints and buffered data, so use with caution:

DROP DATABASE _replicator;

For more targeted cleanup, delete mutations from specific staging tables:

DELETE FROM _replicator.employees WHERE true;

Examples

For detailed examples of using MOLT Replicator usage, refer to the migration workflow tutorials:

Load and Replicate: Load data with MOLT Fetch and set up ongoing replication with MOLT Replicator.
Resume Replication: Resume replication after an interruption.
Migration failback: Replicate changes from CockroachDB back to the initial source database.

MOLT Replicator

Terminology

Prerequisites

Supported databases

Database configuration

User permissions

Installation

Docker images

MOLT Replicator

Docker usage

Performance

Local connection strings

How it works

Consistency modes

Commands

Flags

Global flags

pglogical replication flags

mylogical replication flags

oraclelogminer replication flags

start failback flags

make-jwt flags

Usage

Replicator commands

Source connection strings

Target connection strings

Replication checkpoints

Staging database

Security

Connection security and credentials

CockroachDB changefeed security

TLS from CockroachDB to Replicator

JWT authentication

Token quickstart

External JWT providers

Production considerations

Supply chain security

Common workflows

Forward replication with initial load

Resume after interruption

Failback to source database

Monitoring

Metrics

Logging

Best practices

Test and validate

Optimize performance

Troubleshooting

Forward replication issues

Performance troubleshooting

Unable to create publication or slot

Replication slot already exists

Publication does not exist

Could not connect to PostgreSQL

Wrong replication slot name

Repeated binlog syncing restarts

Invalid GTID format

Table/column names exceed 30 characters

Unsupported data types

LOB column UPDATE statements

JSONB null handling

Missing redo logs or unavailable SCN

Replicator lag

Schema drift errors

Apply flow failures

Failback issues

Connection refused

Incorrect schema path errors

GC threshold error

Duplicated data re-application

Examples

See also

Tell us about your experience

Select the problem area

Thank you for your feedback!

Explore More Documentation:

`pglogical` replication flags

`mylogical` replication flags

`oraclelogminer` replication flags

`start` failback flags

`make-jwt` flags