MOLT Replicator

On this page Carat arrow pointing down

MOLT Replicator continuously replicates changes from a source database to CockroachDB as part of a database migration. It supports migrations from a source database to CockroachDB with minimal downtime, and enables backfill from CockroachDB to your source database for failback scenarios to preserve a rollback option during a migration window.

MOLT Replicator consumes change data from PostgreSQL logical replication streams, MySQL GTID-based replication, Oracle LogMiner, and CockroachDB changefeeds (for failback). For details, refer to How it works.

Terminology

  • Checkpoint: The position in the source database's transaction log from which replication begins or resumes: LSN (PostgreSQL), GTID (MySQL), or SCN (Oracle).
  • Staging database: A CockroachDB database used by Replicator to store replication metadata, checkpoints, and buffered mutations. Specified with --stagingSchema and automatically created with --stagingCreateSchema. For details, refer to Staging database.
  • Forward replication: Replicate changes from a source database (PostgreSQL, MySQL, or Oracle) to CockroachDB during a migration. For usage details, refer to Forward replication with initial load.
  • Failback: Replicate changes from CockroachDB back to the source database. Used for migration rollback or to maintain data consistency on the source during migration. For usage details, refer to Failback to source database.

Prerequisites

Supported databases

MOLT Replicator supports the following source and target databases:

  • PostgreSQL 11-16
  • MySQL 5.7, 8.0 and later
  • Oracle Database 19c (Enterprise Edition) and 21c (Express Edition)
  • CockroachDB (all currently supported versions)

Database configuration

The source database must be configured for replication:

Database Configuration Requirements Details
PostgreSQL source
  • Enable logical replication by setting wal_level = logical.
Configure PostgreSQL for replication
MySQL source
  • Enable global transaction identifiers (GTID) and configure binary logging. Set binlog-row-metadata or binlog-row-image to full.
  • Configure sufficient binlog retention for migration duration.
Configure MySQL for replication
Oracle source Configure Oracle for replication
CockroachDB source (failback)
  • Enable rangefeeds (kv.rangefeed.enabled = true) (CockroachDB self-hosted clusters only).
Configure CockroachDB for replication

User permissions

The SQL user running MOLT Replicator requires specific privileges on both the source and target databases:

Database Required Privileges Details
PostgreSQL source
  • SUPERUSER role (recommended), or the following granular permissions:
  • CREATE and SELECT on database and tables to replicate.
  • Table ownership for adding tables to publications.
  • LOGIN and REPLICATION privileges to create replication slots and access replication data.
Create PostgreSQL migration user
MySQL source
  • SELECT on tables to replicate.
  • REPLICATION SLAVE and REPLICATION CLIENT privileges for binlog access.
  • For --fetchMetadata, either SELECT on the source database or PROCESS globally.
Create MySQL migration user
Oracle source
  • SELECT, INSERT, UPDATE on _replicator_sentinel table.
  • SELECT on V$ views (V$LOG, V$LOGFILE, V$LOGMNR_CONTENTS, V$ARCHIVED_LOG, V$LOG_HISTORY).
  • SELECT on SYS.V$LOGMNR_* views (SYS.V$LOGMNR_DICTIONARY, SYS.V$LOGMNR_LOGS, SYS.V$LOGMNR_PARAMETERS, SYS.V$LOGMNR_SESSION).
  • LOGMINING privilege.
  • EXECUTE on DBMS_LOGMNR.
  • For Oracle Multitenant, the user must be a common user (prefixed with C##) with privileges granted on both CDB and PDB.
Create Oracle migration user

Create sentinel table

Grant LogMiner privileges
CockroachDB target (forward replication)
  • ALL on target database.
  • CREATE on schema.
  • SELECT, INSERT, UPDATE, DELETE on target tables.
  • CREATEDB privilege for creating staging schema.
Create CockroachDB user
PostgreSQL, MySQL, or Oracle target (failback)
  • SELECT, INSERT, UPDATE on tables to fail back to.
  • For Oracle, FLASHBACK is also required.
Grant PostgreSQL user permissions

Grant MySQL user permissions

Grant Oracle user permissions

Installation

To install MOLT, download the binary that matches your architecture and source database:

Operating System Architecture PostgreSQL/MySQL Oracle (Preview)
Windows AMD 64-bit Download N/A
ARM 64-bit Download N/A
Linux AMD 64-bit Download Download
ARM 64-bit Download N/A
Mac AMD 64-bit Download N/A
ARM 64-bit Download N/A

The following binaries are included:

  • molt
  • replicator
Tip:

For ease of use, keep both molt and replicator in your current working directory.

To display the current version of each binary, run molt --version and replicator --version.

For previous binaries, refer to the MOLT version manifest. For release details, refer to the MOLT changelog.

Docker images

MOLT Replicator

Docker images for MOLT Replicator are also available as a standalone binary:

icon/buttons/copy
docker pull cockroachdb/replicator

To pull a specific version (for example, v1.1.1):

icon/buttons/copy
docker pull cockroachdb/replicator:v1.1.1

Docker usage

Performance

MOLT Fetch, Verify, and Replicator are likely to run more slowly in a Docker container than on a local machine. To improve performance, increase the memory or compute resources, or both, on your Docker container.

Local connection strings

When testing locally, specify the host as follows:

  • For macOS, use host.docker.internal. For example:

    --sourceConn 'postgres://postgres:postgres@host.docker.internal:5432/molt?sslmode=disable'
    --targetConn "postgres://root@host.docker.internal:26257/molt?sslmode=disable"
    
  • For Linux and Windows, use 172.17.0.1. For example:

    --sourceConn 'postgres://postgres:postgres@172.17.0.1:5432/molt?sslmode=disable'
    --targetConn "postgres://root@172.17.0.1:26257/molt?sslmode=disable"
    

How it works

MOLT Replicator supports forward replication from PostgreSQL, MySQL, and Oracle, and failback from CockroachDB:

  • PostgreSQL source (pglogical): MOLT Replicator uses PostgreSQL logical replication, which is based on publications and replication slots. You create a publication for the target tables, and a slot marks consistent replication points. MOLT Replicator consumes this logical feed directly and applies the data in sorted batches to the target.

  • MySQL source (mylogical): MOLT Replicator relies on MySQL GTID-based replication to read change data from MySQL binlogs. It works with MySQL versions that support GTID-based replication and applies transactionally consistent feeds to the target. Binlog features that do not use GTIDs are not supported.

  • Oracle source (oraclelogminer): MOLT Replicator uses Oracle LogMiner to capture change data from Oracle redo logs. Both Oracle Multitenant (CDB/PDB) and single-tenant Oracle architectures are supported. Replicator periodically queries LogMiner-populated views and processes transactional data in ascending SCN windows for reliable throughput while maintaining consistency.

  • Failback from CockroachDB (start): MOLT Replicator acts as an HTTP webhook sink for a single CockroachDB changefeed. Replicator receives mutations from source cluster nodes, can optionally buffer them in a CockroachDB staging cluster, and then applies time-ordered transactional batches to the target database. Mutations are applied as UPSERT or DELETE statements while respecting foreign-key and table dependencies.

Consistency modes

MOLT Replicator supports three consistency modes for balancing throughput and transactional guarantees:

  1. Consistent (failback mode only, default for CockroachDB sources): Preserves per-row order and source transaction atomicity. Concurrent transactions are controlled by --parallelism.

  2. BestEffort (failback mode only): Relaxes atomicity across tables that do not have foreign key constraints between them (maintains coherence within FK-connected groups). Enable with --bestEffortOnly or allow auto-entry via --bestEffortWindow set to a positive duration (such as 1s).

    Note:

    For independent tables (with no foreign key constraints), BestEffort mode applies changes immediately as they arrive, without waiting for the resolved timestamp. This provides higher throughput for tables that have no relationships with other tables.

  3. Immediate (default for PostgreSQL, MySQL, and Oracle sources): Applies updates as they arrive to Replicator with no buffering or waiting for resolved timestamps. For CockroachDB sources, provides highest throughput but requires no foreign keys on the target schema.

Commands

MOLT Replicator provides the following commands:

Command Description
pglogical Replicate from PostgreSQL source to CockroachDB target using logical replication.
mylogical Replicate from MySQL source to CockroachDB target using GTID-based replication.
oraclelogminer Replicate from Oracle source to CockroachDB target using Oracle LogMiner.
start Replicate from CockroachDB source to PostgreSQL, MySQL, or Oracle target (failback mode). Requires a CockroachDB changefeed with rangefeeds enabled.
make-jwt Generate JWT tokens for authorizing changefeed connections in failback scenarios. Supports signing tokens with RSA or EC keys, or generating claims for external JWT providers. For details, refer to JWT authentication.
version Display version information and Go module dependencies with checksums. For details, refer to Supply chain security.

For command-specific flags and examples, refer to Usage and Common workflows.

Flags

Global flags

Flag Type Description
--applyTimeout DURATION The maximum amount of time to wait for an update to be applied.

Default: 30s
--dlqTableName IDENT The name of a table in the target schema for storing dead-letter entries.

Default: replicator_dlq
--enableParallelApplies BOOL Enable parallel application of independent table groups during replication. By default, applies are synchronous. When enabled, this increases throughput at the cost of higher target pool usage and memory usage.

Default: false
--flushPeriod DURATION Flush queued mutations after this duration.

Default: 1s
--flushSize INT Ideal batch size to determine when to flush mutations.

Default: 1000
--gracePeriod DURATION Allow background processes to exit.

Default: 30s
--logDestination STRING Write logs to a file. If not specified, write logs to stdout.
--logFormat STRING Choose log output format: "fluent", "text".

Default: "text"
--maxRetries INT Maximum number of times to retry a failed mutation on the target (for example, due to contention or a temporary unique constraint violation) before treating it as a hard failure.

Default: 10
--metricsAddr STRING A host:port on which to serve metrics and diagnostics. The metrics endpoint is http://{host}:{port}/_/varz.
--parallelism INT The number of concurrent database transactions to use.

Default: 16
--quiescentPeriod DURATION How often to retry deferred mutations.

Default: 10s
--retireOffset DURATION How long to delay removal of applied mutations.

Default: 24h0m0s
--retryInitialBackoff DURATION Initial delay before the first retry attempt when applying a mutation to the target database fails due to a retryable error, such as contention or a temporary unique constraint violation.

Default: 25ms
--retryMaxBackoff DURATION Maximum delay between retry attempts when applying mutations to the target database fails due to retryable errors.

Default: 2s
--retryMultiplier INT Multiplier that controls how quickly the backoff interval increases between successive retries of failed applies to the target database.

Default: 2
--scanSize INT The number of rows to retrieve from the staging database used to store metadata for replication.

Default: 10000
--schemaRefresh DURATION How often a watcher will refresh its schema. If this value is zero or negative, refresh behavior will be disabled.

Default: 1m0s
--sourceConn STRING The source database's connection string. When replicating from Oracle, this is the connection string of the Oracle container database (CDB). Refer to Oracle replication flags.
--stageDisableCreateTableReaderIndex BOOL Disable the creation of partial covering indexes to improve read performance on staging tables. Set to true if creating indexes on existing tables would cause a significant operational impact.

Default: false
--stageMarkAppliedLimit INT Limit the number of mutations to be marked applied in a single statement.

Default: 100000
--stageSanityCheckPeriod DURATION How often to validate staging table apply order (-1 to disable).

Default: 10m0s
--stageSanityCheckWindow DURATION How far back to look when validating staging table apply order.

Default: 1h0m0s
--stageUnappliedPeriod DURATION How often to report the number of unapplied mutations in staging tables (-1 to disable).

Default: 1m0s
--stagingConn STRING The staging database's connection string.
--stagingCreateSchema Automatically create the staging schema if it does not exist.
--stagingIdleTime DURATION Maximum lifetime of an idle connection.

Default: 1m0s
--stagingJitterTime DURATION The time over which to jitter database pool disconnections.

Default: 15s
--stagingMaxLifetime DURATION The maximum lifetime of a database connection.

Default: 5m0s
--stagingMaxPoolSize INT The maximum number of staging database connections.

Default: 128
--stagingSchema STRING Name of the CockroachDB schema that stores replication metadata. Required each time replicator is rerun after being interrupted, as the schema contains a checkpoint table that enables replication to resume from the correct transaction.

Default: _replicator.public
--targetApplyQueueSize INT Size of the apply queue that buffers mutations before they are written to the target database. Larger values can improve throughput, but increase memory usage. This flag applies only to CockroachDB and PostgreSQL (pglogical) sources, and replaces the deprecated --copierChannel and --stageCopierChannelSize flags.
--targetConn STRING The target database's connection string.
--targetIdleTime DURATION Maximum lifetime of an idle connection.

Default: 1m0s
--targetJitterTime DURATION The time over which to jitter database pool disconnections.

Default: 15s
--targetMaxLifetime DURATION The maximum lifetime of a database connection.

Default: 5m0s
--targetMaxPoolSize INT The maximum number of target database connections.

Default: 128
--targetSchema STRING The SQL database schema in the target cluster to update.
--targetStatementCacheSize INT The maximum number of prepared statements to retain.

Default: 128
--taskGracePeriod DURATION How long to allow for task cleanup when recovering from errors.

Default: 1m0s
--timestampLimit INT The maximum number of source timestamps to coalesce into a target transaction.

Default: 1000
--userscript STRING The path to a TypeScript configuration script. For example, --userscript 'script.ts'.
-v, --verbose COUNT Increase logging verbosity. Use -v for debug logging or -vv for trace logging.

pglogical replication flags

The following flags are used when replicating from a PostgreSQL source database.

Flag Type Description
--publicationName STRING The publication within the source database to replicate.
--slotName STRING The replication slot in the source database.

Default: "replicator"
--standbyTimeout DURATION How often to report WAL progress to the source server.

Default: 5s

mylogical replication flags

The following flags are used when replicating from a MySQL source database.

Flag Type Description
--defaultGTIDSet STRING Default GTID set, in the format source_uuid:min(interval_start)-max(interval_end). Required the first time replicator is run, as the GTID set provides a replication marker for streaming changes.
--fetchMetadata Fetch column metadata explicitly, for older versions of MySQL that do not support binlog_row_metadata.
--replicationProcessID UINT32 The replication process ID to report to the source database.

Default: 10

oraclelogminer replication flags

The following flags are used when replicating from an Oracle source database.

Flag Type Description
--sourceSchema STRING Required. Source schema name on Oracle where tables will be replicated from.
--scn INT The snapshot System Change Number (SCN) from the initial data load. Required the first time replicator is run, as the SCN provides a replication marker for streaming changes.
--backfillFromSCN INT The SCN of the earliest active transaction at the time of the initial snapshot. Ensures no transactions are skipped when starting replication from Oracle.
--sourcePDBConn STRING Connection string for the Oracle pluggable database (PDB). Only required when using an Oracle multitenant configuration. --sourceConn must be included.
--oracle-application-users STRING List of Oracle usernames responsible for DML transactions in the PDB schema. Enables replication from the latest-possible starting point. Usernames are case-sensitive and must match the internal Oracle usernames (e.g., PDB_USER).

start failback flags

The following flags are used for failback from CockroachDB.

Flag Type Description
--assumeIdempotent Disable the extra staging table queries that debounce non-idempotent redelivery in changefeeds.
--bestEffortOnly Eventually-consistent mode; useful for high-throughput, skew-tolerant schemas with foreign keys.
--bestEffortWindow DURATION Use an eventually-consistent mode for initial backfill or when replication is behind; 0 to disable.

Default: 1h0m0s
--bindAddr STRING The network address to bind to.

Default: ":26258"
--disableAuthentication Disable authentication of incoming Replicator requests; not recommended for production.
--enableCheckpointStream Enable checkpoint streaming (use an internal changefeed from the staging table for real-time updates), rather than checkpoint polling (query the staging table for periodic updates), for failback replication.

Default: false (use checkpoint polling)
--discard Dangerous: Discard all incoming HTTP requests; useful for changefeed throughput testing. Not intended for production.
--discardDelay DURATION Adds additional delay in discard mode; useful for gauging the impact of changefeed round-trip time (RTT).
--healthCheckTimeout DURATION The timeout for the health check endpoint.

Default: 5s
--httpResponseTimeout DURATION The maximum amount of time to allow an HTTP handler to execute.

Default: 2m0s
--immediate Bypass staging tables and write directly to target; recommended only for KV-style workloads with no foreign keys.
--limitLookahead INT Limit number of checkpoints to be considered when computing the resolving range; may cause replication to stall completely if older mutations cannot be applied.
--ndjsonBufferSize INT The maximum amount of data to buffer while reading a single line of ndjson input; increase when source cluster has large blob values.

Default: 65536
--tlsCertificate STRING A path to a PEM-encoded TLS certificate chain.
--tlsPrivateKey STRING A path to a PEM-encoded TLS private key.
--tlsSelfSigned If true, generate a self-signed TLS certificate valid for localhost.

make-jwt flags

The following flags are used with the make-jwt command to generate JWT tokens for changefeed authentication.

Flag Type Description
-a, --allow STRING One or more database.schema identifiers. Can be repeated for multiple schemas.
--claim If true, print a minimal JWT claim instead of signing.
-k, --key STRING The path to a PEM-encoded private key to sign the token with.
-o, --out STRING A file to write the token to.

Usage

Replicator commands

MOLT Replicator provides four commands for different replication scenarios. For detailed workflows, refer to Common workflows.

Use pglogical to replicate from PostgreSQL to CockroachDB:

icon/buttons/copy
replicator pglogical

Use mylogical to replicate from MySQL to CockroachDB:

icon/buttons/copy
replicator mylogical

Use oraclelogminer to replicate from Oracle to CockroachDB:

icon/buttons/copy
replicator oraclelogminer

Use start to replicate from CockroachDB to PostgreSQL, MySQL, or Oracle (failback):

icon/buttons/copy
replicator start

Source connection strings

Tip:

Follow the security recommendations in Connection security and credentials.

--sourceConn specifies the connection string of the source database for forward replication.

Note:

The source connection string must point to the primary instance of the source database. Replicas cannot provide the necessary replication checkpoints and transaction metadata required for ongoing replication.

PostgreSQL connection string:

icon/buttons/copy
--sourceConn 'postgresql://{username}:{password}@{host}:{port}/{database}'

MySQL connection string:

icon/buttons/copy
--sourceConn 'mysql://{username}:{password}@{protocol}({host}:{port})/{database}'

Oracle connection string:

icon/buttons/copy
--sourceConn 'oracle://{username}:{password}@{host}:{port}/{service_name}'

For Oracle Multitenant databases, --sourcePDBConn specifies the pluggable database (PDB) connection. --sourceConn specifies the container database (CDB):

icon/buttons/copy
--sourceConn 'oracle://{username}:{password}@{host}:{port}/{cdb_service_name}'
--sourcePDBConn 'oracle://{username}:{password}@{host}:{port}/{pdb_service_name}'

For failback, --stagingConn specifies the CockroachDB connection string:

icon/buttons/copy
--stagingConn 'postgresql://{username}:{password}@{host}:{port}/{database}'

Target connection strings

--targetConn specifies the connection string of the target CockroachDB database for forward replication:

icon/buttons/copy
--targetConn 'postgresql://{username}:{password}@{host}:{port}/{database}'
Note:

For failback, --targetConn specifies the original source database (PostgreSQL, MySQL, or Oracle). For details, refer to Failback to source database.

Replication checkpoints

MOLT Replicator requires a checkpoint value to start replication from the correct position in the source database's transaction log.

For PostgreSQL, use --slotName to specify the replication slot created during the data load. The slot automatically tracks the LSN (Log Sequence Number):

icon/buttons/copy
--slotName molt_slot

For MySQL, use --defaultGTIDSet with the GTID set from the MOLT Fetch output:

icon/buttons/copy
--defaultGTIDSet '4c658ae6-e8ad-11ef-8449-0242ac140006:1-29'

For Oracle, use --scn and --backfillFromSCN with the SCN values from the MOLT Fetch output:

icon/buttons/copy
--scn 26685786
--backfillFromSCN 26685444

Staging database

The staging database stores replication metadata, checkpoints, and buffered mutations. Specify the staging database with --stagingSchema and create it automatically with --stagingCreateSchema:

icon/buttons/copy
--stagingSchema _replicator
--stagingCreateSchema

The staging database is used to:

  • Store checkpoints that enable resuming from the correct point after interruptions.
  • Buffer mutations before applying them to the target in transaction order.
  • Maintain consistency for time-ordered transactional batches while respecting table dependencies.
  • Provide restart capabilities after failures.

Security

Cockroach Labs strongly recommends the following:

Connection security and credentials

  • To keep your database credentials out of shell history and logs, follow these best practices when specifying your source and target connection strings:

    • Avoid plaintext connection strings.
    • Provide your connection strings as environment variables. For example:

      export SOURCE="postgres://migration_user:a%2452%26@localhost:5432/molt?sslmode=verify-full"
      export TARGET="postgres://root@localhost:26257/molt?sslmode=verify-full"
      

      Afterward, reference the environment variables in MOLT commands:

      --sourceConn $SOURCE
      --targetConn $TARGET
      
    • If possible, use an external secrets manager to load the environment variables from stored secrets.

  • Use TLS-enabled connection strings to encrypt data in transit from MOLT to the database. When using TLS certificates, ensure certificate files are accessible to the MOLT binary on the same machine.

    For example, a PostgreSQL connection string with TLS certificates:

    icon/buttons/copy
    postgresql://migration_user@db.example.com:5432/appdb?sslmode=verify-full&sslrootcert=/etc/molt/certs/ca.pem&sslcert=/etc/molt/certs/client.crt&sslkey=/etc/molt/certs/client.key
    
  • URL-encode connection strings for the source database and CockroachDB so special characters in passwords are handled correctly.

    • Given a password a$52&, pass it to the molt escape-password command with single quotes:

      icon/buttons/copy
      molt escape-password --password 'a$52&'
      

      Use the encoded password in your connection string. For example:

      postgres://migration_user:a%2452%26@localhost:5432/replicationload
      
  • Remove sslmode=disable from production connection strings.

CockroachDB changefeed security

For failback scenarios, secure the connection from CockroachDB to MOLT Replicator using TLS certificates. Generate TLS certificates using self-signed certificates, certificate authorities like Let's Encrypt, or your organization's certificate management system.

TLS from CockroachDB to Replicator

Configure MOLT Replicator with server certificates using the --tlsCertificate and --tlsPrivateKey flags to specify the certificate and private key file paths. For example:

icon/buttons/copy
replicator start \
--tlsCertificate ./certs/server.crt \
--tlsPrivateKey ./certs/server.key \
...

These server certificates must correspond to the client certificates specified in the changefeed webhook URL to ensure proper TLS handshake.

Encode client certificates for changefeed webhook URLs:

  • Webhook URLs: Use both URL encoding and base64 encoding: base64 -i ./client.crt | jq -R -r '@uri'
  • Non-webhook contexts: Use base64 encoding only: base64 -w 0 ca.cert

JWT authentication

You can use JSON Web Tokens (JWT) to authorize incoming changefeed connections and restrict writes to a subset of SQL databases or user-defined schemas in the target cluster.

Replicator supports JWT claims that allow writes to specific databases, schemas, or all of them. JWT tokens must be signed using RSA or EC keys. HMAC and None signatures are automatically rejected.

To configure JWT authentication:

  1. Add PEM-formatted public signing keys to the _replicator.jwt_public_keys table in the staging database.

  2. To revoke a specific token, add its jti value to the _replicator.jwt_revoked_ids table in the staging database.

The Replicator process re-reads these tables every minute to pick up changes.

To pass the JWT token from the changefeed to the Replicator webhook sink, use the webhook_auth_header option:

icon/buttons/copy
CREATE CHANGEFEED ... WITH webhook_auth_header='Bearer <encoded_token>';
Token quickstart

The following example uses OpenSSL to generate keys, but any PEM-encoded RSA or EC keys will work.

icon/buttons/copy
# Generate an EC private key using OpenSSL.
openssl ecparam -out ec.key -genkey -name prime256v1

# Write the public key components to a separate file.
openssl ec -in ec.key -pubout -out ec.pub

# Upload the public key for all instances of Replicator to find it.
cockroach sql -e "INSERT INTO _replicator.jwt_public_keys (public_key) VALUES ('$(cat ec.pub)')"

# Reload configuration, or wait one minute.
killall -HUP replicator

# Generate a token which can write to the ycsb.public schema.
# The key can be decoded using the debugger at https://jwt.io.
# Add the contents of out.jwt to the CREATE CHANGEFEED command:
# WITH webhook_auth_header='Bearer {out.jwt}'
replicator make-jwt -k ec.key -a ycsb.public -o out.jwt
External JWT providers

The make-jwt command also supports a --claim flag, which prints a JWT claim that can be signed by your existing JWT provider. The PEM-formatted public key or keys for that provider must be inserted into the _replicator.jwt_public_keys table. The iss (issuers) and jti (token id) fields will likely be specific to your auth provider, but the custom claim must be retained in its entirety.

You can repeat the -a flag to create a claim for multiple schemas:

icon/buttons/copy
replicator make-jwt -a 'database.schema' --claim
{
  "iss": "replicator",
  "jti": "d5ffa211-8d54-424b-819a-bc19af9202a5",
  "https://github.com/cockroachdb/replicator": {
    "schemas": [
      [
        "database",
        "schema"
      ]
    ]
  }
}
Note:

For details on the make-jwt command flags, refer to make-jwt flags.

Production considerations

  • Avoid --disableAuthentication and --tlsSelfSigned flags in production environments. These flags should only be used for testing or development purposes.

Supply chain security

Use the version command to verify the integrity of your MOLT Replicator build and identify potential upstream vulnerabilities.

icon/buttons/copy
replicator version

The output includes:

  • Module name
  • go.mod checksum
  • Version

Use this information to determine if your build may be subject to vulnerabilities from upstream packages. Cockroach Labs uses Dependabot to automatically upgrade Go modules, and the team regularly merges Dependabot updates to address security issues.

Common workflows

Forward replication with initial load

To start replication after an initial data load with MOLT Fetch, use the pglogical command:

icon/buttons/copy
replicator pglogical

To start replication after an initial data load with MOLT Fetch, use the mylogical command:

icon/buttons/copy
replicator mylogical

To start replication after an initial data load with MOLT Fetch, use the oraclelogminer command:

icon/buttons/copy
replicator oraclelogminer

Specify the source and target database connections. For connection string formats, refer to Source connection strings and Target connection strings:

icon/buttons/copy
--sourceConn $SOURCE
--targetConn $TARGET

For Oracle Multitenant databases, also specify the PDB connection:

icon/buttons/copy
--sourcePDBConn $SOURCE_PDB

Specify the source Oracle schema to replicate from:

icon/buttons/copy
--sourceSchema migration_schema

To replicate from the correct position, specify the appropriate checkpoint value.

Use --slotName to specify the slot created during the data load, which automatically tracks the LSN (Log Sequence Number) checkpoint:

icon/buttons/copy
--slotName molt_slot

Use --defaultGTIDSet from the cdc_cursor field in the MOLT Fetch output:

icon/buttons/copy
--defaultGTIDSet '4c658ae6-e8ad-11ef-8449-0242ac140006:1-29'

Use the --scn and --backfillFromSCN values from the MOLT Fetch output:

icon/buttons/copy
--scn 26685786
--backfillFromSCN 26685444

Use --stagingSchema to specify the staging database. Use --stagingCreateSchema to create it automatically on first run:

icon/buttons/copy
--stagingSchema _replicator
--stagingCreateSchema

At minimum, the replicator command should include the following flags:

icon/buttons/copy
replicator pglogical \
--sourceConn $SOURCE \
--targetConn $TARGET \
--slotName molt_slot \
--stagingSchema _replicator \
--stagingCreateSchema

For detailed steps, refer to Load and replicate.

icon/buttons/copy
replicator mylogical \
--sourceConn $SOURCE \
--targetConn $TARGET \
--defaultGTIDSet '4c658ae6-e8ad-11ef-8449-0242ac140006:1-29' \
--stagingSchema _replicator \
--stagingCreateSchema

For detailed steps, refer to Load and replicate.

icon/buttons/copy
replicator oraclelogminer \
--sourceConn $SOURCE \
--sourcePDBConn $SOURCE_PDB \
--sourceSchema migration_schema \
--targetConn $TARGET \
--scn 26685786 \
--backfillFromSCN 26685444 \
--stagingSchema _replicator \
--stagingCreateSchema

For detailed steps, refer to Load and replicate.

Resume after interruption

When resuming replication after an interruption, MOLT Replicator automatically uses the stored checkpoint to resume from the correct position.

Rerun the same replicator command used during forward replication, specifying the same --stagingSchema value as before. Omit --stagingCreateSchema and any checkpoint flags. For example:

icon/buttons/copy
replicator pglogical \
--sourceConn $SOURCE \
--targetConn $TARGET \
--slotName molt_slot \
--stagingSchema _replicator

For detailed steps, refer to Resume replication.

icon/buttons/copy
replicator mylogical \
--sourceConn $SOURCE \
--targetConn $TARGET \
--stagingSchema _replicator

For detailed steps, refer to Resume replication.

icon/buttons/copy
replicator oraclelogminer \
--sourceConn $SOURCE \
--sourcePDBConn $SOURCE_PDB \
--sourceSchema migration_schema \
--targetConn $TARGET \
--stagingSchema _replicator

For detailed steps, refer to Resume replication.

Failback to source database

When replicating from CockroachDB back to the source database, MOLT Replicator acts as a webhook sink for a CockroachDB changefeed.

Use the start command for failback:

icon/buttons/copy
replicator start

Specify the target database connection (the database you originally migrated from). For connection string formats, refer to Target connection strings:

icon/buttons/copy
--targetConn $TARGET

Specify the CockroachDB connection string. For details, refer to Connect using a URL.

icon/buttons/copy
--stagingConn $STAGING

Specify the staging database name. This should be the same staging database created during Forward replication with initial load:

icon/buttons/copy
--stagingSchema _replicator

Specify a webhook endpoint address for the changefeed to send changes to. For example:

icon/buttons/copy
--bindAddr :30004

Specify TLS certificate and private key file paths for secure webhook connections:

icon/buttons/copy
--tlsCertificate ./certs/server.crt
--tlsPrivateKey ./certs/server.key

At minimum, the replicator command should include the following flags:

icon/buttons/copy
replicator start \
--targetConn $TARGET \
--stagingConn $STAGING \
--stagingSchema _replicator \
--bindAddr :30004 \
--tlsCertificate ./certs/server.crt \
--tlsPrivateKey ./certs/server.key

For detailed steps, refer to Migration failback.

Monitoring

Metrics

MOLT Replicator can export Prometheus metrics by setting the --metricsAddr flag to a port (for example, --metricsAddr :30005). Metrics are not enabled by default. When enabled, metrics are available at the path /_/varz. For example: http://localhost:30005/_/varz.

For a list of recommended metrics to monitor during replication, refer to:

You can use the Replicator Grafana dashboard to visualize the metrics. For Oracle-specific metrics, import the Oracle Grafana dashboard.

To check MOLT Replicator health when metrics are enabled, run curl http://localhost:30005/_/healthz (replacing the port with your --metricsAddr value). This returns a status code of 200 if Replicator is running.

Logging

By default, MOLT Replicator writes two streams of logs: operational logs to stdout (including warning, info, trace, and some errors) and final errors to stderr.

Redirect both streams to ensure all logs are captured for troubleshooting:

icon/buttons/copy
# Merge both streams to console
./replicator ... 2>&1

# Redirect both streams to a file
./replicator ... > output.log 2>&1

# Merge streams to console while saving to file
./replicator > >(tee replicator.log) 2>&1

# Use logDestination flag to write all logs to a file
./replicator --logDestination replicator.log ...

Enable debug logging with -v. For more granularity and system insights, enable trace logging with -vv. Pay close attention to warning- and error-level logs, as these indicate when Replicator is misbehaving.

Best practices

Test and validate

To verify that your connections and configuration work properly, run MOLT Replicator in a staging environment before replicating any data in production. Use a test or development environment that closely resembles production.

Optimize performance

Configure the following replicator flags to optimize replication throughput and resource usage. Test different combinations in a pre-production environment to find the optimal balance of stability and performance for your workload.

Note:

The following parameters apply to PostgreSQL, Oracle, and CockroachDB (failback) sources.

Flag Description
--parallelism Control the maximum number of concurrent target transactions. Higher values increase throughput but require more target connections. Start with a conservative value and increase based on target database capacity.
--flushSize Balance throughput and latency. Controls how many mutations are batched into each query to the target. Increase for higher throughput at the cost of higher latency.
--targetApplyQueueSize Control memory usage during operation. Increase to allow higher throughput at the expense of memory; decrease to apply backpressure and limit memory consumption.
--targetMaxPoolSize Set larger than --parallelism by a safety factor to avoid exhausting target pool connections. Replicator enforces setting parallelism to 80% of this value.
--collapseMutations Reduce the number of queries to the target by combining multiple mutations on the same primary key within each batch. Disable only if exact mutation order matters more than end state.
--enableParallelApplies Improve apply throughput for independent tables and table groups that share foreign key dependencies. Increases memory and target connection usage, so ensure you increase --targetMaxPoolSize or reduce --parallelism.
--flushPeriod Set to the maximum allowable time between flushes (for example, 10s if data must be applied within 10 seconds). Works with --flushSize to control when buffered mutations are committed to the target.
--quiescentPeriod Lower this value if constraint violations resolve quickly on your workload to make retries more frequent and reduce latency. Do not lower if constraint violations take time to resolve.
--scanSize Applies to failback (replicator start) scenarios only. Balance memory usage and throughput. Increase to read more rows at once from the CockroachDB staging cluster for higher throughput, at the cost of memory pressure. Decrease to reduce memory pressure and increase stability.

Troubleshooting

Forward replication issues

Performance troubleshooting

If MOLT Replicator appears hung or performs poorly:

  1. Enable trace logging with -vv to get more visibility into the replicator's state and behavior.

  2. If MOLT Replicator is in an unknown, hung, or erroneous state, collect performance profiles to include with support tickets. Replace {host} and {metrics-port} with your Replicator host and the port specified by --metricsAddr:

    icon/buttons/copy
    curl '{host}:{metrics-port}/debug/pprof/trace?seconds=15' > trace.out
    curl '{host}:{metrics-port}/debug/pprof/profile?seconds=15' > profile.out
    curl '{host}:{metrics-port}/debug/pprof/goroutine?seconds=15' > gr.out
    curl '{host}:{metrics-port}/debug/pprof/heap?seconds=15' > heap.out
    
  3. Monitor lag metrics and adjust performance parameters as needed.

Unable to create publication or slot

This error occurs when logical replication is not supported.

Resolution: If you are connected to a replica, connect to the primary instance instead. Replicas cannot create or manage logical replication slots or publications.

Verify that the source database supports logical replication by checking the wal_level parameter on PostgreSQL:

icon/buttons/copy
SHOW wal_level;

If wal_level is not set to logical, update it and restart PostgreSQL:

icon/buttons/copy
ALTER SYSTEM SET wal_level = 'logical';
Replication slot already exists
ERROR: replication slot "molt_slot" already exists

Resolution: Either create a new slot with a different name, or drop the existing slot to start fresh:

icon/buttons/copy
SELECT pg_drop_replication_slot('molt_slot');
Warning:

Dropping a replication slot can be destructive and delete data that is not yet replicated. Only use this if you want to restart replication from the current position.

Publication does not exist
run CREATE PUBLICATION molt_fetch FOR ALL TABLES;

Resolution: Create the publication on the source database. Ensure you also create the replication slot:

icon/buttons/copy
CREATE PUBLICATION molt_publication FOR ALL TABLES;
SELECT pg_create_logical_replication_slot('molt_slot', 'pgoutput');
Could not connect to PostgreSQL
could not connect to source database: failed to connect to `user=migration_user database=source_database`

Resolution: Verify the connection details including user, host, port, and database name. Ensure the database name in your --sourceConn connection string matches exactly where you created the publication and slot. Verify you're connecting to the same host and port where you ran the CREATE PUBLICATION and SELECT pg_create_logical_replication_slot() commands. Check if TLS certificates need to be included in the connection URI.

Wrong replication slot name
run SELECT pg_create_logical_replication_slot('molt_slot', 'pgoutput'); in source database

Resolution: Create the replication slot or verify the correct slot name:

icon/buttons/copy
SELECT pg_create_logical_replication_slot('molt_slot', 'pgoutput');
Repeated binlog syncing restarts

If Replicator repeatedly restarts binlog syncing or starts replication from an unexpectedly old location, this indicates an invalid or purged GTID. When an invalid GTID is provided, the binlog syncer will fall back to the first valid GTID.

Resolution: Verify the GTID set is valid and not purged:

icon/buttons/copy
-- Check if GTID is in executed set
SELECT GTID_SUBSET('your-gtid-set', @@GLOBAL.gtid_executed) AS in_executed;

-- Check if GTID is purged
SELECT GTID_SUBSET('your-gtid-set', @@GLOBAL.gtid_purged) AS in_purged;

Interpret the results as follows:

  • If in_executed returns 1 and in_purged returns 0, the GTID is valid for replication.
  • If in_purged returns 1, the GTID has been purged and you must find a newer consistent point.
  • If both return 0, the GTID doesn't exist in the records and is invalid.

If the GTID is purged or invalid, follow these steps:

  1. Increase binlog retention by configuring binlog_expire_logs_seconds in MySQL:

    icon/buttons/copy
    -- Increase binlog retention (example: 7 days = 604800 seconds)
    SET GLOBAL binlog_expire_logs_seconds = 604800;
    
    Note:

    For managed MySQL services (such as Amazon RDS, Google Cloud SQL, or Azure Database for MySQL), binlog retention is typically configured through the provider's console or CLI. Consult your provider's documentation for how to adjust binlog retention settings.

  2. Get a current GTID set to restart replication:

    icon/buttons/copy
    -- For MySQL < 8.0:
    SHOW MASTER STATUS;
    -- For MySQL 8.0+:
    SHOW BINARY LOG STATUS;
    
    +---------------+----------+--------------+------------------+-------------------------------------------+
    | File          | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                         |
    +---------------+----------+--------------+------------------+-------------------------------------------+
    | binlog.000005 |      197 |              |                  | 77263736-7899-11f0-81a5-0242ac120002:1-38 |
    +---------------+----------+--------------+------------------+-------------------------------------------+
    

    Use the Executed_Gtid_Set value for the --defaultGTIDSet flag.

Invalid GTID format

Invalid GTIDs can occur when GTIDs are purged due to insufficient binlog retention, when connecting to a replica instead of the primary host, or when passing a GTID that has valid format but doesn't exist in the binlog history.

Resolution: Use a valid GTID from SHOW MASTER STATUS (MySQL < 8.0) or SHOW BINARY LOG STATUS (MySQL 8.0+) and ensure you're connecting to the primary host. If GTIDs are being purged, increase binlog retention.

Table/column names exceed 30 characters

Oracle LogMiner excludes tables and columns with names longer than 30 characters from redo logs.

Resolution: Rename tables and columns to 30 characters or fewer before migration.

Unsupported data types

LogMiner and replication do not support:

  • Long BLOB/CLOBs (4000+ characters)
  • User-defined types (UDTs)
  • Nested tables
  • Varrays
  • GEOGRAPHY and GEOMETRY

Resolution: Convert unsupported data types or exclude affected tables from replication.

LOB column UPDATE statements

UPDATE statements that only modify LOB columns are not supported by Oracle LogMiner.

Resolution: Avoid LOB-only updates during replication, or use Binary Reader for Oracle 12c.

JSONB null handling

SQL NULL and JSON null values are not distinguishable in JSON payloads during replication.

Resolution: Avoid using nullable JSONB columns where the distinction between SQL NULL and JSON null is important.

Missing redo logs or unavailable SCN

If the Oracle redo log files are too small or do not retain enough history, you may get errors indicating that required log files are missing for a given SCN range, or that a specific SCN is unavailable.

Resolution: Increase the number and size of online redo log files, and verify that archived log files are being generated and retained correctly in your Oracle environment.

Replicator lag

If the replicator process is lagging significantly behind the current Oracle SCN, you may see log messages like: replicator is catching up to the current SCN at 5000 from 1000…. This indicates that replication is progressing but is still behind the most recent changes on the source database.

Schema drift errors

Indicates source and target schemas are mismatched:

WARNING: schema drift detected in "database"."table" at payload object offset 0: unexpected columns: column_name

Resolution: Align schemas or use userscripts to transform data.

Apply flow failures

Apply flow failures occur when the target database encounters error conditions such as unique constraint violations, target database being unavailable, or incorrect data (missing or extraneous columns) during apply operations:

WARNING: warning during tryCommit: ERROR: duplicate key value violates unique constraint
ERROR: maximum number of retries (10) exceeded

Resolution: Check target database constraints and connection stability. MOLT Replicator will log warnings for each retry attempt. If you see warnings but no final error, the apply succeeded after retrying. If all retry attempts are exhausted, Replicator will surface a final error and restart the apply loop to continue processing.

Failback issues

If the changefeed shows connection errors in SHOW CHANGEFEED JOB:

Connection refused
transient error: Post "https://replicator-host:30004/molt/public": dial tcp [::1]:30004: connect: connection refused

This indicates that Replicator is down, the webhook URL is incorrect, or the port is misconfigured.

Resolution: Verify that MOLT Replicator is running on the port specified in the changefeed INTO configuration. Confirm the host and port are correct.

Incorrect schema path errors

This error occurs when the CockroachDB changefeed webhook URL path does not match the target database schema naming convention:

transient error: 400 Bad Request: unknown schema:

The webhook URL path is specified in the INTO clause when you create the changefeed. For example: webhook-https://replicator-host:30004/database/schema.

Resolution: Verify the webhook path format matches your target database type:

  • PostgreSQL or CockroachDB targets: Use /database/schema format. For example, webhook-https://replicator-host:30004/migration_schema/public.
  • MySQL targets: Use /database format (schema is implicit). For example, webhook-https://replicator-host:30004/migration_schema.
  • Oracle targets: Use /DATABASE format in uppercase. For example, webhook-https://replicator-host:30004/MIGRATION_SCHEMA.

For details on configuring the webhook sink URI, refer to Webhook sink.

GC threshold error
batch timestamp * must be after replica GC threshold

This indicates starting from an invalid cursor that has been garbage collected.

Resolution: Double-check the cursor to ensure it represents a valid range that has not been garbage collected, or extend the GC TTL on the source CockroachDB cluster:

icon/buttons/copy
ALTER DATABASE defaultdb CONFIGURE ZONE USING gc.ttlseconds = {gc_ttl_in_seconds};
Duplicated data re-application

This occurs when resuming a changefeed from a cursor causes excessive data duplication.

Resolution: Clear the staging database to prevent duplication. This deletes all checkpoints and buffered data, so use with caution:

icon/buttons/copy
DROP DATABASE _replicator;

For more targeted cleanup, delete mutations from specific staging tables:

icon/buttons/copy
DELETE FROM _replicator.employees WHERE true;

Examples

For detailed examples of using MOLT Replicator usage, refer to the migration workflow tutorials:

See also

×