Introduction
Tungsten Replicator is a powerful tool with many great features, and today we will look at a new one - the ability to Pause and Resume a replication service, included with Tungsten versions 6.1.18+ and 7.0.2+.
Until now, the only way to stop a replication stream was to take the service offline, and put it back online when it is needed again. The online process involves a number of internal steps and overhead, and if there are many THL files on disk, it could take some time to index them all before the service can come fully online.
To remove the overhead of the startup time, the new pause/resume feature was developed for the Replicator, and is accessed via the `trepctl
` command.
Use-Cases
The original need for this came from a customer who was using a variety of tools and home-grown scripts to sync cross-site services by time in a Multi-Site/Active-Active cluster. The customer wanted to cut down on the offline/online overhead, and requested the low-overhead pause feature instead.
You may have or think of other needs. As always, YMMV!
Command Summary
To begin with, here is the basic syntax for the commands:
shell> trepctl -service {SERVICE} pause -stage {STAGE} [-time {SECONDS}]
shell> trepctl -service {SERVICE} resume -stage {STAGE}
The Fine Print
Please note this paused state will not survive a replicator restart or a service offline/online.
You must specify one of the five stage names (binlog-to-q
, q-to-thl
, remote-to-thl
, thl-to-q
, q-to-dbms
).
Add -y
to avoid being prompted with the “Are you sure?” message when no time is specified to `trepctl pause
`.
Running a `trepctl pause
` command again will override the previous `trepctl pause
` command.
Getting Status
To begin, let’s examine the commands needed to see the status of the pause.
We will use the `trepctl status -name stages
` command to get at the information we want, and then pipe the output through `egrep
` to grab just the desired lines.
For example, here is the status command run first on a primary, then on a secondary:
shell> trepctl -service north status -name stages | egrep 'Pause|pause|^name'
name : binlog-to-q
paused : false
name : q-to-thl
paused : false
shell> trepctl -service north status -name stages | egrep 'Pause|pause|^name'
name : remote-to-thl
paused : false
name : thl-to-q
paused : false
name : q-to-dbms
paused : false
The Pause
The first step is to issue a `trepctl pause
` command with or without a time to pause.
If no time is specified, the pause will last indefinitely.
For example, to pause the flow of events into the database server indefinitely:
shell> trepctl pause -stage q-to-dbms
Do you really want to suspend stage q-to-dbms of service north indefinitely ? Remember you have to resume once done... [yes/NO] yes
Pausing stage q-to-dbms of service north indefinitely - PLEASE REMEMBER TO RESUME WHEN DONE !
Remember, the paused state will not survive a replicator restart or a service offline/online.
To see the current state of the pause, use the `trepctl status -name stages
` command.
For example, here is the status command run on the secondary after the above command:
shell> trepctl -service north status -name stages | egrep 'Pause|pause|^name'
name : q-to-dbms
paused : true
remainingPauseTime : indefinitely
...
Here is a different example, this one shows pausing the flow of events into the database server for 100 seconds, knowing that the new pause command will override the previous one:
shell> trepctl pause -stage q-to-dbms -time 100
Pausing stage q-to-dbms of service north for 100 seconds
shell> trepctl -service north status -name stages | egrep 'Pause|pause|^name'
name : q-to-dbms
paused : true
remainingPauseTime : 94
...
The Resume
To easily resume a stage before the timer expires or when set to indefinitely, use the `trepctl resume
` command:
shell> trepctl resume -stage q-to-dbms
Resuming stage q-to-dbms of service north
shell> trepctl -service north status -name stages | egrep 'Pause|pause|^name'
name : q-to-dbms
paused : false
...
Q & A
Q. What if stage q-to-dbms is paused while a long transaction is in the middle of being applied to the replica’s database server? Will that transaction finish?
A. The setting does not apply to in-flight events, only new events (threads).
Wrap-Up
In this post we explored the new Replicator service trepctl pause/resume feature included with Tungsten version 6.1.18+/7.0.2+.
For more information, please see our Docs at https://docs.continuent.com/tungsten-clustering-6.1/cmdline-tools-trepctl-command-pause.html
Smooth sailing!
Comments
Add new comment