Overview
The Skinny
In this blog post we explore various options for performance tuning MySQL server for better Replicator applier performance on the replica nodes.
A Tungsten Cluster relies upon the Tungsten Replicator to move events from the primary node to the replicas. Once the event has been transferred to the replica as THL on disk, the applier on the replica will then attempt to write it to the database. Tungsten Replicator can only apply events as fast as MySQL allows. If the MySQL server is somehow slow or blocking, then the Replicator will be as well.
A properly-tuned database server in addition to infrastructure and SysAdmin best practices will go quite a long way towards high-performance apply operations on replica nodes.
The Question
Recently, a customer asked us:
During one of our load tests, we had a peak of 60k writes/min, averaging around 40k w/m. During this period, we saw an applied latency of 100-160 seconds. Is it possible to improve the replication latency on the replica nodes?
The Summary
Where To Look and What For?
When performance tuning, we normally examine the configuration and logs for the following layers:
- Tungsten Replicator
- MySQL server
- Storage
- Network/Firewall
- Hardware & OS
The Rules
What are Performance Tuning Best Practices?
Before we dive into the various layers, let's review the performance tuning basics that I have learned over the years:
- Performance tuning is a slow, iterative process.
- Change one thing at a time and test carefully.
- Document your changes and the associated test results.
- Go for the low-hanging fruit early on. Sometimes the smallest changes may have the largest results, i.e. adding a proper index on a table.
- As soon as you remove one bottleneck, the next one will appear. Iterate!
- Divide and Conquer - pick a spot and ensure everything is working properly on both sides. This approach often helps to isolate issues.
- Start at the top and work your way down the layers. Test, test, test!
- Start at the bottom and work your way up the layers. Test, test, test!
- When in doubt, ask, then document.
- Read the documentation. Then read it again.
- Ensure consistency amongst all nodes at every layer.
- Most of all, do not assume.
Tungsten Replicator
The Short Version
We always start off by checking the replicator applier stages, of which there are three.
On a replica node, if the appliedLatency
of the remote-to-thl
stage is low, but the appliedLatency
of both the thl-to-q
and q-to-dbms
stages are high, then the issue is almost always with the MySQL server somehow.
For example, on a healthy cluster:
On a Primary Node:
shell> trepctl status -name tasks | egrep 'applied|stage'
appliedLastEventId : mysql-bin.000046:0000000065129638;-1
appliedLastSeqno : 2656221
appliedLatency : 0.279
stage : binlog-to-q
appliedLastEventId : mysql-bin.000046:0000000065129638;-1
appliedLastSeqno : 2656221
appliedLatency : 0.279
stage : q-to-thl
On a Replica Node:
shell> trepctl status -name tasks | egrep 'applied|stage'
appliedLastEventId : mysql-bin.000046:0000000065077978;-1
appliedLastSeqno : 2656191
appliedLatency : 0.345
stage : remote-to-thl
appliedLastEventId : mysql-bin.000046:0000000065077978;-1
appliedLastSeqno : 2656191
appliedLatency : 0.371
stage : thl-to-q
appliedLastEventId : mysql-bin.000046:0000000065077978;-1
appliedLastSeqno : 2656191
appliedLatency : 0.374
stage : q-to-dbms
For more information about stages, please visit the following blog post: Mastering Tungsten Replicator Series: Understanding Pipelines and Stages.
Quick Tech "Tool Tip"
Here is a handy command to describe the summary progress of each stage on that node, refreshing every two seconds:
trepctl perf -r 2
MySQL Server
What a Lovely Tune
Each one of these suggestions can bear performance fruit. As always, your mileage may vary!
-
Slow Query Log To The Rescue
The first thing to do would be to enable the slow query logs just to validate that there is nothing blocking faster queries behind it. Since the applier is single-threaded, just one repeated slow query can create a huge bottleneck.
To control the slow query log, use the below queries:-
Set global system variable
slow_query_log
to 0 to disable the log or to 1 to enable it:-
mysql> SELECT @@global.slow_query_log;
-
mysql> SET GLOBAL slow_query_log = 0;
-
mysql> SET GLOBAL slow_query_log = 1;
-
-
Set global system variable
slow_query_log_file
to specify the name of the log file:-
mysql> SELECT @@global.slow_query_log_file;
mysql> SELECT @@global.slow_query_log_file; +---------------------------------------+ | @@global.slow_query_log_file | +---------------------------------------+ | /volumes/data/mysql/db1-demo-slow.log | +---------------------------------------+ 1 row in set (0.00 sec)
-
mysql> SET GLOBAL slow_query_log_file = "/volumes/data/mysql/db1-slow-query.log";
-
-
Set global system variable
-
Next,
SHOW FULL PROCESSLIST
on the replicas during the latency would be helpful to identify what may be locking or blocking. - Use the MySQL EXPLAIN command to better understand why queries are slow.
- Does every table have a Primary Key?
- What type of replication is enabled in MySQL - STATEMENT, MIXED or ROW? For active/passive clusters, we recommend MIXED, and for active/active clusters, use ROW to ensure data integrity across masters. ROW will increase latency due to the sheer volume of data transmitted as compared to STATEMENT or MIXED.
- Make sure there are no redundant indexes.
- Also, check for the use of Unique indexes, which disable the InnoDB change buffering performance boost. InnoDB has change buffering (previously called the insert buffer), which is a feature to delay building secondary indexes that are not unique, and merge writes. It can boost insert performance by quite a lot, and it’s enabled by default.
- Are the tables InnoDB? If so, do you have lots of memory? Can you load the tables into RAM and then sync to disk? This could help to avoid the physical SERIAL disk i/o that can create bottlenecks, even if the i/o channel itself is not full.
- Finally, how large are the affected tables and schemas in terms of row size and byte size? Can the tables be split? Can the databases be split?
Key MySQL my.cnf Configuration Parameters to Check:
- innodb_log_file_size
- innodb_flush_method
- innodb_flush_neighbors
- innodb_flush_log_at_trx_commit
- innodb_io_capacity
- innodb_io_capacity_max
- innodb_lru_scan_depth
- innodb_adaptive_hash_index
- performance_schema
Storage
Disk Inside and Out
- Disk I/O is normally the slowest part of any compute workflow because disk is often much slower than physical memory. No matter how fast the underlying SSD is, there are physical limitations to overcome, especially with SAN and Cloud-based storage.
- For example, when using AWS EC2, consider using enhanced EBS I/O with provisioned IOPS if you are not already doing so, which would provide faster performance at an increased cost.
- Not having separated volumes for different purposes will always degrade performance because of disk contention. For example, ensure separate volumes for (at least) root, mysql data, mysql binlogs and THL files.
- Then stripe each filesystems for MySQL over multiple volumes using RAID of some sort (i.e. RAID-0), so that there are multiple I/O channels in use at the same time for the same filesystem.
- This layered disk architecture provides for a large number of parallel disk I/O channels, giving a much higher throughput at a much lower latency.
- What is the filesystem type? (i.e. xfs, ext4, etc...) because journaling filesystems are very slow. Consider using a non-journaling file system, or disabling journaling. We suggest using xfs.
-
Mount the filesystem using the
-noatime
option.
Network/Router/Firewall
The Network Really IS the Computer
- Ensure sufficient bandwidth, and a very low error rate.
- Confirm the various hops have sufficient cpu, ram and bandwidth.
- Is the firewall able to keep up?
Operating System
SysAdmins to the Rescue, As Usual
- First, verify the documented prerequisites for Tungsten hosts.
-
Check the number of open files (nofiles) and the max user processes (nproc):
tungsten@db1-demo:/home/tungsten # cat /etc/security/limits.conf * - nofile 65535 * - nproc 8096 tungsten@db1-demo:/home/tungsten # ulimit -n 65535 tungsten@db1-demo:/home/tungsten # ulimit -u 8096 tungsten@db1-demo:/home/tungsten # ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 31374 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 65535 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 8096 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
- Is there enough CPU?
- Is there enough RAM?
- Check the various system logs for any errors or warnings.
-
If using
systemd
, check thesystemctl
command for errors or warnings from installed components:-
systemctl status tconnector
-
systemctl status tmanager
-
systemctl status treplicator
-
Advanced Tungsten Replicator
The Parallel (Apply) Universe
After making sure that all reasonable efforts have been made to properly evaluate the above solutions, there is an advanced feature available to try: Parallel Apply.
Tungsten Replicator by default uses a single-threaded applier, so it can get about 10,000 updates per second maximum, depending on the round trip time, and so on. Since MySQL server is multi-threaded, the primary is able to write faster than the replicas.
To increase that limit, Tungsten Replicator offers Parallel Apply, which employs multiple replication apply threads, one per shard.
By default, we shard by database, with one shard created per database schema. If there is a cross-database query, all other threads block until that one completes, slowing performance down dramatically.
This means Parallel apply is best suited for environments that equally busy writes for every database. Having many databases, but only one or two as hot defeats the design and purpose of Parallel Apply.
Again, any cross-shard query will force the Replicator back into single-threaded mode to ensure data integrity, with the result of having no performance gain, or even degradation as now the Replicator has to keep switching modes.
Tungsten Replicator can also shard by table, with the same caveats, but this time cross-TABLE queries will block other threads, making this somewhat less useful than desired based on most query designs.
Important questions before implementing Parallel Apply:
- Do you have just one database schema or multiple?
- If single, do the bulk of your queries cross tables or use single tables?
- If multiple databases, are they de-coupled or do you do a bunch of cross-database queries?
The Answer
What WAS the Solution After All?
For this customer, the following changes improved performance to acceptable levels:
- The THL was being written to the same location as the MySQL binary logs, creating contention. Moving the THL directory to a different location (/logs) improved performance.
- Because the nodes are AWS EC2 instances, converting volumes from GP2 to Provisioned IOPS allowed the replica to keep up with replication much better.
-
Several MySQL configuration values were modified, which also decreased the applied latency on the replica nodes:
innodb_io_capacity=1000 (Interestingly, lowering this from 10k to 1k led to a significant improvement in latency, despite the EBS volume being provisioned for 10k iops) innodb_io_capacity_max=2000 innodb_adaptive_hash_index=0 innodb_lru_scan_depth=2048 performance_schema=0
Bonus Performance Tuning Tip
Tungsten versions >= 7.0.0 have a new tpm option replica-tx-commit-level
which allows you to adjust the innodb_flush_log_at_trx_commit
value of the replica nodes, potentially resulting in up to 75% gains. By setting the MySQL variable innodb_flush_log_at_trx_commit=2
, MySQL does not need to flush to disk upon each write, instead passing that responsibility to the operating system sync operation. There is a small window for data loss in the event of a host crash, but if you deem that risk acceptable on a Replica, you can get a huge boost in performance. This feature will do this for you automatically if enabled, and ensure that the value is set to 1 on the Primary node.
Valid values are either 1 (default) or 2. Setting this value to 2 will change the underlying database property innodb_flush_log_at_trx_commit
to 2 when the node becomes a Replica, resetting it back to 1 if the node is promoted to a Primary.
The Library
Please read the docs!
For more information about how to use the Tungsten Replicator, please see Mastering Tungsten Replicator Series: Command Line Superpowers.
For more in-depth discussion about MySQL database tuning, here are some excellent blog posts by Morgan Tocker:
- http://www.tocker.ca/2013/09/17/what-to-tune-in-mysql-56-after-installation.html
- http://www.tocker.ca/2013/05/06/when-does-mysql-perform-io.html
For more technical information about Tungsten clusters, please visit our documentation.
Summary
The Wrap-Up
Clearly, there are many, many things to think about when it comes to MySQL performance tuning - this blog post barely scratches the surface of the subject. Remember, performance tuning is all about iterative effort over time!
Tungsten Clustering is the most flexible, performant global database layer available today – use it underlying your SaaS offering as a strong base upon which to grow your worldwide business!
For more information, please visit our products page.
Want to learn more or run a POC? Contact us.
Comments
Add new comment