Geo-Distributed MySQL Clusters for the Enterprise
Oracle MySQL InnoDB Cluster provides high availability and scalability for MySQL. An InnoDB Cluster consists of 3 or more MySQL instances in a local network with group replication enabled, MySQL Shell which is used for management of the cluster, and optionally MySQL Router to provide basic routing from applications to the cluster. While this provides high availability in a local region or site, it does not provide any provisions for disaster recovery (DR) or any multi-site deployment in general, so let’s explore how we could extend the functionality of InnoDB Cluster to deploy at geo-scale.
Linking Two (2) or More InnoDB Clusters
To create a multi-site InnoDB Cluster, it’s tempting to simply add more cluster nodes in a remote region. However, due to group replication and high latencies over the WAN, cluster and application performance would become unacceptable and thus this topology is not recommended. Instead, we turn to native MySQL replication, which is asynchronous and does not come with a significant performance penalty.
The steps to establish replication from Site 1 to Site 2 (DR) are:
- Create a cluster in Site 1.
- Create a cluster in Site 2 from Site 1. At this point, the 2 clusters are identical.
- Configure necessary firewall rules between sites to allow replication from Site 1 to Site 2.
- Also, configure the firewall to allow replication in the reverse direction (see below).
- Configure replication on Site 2 to read from Site 1.
- Must ignore schema “mysql_innodb_cluster_metadata” as this is cluster specific metadata that should not be replicated.
- For better fault tolerance of the replication stream, configure MySQL router as a proxy for Site 1 and use it as the replication source. This way, changes in the cluster in Site 1 won’t cause replication to stop.
- At this point, be sure to set up monitoring on both clusters so you are notified of any issues!
With the template above and mixing various technologies, you can achieve a DR site for your InnoDB Cluster. Note that using asynchronous replication (as above) to replicate between 2 InnoDB clusters is NOT SUPPORTED. One reason for this is the schema “mysql_innodb_cluster_metadata” that the local cluster maintains. If this schema gets corrupted, the entire cluster will become corrupted (which is why we ignore this database).
A “supported” version of this topology is simply to have a single node in the DR site.When configured this way there is no need for the replication filter and the maintenance of a second independent cluster. However, failing over to a single MySQL node is quite risky and unacceptable for business-critical continuous operations.
What Else to Consider with Geo-Distributed InnoDB Cluster?
Actually using native replication to join InnoDB Clusters (or 1 cluster + 1 standalone DR instance) requires us to plan a few additional items:
- How do we actually failover?
- Stop application traffic to the primary site.
- Promote the DR site/DR database as active (making it read/write).
- Repoint all application traffic to Site 2.
- Note that the above steps will take your application offline for however long it takes to perform the above operations.
- How do we fail back?
- We will have to reprovision Site 1 since the data is stale.
- Schedule downtime, do backup and restore, all nodes now in sync.
- Take backup, restore onto Site 1, then establish replication from Site 2 to Site 1.
- Stop application traffic to Site 2.
- Be sure replication traffic has caught up if using replication.
- Promote Site 1 as active.
- Reestablish replication from Site 1 to Site 2.
- Repoint application traffic to Site 2.
- Again, the above steps will incur downtime for your application.
- We will have to reprovision Site 1 since the data is stale.
- Monitoring and Management.
- Need to monitor:
- InnoDB Cluster in both sites.
- MySQL native replication.
- MySQL Router.
- Various tools required for each item above.
- Each technology needs to be managed separately, most likely using Do It Yourself (DIY) scripts.
- Requires development to view and manage the entire topology from a high level.
- Need to monitor:
Is This the Right Solution for Me?
Creating a Geo-deployed InnoDB Cluster has quite a few moving parts, and requires planning and testing for how the components will work together when they are deployed.
If you enjoy managing various technologies, have development/integration/testing time for the various elements, can afford downtime, and do not need support, then the solution could work for you.
If, however, you need a proven, complete, bulletproof solution with 24x7 support that takes care of all of the steps above, feel free to see how we do it here: MySQL Disaster Recovery Done Right! (Part I) | Blog.
Comments
Add new comment