How to exclude some tables from RDS Mysql replication
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Understanding MySQL Replication in RDS
Amazon Relational Database Service (RDS) supports a variety of databases, including MySQL. One of the powerful features of MySQL is replication, which allows you to create copies of a database that can be used for scaling out reads, backups, and disaster recovery. Typically, replication involves replicating all databases and tables, but there are scenarios where you might want to exclude specific tables from replication.
Why Exclude Tables from Replication?
Exclusion of specific tables from replication can be beneficial in several scenarios:
- Reducing Replication Load: Some tables, like logs or audit trails, may generate a significant amount of data, which might not be necessary on replica servers.
- Privacy and Security: Sensitive information which is only relevant to certain environments (e.g., development) should not replicate to all servers.
- Optimizing Resource Usage: By excluding unnecessary tables, you can conserve disk space and processing power on your replica database instances.
Technical Approach to Excluding Tables
When you're working with MySQL on RDS, traditionally you would set replication filters to include or exclude tables. However, Amazon RDS places some limitations on direct access to the MySQL configurations. Instead, you will need to implement these solutions using other techniques:
Method 1: Using Blackhole Storage Engine on Source
- Identify Tables to Exclude: Determine which tables should not be included in the replication.
- Convert Tables on Master to Blackhole:
- On your master RDS instance, convert those tables to the
BLACKHOLEstorage engine. This engine will discard data written to it.
- The blackhole engine will still replicate structure changes but not any data changes.
Method 2: External Slave for Filtering
- Set Up an External Slave:
- Create a non-RDS MySQL instance as an intermediary slave. This instance has full configuration access.
- Apply Replication Filters:
- Use MySQL options like
replicate-do-db,replicate-ignore-db,replicate-do-table, andreplicate-ignore-tableto control which databases or tables are replicated.
- Replicate to RDS:
- Use the filtered external MySQL as a master to the RDS instances. The filters will ensure only selected tables get replicated.
Method 3: Application-Level Filtering
If you can modify the application:
- Selective Data Handling:
- Change the application logic to exclude writes to certain tables on replicas. This strategy requires significant application development and is best suited for specialized use cases.
Considerations and Limitations
- Blackhole Engine: This doesn't truly stop replication but ensures data isn't physically stored. It can still lead to increased binlog size.
- External Slave Complexity: Managing an external instance adds operational overhead and potential security concerns.
- RDS Limitations: RDS's managed nature can restrict direct access to certain configurations, necessitating workarounds like these.
Summary Table
| Strategy | Description | Pros | Cons | ||||
| Blackhole Storage Engine | Converts specific tables to blackhole | Simple to implement, effective on large data | Does not reduce binlog size, structure still replicates | External Slave Filtering | Uses intermediary MySQL instance for filtering | Full control over replication filtering | Adds complexity, extra instance to manage |
| Application-Level Filtering | Data handling at application level | Highly customizable | Requires significant application changes |
Conclusion
Excluding tables from MySQL replication in Amazon RDS requires careful planning and sometimes creative solutions due to the managed nature of RDS instances. Whether you decide to use the blackhole engine approach, an intermediary MySQL instance, or modify your application, each method has its advantages and limitations. Carefully assess your requirements and constraints before deciding on the appropriate strategy for your deployment.

