Database Migration Service (DMS) is a service that can be accessed from the AWS console to migrate data to and from widely used databases such as Amazon Aurora, Oracle, MySQL, PostgreSQL among others. It can migrate data from homogenous or heterogeneous services and either uses auto conversion or a manual conversion tool to format the data properly. It connects to the source database, reads the source data, formats the data for consumption by the target database, and loads the data into the target database.
How it Works
AWS creates target schema that’s primary or minimal to the migration, such as tables, primary keys, and unique indexes. It does not however, create secondary indexes, non-primary key constraints, or data defaults. AWS actually provides a Schema Conversion Tool which can be used to create or convert the complete schema in preparation for migration.
When following Amazon’s steps the first thing to do is to create a replication instance in either EC2 or RDS that has sufficient storage and processing power to perform the tasks needed to migrate the information. The next step is to specify the database endpoints in which both the source and target information can be either an EC2 instance, RDS DB instance, or an on-premises database. After specifying the endpoints for the migration, the next step is to create a task which can specify which data to migrate, map data using target schema, or even create new data on the target database. The final step is to monitor the tasks that are operating. AWS provides different services to monitor the information, but at the minimum, it provides a table of statistics for the task running. This is a high level overview of the basic steps. For more details and tips see here: http://docs.aws.amazon.com/dms/latest/sbs/DMS-SBS-Welcome.html
Things to Consider
Amazon Web Services, as well as some other sources provide some basic requirements that should be considered when performing database migration. The main point is that the process requires time. AWS researchers conducted a test of DMS which concluded that this process can take anywhere from two weeks to multiple months. It took them about 12-13 hours to migrate a terabyte of data in ideal conditions, migrating source databases in EC2 and RDS to target databases in RDS.
Arguably, the most important part of the migration is the planning. In some cases, the process of mapping out and planning the details of the migration can take longer than the migration itself. It’s important that there is a certain amount of expertise when conducting the migration. Fully understanding how to operate the services of AWS and migration is essential to a successful migration.
Finally, there’s some ways solely in AWS to improve migration performance:
- Remove Bottlenecks on Targets- This simply means don’t have processes that compete with each other for resources. This can cause the process to take longer than needed.
- Use Multiple Tasks- Sometimes dividing migration into multiple tasks helps, but make sure the data doesn’t participate in common transactions.
- Determine Optimum Size for Replication Instance- Be aware of the data size and total size of migration to ensure that the right amount of storage and power is used to migrate the data in the amount of time needed.
- Use the Task Log to Troubleshoot Issues- AWS provides a log of activity for the tasks put in place so following it can help find any errors or problems with the task.
This is the most common instance of a total database migration of a major corporation. That’s not to say that other large companies haven’t migrated to the cloud, but Netflix seems to be the most public about it. Netflix cited the talent they have as the main reason for their migration success. In terms of the actual migration, Netflix stated that they used multiple replication instances simultaneously and prepared multiple redundancies if there were any issues in the cloud. The biggest takeaway from Netflix is to always be prepared for failure. They store data across multiple availability zones, that way if there’s a failure in one zone they can work through another. Netflix also advises to buy more capacity than needed, just in case there’s large spikes of data or an increase. The final thing to look at in Netflix’s migration is that they always leverage one action with another. It can be a simple instance or complex architecture, when migrating, they always had a backup plan for if something didn’t work quite the right way.
For more information see here: