Monday, November 18, 2013

Techniques to backup and restore mysql databases - A comparison

When I started working at Digital Mediat Solutions, one of the biggest problems that we had was the runtime of daily mysql backup scripts. They were taking 8 hours to complete. The first task I had was to fix the performance of the backups. The existing scripts were based on mysqldump. 

I compiled a list of options to look into by referring the web. I'm sharing this, so that someone else might find it useful. 

Out of the following options, the most feasible (economically, disk space wise and performance wise) option for us, was to use mysqlhotcopy. Now the backup script I have written based on  mysqlhotcopy, completes in 1 hours time, which is a significant imporvement. 


1) MySQL enterprise backup
  • incremental backup
  • compressed backup
  • Backing up the physical database files makes restore much faster than logical techniques such as the mysqldump command.
  • InnoDB tables are copied using a hot backup mechanism. (Ideally, the InnoDB tables should represent a substantial majority of the data.)
  • Tables from other storage engines are copied using a warm backup mechanism.

  • enterprise licensed
  • Minimalistic license per 1 mysql server instance per 1 year is XXXXUSD.
  • The cost will grow as the instances grow and the running time exceeds year by year.
  • It even exceeds our current budget for underlying hardware.
  • Therefore, I don’t think it’s feasible for our budget and IMV it’s not needed for the current deployment.

2) MySQL dumps with diffs
  • Saves space in the backed up device as we are backing up the diff.   
  • It won't result in network traffic as the only time the full dump is transferred, is when it’s run for the first time.

  • We are looking into stopping the overhead generated by the mysql dump command.
  • Since, we are using the same command over and over again, in this deployment, this wont satisfuy our need. Therefore, IMV this is not the option for us.

3) Make incremental backups by enabling the binary logs.
  • Used to set up replication
  • Used for restore operations
  • The above two pros outweigh the con.
  • Can do pinpoint restorations
  • Can tarball the existing logs and ship to the backup location. Just a matter of file zipping operation.

  • Makes performance slightly slower.

4) Suggestions to improve current backup scripts
  • Improve the cron job (which gets triggered on low usage period) to perform the slaving task to avoid high CPU utilization.

5) Should investigate and run some performance matrices on mysql-parallel-dump and mk-parallel-dump ( utilities.
  • mk-parallel-dump is deprecated now. So, it not feasible.
  • TODO: Talk to Matt regarding:
    • Setting up a dev env, so that I could test these stuff. I don’t need any new fancy hardware for these. Just give me a PC and I’ll configure it with linux and try these out. We don’t have to buy new hardware.
    • Need to test these in a dev env. So, that we could roll these out to a production environment.
    • Need to have a mock server (which have kind of the same traffic as the real servers), so that we could test and verify things properly

6) Remove the current scripts and use a tool specifically written for this and see how it performs.

7) Update the current scripts with --quick option which avoids buffering of large tables and does row by row backups. Resulting in faster restores. This will be a minor tweak to the current scripts.

8) Use of mysqlhotcopy instead of mysquldump
  • It uses FLUSH TABLES, LOCK TABLES, and cp or scp to make a database backup.
  • It is a fast way to make a backup of the database or single tables
  • but it can be run only on the same machine where the database directories are located.
  • mysqlhotcopy works only for backing up MyISAM and ARCHIVE tables.
  • Since we are using MyISAM, this wont be a problem for us.

9) Use SVN to backup databases.
  • This will help to have revisions.
  • I don’t think we need such a mechanism for us.

10) If replication to a slave isn't an option, we could leverage the filesystem, depending on the OS we are using,

No comments: