Monday, February 24, 2014

Fixing mysql replication errors in a master/slave setup

If you have a mysql master/slave replication setup and have run into replication errors, you can follow the below instructions to fix up the replication break and sync up the data.
1) Stop mysql on the slave.
service mysql stop
2) Login to the master.

3) Lock the master by running the following command
mysql> flush tables with read lock;
NOTE: No data can be written at this time to the master, so be quick in the following steps if your site is in production. It is important that you do not close mysql or this running ssh terminal. Login to a separate ssh terminal for the next part.

4) Now, in a separate ssh screen, type the following command from the master mysql server.
rsync -varPe ssh /var/lib/mysql root@IP_ADDRESS:/var/lib/ —delete-after
5) After this is done, now you must get the binlog and position from the master before you unlock the tables. Run the following command on master
mysql> show master status\G;
Then you’ll get some information on master status. You need the file and the position because that’s what you’re going to use on the slave in a moment. See step 10 on how this information is used, but please do not skip to step 10.

6) Unlock the master. It does not need to be locked now that you have the copy of the data and the log position.
mysql> unlock tables;
7) Login to the slave now via ssh. First remove the two files : /var/lib/mysql/ and /var/lib/mysql/

8) Start mysql on the slave.
service mysqld start
9) Immediately login to mysql and stop slave
mysql> slave stop; 
10) Now, you have to run the following query filling in the information from the show master status above (Step 5.)
mysql> MASTER_USER=‘replicate’, MASTER_PASSWORD=‘replicate’,
mysql> MASTER_LOG_FILE=‘mysql-bin.000001’, MASTER_LOG_POS=1234512351;
11) Now start the slave.
mysql > slave start
12) Check slave status.
mysql> show slave status\G;


Rich Hewett said...

Your rsync command in step 3/4 is obviously for Linux. I'm trying to do this on a Window 2008 server. You wouldn't happen to know the correct command for that, would you?

Heshan Suriyaarachchi said...

I'm sorry. I don't.

Rich Hewett said...

Is this attempting to sync the databases between the master and slave, or is this just logging into the slave?

Heshan Suriyaarachchi said...

This is trying to sync the data between master and slave. (In this setup my master is working as read/write and my slave is working as read only. This post address how to fix a replication break between master and slave). I hope that information was helpful to you.

Rich Hewett said...

That tells me what I need to know. Instead of using SSH to do this, I can do a MYSQLDUMP and transfer the data "by hand". It's a slower process, but without being fluent in SSH/rsync, it's a workaround.

Heshan Suriyaarachchi said...

Yes, you can do like that as well but the downside is, it will take a longer time to do that. In my experience with backing up mysql data, I had huge performance gains when backing up with mysqlhotcopy (for MyISAM) compared to mysqldump. Hope this helps.

Good luck in fixing the replication :). I know how much of a pain in the ass that is, in a production environment.