(mysql.info.gz) Replication FAQ
Info Catalog
(mysql.info.gz) Replication Options
(mysql.info.gz) Replication
(mysql.info.gz) Replication Problems
6.9 Replication FAQ
===================
*Q*: How do I configure a slave if the master is running and I do not
want to stop it?
*A*: There are several options. If you have taken a backup of the
master at some point and recorded the binary log name and offset (from
the output of `SHOW MASTER STATUS' ) corresponding to the snapshot, use
the following procedure:
1. Make sure that the slave is assigned a unique server ID.
2. Execute the following statement on the slave, filling in
appropriate values for each option:
mysql> CHANGE MASTER TO
-> MASTER_HOST='master_host_name',
-> MASTER_USER='master_user_name',
-> MASTER_PASSWORD='master_pass',
-> MASTER_LOG_FILE='recorded_log_file_name',
-> MASTER_LOG_POS=recorded_log_position;
3. Execute `START SLAVE' on the slave.
If you do not have a backup of the master server, here is a quick
procedure for creating one. All steps should be performed on the master
host.
1. Issue this statement:
mysql> FLUSH TABLES WITH READ LOCK;
2. With the lock still in place, execute this command (or a variation
of it):
shell> tar zcf /tmp/backup.tar.gz /var/lib/mysql
3. Issue this statement and make sure to record the output, which you
will need later:
mysql> SHOW MASTER STATUS;
4. Release the lock:
mysql> UNLOCK TABLES;
An alternative is to make an SQL dump of the master instead of a binary
copy as in the preceding procedure. To do this, you can use `mysqldump
--master-data' on your master and later load the SQL dump into your
slave. However, this is slower than making a binary copy.
No matter which of the two methods you use, afterward follow the
instructions for the case when you have a snapshot and have recorded
the log name and offset. You can use the same snapshot to set up
several slaves. Once you have the snapshot of the master, you can wait
to set up a slave as long as the binary logs of the master are left
intact. The two practical limitations on the length of time you can
wait are the amount of disk space available to retain binary logs on
the master and the length of time it will take the slave to catch up.
You can also use `LOAD DATA FROM MASTER'. This is a convenient
statement that transfers a snapshot to the slave and adjusts the log
name and offset all at once. In the future, `LOAD DATA FROM MASTER'
will be the recommended way to set up a slave. Be warned, however,
that it works only for `MyISAM' tables and it may hold a read lock for
a long time. It is not yet implemented as efficiently as we would
like. If you have large tables, the preferred method at this time is
still to make a binary snapshot on the master server after executing
`FLUSH TABLES WITH READ LOCK'.
*Q*: Does the slave need to be connected to the master all the time?
*A*: No, it does not. The slave can go down or stay disconnected for
hours or even days, then reconnect and catch up on the updates. For
example, you can set up a master/slave relationship over a dial-up link
where the link is up only sporadically and for short periods of time.
The implication of this is that, at any given time, the slave is not
guaranteed to be in sync with the master unless you take some special
measures. In the future, we will have the option to block the master
until at least one slave is in sync.
*Q*: How do I know how late a slave is compared to the master? In other
words, how do I know the date of the last query replicated by the slave?
*A*: If the slave is 4.1.1 or newer, read the `Seconds_Behind_Master'
column in `SHOW SLAVE STATUS'. For older versions, the following
applies. This is possible only if `SHOW SLAVE STATUS' on the slave
shows that the SQL thread is running (or for MySQL 3.23, that the slave
thread is running), and that the thread has executed at least one event
from the master. Replication Implementation Details.
When the slave SQL thread executes an event read from the master, it
modifies its own time to the event timestamp (this is why `TIMESTAMP'
is well replicated). In the `Time' column in the output of `SHOW
PROCESSLIST', the number of seconds displayed for the slave SQL thread
is the number of seconds between the timestamp of the last replicated
event and the real time of the slave machine. You can use this to
determine the date of the last replicated event. Note that if your
slave has been disconnected from the master for one hour, and then
reconnects, you may immediately see `Time' values like 3600 for the
slave SQL thread in `SHOW PROCESSLIST'. This would be because the slave
is executing statements that are one hour old.
*Q*: How do I force the master to block updates until the slave catches
up?
*A*: Use the following procedure:
1. On the master, execute these statements:
mysql> FLUSH TABLES WITH READ LOCK;
mysql> SHOW MASTER STATUS;
Record the log name and the offset from the output of the `SHOW'
statement. These are the replication coordinates.
2. On the slave, issue the following statement, where the arguments
to the `MASTER_POS_WAIT()' function are the replication coordinate
values obtained in the previous step:
mysql> SELECT MASTER_POS_WAIT('log_name', log_offset);
The `SELECT' statement will block until the slave reaches the
specified log file and offset. At that point, the slave will be in
sync with the master and the statement will return.
3. On the master, issue the following statement to allow the master
to begin processing updates again:
mysql> UNLOCK TABLES;
*Q*: What issues should I be aware of when setting up two-way
replication?
*A*: MySQL replication currently does not support any locking protocol
between master and slave to guarantee the atomicity of a distributed
(cross-server) update. In other words, it is possible for client A to
make an update to co-master 1, and in the meantime, before it
propagates to co-master 2, client B could make an update to co-master 2
that will make the update of client A work differently than it did on
co-master 1. Thus, when the update of client A makes it to co-master 2,
it will produce tables that are different than what you have on
co-master 1, even after all the updates from co-master 2 have also
propagated. This means that you should not co-chain two servers in a
two-way replication relationship unless you are sure that your updates
can safely happen in any order, or unless you take care of mis-ordered
updates somehow in the client code.
You must also realize that two-way replication actually does not improve
performance very much (if at all), as far as updates are concerned. Both
servers need to do the same number of updates each, as you would have
one server do. The only difference is that there will be a little less
lock contention, because the updates originating on another server will
be serialized in one slave thread. Even this benefit might be offset by
network delays.
*Q*: How can I use replication to improve performance of my system?
*A*: You should set up one server as the master and direct all writes
to it. Then configure as many slaves as you have the budget and
rackspace for, and distribute the reads among the master and the slaves.
You can also start the slaves with the `--skip-innodb', `--skip-bdb',
`--low-priority-updates', and `--delay-key-write=ALL' options to get
speed improvements on the slave end. In this case, the slave will use
non-transactional `MyISAM' tables instead of `InnoDB' and `BDB' tables
to get more speed.
*Q*: What should I do to prepare client code in my own applications to
use performance-enhancing replication?
*A*: If the part of your code that is responsible for database access
has been properly abstracted/modularized, converting it to run with a
replicated setup should be very smooth and easy. Just change the
implementation of your database access to send all writes to the master,
and to send reads to either the master or a slave. If your code does
not have this level of abstraction, setting up a replicated system will
give you the opportunity and motivation to it clean up. You should
start by creating a wrapper library or module with the following
functions:
* `safe_writer_connect()'
* `safe_reader_connect()'
* `safe_reader_statement()'
* `safe_writer_statement()'
`safe_' in each function name means that the function will take care of
handling all the error conditions. You can use different names for the
functions. The important thing is to have a unified interface for
connecting for reads, connecting for writes, doing a read, and doing a
write.
You should then convert your client code to use the wrapper library.
This may be a painful and scary process at first, but it will pay off in
the long run. All applications that use the approach just described
will be able to take advantage of a master/slave configuration, even
one involving multiple slaves. The code will be a lot easier to
maintain, and adding troubleshooting options will be trivial. You will
just need to modify one or two functions; for example, to log how long
each statement took, or which statement among your many thousands gave
you an error.
If you have written a lot of code, you may want to automate the
conversion task by using the `replace' utility that comes with standard
MySQL distributions, or just write your own conversion script.
Ideally, your code uses consistent programming style conventions. If
not, then you are probably better off rewriting it anyway, or at least
going through and manually regularizing it to use a consistent style.
*Q*: When and how much can MySQL replication improve the performance of
my system?
*A*: MySQL replication is most beneficial for a system with frequent
reads and infrequent writes. In theory, by using a
single-master/multiple-slave setup, you can scale the system by adding
more slaves until you either run out of network bandwidth, or your
update load grows to the point that the master cannot handle it.
In order to determine how many slaves you can get before the added
benefits begin to level out, and how much you can improve performance
of your site, you need to know your query patterns, and to determine
empirically by benchmarking the relationship between the throughput for
reads (reads per second, or `max_reads') and for writes (`max_writes')
on a typical master and a typical slave. The example here shows a
rather simplified calculation of what you can get with replication for
a hypothetical system.
Let's say that system load consists of 10% writes and 90% reads, and we
have determined by benchmarking that `max_reads' is 1200 - 2 *
`max_writes'. In other words, the system can do 1,200 reads per second
with no writes, the average write is twice as slow as the average read,
and the relationship is linear. Let us suppose that the master and each
slave have the same capacity, and that we have one master and N slaves.
Then we have for each server (master or slave):
`reads = 1200 - 2 * writes'
`reads = 9 * writes / (N + 1) ' (reads are split, but writes go to all
servers)
`9 * writes / (N + 1) + 2 * writes = 1200'
`writes = 1200 / (2 + 9/(N+1)) '
The last equation indicates that the maximum number of writes for N
slaves, given a maximum possible read rate of 1,200 per minute and a
ratio of nine reads per write.
This analysis yields the following conclusions:
* If N = 0 (which means we have no replication), our system can
handle about 1200/11 = 109 writes per second.
* If N = 1, we get up to 184 writes per second.
* If N = 8, we get up to 400 writes per second.
* If N = 17, we get up to 480 writes per second.
* Eventually, as N approaches infinity (and our budget negative
infinity), we can get very close to 600 writes per second,
increasing system throughput about 5.5 times. However, with only
eight servers, we increased it almost four times.
Note that these computations assume infinite network bandwidth and
neglect several other factors that could turn out to be significant on
your system. In many cases, you may not be able to perform a computation
similar to the just shown that will accurately predict what will happen
on your system if you add N replication slaves. However, answering the
following questions should help you decide whether and how much
replication will improve the performance of your system:
* What is the read/write ratio on your system?
* How much more write load can one server handle if you reduce the
reads?
* For how many slaves do you have bandwidth available on your
network?
*Q*: How can I use replication to provide redundancy/high availability?
*A*: With the currently available features, you would have to set up a
master and a slave (or several slaves), and write a script that will
monitor the master to see whether it is up. Then instruct your
applications and the slaves to change master in case of failure. Some
suggestions:
* To tell a slave to change its master, use the `CHANGE MASTER TO'
statement.
* A good way to keep your applications informed as to the location
of the master is by having a dynamic DNS entry for the master.
With `bind' you can use `nsupdate' to dynamically update your DNS.
* You should run your slaves with the `--log-bin' option and without
`--log-slave-updates'. This way the slave will be ready to become a
master as soon as you issue `STOP SLAVE'; `RESET MASTER', and
`CHANGE MASTER TO' on the other slaves. For example, assume that
you have the following setup:
WC
\
v
WC----> M
/ | \
/ | \
v v v
S1 S2 S3
M means the master, S the slaves, WC the clients that issue
database writes and reads; clients that issue only database reads
are not represented, because they need not switch. S1, S2, and S3
are slaves running with `--log-bin' and without
`--log-slave-updates'. Because updates received by a slave from
the master are not logged in the binary log unless
`--log-slave-updates' is specified, the binary log on each slave
is empty. If for some reason M becomes unavailable, you can pick
one slave to become the new master. For example, if you pick S1,
all WC should be redirected to S1, and S2 and S3 should replicate
from S1.
Make sure that all slaves have processed any statements in their
relay log. On each slave, issue `STOP SLAVE IO_THREAD', then
check the output of `SHOW PROCESSLIST' until you see `Has read all
relay log'. When this is true for all slaves, they can be
reconfigured to the new setup. On the slave S1 being promoted to
become the master, issue `STOP SLAVE' and `RESET MASTER'.
On the other slaves S2 and S3, use `STOP SLAVE' and `CHANGE MASTER
TO MASTER_HOST='S1'' (where `'S1'' represents the real hostname of
S1). To `CHANGE MASTER', add all information about how to connect
to S1 from S2 or S3 (user, password, port). In `CHANGE MASTER',
there is no need to specify the name of S1's binary log or binary
log position to read from: We know it is the first binary log and
position 4, which are the defaults for `CHANGE MASTER'. Finally,
use `START SLAVE' on S2 and S3.
Then instruct all WC to direct their statements to S1. From that
point on, all updates statements sent by WC to S1 are written to
the binary log of S1, which will contain exactly every update
statement sent to S1 since M died.
The result is this configuration:
WC
/
|
WC | M(unavailable)
\ |
\ |
v v
S1<--S2 S3
^ |
+-------+
When M is up again, you just have to issue on it the same `CHANGE
MASTER' as the one issued on S2 and S3, so that M becomes a slave
of S1 and picks all the WC writes it has missed while it was down.
To make M a master again (because it is the most powerful machine,
for example), use the preceding procedure as if S1 was unavailable
and M was to be the new master. During the procedure, don't forget
to run `RESET MASTER' on M before making S1, S2, and S3 slaves of
M. Otherwise, they may pick up old WC writes from before the point
at which M became unavailable.
We are currently working on integrating an automatic master election
system into MySQL, but until it is ready, you will have to create your
own monitoring tools.
Info Catalog
(mysql.info.gz) Replication Options
(mysql.info.gz) Replication
(mysql.info.gz) Replication Problems
automatically generated byinfo2html