thr3ads.net - rsync - Data corruption [Aug 2005]

If this information is useful, please help other people find it:
Share via:

Linus Hicks

2005-Aug-29 18:25 UTC

Data corruption

We used rsync 2.6.3 on a couple of Solaris 8 machines to update an Oracle 
database from one machine to another. Here is the procedure I used:

The source database was up and running so this operation was similar to doing a 
hot backup. I queried the source database for a list of tablespace names, and 
for each tablespace, I queried the list of datafiles. I put the tablespace in 
hot backup mode, which means that no updates are written to the datafiles; they 
will all go the the redo logs. Then I rsync'ed each datafile in that
tablespace
then took the tablespace out of hot backup mode. Repeat for next tablespace.

Early on in this process, I discovered I had a big performance problem and after
some experimentation I learned some important things.

Mainly, it was apparently defaulting to using whole-file mode, which is 
different from my past experience. Previously I had always supplied directories 
as the path to rsync, whereas this time I was doing individual files. I'm 
guessing that caused a different default behavior. After I started using 
--no-whole-file and --inplace, the situation improved. For files that had few 
differences, it was quite fast. However, for files that had lots of modified 
datablocks, it was still taking much longer than an rcp would. An rcp of a 4gb 
datafile took about seven minutes whereas rsync with about 10% modified data 
took about half an hour as shown:

-- > Syncing Datafile: /c03/oradata/can/ard04.dbf @ Fri Aug 26 11:46:08 EDT
2005

Number of files: 1
Number of files transferred: 1
Total file size: 4294975488 bytes
Total transferred file size: 4294975488 bytes
Literal data: 403292160 bytes
Matched data: 3891683328 bytes
File list size: 72
Total bytes sent: 4194348
Total bytes received: 405243604

sent 4194348 bytes  received 405243604 bytes  239507.43 bytes/sec
total size is 4294975488  speedup is 10.49

-- > Syncing Datafile: /c03/oradata/can/ard05.dbf @ Fri Aug 26 12:14:37 EDT
2005


Then when we started recovery on the destination database, Oracle complained 
about block zero being corrupted on six (out of more than 330) of the datafiles 
(one at a time). All of those were small, so I just used rcp to copy them (in 
hot backup mode). I started having misgivings then, but continued the process of
recovering the database and finally got to applying the next to last redo log 
and Oracle barfed on block corruption in one of our big datafiles.

All of the small datafiles that had block zero corrupted had a single block 
transfered via rsync. The process of opening a database and shutting it down 
will cause an update to block zero, and these datafiles are not really used 
during day-to-day operation, so it fits that rsync copied one block. In fact, 
there are a bunch of small datafiles similarly unused that had a single block 
transfered that Oracle did not complain about.

Here is the command line I used:

rsync -ptgoHS --stats --rsh=/usr/bin/rsh -B 8192 --no-whole-file --inplace \
rmthost:${df} ${df}

I probably shouldn't have used -H, and I saw a bug report about it, but
can't
believe it is related to my corruption problem. Is it possible -S is involved 
somehow?

The data corruption of course makes rsync useless to me for copying databases, 
and I'm wondering now if other things I use it for are susceptible to the
same
problem.

However, even if the corruption problem is fixed, the performance of rsync on 
large datafiles with more than a few percent of modified blocks may make it not 
worth using.

Any help is appreciated.

Linus

Wayne Davison

2005-Aug-30 02:12 UTC

head link

Data corruption

On Mon, Aug 29, 2005 at 02:24:08PM -0400, Linus Hicks
wrote:> Mainly, it was apparently defaulting to using whole-file mode
If you're doing a local copy, --whole-file mode is *much* faster.  Using
--no-whole-file doubles your disk I/O, which is only a good thing if
your transfer is limited by network I/O.
> Is it possible -S is involved somehow?
Yes, using -S is incompatible with --inplace.  Unforuntately, rsync
doesn't (yet) complain and reject that combination of options.  I'll
check in some code for that now.  I'd recommend using --inplace and
--whole-file for the fastest local copy (that rsync is capable of).

..wayne..

Apparently Analagous Threads

Search for more reasonably related threads

rsync - Aug 2005 - Data corruption

Data corruption

Data corruption

Apparently Analagous Threads