Michael Peek
2013-Aug-16 15:18 UTC
[Gluster-users] Problems with data integrity between client, volume, and replicated bricks
Hi gurus,
I've been banging my head against a test volume for about a month and a
half now, and I'm having some serious problems figuring out what's going
on.
I'm running on Ubuntu 12.04 amd64
I'm running Gluster 3.4.0final-ubuntu1~precise1
My cluster is made up of four machines, each machine has two 4TB HDDs
(ext4), with replication
My test client has an HDD with 913GB of test data in 156,544 files
Forgive the weird path names, but I wanted to use a setup with something
akin to the real data that I'd be using, and in production there's going
to be weird path names aplenty. I include the path names here just in
case someone sees something obvious, like "You compared the wrong
files"
or "You can't use path names like that with gluster!" But for
your
reading pleasure, I also list output below with the path names removed
so that you can clearly see similarities or differences from client to
volume to brick.
Disclaimer: I have done some outage tests with this volume in the past
by unplugging a drive, plugging it back in, and then doing a full heal.
The volume currently shows 1023 failed heals on bkupc1-b:/export/b/
(brick #2). But that was before I started this particular test. For
this test all the old files and directories had been deleted from the
volume beforehand so that I could start with an empty volume. And no
outages -- simulated or otherwise -- have taken place for this test. (I
have confirmed that every file listed by a gluster as heal-failed no
longer exists. And yet, even though I have deleted the volume's
contents, the failed heals count remains.) I thought this might be
important to disclose. If so desired I can repeat the test after
deleting the volume and recreating it from scratch. However, once in
production, doing this would be highly unfeasible as a solution to a
problem. So if this is the cause of my angst, then I'd rather know how
to fix things as they sit now as opposed to scrapping the volume and
starting anew.
Here's a detailed description of my latest test:
1) The client mounts the volume with fuse.glusterfs
(rw,default_permissions,allow_other,max_read=131072) as /data/bkupc1
2) I perform an rsync of the data to the volume. I have the whole test
scripted and I'll list the juicy bits:
cd /export/d/eraseme/
if [ -d /data/bkupc1/BACKUPS/ ]; then
mv /data/bkupc1/BACKUPS /data/bkupc1/BACKUPS.old
( /bin/rm -fr /data/bkupc1/BACKUPS.old & )
fi
mkdir /data/bkupc1/BACKUPS
rsync \
-a \
-v \
--delete \
--delete-excluded \
--force \
--ignore-errors \
--one-file-system \
--progress \
--stats \
--exclude '/tmp' \
--exclude '/var/tmp' \
--exclude '**core' \
--partial \
--inplace \
./ \
/data/bkupc1/BACKUPS/
NOTE: If the directory /data/bkupc1/BACKUPS/ exists from a previous run
of this test then I move it, and then delete it in the background while
rsync is running.
Output:
...
Number of files: 156554
Number of files transferred: 147980
Total file size: 886124490325 bytes
Total transferred file size: 886124487184 bytes
Literal data: 886124487184 bytes
Matched data: 0 bytes
File list size: 20189800
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 886258975318
Total bytes received: 2845881
sent 886258975318 bytes received 2845881 bytes 45981053.79 bytes/sec
total size is 886124490325 speedup is 1.00
3) My client has md5 checksums for it's files, so next my script checks
the files on the volume:
cd /data/bkupc1/BACKUPS/
md5sum -c --quiet md5sums
data/884b9a38-0443-11e3-b8fb-f46d04e15793/884a7040-0443-11e3-b8fb-f46d04e15793/87fdc790-0443-11e3-b8fb-f46d04e15793/87fb6cfc-0443-11e3-b8fb-f46d04e15793/gF-Eqm1GHw7NPNQOQoeJLNNlfL5ydR0FzVZDHdK9OShRHknwgkqCG0M1yWnryQ,cdfk6Ysdk99eoncEHxnDrEQZF:
FAILED
md5sum: WARNING: 1 computed checksum did NOT match
a) Taking a closer look at this file:
On the client:
root at client:/export/d/eraseme# ls -ald
data/884b9a38-0443-11e3-b8fb-f46d04e15793/884a7040-0443-11e3-b8fb-f46d04e15793/87fdc790-0443-11e3-b8fb-f46d04e15793/87fb6cfc-0443-11e3-b8fb-f46d04e15793/gF-Eqm1GHw7NPNQOQoeJLNNlfL5ydR0FzVZDHdK9OShRHknwgkqCG0M1yWnryQ,cdfk6Ysdk99eoncEHxnDrEQZF
-rw-r--r-- 1 peek peek 646041328 Nov 13 2009
data/884b9a38-0443-11e3-b8fb-f46d04e15793/884a7040-0443-11e3-b8fb-f46d04e15793/87fdc790-0443-11e3-b8fb-f46d04e15793/87fb6cfc-0443-11e3-b8fb-f46d04e15793/gF-Eqm1GHw7NPNQOQoeJLNNlfL5ydR0FzVZDHdK9OShRHknwgkqCG0M1yWnryQ,cdfk6Ysdk99eoncEHxnDrEQZF
On the volume:
root at bkupc1-a:/data/bkupc1/BACKUPS# ls -ald
data/884b9a38-0443-11e3-b8fb-f46d04e15793/884a7040-0443-11e3-b8fb-f46d04e15793/87fdc790-0443-11e3-b8fb-f46d04e15793/87fb6cfc-0443-11e3-b8fb-f46d04e15793/gF-Eqm1GHw7NPNQOQoeJLNNlfL5ydR0FzVZDHdK9OShRHknwgkqCG0M1yWnryQ,cdfk6Ysdk99eoncEHxnDrEQZF
-rw-r--r-- 1 peek peek 646041328 Nov 13 2009
data/884b9a38-0443-11e3-b8fb-f46d04e15793/884a7040-0443-11e3-b8fb-f46d04e15793/87fdc790-0443-11e3-b8fb-f46d04e15793/87fb6cfc-0443-11e3-b8fb-f46d04e15793/gF-Eqm1GHw7NPNQOQoeJLNNlfL5ydR0FzVZDHdK9OShRHknwgkqCG0M1yWnryQ,cdfk6Ysdk99eoncEHxnDrEQZF
On the raw bricks:
root at bkupc1-a:/export# ls -ald
./*/glusterfs/BACKUPS/data/884b9a38-0443-11e3-b8fb-f46d04e15793/884a7040-0443-11e3-b8fb-f46d04e15793/87fdc790-0443-11e3-b8fb-f46d04e15793/87fb6cfc-0443-11e3-b8fb-f46d04e15793/gF-Eqm1GHw7NPNQOQoeJLNNlfL5ydR0FzVZDHdK9OShRHknwgkqCG0M1yWnryQ,cdfk6Ysdk99eoncEHxnDrEQZF
-rw-r--r-- 2 peek peek 646041328 Nov 13 2009
./a/glusterfs/BACKUPS/data/884b9a38-0443-11e3-b8fb-f46d04e15793/884a7040-0443-11e3-b8fb-f46d04e15793/87fdc790-0443-11e3-b8fb-f46d04e15793/87fb6cfc-0443-11e3-b8fb-f46d04e15793/gF-Eqm1GHw7NPNQOQoeJLNNlfL5ydR0FzVZDHdK9OShRHknwgkqCG0M1yWnryQ,cdfk6Ysdk99eoncEHxnDrEQZF
root at bkupc1-b:/export# ls -ald
./*/glusterfs/BACKUPS/data/884b9a38-0443-11e3-b8fb-f46d04e15793/884a7040-0443-11e3-b8fb-f46d04e15793/87fdc790-0443-11e3-b8fb-f46d04e15793/87fb6cfc-0443-11e3-b8fb-f46d04e15793/gF-Eqm1GHw7NPNQOQoeJLNNlfL5ydR0FzVZDHdK9OShRHknwgkqCG0M1yWnryQ,cdfk6Ysdk99eoncEHxnDrEQZF
-rw-r--r-- 2 peek peek 646041328 Nov 13 2009
./a/glusterfs/BACKUPS/data/884b9a38-0443-11e3-b8fb-f46d04e15793/884a7040-0443-11e3-b8fb-f46d04e15793/87fdc790-0443-11e3-b8fb-f46d04e15793/87fb6cfc-0443-11e3-b8fb-f46d04e15793/gF-Eqm1GHw7NPNQOQoeJLNNlfL5ydR0FzVZDHdK9OShRHknwgkqCG0M1yWnryQ,cdfk6Ysdk99eoncEHxnDrEQZF
To make this more readable, here's the output with the path stripped
off, listed in the order given above:
-rw-r--r-- 1 peek peek 646041328 Nov 13 2009 <-- client
-rw-r--r-- 1 peek peek 646041328 Nov 13 2009 <-- volume
-rw-r--r-- 2 peek peek 646041328 Nov 13 2009 <-- brick #1
-rw-r--r-- 2 peek peek 646041328 Nov 13 2009 <-- brick #2
Good: Size, permissions, ownership, and time all match.
b) MD5 checksums:
On the client:
root at catus:/export/d/eraseme# md5sum
data/884b9a38-0443-11e3-b8fb-f46d04e15793/884a7040-0443-11e3-b8fb-f46d04e15793/87fdc790-0443-11e3-b8fb-f46d04e15793/87fb6cfc-0443-11e3-b8fb-f46d04e15793/gF-Eqm1GHw7NPNQOQoeJLNNlfL5ydR0FzVZDHdK9OShRHknwgkqCG0M1yWnryQ,cdfk6Ysdk99eoncEHxnDrEQZF
52b8f8166ef4303bd7b897e8cc6a86c0
data/884b9a38-0443-11e3-b8fb-f46d04e15793/884a7040-0443-11e3-b8fb-f46d04e15793/87fdc790-0443-11e3-b8fb-f46d04e15793/87fb6cfc-0443-11e3-b8fb-f46d04e15793/gF-Eqm1GHw7NPNQOQoeJLNNlfL5ydR0FzVZDHdK9OShRHknwgkqCG0M1yWnryQ,cdfk6Ysdk99eoncEHxnDrEQZF
On the volume:
root at bkupc1-a:/data/bkupc1/BACKUPS# md5sum
data/884b9a38-0443-11e3-b8fb-f46d04e15793/884a7040-0443-11e3-b8fb-f46d04e15793/87fdc790-0443-11e3-b8fb-f46d04e15793/87fb6cfc-0443-11e3-b8fb-f46d04e15793/gF-Eqm1GHw7NPNQOQoeJLNNlfL5ydR0FzVZDHdK9OShRHknwgkqCG0M1yWnryQ,cdfk6Ysdk99eoncEHxnDrEQZF
90a767df080af25adbc3db4da8406072
data/884b9a38-0443-11e3-b8fb-f46d04e15793/884a7040-0443-11e3-b8fb-f46d04e15793/87fdc790-0443-11e3-b8fb-f46d04e15793/87fb6cfc-0443-11e3-b8fb-f46d04e15793/gF-Eqm1GHw7NPNQOQoeJLNNlfL5ydR0FzVZDHdK9OShRHknwgkqCG0M1yWnryQ,cdfk6Ysdk99eoncEHxnDrEQZF
On the bricks:
root at bkupc1-a:/export# md5sum
./*/glusterfs/BACKUPS/data/884b9a38-0443-11e3-b8fb-f46d04e15793/884a7040-0443-11e3-b8fb-f46d04e15793/87fdc790-0443-11e3-b8fb-f46d04e15793/87fb6cfc-0443-11e3-b8fb-f46d04e15793/gF-Eqm1GHw7NPNQOQoeJLNNlfL5ydR0FzVZDHdK9OShRHknwgkqCG0M1yWnryQ,cdfk6Ysdk99eoncEHxnDrEQZF
90a767df080af25adbc3db4da8406072
./a/glusterfs/BACKUPS/data/884b9a38-0443-11e3-b8fb-f46d04e15793/884a7040-0443-11e3-b8fb-f46d04e15793/87fdc790-0443-11e3-b8fb-f46d04e15793/87fb6cfc-0443-11e3-b8fb-f46d04e15793/gF-Eqm1GHw7NPNQOQoeJLNNlfL5ydR0FzVZDHdK9OShRHknwgkqCG0M1yWnryQ,cdfk6Ysdk99eoncEHxnDrEQZF
root at bkupc1-b:/export# md5sum
./*/glusterfs/BACKUPS/data/884b9a38-0443-11e3-b8fb-f46d04e15793/884a7040-0443-11e3-b8fb-f46d04e15793/87fdc790-0443-11e3-b8fb-f46d04e15793/87fb6cfc-0443-11e3-b8fb-f46d04e15793/gF-Eqm1GHw7NPNQOQoeJLNNlfL5ydR0FzVZDHdK9OShRHknwgkqCG0M1yWnryQ,cdfk6Ysdk99eoncEHxnDrEQZF
52b8f8166ef4303bd7b897e8cc6a86c0
./a/glusterfs/BACKUPS/data/884b9a38-0443-11e3-b8fb-f46d04e15793/884a7040-0443-11e3-b8fb-f46d04e15793/87fdc790-0443-11e3-b8fb-f46d04e15793/87fb6cfc-0443-11e3-b8fb-f46d04e15793/gF-Eqm1GHw7NPNQOQoeJLNNlfL5ydR0FzVZDHdK9OShRHknwgkqCG0M1yWnryQ,cdfk6Ysdk99eoncEHxnDrEQZF
To make this more readable, here's the output with the path stripped
off, listed in the order given above:
52b8f8166ef4303bd7b897e8cc6a86c0 <-- client
90a767df080af25adbc3db4da8406072 <-- volume
90a767df080af25adbc3db4da8406072 <-- brick #1
52b8f8166ef4303bd7b897e8cc6a86c0 <-- brick #2
AHA!!! The MD5 checksum is different on one of the bricks!
c) I also have SHA1 checksums of these files as well, and checking that
I get the same thing:
a12cbec32cc8b02dd4dc5e53d017238756f2b182 <-- client
4fbeacdac48f5a292bd5f0c9dfe1d073fd75354e <-- volume
4fbeacdac48f5a292bd5f0c9dfe1d073fd75354e <-- brick #1
a12cbec32cc8b02dd4dc5e53d017238756f2b182 <-- brick #2
4) Last but not least, just to make sure that the horse is good and
dead, my script does a byte-by-byte comparison of every file with
/usr/bin/diff -r -q ./ /data/bkupc1/BACKUPS/. Diff reports a difference
-- BUT -- it's with a *different* file, in a *different* directory:
Files
./data/884b9a38-0443-11e3-b8fb-f46d04e15793/884a7040-0443-11e3-b8fb-f46d04e15793/8825c6c8-0443-11e3-b8fb-f46d04e15793/880f8f0c-0443-11e3-b8fb-f46d04e15793/iMmV,UqdiqZRie5QUu341iRS7s,-OK7PzXSuPgr0o30yNDXNG6uvqA0Wyr7RRR3MBE4
and
/data/bkupc1/BACKUPS/data/884b9a38-0443-11e3-b8fb-f46d04e15793/884a7040-0443-11e3-b8fb-f46d04e15793/8825c6c8-0443-11e3-b8fb-f46d04e15793/880f8f0c-0443-11e3-b8fb-f46d04e15793/iMmV,UqdiqZRie5QUu341iRS7s,-OK7PzXSuPgr0o30yNDXNG6uvqA0Wyr7RRR3MBE4
differ
NOTE: Diff doesn't even notice that the first file -- the one listed in
(3) above -- shows any difference at all. I suppose this could be
explained away depending on glusters' internal workings. IF gluster
provides access to replicated data round-robin then I could see how md5
and sha1 might wind up getting the file from brick#1 while diff gets the
file from brick#2, but that's "IF", not "HOW". I
don't actually know
anything about how gluster works under the hood. That's just the first
possible explanation that came to my mind.
On closer look:
ls -ald:
-rw-r--r-- 1 peek peek 527435808 Aug 5 2009 <-- client
-rw-r--r-- 1 peek peek 527435808 Aug 5 2009 <-- volume
-rw-r--r-- 2 peek peek 527435808 Aug 5 2009 <-- brick #1
-rw-r--r-- 2 peek peek 527435808 Aug 5 2009 <-- brick #2
MD5:
01eca86b5b48beb8f76204112dc69ac3 <-- client
01eca86b5b48beb8f76204112dc69ac3 <-- volume
01eca86b5b48beb8f76204112dc69ac3 <-- brick #1
3c1c9eadc44a1e144a576d4b388a1c42 <-- brick #2
SHA1:
a570dac34f820bc4973ea485b059429786068993 <-- client
a570dac34f820bc4973ea485b059429786068993 <-- volume
a570dac34f820bc4973ea485b059429786068993 <-- brick #1
0b755903ac3ba1314fbba7a73ef0c5c6d6716ff1 <-- brick #2
Why is this happening?
Did I do something wrong?
Or is this a legitimate bug?
I have preserved the log files from each client and I'll be pouring over
those next, but I'll be honest, I don't know what I'm looking for.
Any help is greatly appreciated.
Michael Peek
Hi gurus,
This is a follow-up to a previous report about data integrity problems
with Gluster 3.4.0. I will be as thorough as I can, but this is already
a pretty long post. So feel free to see my previous post for more
information specific to my previous run of tests.
1. I am running a fully up-to-date version of Ubuntu 12.04, with
Gluster 3.4.0final-ubuntu1~precise1.
2. My cluster consists of four nodes. Each node consists of:
1. Hostnames: bkupc1-a -to- bkupc1-d
2. Bricks: Each host has /export/a/glusterfs/ and
/export/b/glusterfs/, which are 4TB ext4 drives
3. Clients: I have a client that mounts the volume as /data/bkupc1/
using the fuse driver.
3. My volume was created with:
/usr/sbin/gluster peer probe bkupc1-a
/usr/sbin/gluster peer probe bkupc1-b
/usr/sbin/gluster peer probe bkupc1-c
/usr/sbin/gluster peer probe bkupc1-d
/usr/sbin/gluster volume create bkupc1 replica 2 transport tcp \
bkupc1-a:/export/a/glusterfs bkupc1-b:/export/a/glusterfs \
bkupc1-c:/export/a/glusterfs bkupc1-d:/export/a/glusterfs \
bkupc1-a:/export/b/glusterfs bkupc1-b:/export/b/glusterfs \
bkupc1-c:/export/b/glusterfs bkupc1-d:/export/b/glusterfs
/usr/sbin/gluster volume set bkupc1 auth.allow {list of IP addresses}
4. On the client I have a 1TB drive filled with 900+GB of data in
156,554 test files. These files are encrypted backups that are
dispersed throughout many subdirectories. They are ugly to look
at. Here's an example:
data/
884b9a38-0443-11e3-b8fb-f46d04e15793/
884a7040-0443-11e3-b8fb-f46d04e15793/
8825c6c8-0443-11e3-b8fb-f46d04e15793/
880f8f0c-0443-11e3-b8fb-f46d04e15793/
iMmV,UqdiqZRie5QUu341iRS7s,-OK7PzXSuPgr0o30yNDXNG6uvqA0Wyr7RRR3MBE4
<Line breaks for readability>
I have pre-calculated MD5 and SHA1 checksums for all of these files,
and I have verified that the checksums are correct on the client drive.
5. My first set of runs involved using rsync. Nothing fancy here:
1. The volume is empty when I begin
2. I create /data/bkupc1/BACKUPS-rsync.${timestamp}/
3. Use rsync to copy files from the client to the volume
4. Here's my script:
#!/bin/bash -x
timestamp="${1}"
/bin/date
mkdir /data/bkupc1/BACKUPS-rsync.${timestamp}
rsync \
-a \
-v \
--delete \
--delete-excluded \
--force \
--ignore-errors \
--one-file-system \
--stats \
--inplace \
./ \
/data/bkupc1/BACKUPS-rsync.${timestamp}/ \
#
/bin/date
(\
cd /data/bkupc1/BACKUPS-rsync.${timestamp}/ \
&& md5sum -c --quiet md5sums \
)
/bin/date
(\
cd /data/bkupc1/BACKUPS-rsync.${timestamp}/ \
&& sha1sum -c --quiet sha1sums \
)
/bin/date
/usr/bin/diff -r -q ./ /data/bkupc1/BACKUPS-rsync.${timestamp}/
/bin/date
5. As you can see from the script, after rsyncing, I check the
files on the volume
1. Against their MD5 checksums
2. Then against their SHA1 checksums
3. Then, just to beat a dead horse, I use diff to do a
byte-for-byte check between the files on the client and the
files on the volume. (Note to self: I should replace diff
with cmp, as I have run into "out of memory" errors with
diff on files that cmp can handle just fine.)
6. What I have found is that about 50% of the time, there will be
one or two files out of those 156,554 that differ. I documented
my findings in more detail in my previous email.
6. One though that occurred to me is that this could be the fault of
rsync. So I have repeated the tests using plain old /bin/cp.
Here's my (very similar) script:
#!/bin/bash -x
timestamp="${1}"
/bin/date
mkdir /data/bkupc1/BACKUPS-cp.${timestamp}
/bin/cp -ar ./ /data/bkupc1/BACKUPS-cp.${timestamp}/
/bin/date
(\
cd /data/bkupc1/BACKUPS-cp.${timestamp}/ \
&& md5sum -c --quiet md5sums \
)
/bin/date
(\
cd /data/bkupc1/BACKUPS-cp.${timestamp}/ \
&& sha1sum -c --quiet sha1sums \
)
/bin/date
/usr/bin/diff -r -q ./ /data/bkupc1/BACKUPS-cp.${timestamp}/
/bin/date
7. Results:
1. Output from the script:
+ timestamp=20130821-081918
+ /bin/date
Wed Aug 21 08:19:18 EDT 2013
+ mkdir /data/bkupc1/BACKUPS-cp.20130821-081918
+ /bin/cp -ar ./ /data/bkupc1/BACKUPS-cp.20130821-081918/
+ /bin/date
Wed Aug 21 13:51:53 EDT 2013
+ cd /data/bkupc1/BACKUPS-cp.20130821-081918/
+ md5sum -c --quiet md5sums
data/884b9a38-0443-11e3-b8fb-f46d04e15793/884a7040-0443-11e3-b8fb-f46d04e15793/87fdc790-0443-11e3-b8fb-f46d04e15793/87f54d22-0443-11e3-b8fb-f46d04e15793/KAfe4MUAlmO-Lt4N0KqVQTtf0im3mcoTuAyJvSP,t0o2Lc,FGce49pe9wEPDiIIt201oEks-taGDbc5-Nph6AacR:
FAILED
data/a34bc588-0443-11e3-b8fb-f46d04e15793/a34a8bf0-0443-11e3-b8fb-f46d04e15793/a3494b3c-0443-11e3-b8fb-f46d04e15793/a34808b2-0443-11e3-b8fb-f46d04e15793/a346cd08-0443-11e3-b8fb-f46d04e15793/a3456e2c-0443-11e3-b8fb-f46d04e15793/a344366a-0443-11e3-b8fb-f46d04e15793/8c9e94a0-0443-11e3-b8fb-f46d04e15793/NLrXi5u80FoUV6Gi2ouEybAebOgnF7p1PtEYmPbd0huh,1:
FAILED
md5sum: WARNING: 2 computed checksums did NOT match
+ /bin/date
Wed Aug 21 16:54:13 EDT 2013
+ cd /data/bkupc1/BACKUPS-cp.20130821-081918/
+ sha1sum -c --quiet sha1sums
data/884b9a38-0443-11e3-b8fb-f46d04e15793/884a7040-0443-11e3-b8fb-f46d04e15793/8825c6c8-0443-11e3-b8fb-f46d04e15793/8810e21c-0443-11e3-b8fb-f46d04e15793/7LOu,NZ5eMXxxrqjZHv5a9-4aHd641hN2tGaneMa1D2Kl9wLXf1f71nX6g-8ps2BpABovO7w68Wy63pH0gU3yLnyLEfFfT25Zk5jNvpDU6eQ,1:
FAILED
sha1sum: WARNING: 1 computed checksum did NOT match
+ /bin/date
Wed Aug 21 19:54:29 EDT 2013
+ /usr/bin/diff -r -q ./ /data/bkupc1/BACKUPS-cp.20130821-081918/
+ /bin/date
Thu Aug 22 00:16:53 EDT 2013
2. A listing of files that were reported as different (line breaks
for readability):
1. MD5 failure:
data/
884b9a38-0443-11e3-b8fb-f46d04e15793/
884a7040-0443-11e3-b8fb-f46d04e15793/
87fdc790-0443-11e3-b8fb-f46d04e15793/
87f54d22-0443-11e3-b8fb-f46d04e15793/
KAfe4MUAlmO-Lt4N0KqVQTtf0im3mcoTuAyJvSP,t0o2Lc,FGce49pe9wEPDiIIt201oEks-taGDbc5-Nph6AacR
2. MD5 failure:
data/
a34bc588-0443-11e3-b8fb-f46d04e15793/
a34a8bf0-0443-11e3-b8fb-f46d04e15793/
a3494b3c-0443-11e3-b8fb-f46d04e15793/
a34808b2-0443-11e3-b8fb-f46d04e15793/
a346cd08-0443-11e3-b8fb-f46d04e15793/
a3456e2c-0443-11e3-b8fb-f46d04e15793/
a344366a-0443-11e3-b8fb-f46d04e15793/
8c9e94a0-0443-11e3-b8fb-f46d04e15793/
NLrXi5u80FoUV6Gi2ouEybAebOgnF7p1PtEYmPbd0huh,1
3. SHA1 failure:
data/
884b9a38-0443-11e3-b8fb-f46d04e15793/
884a7040-0443-11e3-b8fb-f46d04e15793/
8825c6c8-0443-11e3-b8fb-f46d04e15793/
8810e21c-0443-11e3-b8fb-f46d04e15793/
7LOu,NZ5eMXxxrqjZHv5a9-4aHd641hN2tGaneMa1D2Kl9wLXf1f71nX6g-8ps2BpABovO7w68Wy63pH0gU3yLnyLEfFfT25Zk5jNvpDU6eQ,1
3. A byte-for-byte comparison:
1. File from 7.2.1 above:
(KAfe4MUAlmO-Lt4N0KqVQTtf0im3mcoTuAyJvSP,t0o2Lc,FGce49pe9wEPDiIIt201oEks-taGDbc5-Nph6AacR)
1. After the test, this file exist on three locations:
client:/export/d/eraseme/ <-- the original
bkupc1-a:/export/a/glusterfs/ <-- replicated volume
copy 1 of 2
bkupc1-b:/export/a/glusterfs/ <-- replicated volume
copy 2 of 2
2. MD5sums:
68ce7073e462fda42d4b551a843bd71f <-- bkupc1-a (directly
from the brick)
68ce7073e462fda42d4b551a843bd71f <-- bkupc1-b (directly
from the brick)
68ce7073e462fda42d4b551a843bd71f <-- client
68ce7073e462fda42d4b551a843bd71f <-- volume (from the
mount via the fuse driver)
NOTE: There is no difference between the MD5 checksums
3. SHA1sums:
c5c59c18f5cc0c1b6e4dd80b2d41fc3bc7148509 <-- bkupc1-a
c5c59c18f5cc0c1b6e4dd80b2d41fc3bc7148509 <-- bkupc1-b
c5c59c18f5cc0c1b6e4dd80b2d41fc3bc7148509 <-- client
c5c59c18f5cc0c1b6e4dd80b2d41fc3bc7148509 <-- volume
NOTE: There is no difference between the SHA1 checksums
4. Both /usr/bin/diff and /usr/bin/cmp report no difference
between these files.
2. File 7.2.2 from above:
(NLrXi5u80FoUV6Gi2ouEybAebOgnF7p1PtEYmPbd0huh,1)
1. After the test, this file exists on three locations:
client:/export/d/eraseme/ <-- the original
bkupc1-a:/export/a/glusterfs/ <-- replicated volume
copy 1 of 2
bkupc1-b:/export/a/glusterfs/ <-- replicated volume
copy 2 of 2
2. MD5sums:
78696407263ef75ae2795ed7cb4eb24a <-- bkupc1-a
77fdce4ebe9e94f611848d174de01357 <-- bkupc1-b
78696407263ef75ae2795ed7cb4eb24a <-- client
78696407263ef75ae2795ed7cb4eb24a <-- volume
3. SHA1sums:
de93bcc7b64458926505dfc5ac4c597f3fefe6db <-- bkupc1-a
0254c117b92ca95987aa7389980fb0bcc850e9c5 <-- bkupc1-b
de93bcc7b64458926505dfc5ac4c597f3fefe6db <-- client
de93bcc7b64458926505dfc5ac4c597f3fefe6db <-- volume
4. Byte differences: The output of /usr/bin/cmp -l, when
comparing the version of the file on the client with the
version of the file on bkupc1-b:
3262724555 274 234
If I'm reading this right, then this means that the
files differ by only one byte (274 vs. 234).
3. File 7.2.3 from above:
(7LOu,NZ5eMXxxrqjZHv5a9-4aHd641hN2tGaneMa1D2Kl9wLXf1f71nX6g-8ps2BpABovO7w68Wy63pH0gU3yLnyLEfFfT25Zk5jNvpDU6eQ,1)
1. After the test, this file exists on three locations:
client:/export/d/eraseme/ <-- the original
bkupc1-c:/export/a/glusterfs/ <-- replicated volume
copy 1 of 2
bkupc1-d:/export/a/glusterfs/ <-- replicated volume
copy 2 of 2
2. MD5sums:
05ea9e04df984cc7ed514f93dc79067e <-- bkupc1-c
868c0eafa2bde7386d808b722166a283 <-- bkupc1-d
05ea9e04df984cc7ed514f93dc79067e <-- client
868c0eafa2bde7386d808b722166a283 <-- volume
3. SHA1sums:
a6cf53c106b1856826db8de8b947273b05eb6391 <-- bkupc1-c
f56e01f982028eed9c115f0696346861bf3b7169 <-- bkupc1-d
a6cf53c106b1856826db8de8b947273b05eb6391 <-- client
f56e01f982028eed9c115f0696346861bf3b7169 <-- volume
4. Byte differences: The output of /usr/bin/cmp -l, when
comparing the version of the file on the client with the
version of the file on bkupc1-d:
181361479 226 206
For 7.3.1, md5sum reported a difference between the client and the
volume, even though re-comparing it with md5sum, sha1sum, cmp, and diff
on both the mounted volume and the individual bricks showed no
difference at all. This would imply that there may be a data read error
somewhere in gluster.
For 7.3.2 and 7.3.3, each file differed between the client, the volume,
and between replica bricks, by exactly one byte (if I've read the output
from cmp correctly). This would imply that there may be a data write
error in gluster.
(And please, for the love of Pete, tell me if I have done anything
stupid, b/c I really wanted gluster to be the silver bullet solution to
my data storage problems.)
Michael Peek
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130826/63936eba/attachment.html>