i have been playing with 3.3beta2 for a few weeks now and i can easily create inconsistencies between replicas. my first test was to start an rsync from a client using the gluster fuse code and then reboot one of the servers. after the rsync finishes, i can see that the replicas are out of sync (which is to be expected), so i start a self-heal via find. after that has run to completion, i don't see any xattrs that indicate there are any files out of sync. but a simple du/df shows that the two replicas aren't using the same amount of space, so i dig further via find -ls on each replica and then compare those outputs, and i'll find one or more files that are truncated on the server that i rebooted. i have been using the 3.3 beta code because i wanted to play with the s3 layer, but this simple test makes me wonder what's going on. i guess i could try 3.2 and see if i can create the problem there, but i was kind of hoping that this basic layer of the code wouldn't have been touched that much from 3.2 to 3.3. the files in question so far have been large files, so that could be a clue. maybe centos 6.2 is to blame? anyone else trying these simple kinds of tests? i don't see anything obvious in the logs. any suggestions from the list?
Do you have any symlinks in what you are rsyncing? I did mention a bug a little back with an easy to reproduce bug for me that involves extracting a tar archive if there is a symlink in the tar file. (Not sure if they were our of sync, or if it just didn't make the symlink right).> -----Original Message----- > From: gluster-users-bounces at gluster.org [mailto:gluster-users- > bounces at gluster.org] On Behalf Of Joe Pruett > Sent: Monday, January 16, 2012 4:45 PM > To: Gluster General Discussion List > Subject: [Gluster-users] 3.3beta2 on centos 6.2 seems very broken > > i have been playing with 3.3beta2 for a few weeks now and i can easily > create inconsistencies between replicas. > > my first test was to start an rsync from a client using the gluster fuse > code and then reboot one of the servers. after the rsync finishes, i > can see that the replicas are out of sync (which is to be expected), so > i start a self-heal via find. after that has run to completion, i don't > see any xattrs that indicate there are any files out of sync. but a > simple du/df shows that the two replicas aren't using the same amount of > space, so i dig further via find -ls on each replica and then compare > those outputs, and i'll find one or more files that are truncated on the > server that i rebooted. > > i have been using the 3.3 beta code because i wanted to play with the s3 > layer, but this simple test makes me wonder what's going on. > > i guess i could try 3.2 and see if i can create the problem there, but i > was kind of hoping that this basic layer of the code wouldn't have been > touched that much from 3.2 to 3.3. > > the files in question so far have been large files, so that could be a > clue. maybe centos 6.2 is to blame? > > anyone else trying these simple kinds of tests? i don't see anything > obvious in the logs. > > any suggestions from the list? > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
i have been repeating my tests with the 3.2.5 release and i'm not seeing any problems. so i think there is something wrong with the healing code in 3.3beta2. i'll try again with xfs under 3.3beta2 just to see if this is somehow connected to ext4. On 01/16/2012 01:44 PM, Joe Pruett wrote:> i have been playing with 3.3beta2 for a few weeks now and i can easily > create inconsistencies between replicas. > > my first test was to start an rsync from a client using the gluster fuse > code and then reboot one of the servers. after the rsync finishes, i > can see that the replicas are out of sync (which is to be expected), so > i start a self-heal via find. after that has run to completion, i don't > see any xattrs that indicate there are any files out of sync. but a > simple du/df shows that the two replicas aren't using the same amount of > space, so i dig further via find -ls on each replica and then compare > those outputs, and i'll find one or more files that are truncated on the > server that i rebooted. > > i have been using the 3.3 beta code because i wanted to play with the s3 > layer, but this simple test makes me wonder what's going on. > > i guess i could try 3.2 and see if i can create the problem there, but i > was kind of hoping that this basic layer of the code wouldn't have been > touched that much from 3.2 to 3.3. > > the files in question so far have been large files, so that could be a > clue. maybe centos 6.2 is to blame? > > anyone else trying these simple kinds of tests? i don't see anything > obvious in the logs. > > any suggestions from the list? > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >