Dear All, I have found a large number of files like the one below in a DHT volume. ---------T 1 nobody nobody 0 2010-05-01 18:34 d80bao.daj0710 For those I have checked, a real file (i.e. non-zero size and normal permissions and ownership) exists on one brick and its zero byte counterpart is on one of the other bricks. I originally found these files by looking in the individual server bricks, but I discovered that I can also find them using the "find" command in the glusterfs DHT volume itself. However, listing a directory containing these files with "ls" does not list the zero byte versions, but instead two copies of the normal versions are shown. Does anybody have an idea what is going on? This strange behaviour is clearly going to confuse some users, and there are sometimes also long delays when listing the affected directories. I am using GlusterFS version 3.0.3 now but I also noticed this behaviour in 2.0.8. My client and server volume files are shown below. Regards, Dan Bretherton. # ## ### One of the server vol files ### ## # volume posix1 type storage/posix option directory /local end-volume volume posix2 type storage/posix option directory /local2/glusterfs end-volume volume posix3 type storage/posix option directory /local3/glusterfs end-volume volume locks1 type features/locks subvolumes posix1 end-volume volume locks2 type features/locks subvolumes posix2 end-volume volume locks3 type features/locks subvolumes posix3 end-volume volume io-cache1 type performance/io-cache subvolumes locks1 end-volume volume io-cache2 type performance/io-cache subvolumes locks2 end-volume volume io-cache3 type performance/io-cache subvolumes locks3 end-volume volume writebehind1 type performance/write-behind subvolumes io-cache1 end-volume volume writebehind2 type performance/write-behind subvolumes io-cache2 end-volume volume writebehind3 type performance/write-behind subvolumes io-cache3 end-volume volume brick1 type performance/io-threads subvolumes writebehind1 end-volume volume brick2 type performance/io-threads subvolumes writebehind2 end-volume volume brick3 type performance/io-threads subvolumes writebehind3 end-volume volume server type protocol/server option transport-type tcp option auth.addr.brick1.allow * option auth.addr.brick2.allow * option auth.addr.brick3.allow * option listen-port 6996 subvolumes brick1 brick2 brick3 end-volume # ## ### Client vol file ### ## # volume remus type protocol/client option transport-type tcp option remote-host remus option remote-port 6996 option remote-subvolume brick3 end-volume volume perseus type protocol/client option transport-type tcp option remote-host perseus option remote-port 6996 option remote-subvolume brick1 end-volume volume romulus type protocol/client option transport-type tcp option remote-host romulus option remote-port 6996 option remote-subvolume brick1 end-volume volume distribute type cluster/distribute option min-free-disk 20% #option lookup-unhashed yes subvolumes remus perseus romulus end-volume volume writebehind type performance/write-behind subvolumes distribute end-volume volume io-threads type performance/io-threads subvolumes writebehind end-volume volume io-cache type performance/io-cache option cache-size 512MB subvolumes io-threads end-volume volume main type performance/stat-prefetch subvolumes io-cache end-volume -- Mr. D.A. Bretherton Reading e-Science Centre Environmental Systems Science Centre Harry Pitt Building 3 Earley Gate University of Reading Reading, RG6 6AL UK Tel. +44 118 378 7722 Fax: +44 118 378 6413
On Jun 8, 2010, at 4:40 AM, Dan Bretherton wrote:> Dear All, > > I have found a large number of files like the one below in a DHT volume. > > ---------T 1 nobody nobody 0 2010-05-01 18:34 d80bao.daj0710 > > For those I have checked, a real file (i.e. non-zero size and normal > permissions and ownership) exists on one brick and its zero byte counterpart > is on one of the other bricks. I originally found these files by looking in > the individual server bricks, but I discovered that I can also find them > using the "find" command in the glusterfs DHT volume itself. However, > listing a directory containing these files with "ls" does not list the zero > byte versions, but instead two copies of the normal versions are shown. > Does anybody have an idea what is going on? This strange behaviour is > clearly going to confuse some users, and there are sometimes also long > delays when listing the affected directories. I am using GlusterFS version > 3.0.3 now but I also noticed this behaviour in 2.0.8. My client and server > volume files are shown below.These are "link files" used by the distribute translator. It needs to create these files in certain situations involving rename, etc. The link file contains a pointer to the actual file in its extended attribute. It is completely normal to have these files. ------------------------------ Vikas Gorur Engineer - Gluster, Inc. ------------------------------
Dan, volume perseus type protocol/client option transport-type tcp option remote-host perseus option remote-port 6996 option remote-subvolume brick1 end-volume volume romulus type protocol/client option transport-type tcp option remote-host romulus option remote-port 6996 option remote-subvolume brick1 end-volume both volumes are pointing to brick1, hence the distribute sees the file twice. On Tue, Jun 8, 2010 at 4:40 AM, Dan Bretherton <dab at mail.nerc-essc.ac.uk> wrote:> Dear All, > > I have found a large number of files like the one below in a DHT volume. > > ---------T ?1 nobody nobody ? ? 0 2010-05-01 18:34 d80bao.daj0710 > > For those I have checked, a real file (i.e. non-zero size and normal > permissions and ownership) exists on one brick and its zero byte counterpart > is on one of the other bricks. I originally found these files by looking in > the individual server bricks, but I discovered that I can also find them > using the "find" command in the glusterfs DHT volume itself. ?However, > listing a directory containing these files with "ls" does not list the zero > byte versions, but instead two copies of the normal versions are shown. > Does anybody have an idea what is going on? ?This strange behaviour is > clearly going to confuse some users, and there are sometimes also long > delays when listing the affected directories. ?I am using GlusterFS version > 3.0.3 now but I also noticed this behaviour in 2.0.8. ?My client and server > volume files are shown below. > > Regards, > Dan Bretherton. > > # > ## > ### One of the server vol files ### > ## > # > volume posix1 > ?type storage/posix > ?option directory /local > end-volume > > volume posix2 > ?type storage/posix > ?option directory /local2/glusterfs > end-volume > > volume posix3 > ?type storage/posix > ?option directory /local3/glusterfs > end-volume > > volume locks1 > ? ?type features/locks > ? ?subvolumes posix1 > end-volume > > volume locks2 > ? ?type features/locks > ? ?subvolumes posix2 > end-volume > > volume locks3 > ? ?type features/locks > ? ?subvolumes posix3 > end-volume > > volume io-cache1 > ?type performance/io-cache > ?subvolumes locks1 > end-volume > > volume io-cache2 > ?type performance/io-cache > ?subvolumes locks2 > end-volume > > volume io-cache3 > ?type performance/io-cache > ?subvolumes locks3 > end-volume > > volume writebehind1 > ?type performance/write-behind > ?subvolumes io-cache1 > end-volume > > volume writebehind2 > ?type performance/write-behind > ?subvolumes io-cache2 > end-volume > > volume writebehind3 > ?type performance/write-behind > ?subvolumes io-cache3 > end-volume > > volume brick1 > ? ?type performance/io-threads > ? ?subvolumes writebehind1 > end-volume > > volume brick2 > ? ?type performance/io-threads > ? ?subvolumes writebehind2 > end-volume > > volume brick3 > ? ?type performance/io-threads > ? ?subvolumes writebehind3 > end-volume > > volume server > ? ?type protocol/server > ? ?option transport-type tcp > ? ?option auth.addr.brick1.allow * > ? ?option auth.addr.brick2.allow * > ? ?option auth.addr.brick3.allow * > ? ?option listen-port 6996 > ? ?subvolumes brick1 brick2 brick3 > end-volume > > # > ## > ### Client vol file ### > ## > # > volume remus > ? ?type protocol/client > ? ?option transport-type tcp > ? ?option remote-host remus > ? ?option remote-port 6996 > ? ?option remote-subvolume brick3 > end-volume > > volume perseus > ? ?type protocol/client > ? ?option transport-type tcp > ? ?option remote-host perseus > ? ?option remote-port 6996 > ? ?option remote-subvolume brick1 > end-volume > > volume romulus > ? ?type protocol/client > ? ?option transport-type tcp > ? ?option remote-host romulus > ? ?option remote-port 6996 > ? ?option remote-subvolume brick1 > end-volume > > volume distribute > ? ?type cluster/distribute > ? ?option min-free-disk 20% > ? ?#option lookup-unhashed yes > ? ?subvolumes remus perseus romulus > end-volume > > volume writebehind > ?type performance/write-behind > ?subvolumes distribute > end-volume > > volume io-threads > ?type performance/io-threads > ?subvolumes writebehind > end-volume > > volume io-cache > ? ?type performance/io-cache > ? ?option cache-size 512MB > ? ?subvolumes io-threads > end-volume > > volume main > ?type performance/stat-prefetch > ?subvolumes io-cache > end-volume > > > -- > Mr. D.A. Bretherton > Reading e-Science Centre > Environmental Systems Science Centre > Harry Pitt Building > 3 Earley Gate > University of Reading > Reading, RG6 6AL > UK > > Tel. +44 118 378 7722 > Fax: +44 118 378 6413 > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > >
On Jun 8, 2010, at 4:40 AM, Dan Bretherton wrote:> Dear All, > > I have found a large number of files like the one below in a DHT volume. > > ---------T 1 nobody nobody 0 2010-05-01 18:34 d80bao.daj0710 > > For those I have checked, a real file (i.e. non-zero size and normal > permissions and ownership) exists on one brick and its zero byte counterpart > is on one of the other bricks. I originally found these files by looking in > the individual server bricks, but I discovered that I can also find them > using the "find" command in the glusterfs DHT volume itself. However, > listing a directory containing these files with "ls" does not list the zero > byte versions, but instead two copies of the normal versions are shown. > Does anybody have an idea what is going on? This strange behaviour is > clearly going to confuse some users, and there are sometimes also long > delays when listing the affected directories. I am using GlusterFS version > 3.0.3 now but I also noticed this behaviour in 2.0.8. My client and server > volume files are shown below.Dan, my apologies for not fully parsing your e-mail. The existence of link files is normal, but seeing them on the mountpoint or seeing duplicate files is not. This can usually happen due to mentioning the same subvolume twice, as Krishna suggested in the other e-mail. ------------------------------ Vikas Gorur Engineer - Gluster, Inc. ------------------------------
Dear Vikas and Krishna, Thank you for your replies. I have a couple more questions if you don't mind. These are "link files" used by the distribute translator. It needs to create> these files in certain situations involving rename, etc. The link file > contains a pointer to the actual file in its extended attribute. It is > completely normal to have these files. > > Are the link files temporary? How long should they exist for?both volumes are pointing to brick1, hence the distribute sees the file> twice. >But they also point to a different remote hosts, perseus and romulus. These two servers just happen to have Glusterfs server brick volumes called "brick1". Is this not allowed? -Dan. Message: 7> Date: Tue, 8 Jun 2010 09:10:49 -0700 > From: Vikas Gorur <vikas at gluster.com> > Subject: Re: [Gluster-users] Zero byte versions in DHT volume > To: Gluster General Discussion List <gluster-users at gluster.org> > Message-ID: <3ED9EA54-2C2E-46E3-BA07-54C072AD714D at gluster.com> > Content-Type: text/plain; charset=us-ascii > > > > On Jun 8, 2010, at 4:40 AM, Dan Bretherton wrote: > > > Dear All, > > > > I have found a large number of files like the one below in a DHT volume. > > > > ---------T 1 nobody nobody 0 2010-05-01 18:34 d80bao.daj0710 > > > > For those I have checked, a real file (i.e. non-zero size and normal > > permissions and ownership) exists on one brick and its zero byte > counterpart > > is on one of the other bricks. I originally found these files by looking > in > > the individual server bricks, but I discovered that I can also find them > > using the "find" command in the glusterfs DHT volume itself. However, > > listing a directory containing these files with "ls" does not list the > zero > > byte versions, but instead two copies of the normal versions are shown. > > Does anybody have an idea what is going on? This strange behaviour is > > clearly going to confuse some users, and there are sometimes also long > > delays when listing the affected directories. I am using GlusterFS > version > > 3.0.3 now but I also noticed this behaviour in 2.0.8. My client and > server > > volume files are shown below. > > These are "link files" used by the distribute translator. It needs to > create these files in certain situations involving rename, etc. The link > file contains a pointer to the actual file in its extended attribute. It is > completely normal to have these files. > > ------------------------------ > Vikas Gorur > Engineer - Gluster, Inc. > ------------------------------ > ------------------------------ > > Message: 8 > Date: Tue, 8 Jun 2010 09:27:24 -0700 > From: Krishna Srinivas <krishna at gluster.com> > Subject: Re: [Gluster-users] Zero byte versions in DHT volume > To: Gluster General Discussion List <gluster-users at gluster.org> > Message-ID: > <AANLkTin7XYJfm6q-qigHyRpp3zxtJVJNatw1a4xDLNJ0 at mail.gmail.com> > Content-Type: text/plain; charset=ISO-8859-1 > > Dan, > > > volume perseus > type protocol/client > option transport-type tcp > option remote-host perseus > option remote-port 6996 > option remote-subvolume brick1 > end-volume > > volume romulus > type protocol/client > option transport-type tcp > option remote-host romulus > option remote-port 6996 > option remote-subvolume brick1 > end-volume > > both volumes are pointing to brick1, hence the distribute sees the file > twice. > - Hide quoted text - > > > On Tue, Jun 8, 2010 at 4:40 AM, Dan Bretherton <dab at mail.nerc-essc.ac.uk> > wrote: > > Dear All, > > > > I have found a large number of files like the one below in a DHT volume. > > > > ---------T ?1 nobody nobody ? ? 0 2010-05-01 18:34 d80bao.daj0710 > > > > For those I have checked, a real file (i.e. non-zero size and normal > > permissions and ownership) exists on one brick and its zero byte > counterpart > > is on one of the other bricks. I originally found these files by looking > in > > the individual server bricks, but I discovered that I can also find them > > using the "find" command in the glusterfs DHT volume itself. ?However, > > listing a directory containing these files with "ls" does not list the > zero > > byte versions, but instead two copies of the normal versions are shown. > > Does anybody have an idea what is going on? ?This strange behaviour is > > clearly going to confuse some users, and there are sometimes also long > > delays when listing the affected directories. ?I am using GlusterFS > version > > 3.0.3 now but I also noticed this behaviour in 2.0.8. ?My client and > server > > volume files are shown below. > > > > Regards, > > Dan Bretherton. > > > > # > > ## > > ### One of the server vol files ### > > ## > > # > > volume posix1 > > ?type storage/posix > > ?option directory /local > > end-volume > > > > volume posix2 > > ?type storage/posix > > ?option directory /local2/glusterfs > > end-volume > > > > volume posix3 > > ?type storage/posix > > ?option directory /local3/glusterfs > > end-volume > > > > volume locks1 > > ? ?type features/locks > > ? ?subvolumes posix1 > > end-volume > > > > volume locks2 > > ? ?type features/locks > > ? ?subvolumes posix2 > > end-volume > > > > volume locks3 > > ? ?type features/locks > > ? ?subvolumes posix3 > > end-volume > > > > volume io-cache1 > > ?type performance/io-cache > > ?subvolumes locks1 > > end-volume > > > > volume io-cache2 > > ?type performance/io-cache > > ?subvolumes locks2 > > end-volume > > > > volume io-cache3 > > ?type performance/io-cache > > ?subvolumes locks3 > > end-volume > > > > volume writebehind1 > > ?type performance/write-behind > > ?subvolumes io-cache1 > > end-volume > > > > volume writebehind2 > > ?type performance/write-behind > > ?subvolumes io-cache2 > > end-volume > > > > volume writebehind3 > > ?type performance/write-behind > > ?subvolumes io-cache3 > > end-volume > > > > volume brick1 > > ? ?type performance/io-threads > > ? ?subvolumes writebehind1 > > end-volume > > > > volume brick2 > > ? ?type performance/io-threads > > ? ?subvolumes writebehind2 > > end-volume > > > > volume brick3 > > ? ?type performance/io-threads > > ? ?subvolumes writebehind3 > > end-volume > > > > volume server > > ? ?type protocol/server > > ? ?option transport-type tcp > > ? ?option auth.addr.brick1.allow * > > ? ?option auth.addr.brick2.allow * > > ? ?option auth.addr.brick3.allow * > > ? ?option listen-port 6996 > > ? ?subvolumes brick1 brick2 brick3 > > end-volume > > > > # > > ## > > ### Client vol file ### > > ## > > # > > volume remus > > ? ?type protocol/client > > ? ?option transport-type tcp > > ? ?option remote-host remus > > ? ?option remote-port 6996 > > ? ?option remote-subvolume brick3 > > end-volume > > > > volume perseus > > ? ?type protocol/client > > ? ?option transport-type tcp > > ? ?option remote-host perseus > > ? ?option remote-port 6996 > > ? ?option remote-subvolume brick1 > > end-volume > > > > volume romulus > > ? ?type protocol/client > > ? ?option transport-type tcp > > ? ?option remote-host romulus > > ? ?option remote-port 6996 > > ? ?option remote-subvolume brick1 > > end-volume > > > > volume distribute > > ? ?type cluster/distribute > > ? ?option min-free-disk 20% > > ? ?#option lookup-unhashed yes > > ? ?subvolumes remus perseus romulus > > end-volume > > > > volume writebehind > > ?type performance/write-behind > > ?subvolumes distribute > > end-volume > > > > volume io-threads > > ?type performance/io-threads > > ?subvolumes writebehind > > end-volume > > > > volume io-cache > > ? ?type performance/io-cache > > ? ?option cache-size 512MB > > ? ?subvolumes io-threads > > end-volume > > > > volume main > > ?type performance/stat-prefetch > > ?subvolumes io-cache > > end-volume > > > > > > -- > > Mr. D.A. Bretherton > > Reading e-Science Centre > > Environmental Systems Science Centre > > Harry Pitt Building > > 3 Earley Gate > > University of Reading > > Reading, RG6 6AL > > UK > > > > Tel. +44 118 378 7722 > > Fax: +44 118 378 6413 > > >-- Mr. D.A. Bretherton Reading e-Science Centre Environmental Systems Science Centre Harry Pitt Building 3 Earley Gate University of Reading Reading, RG6 6AL UK Tel. +44 118 378 7722 Fax: +44 118 378 6413
Dear Krishna and Vikas, I see that perseus and romulus are remote-hosts in the subvolume, do> they resolve to different servers or the same servers? > >They resolve to different servers. I have an idea what may have happened. Some time ago I was experimenting with a replicated DHT volume and ran into a "split brain" problem. I didn't have time to fix that problem so decided to rescue the data by copying the data to its current location (a DHT volume) from the individual bricks of the broken volume. I used rsync with "--update" to ensure I was getting the most recent version of each file. It is possible that some of the files I transferred were link files from the broken volume, and that these still exist and are causing problems in the current DHT volume. Can I fix this by deleting all link files in the current volume, brick by brick? I am hoping it will be possible to heal the volume in some way after that, e.g. by doing ls -lR on all the clients. I realise that some of the rescued data might be missing because I ended up with link files instead of real files. That does not matter too much as the data is not crucial. However I want to try to repair the current volume in order to gain experience, before deploying Glusterfs more widely with more important data. -Dan. On 8 June 2010 20:00, <gluster-users-request at gluster.org> wrote: Send Gluster-users mailing list submissions to gluster-users at gluster.org To subscribe or unsubscribe via the World Wide Web, visit http://gluster.org/cgi-bin/mailman/listinfo/gluster-users or, via email, send a message with subject or body 'help' to gluster-users-request at gluster.org You can reach the person managing the list at gluster-users-owner at gluster.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Gluster-users digest..." Today's Topics: 1. Re: Zero byte versions in DHT volume (Krishna Srinivas) 2. Re: Zero byte versions in DHT volume (Vikas Gorur) 3. Re: Zero byte versions in DHT volume (Dan Bretherton) ------------------------------> > ---------------------------------------- > > Message: 1 > Date: Tue, 8 Jun 2010 09:40:02 -0700 > > From: Krishna Srinivas <krishna at gluster.com> > Subject: Re: [Gluster-users] Zero byte versions in DHT volume > To: Gluster General Discussion List <gluster-users at gluster.org> > Message-ID: > <AANLkTimXqyqh2whouyF-88ydizDIQpUB7UM1Qs3HaWUd at mail.gmail.com> > > Content-Type: text/plain; charset=ISO-8859-1 > > On Tue, Jun 8, 2010 at 9:27 AM, Krishna Srinivas <krishna at gluster.com> > wrote: > > Dan, > > > > volume perseus > > ? type protocol/client > > ? option transport-type tcp > > ? option remote-host perseus > > ? option remote-port 6996 > > ? option remote-subvolume brick1 > > end-volume > > > > volume romulus > > ? type protocol/client > > ? option transport-type tcp > > ? option remote-host romulus > > ? option remote-port 6996 > > ? option remote-subvolume brick1 > > end-volume > > > > both volumes are pointing to brick1, hence the distribute sees the file > twice. > > I see that perseus and romulus are remote-hosts in the subvolume, do > they resolve to different servers or the same servers? > > > ------------------------------ > > Message: 2 > Date: Tue, 8 Jun 2010 09:44:02 -0700 > > From: Vikas Gorur <vikas at gluster.com> > Subject: Re: [Gluster-users] Zero byte versions in DHT volume > To: Gluster General Discussion List <gluster-users at gluster.org> > Message-ID: <2CB10C76-E1AB-4F6B-B92A-1DA162DEA15C at gluster.com> > > Content-Type: text/plain; charset=us-ascii > > > On Jun 8, 2010, at 4:40 AM, Dan Bretherton wrote: > > > Dear All, > > > > I have found a large number of files like the one below in a DHT volume. > > > > ---------T 1 nobody nobody 0 2010-05-01 18:34 d80bao.daj0710 > > > > For those I have checked, a real file (i.e. non-zero size and normal > > permissions and ownership) exists on one brick and its zero byte > counterpart > > is on one of the other bricks. I originally found these files by looking > in > > the individual server bricks, but I discovered that I can also find them > > using the "find" command in the glusterfs DHT volume itself. However, > > listing a directory containing these files with "ls" does not list the > zero > > byte versions, but instead two copies of the normal versions are shown. > > Does anybody have an idea what is going on? This strange behaviour is > > clearly going to confuse some users, and there are sometimes also long > > delays when listing the affected directories. I am using GlusterFS > version > > 3.0.3 now but I also noticed this behaviour in 2.0.8. My client and > server > > volume files are shown below. > > Dan, my apologies for not fully parsing your e-mail. The existence of link > files is normal, but seeing them on the mountpoint or seeing duplicate files > is not. This can usually happen due to mentioning the same subvolume twice, > as Krishna suggested in the other e-mail. > > > ------------------------------ > Vikas Gorur > Engineer - Gluster, Inc. > ------------------------------ >-- Mr. D.A. Bretherton Reading e-Science Centre Environmental Systems Science Centre Harry Pitt Building 3 Earley Gate University of Reading Reading, RG6 6AL UK Tel. +44 118 378 7722 Fax: +44 118 378 6413