thr3ads.net - Gluster users - [Gluster-users] Strange behaviour in AFR - directory read only from first brick? [Jul 2008]

If this information is useful, please help other people find it:
Share via:

Arnulf Heimsbakk

2008-Jul-03 13:20 UTC

[Gluster-users] Strange behaviour in AFR - directory read only from first brick?

Hi,

I find GlusterFS as a refreshing alternative to traditional cluster
file systems. I have currently experimented with unify and AFR. When
experimenting with AFR with two bricks I found a strange behaviour. It
seems that the directory is only read from the first brick.

I created two bricks and run client side AFR. In the client config the
bricks is represented as follows:

subvolumes brick1 brick2

Then I simulated a crash with loss of files.

1. AFR is up
2. touch file f1
3. brick1 crashes
4. ls on mount point, f1 exists and everything is normal (ls read from brick2)
5. file system repair removes f1 from brick1
6. gclusterfsd start on brick1
7. ls on mount point does not show f1 anymore (ls read only from brick1?)
8. cat f1 on mount point replicates file and it becomes visible

I have replicated this error a numerous times. Every time i have
removed user_xattr from the exported directories.

GlusterFS version 1.3.8
XFS backend filesystem
Debian Etch i686

Is there a fix for this behaviour or is there a configuration solution
which eliminates this problem?
Or is this problem considered to be a split brain case?
Does this affect AFRd namespaces (i did not test that)?

Arnulf Heimsbakk
Sysadmin @ Norwegian Meteorological Institute
-- 
"When all else fails, read the instructions." ~ L. Iasellio

Anand Avati

2008-Jul-03 17:52 UTC

head link

[Gluster-users] Strange behaviour in AFR - directory read only from first brick?

this case is considered equal to 'altering the backend without the mount
point'. currently it is an unsupported 'action'.

avati

On 03/07/2008, Arnulf Heimsbakk <arnulf.heimsbakk at gmail.com>
wrote:>
> Hi,
>
> I find GlusterFS as a refreshing alternative to traditional cluster
> file systems. I have currently experimented with unify and AFR. When
> experimenting with AFR with two bricks I found a strange behaviour. It
> seems that the directory is only read from the first brick.
>
> I created two bricks and run client side AFR. In the client config the
> bricks is represented as follows:
>
> subvolumes brick1 brick2
>
> Then I simulated a crash with loss of files.
>
> 1. AFR is up
> 2. touch file f1
> 3. brick1 crashes
> 4. ls on mount point, f1 exists and everything is normal (ls read from
> brick2)
> 5. file system repair removes f1 from brick1
> 6. gclusterfsd start on brick1
> 7. ls on mount point does not show f1 anymore (ls read only from brick1?)
> 8. cat f1 on mount point replicates file and it becomes visible
>
> I have replicated this error a numerous times. Every time i have
> removed user_xattr from the exported directories.
>
> GlusterFS version 1.3.8
> XFS backend filesystem
> Debian Etch i686
>
> Is there a fix for this behaviour or is there a configuration solution
> which eliminates this problem?
> Or is this problem considered to be a split brain case?
> Does this affect AFRd namespaces (i did not test that)?
>
> Arnulf Heimsbakk
> Sysadmin @ Norwegian Meteorological Institute
>
> --
> "When all else fails, read the instructions." ~ L. Iasellio
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>


-- 
If I traveled to the end of the rainbow
As Dame Fortune did intend,
Murphy would be there to tell me
The pot's at the other end.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20080703/4421ffb2/attachment.html>

baggio liu

2008-Jul-04 07:46 UTC

head link

[Gluster-users] Strange behaviour in AFR - directory read only from first brick?

Hi,
      As I test, AFR can migrate file (from Server2 to Server1 )which create
at Server1 down
time. But if a file is created when Server1 alive, then Server1 goes down
(and all
the data lose.) And restart Server1 (as well, glusterfs process), data can *not
*be
replicated by AFR automatically.
     And this case is common when we use cheap storage as glusterfs
server.We have not
modify data at backend, but in enterprise environment , failure can not be
avoid.
     How can I deal with these cases?


Thanks & Regards
Baggio



2008/7/4 Krishna Srinivas <krishna at zresearch.com>:
> AFR will take care of things as long as you dont modify the
> backend yourself.
>
> Regards :)
> Krishna
>
> On Fri, Jul 4, 2008 at 12:06 PM, baggio liu <baggioss at gmail.com>
wrote:
> > Thanks for your response.  :)
> >
> > You mean, when some nodes failvoer, we should guarantee namespace/data
> > right by manually ?? afr can not work well at this case ??
> >
> >
> > Thanks
> >
> > Baggio
> >
> >
> >
> > 2008/7/4 Krishna Srinivas <krishna at zresearch.com>:
> >>
> >> Baggio,
> >>
> >> If the xattr versions are same on both the directories of AFR,
then
> >> it assumes that contents are also same, so when you wipe the
> >> data on the first brick, you also need to make sure that when you
> >> bring it back up, the version on the brick2 is more that version
> >> on brick1. use "getfattr" to get version and createtime
attrs
> >> on the directories. the xattrs are trusted.glusterfs.version and
> >> trusted.glusterfs.createtime.
> >>
> >> Krishna
> >>
> >> On Fri, Jul 4, 2008 at 11:10 AM, baggio liu <baggioss at
gmail.com> wrote:
> >> > Hi,
> >> >    I have make some tests.
> >> >
> >> > 1. AFR is up (namespace and data brick between Server1 and
Server2
> have
> >> > AFR)
> >> > 2. touch file f1
> >> > 3. Server1 crashes (remove data and namespace in Server1 )
> >> >
> >> > 4. ls on mount point, f1 exists and everything is normal (ls
read from
> >> > Server2)
> >> > 5. gclusterfsd start on Server1
> >> > 6. ls on mount point does not show f1 anymore (ls read only
from
> >> > brick1?)
> >> > 7. cat f1 on client, and content of it can be seen, but ls
can not
> work
> >> > well.
> >> >
> >> > GlusterFS version 1.3.9 release
> >> >
> >> >
> >> >
> >> > Server1 spec vol:
> >> >
> >> > volume brick
> >> >   type storage/posix
> >> >   option directory /mnt/glusterfs/brick00
> >> > end-volume
> >> >
> >> > volume ns
> >> >   type storage/posix
> >> >   option directory /mnt/glusterfs/ns
> >> > end-volume
> >> >
> >> > volume server
> >> >   type protocol/server
> >> >   option transport-type tcp/server
> >> >   option ib-verbs-work-request-send-size  131072
> >> >   option ib-verbs-work-request-send-count 64
> >> >   option ib-verbs-work-request-recv-size  131072
> >> >   option ib-verbs-work-request-recv-count 64
> >> >   option auth.ip.brick.allow *
> >> >   option auth.ip.ns.allow *
> >> >   subvolumes brick ns
> >> > end-volume
> >> >
> >> > Server2 spec vol:
> >> >
> >> > volume remote-ns
> >> >         type protocol/client
> >> >         option transport-type tcp/client
> >> >         option remote-host [server1 ip]
> >> >         option remote-subvolume ns
> >> > end-volume
> >> >
> >> > volume local-ns
> >> >   type storage/posix
> >> >   option directory /mnt/glusterfs/ns
> >> > end-volume
> >> >
> >> > volume ns
> >> >  type cluster/afr
> >> >  subvolumes remote-ns local-ns
> >> > end-volume
> >> >
> >> > volume remote-brick00
> >> >   type protocol/client
> >> >   option transport-type tcp/client
> >> >   option remote-host 172.16.208.20
> >> >   option remote-port 6996
> >> >   option remote-subvolume brick
> >> > end-volume
> >> >
> >> >
> >> > volume local-brick00
> >> >   type storage/posix
> >> >   option directory /mnt/glusterfs/brick00
> >> > end-volume
> >> >
> >> > volume brick00
> >> >  type cluster/afr
> >> >  subvolumes remote-brick00 local-brick00
> >> > end-volume
> >> >
> >> > volume unify
> >> >   type cluster/unify
> >> >   option namespace ns
> >> >   option scheduler rr
> >> >   subvolumes brick00
> >> > end-volume
> >> >
> >> > BTW, I'm not very clear about what arnulf said, but in my
may, this
> >> > problem
> >> > can be seen.
> >> >
> >> >
> >> > Baggio
> >> >
> >> >
> >> > 2008/7/4 Krishna Srinivas <krishna at zresearch.com>:
> >> >>
> >> >> >>> 1. AFR is up
> >> >> >>> 2. touch file f1
> >> >> >>> 3. brick1 crashes
> >> >> >>> 4. ls on mount point, f1 exists and
everything is normal (ls read
> >> >> >>> from
> >> >> >>> brick2)
> >> >> >>> 5. file system repair removes f1 from brick1
> >> >>
> >> >> Glusterfs removes f1 from brick1? Or do you manually
remove it?
> >> >> Could you also check with a later release. As a related
bug was
> >> >> fixed.
> >> >>
> >> >> Thanks
> >> >>
> >> >> >>> 6. gclusterfsd start on brick1
> >> >> >>> 7. ls on mount point does not show f1
anymore (ls read only from
> >> >> >>> brick1?)
> >> >> >>> 8. cat f1 on mount point replicates file and
it becomes visible
> >> >>
> >> >>
> >> >>
> >> >> On Fri, Jul 4, 2008 at 7:03 AM, baggio liu <baggioss
at gmail.com>
> wrote:
> >> >> > Hi,
> >> >> >    A file can't "ls " ,but can
"less ".
> >> >> >    I think this action is a little weird. If this
action can not be
> >> >> > supp
> >> >
> >> >
> >
> >
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20080704/0860f5ed/attachment.html>

Gluster users - Jul 2008 - Strange behaviour in AFR - directory read only from first brick?

[Gluster-users] Strange behaviour in AFR - directory read only from first brick?

[Gluster-users] Strange behaviour in AFR - directory read only from first brick?

[Gluster-users] Strange behaviour in AFR - directory read only from first brick?