thr3ads.net - Gluster users - [Gluster-users] Replication destroying file content on Node crash [Apr 2010]

If this information is useful, please help other people find it:
Share via:

Kelvin Westlake

2010-Apr-11 23:09 UTC

[Gluster-users] Replication destroying file content on Node crash

Hi Guys

I've got 2 servers with a volume replicated between them, and each has a
Client connected for the volume. Everything works fine while both
servers remain up, I can copy files between them and it works
flawlessly, the same occurs if I take one of the servers offline (i.e.
simulating a crash), but the moment I bring the server back up, any new
files (i.e. only on the live server) become inaccessible (If I edit it,
I get "Input/Output error" and the following appears in the Gluster
client logs -

	[2010-04-12 00:05:27] E
[afr-self-heal-common.c:1237:sh_missing_entries_create] mirror-0:
unknown file type: 01
	[2010-04-12 00:05:27] W [fuse-bridge.c:858:fuse_fd_cbk]
glusterfs-fuse: 168144: OPEN() /test2 => -1 (Input/output error)

I've also tried firing off a resync with "ls -lR" but this seems
to have
no effect.

Here are the vol files I'm using, I even tried disabling the
stat-prefetch in the client as suggest here
(http://gluster.org/pipermail/gluster-users/2009-December/003636.html),
but still no joy 

Server 1 - 192.168.100.29

## file auto generated by /usr/local/bin/glusterfs-volgen (export.vol)
# Cmd line:
# $ /usr/local/bin/glusterfs-volgen --name cluster3 --raid 1
192.168.100.31:/glusterfs/home 192.168.100.29:/glusterfs/home

volume posix1
  type storage/posix
  option directory /glusterfs/home
end-volume

volume locks1
    type features/locks
    subvolumes posix1
end-volume

volume brick1
    type performance/io-threads
    option thread-count 8
    subvolumes locks1
end-volume

volume server-tcp
    type protocol/server
    option transport-type tcp
    option auth.addr.brick1.allow *
    option transport.socket.listen-port 6996
    option transport.socket.nodelay on
    subvolumes brick1
end-volume

Server 2 - 192.168.100.31

## file auto generated by /usr/local/bin/glusterfs-volgen (export.vol)
# Cmd line:
# $ /usr/local/bin/glusterfs-volgen --name cluster3 --raid 1
192.168.100.31:/glusterfs/home 192.168.100.29:/glusterfs/home

volume posix1
  type storage/posix
  option directory /glusterfs/home
end-volume

volume locks1
    type features/locks
    subvolumes posix1
end-volume

volume brick1
    type performance/io-threads
    option thread-count 8
    subvolumes locks1
end-volume

volume server-tcp
    type protocol/server
    option transport-type tcp
    option auth.addr.brick1.allow *
    option transport.socket.listen-port 6996
    option transport.socket.nodelay on
    subvolumes brick1
end-volume

Client vol file, mount on both servers to /glusterfs/home-mnt

## file auto generated by /usr/local/bin/glusterfs-volgen (mount.vol)
# Cmd line:
# $ /usr/local/bin/glusterfs-volgen --name cluster3 --raid 1
192.168.100.31:/glusterfs/home 192.168.100.29:/glusterfs/home

# RAID 1
# TRANSPORT-TYPE tcp
volume 192.168.100.31-1
    type protocol/client
    option transport-type tcp
    option remote-host 192.168.100.31
    option transport.socket.nodelay on
    option transport.remote-port 6996
    option remote-subvolume brick1
end-volume

volume 192.168.100.29-1
    type protocol/client
    option transport-type tcp
    option remote-host 192.168.100.29
    option transport.socket.nodelay on
    option transport.remote-port 6996
    option remote-subvolume brick1
end-volume

volume mirror-0
    type cluster/replicate
    subvolumes 192.168.100.31-1 192.168.100.29-1
end-volume

volume writebehind
    type performance/write-behind
    option cache-size 4MB
    subvolumes mirror-0
end-volume

volume readahead
    type performance/read-ahead
    option page-count 4
    subvolumes writebehind
end-volume

volume iocache
    type performance/io-cache
    option cache-size `echo $[ $(grep 'MemTotal' /proc/meminfo | sed
's/[^0-9]//g') / 5120 ]`MB
    option cache-timeout 1
    subvolumes readahead
end-volume

volume quickread
    type performance/quick-read
    option cache-timeout 1
    option max-file-size 64kB
    subvolumes iocache
end-volume

#volume statprefetch
#    type performance/stat-prefetch
#    subvolumes quickread
#end-volume

Any help or advise would be greatly appreciated.


Thanks
Kelvin

 
 
This email with any attachments is for the exclusive and confidential use of the
addressee(s) and may contain legally privileged information. Any other
distribution, use or reproduction without the senders prior consent is
unauthorised and strictly prohibited. If you receive this message in error
please notify the sender by email and delete the message from your computer.
 
Netbasic Limited registered office and business address is 9 Funtley Court,
Funtley Hill, Fareham, Hampshire PO16 7UY. Company No. 04906681. Netbasic
Limited is authorised and regulated by the Financial Services Authority in
respect of regulated activities. Please note that many of our activities do not
require FSA regulation.

Vijay Bellur

2010-Apr-12 17:26 UTC

head link

[Gluster-users] Replication destroying file content on Node crash

Kelvin Westlake wrote:> Client vol file, mount on both servers to /glusterfs/home-mnt
>
>
> volume mirror-0
>     type cluster/replicate
>     subvolumes 192.168.100.31-1 192.168.100.29-1
> end-volume
>
> volume writebehind
>     type performance/write-behind
>     option cache-size 4MB
>     subvolumes mirror-0
> end-volume
>
> volume readahead
>     type performance/read-ahead
>     option page-count 4
>     subvolumes writebehind
> end-volume
>
> volume iocache
>     type performance/io-cache
>     option cache-size `echo $[ $(grep 'MemTotal' /proc/meminfo |
sed
> 's/[^0-9]//g') / 5120 ]`MB
>     option cache-timeout 1
>     subvolumes readahead
> end-volume
>
> volume quickread
>     type performance/quick-read
>     option cache-timeout 1
>     option max-file-size 64kB
>     subvolumes iocache
> end-volume
>
> #volume statprefetch
> #    type performance/stat-prefetch
> #    subvolumes quickread
> #end-volume
>   

Hi Kevin,

Can you please check if the same behavior persists with quick-read 
translator section commented out in the client?
If it doesn't, this could be related to bug 815.

Thanks,
Vijay

Gluster users - Apr 2010 - Replication destroying file content on Node crash

[Gluster-users] Replication destroying file content on Node crash

[Gluster-users] Replication destroying file content on Node crash