thr3ads.net - Gluster users - [Gluster-users] Fwd: files not syncing up with glusterfs 3.1.2 [Feb 2011]

If this information is useful, please help other people find it:
Share via:

paul simpson

2011-Feb-19 01:24 UTC

[Gluster-users] Fwd: files not syncing up with glusterfs 3.1.2

hello all,

i have been testing gluster as a central file server for a small animation
studio/post production company.  my initial experiments were using the fuse
glusterfs protocol - but that ran extremely slowly for home dirs and general
file sharing.  we have since switched to using NFS over glusterfs.  NFS
has certainly seemed more responsive re. stat and dir traversal.  however,
i'm now being plagued with three different types of errors:

1/ Stale NFS file handle
2/ input/output errors
3/ and a new one:
$ l -l /n/auto/gv1/production/conan/hda/published/OLD/
ls: cannot access /n/auto/gv1/production/conan/hda/published/OLD/shot:
Remote I/O error
total 0
d????????? ? ? ? ?                ? shot

...so it's a bit all over the place.  i've tried rebooting both servers
and
clients.  these issues are very erratic - they come and go.

some information on my setup: glusterfs 3.1.2

g1:~ # gluster volume info

Volume Name: glustervol1
Type: Distributed-Replicate
Status: Started
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: g1:/mnt/glus1
Brick2: g2:/mnt/glus1
Brick3: g3:/mnt/glus1
Brick4: g4:/mnt/glus1
Brick5: g1:/mnt/glus2
Brick6: g2:/mnt/glus2
Brick7: g3:/mnt/glus2
Brick8: g4:/mnt/glus2
Options Reconfigured:


performance.write-behind-window-size: 1mb


performance.cache-size: 1gb


performance.stat-prefetch: 1


network.ping-timeout: 20


diagnostics.latency-measurement: off


diagnostics.dump-fd-stats: on


that is 4 servers - serving ~30 clients - 95% linux, 5% mac.  all NFS.
 other points:
- i'm automounting using NFS via autofs (with ldap).  ie:
  gus:/glustervol1 on /n/auto/gv1 type nfs
(rw,vers=3,rsize=32768,wsize=32768,intr,sloppy,addr=10.0.0.13)
gus is pointing to rr dns machines (g1,g2,g3,g4).  that all seems to be
working.

- backend files system on g[1-4] is xfs.  ie,

g1:/var/log/glusterfs # xfs_info /mnt/glus1
meta-data=/dev/sdb1              isize=256    agcount=7, agsize=268435200
blks
         =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=1627196928, imaxpct=5
         =                       sunit=256    swidth=2560 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=32768, version=2
         =                       sectsz=512   sunit=8 blks, lazy-count=0
realtime =none                   extsz=4096   blocks=0, rtextents=0


- sometimes root can stat/read the file in question while the user cannot!
 i can remount the same NFS share to another mount point - and i can then
see that with the same user.

- sample output of g1 nfs.log file:

[2011-02-18 15:27:07.201433] I [io-stats.c:338:io_stats_dump_fd]
glustervol1:       Filename :
/production/conan/hda/published/shot/backup/.svn/tmp/entries
[2011-02-18 15:27:07.201445] I [io-stats.c:353:io_stats_dump_fd]
glustervol1:   BytesWritten : 1414 bytes
[2011-02-18 15:27:07.201455] I [io-stats.c:365:io_stats_dump_fd]
glustervol1: Write 001024b+ : 1
[2011-02-18 15:27:07.205999] I [io-stats.c:333:io_stats_dump_fd]
glustervol1: --- fd stats ---
[2011-02-18 15:27:07.206032] I [io-stats.c:338:io_stats_dump_fd]
glustervol1:       Filename :
/production/conan/hda/published/shot/backup/.svn/props/tempfile.tmp
[2011-02-18 15:27:07.210799] I [io-stats.c:333:io_stats_dump_fd]
glustervol1: --- fd stats ---
[2011-02-18 15:27:07.210824] I [io-stats.c:338:io_stats_dump_fd]
glustervol1:       Filename :
/production/conan/hda/published/shot/backup/.svn/tmp/log
[2011-02-18 15:27:07.211904] I [io-stats.c:333:io_stats_dump_fd]
glustervol1: --- fd stats ---
[2011-02-18 15:27:07.211928] I [io-stats.c:338:io_stats_dump_fd]
glustervol1:       Filename :
/prod_data/xmas/lgl/pic/mr_all_PBR_HIGHNO_DF/035/1920x1080/mr_all_PBR_HIGHNO_DF.6084.exr
[2011-02-18 15:27:07.211940] I [io-stats.c:343:io_stats_dump_fd]
glustervol1:       Lifetime : 8731secs, 610796usecs
[2011-02-18 15:27:07.211951] I [io-stats.c:353:io_stats_dump_fd]
glustervol1:   BytesWritten : 2321370 bytes
[2011-02-18 15:27:07.211962] I [io-stats.c:365:io_stats_dump_fd]
glustervol1: Write 000512b+ : 1
[2011-02-18 15:27:07.211972] I [io-stats.c:365:io_stats_dump_fd]
glustervol1: Write 002048b+ : 1
[2011-02-18 15:27:07.211983] I [io-stats.c:365:io_stats_dump_fd]
glustervol1: Write 004096b+ : 4
[2011-02-18 15:27:07.212009] I [io-stats.c:365:io_stats_dump_fd]
glustervol1: Write 008192b+ : 4
[2011-02-18 15:27:07.212019] I [io-stats.c:365:io_stats_dump_fd]
glustervol1: Write 016384b+ : 20
[2011-02-18 15:27:07.212030] I [io-stats.c:365:io_stats_dump_fd]
glustervol1: Write 032768b+ : 54
[2011-02-18 15:27:07.228051] I [io-stats.c:333:io_stats_dump_fd]
glustervol1: --- fd stats ---
[2011-02-18 15:27:07.228078] I [io-stats.c:338:io_stats_dump_fd]
glustervol1:       Filename :
/production/conan/hda/published/shot/backup/.svn/tmp/entries

...so, the files not working don't have lifetime, read/written lines after
their log entry.

all very perplexing - and scary.  one thing that reliably fails is using svn
working dirs on the gluster filesystem.  nfs locks keep being dropped.  this
is temporarily fixed when i view the file as root (on a client) - but then
re-appears very quickly.  i assume that gluster is upto something as simple
as having svn working dirs?

i'm hoping i've done something stupid which is easily fixed.  we seem so
close - but right now, i'm at a loss and loosing confidence.  i would
greatly appreciate any help/pointers out there.

regards,

paul

Fabricio Cannini

2011-Feb-21 14:42 UTC

head link

[Gluster-users] Fwd: files not syncing up with glusterfs 3.1.2

Em Sexta-feira 18 Fevereiro 2011, ?s 23:24:10, paul simpson
escreveu:> hello all,
> 
> i have been testing gluster as a central file server for a small animation
> studio/post production company.  my initial experiments were using the fuse
> glusterfs protocol - but that ran extremely slowly for home dirs and
> general file sharing.  we have since switched to using NFS over glusterfs.
>  NFS has certainly seemed more responsive re. stat and dir traversal. 
> however, i'm now being plagued with three different types of errors:
> 
> 1/ Stale NFS file handle
> 2/ input/output errors
> 3/ and a new one:
> $ l -l /n/auto/gv1/production/conan/hda/published/OLD/
> ls: cannot access /n/auto/gv1/production/conan/hda/published/OLD/shot:
> Remote I/O error
> total 0
> d????????? ? ? ? ?                ? shot
> 
> ...so it's a bit all over the place.  i've tried rebooting both
servers and
> clients.  these issues are very erratic - they come and go.
> 
> some information on my setup: glusterfs 3.1.2
> 
> g1:~ # gluster volume info
> 
> Volume Name: glustervol1
> Type: Distributed-Replicate
> Status: Started
> Number of Bricks: 4 x 2 = 8
> Transport-type: tcp
> Bricks:
> Brick1: g1:/mnt/glus1
> Brick2: g2:/mnt/glus1
> Brick3: g3:/mnt/glus1
> Brick4: g4:/mnt/glus1
> Brick5: g1:/mnt/glus2
> Brick6: g2:/mnt/glus2
> Brick7: g3:/mnt/glus2
> Brick8: g4:/mnt/glus2
> Options Reconfigured:
> 
> 
> performance.write-behind-window-size: 1mb
> 
> 
> performance.cache-size: 1gb
> 
> 
> performance.stat-prefetch: 1
> 
> 
> network.ping-timeout: 20
> 
> 
> diagnostics.latency-measurement: off
> 
> 
> diagnostics.dump-fd-stats: on
> 
> 
> that is 4 servers - serving ~30 clients - 95% linux, 5% mac.  all NFS.
>  other points:
> - i'm automounting using NFS via autofs (with ldap).  ie:
>   gus:/glustervol1 on /n/auto/gv1 type nfs
> (rw,vers=3,rsize=32768,wsize=32768,intr,sloppy,addr=10.0.0.13)
> gus is pointing to rr dns machines (g1,g2,g3,g4).  that all seems to be
> working.
> 
> - backend files system on g[1-4] is xfs.  ie,
> 
> g1:/var/log/glusterfs # xfs_info /mnt/glus1
> meta-data=/dev/sdb1              isize=256    agcount=7, agsize=268435200
> blks
>          =                       sectsz=512   attr=2
> data     =                       bsize=4096   blocks=1627196928, imaxpct=5
>          =                       sunit=256    swidth=2560 blks
> naming   =version 2              bsize=4096   ascii-ci=0
> log      =internal               bsize=4096   blocks=32768, version=2
>          =                       sectsz=512   sunit=8 blks, lazy-count=0
> realtime =none                   extsz=4096   blocks=0, rtextents=0
> 
> 
> - sometimes root can stat/read the file in question while the user cannot!
>  i can remount the same NFS share to another mount point - and i can then
> see that with the same user.
> 
> - sample output of g1 nfs.log file:
> 
> [2011-02-18 15:27:07.201433] I [io-stats.c:338:io_stats_dump_fd]
> glustervol1:       Filename :
> /production/conan/hda/published/shot/backup/.svn/tmp/entries
> [2011-02-18 15:27:07.201445] I [io-stats.c:353:io_stats_dump_fd]
> glustervol1:   BytesWritten : 1414 bytes
> [2011-02-18 15:27:07.201455] I [io-stats.c:365:io_stats_dump_fd]
> glustervol1: Write 001024b+ : 1
> [2011-02-18 15:27:07.205999] I [io-stats.c:333:io_stats_dump_fd]
> glustervol1: --- fd stats ---
> [2011-02-18 15:27:07.206032] I [io-stats.c:338:io_stats_dump_fd]
> glustervol1:       Filename :
> /production/conan/hda/published/shot/backup/.svn/props/tempfile.tmp
> [2011-02-18 15:27:07.210799] I [io-stats.c:333:io_stats_dump_fd]
> glustervol1: --- fd stats ---
> [2011-02-18 15:27:07.210824] I [io-stats.c:338:io_stats_dump_fd]
> glustervol1:       Filename :
> /production/conan/hda/published/shot/backup/.svn/tmp/log
> [2011-02-18 15:27:07.211904] I [io-stats.c:333:io_stats_dump_fd]
> glustervol1: --- fd stats ---
> [2011-02-18 15:27:07.211928] I [io-stats.c:338:io_stats_dump_fd]
> glustervol1:       Filename :
> /prod_data/xmas/lgl/pic/mr_all_PBR_HIGHNO_DF/035/1920x1080/mr_all_PBR_HIGHN
> O_DF.6084.exr [2011-02-18 15:27:07.211940] I
> [io-stats.c:343:io_stats_dump_fd]
> glustervol1:       Lifetime : 8731secs, 610796usecs
> [2011-02-18 15:27:07.211951] I [io-stats.c:353:io_stats_dump_fd]
> glustervol1:   BytesWritten : 2321370 bytes
> [2011-02-18 15:27:07.211962] I [io-stats.c:365:io_stats_dump_fd]
> glustervol1: Write 000512b+ : 1
> [2011-02-18 15:27:07.211972] I [io-stats.c:365:io_stats_dump_fd]
> glustervol1: Write 002048b+ : 1
> [2011-02-18 15:27:07.211983] I [io-stats.c:365:io_stats_dump_fd]
> glustervol1: Write 004096b+ : 4
> [2011-02-18 15:27:07.212009] I [io-stats.c:365:io_stats_dump_fd]
> glustervol1: Write 008192b+ : 4
> [2011-02-18 15:27:07.212019] I [io-stats.c:365:io_stats_dump_fd]
> glustervol1: Write 016384b+ : 20
> [2011-02-18 15:27:07.212030] I [io-stats.c:365:io_stats_dump_fd]
> glustervol1: Write 032768b+ : 54
> [2011-02-18 15:27:07.228051] I [io-stats.c:333:io_stats_dump_fd]
> glustervol1: --- fd stats ---
> [2011-02-18 15:27:07.228078] I [io-stats.c:338:io_stats_dump_fd]
> glustervol1:       Filename :
> /production/conan/hda/published/shot/backup/.svn/tmp/entries
> 
> ...so, the files not working don't have lifetime, read/written lines
after
> their log entry.
> 
> all very perplexing - and scary.  one thing that reliably fails is using
> svn working dirs on the gluster filesystem.  nfs locks keep being dropped.
>  this is temporarily fixed when i view the file as root (on a client) -
> but then re-appears very quickly.  i assume that gluster is upto something
> as simple as having svn working dirs?
> 
> i'm hoping i've done something stupid which is easily fixed.  we
seem so
> close - but right now, i'm at a loss and loosing confidence.  i would
> greatly appreciate any help/pointers out there.
> 
> regards,
> 
> paul
Hi Paul.

I've been using gluster for ~6 months now, so i'm by no means an expert,
but i
can see that you're doing two things that are dicouraged by the devels:

- Using xfs as a backend filesystem
- Serving small files ( < 1MB of size ) 
( I'm assuming that because of log messages like this )
> [2011-02-18 15:27:07.201433] I [io-stats.c:338:io_stats_dump_fd]
> glustervol1:       Filename :
> /production/conan/hda/published/shot/backup/.svn/tmp/entries
> [2011-02-18 15:27:07.201445] I [io-stats.c:353:io_stats_dump_fd]
> glustervol1:   BytesWritten : 1414 bytes
Can any one please confirm or correct my assumptions ?

TIA

Shehjar Tikoo

2011-Feb-22 07:14 UTC

head link

[Gluster-users] Fwd: files not syncing up with glusterfs 3.1.2

paul simpson wrote:> hello all,
> 
> i have been testing gluster as a central file server for a small animation
> studio/post production company.  my initial experiments were using the fuse
> glusterfs protocol - but that ran extremely slowly for home dirs and
general
> file sharing.  we have since switched to using NFS over glusterfs.  NFS
> has certainly seemed more responsive re. stat and dir traversal.  however,
> i'm now being plagued with three different types of errors:
> 
> 1/ Stale NFS file handle
> 2/ input/output errors
> 3/ and a new one:
> $ l -l /n/auto/gv1/production/conan/hda/published/OLD/
> ls: cannot access /n/auto/gv1/production/conan/hda/published/OLD/shot:
> Remote I/O error
> total 0
> d????????? ? ? ? ?                ? shot
> 
> ...so it's a bit all over the place.  i've tried rebooting both
servers and
> clients.  these issues are very erratic - they come and go.
> 
> some information on my setup: glusterfs 3.1.2
> 
> g1:~ # gluster volume info
> 
> Volume Name: glustervol1
> Type: Distributed-Replicate
> Status: Started
> Number of Bricks: 4 x 2 = 8
> Transport-type: tcp
> Bricks:
> Brick1: g1:/mnt/glus1
> Brick2: g2:/mnt/glus1
> Brick3: g3:/mnt/glus1
> Brick4: g4:/mnt/glus1
> Brick5: g1:/mnt/glus2
> Brick6: g2:/mnt/glus2
> Brick7: g3:/mnt/glus2
> Brick8: g4:/mnt/glus2
> Options Reconfigured:
> 
> 
> performance.write-behind-window-size: 1mb
> 
> 
> performance.cache-size: 1gb
> 
> 
> performance.stat-prefetch: 1
> 
> 
> network.ping-timeout: 20
> 
> 
> diagnostics.latency-measurement: off
> 
> 
> diagnostics.dump-fd-stats: on
> 
> 
> that is 4 servers - serving ~30 clients - 95% linux, 5% mac.  all NFS.
Mac OS as a nfs client remains untested against Gluster NFS. Do you see 
these errors on Mac or Linux clients?

>  other points:
> - i'm automounting using NFS via autofs (with ldap).  ie:
>   gus:/glustervol1 on /n/auto/gv1 type nfs
> (rw,vers=3,rsize=32768,wsize=32768,intr,sloppy,addr=10.0.0.13)
> gus is pointing to rr dns machines (g1,g2,g3,g4).  that all seems to be
> working.
> 
> - backend files system on g[1-4] is xfs.  ie,
> 
> g1:/var/log/glusterfs # xfs_info /mnt/glus1
> meta-data=/dev/sdb1              isize=256    agcount=7, agsize=268435200
> blks
>          =                       sectsz=512   attr=2
> data     =                       bsize=4096   blocks=1627196928, imaxpct=5
>          =                       sunit=256    swidth=2560 blks
> naming   =version 2              bsize=4096   ascii-ci=0
> log      =internal               bsize=4096   blocks=32768, version=2
>          =                       sectsz=512   sunit=8 blks, lazy-count=0
> realtime =none                   extsz=4096   blocks=0, rtextents=0
> 
> 
> - sometimes root can stat/read the file in question while the user cannot!
>  i can remount the same NFS share to another mount point - and i can then
> see that with the same user.
I think that may be occurring because NFS+LDAP requires a slightly 
different authentication scheme as compared to a NFS only setup. Please try 
the same test without LDAP in the middle.
> 
> - sample output of g1 nfs.log file:
> 
> [2011-02-18 15:27:07.201433] I [io-stats.c:338:io_stats_dump_fd]
> glustervol1:       Filename :
> /production/conan/hda/published/shot/backup/.svn/tmp/entries
> [2011-02-18 15:27:07.201445] I [io-stats.c:353:io_stats_dump_fd]
> glustervol1:   BytesWritten : 1414 bytes
> [2011-02-18 15:27:07.201455] I [io-stats.c:365:io_stats_dump_fd]
> glustervol1: Write 001024b+ : 1
> [2011-02-18 15:27:07.205999] I [io-stats.c:333:io_stats_dump_fd]
> glustervol1: --- fd stats ---
> [2011-02-18 15:27:07.206032] I [io-stats.c:338:io_stats_dump_fd]
> glustervol1:       Filename :
> /production/conan/hda/published/shot/backup/.svn/props/tempfile.tmp
> [2011-02-18 15:27:07.210799] I [io-stats.c:333:io_stats_dump_fd]
> glustervol1: --- fd stats ---
> [2011-02-18 15:27:07.210824] I [io-stats.c:338:io_stats_dump_fd]
> glustervol1:       Filename :
> /production/conan/hda/published/shot/backup/.svn/tmp/log
> [2011-02-18 15:27:07.211904] I [io-stats.c:333:io_stats_dump_fd]
> glustervol1: --- fd stats ---
> [2011-02-18 15:27:07.211928] I [io-stats.c:338:io_stats_dump_fd]
> glustervol1:       Filename :
>
/prod_data/xmas/lgl/pic/mr_all_PBR_HIGHNO_DF/035/1920x1080/mr_all_PBR_HIGHNO_DF.6084.exr
> [2011-02-18 15:27:07.211940] I [io-stats.c:343:io_stats_dump_fd]
> glustervol1:       Lifetime : 8731secs, 610796usecs
> [2011-02-18 15:27:07.211951] I [io-stats.c:353:io_stats_dump_fd]
> glustervol1:   BytesWritten : 2321370 bytes
> [2011-02-18 15:27:07.211962] I [io-stats.c:365:io_stats_dump_fd]
> glustervol1: Write 000512b+ : 1
> [2011-02-18 15:27:07.211972] I [io-stats.c:365:io_stats_dump_fd]
> glustervol1: Write 002048b+ : 1
> [2011-02-18 15:27:07.211983] I [io-stats.c:365:io_stats_dump_fd]
> glustervol1: Write 004096b+ : 4
> [2011-02-18 15:27:07.212009] I [io-stats.c:365:io_stats_dump_fd]
> glustervol1: Write 008192b+ : 4
> [2011-02-18 15:27:07.212019] I [io-stats.c:365:io_stats_dump_fd]
> glustervol1: Write 016384b+ : 20
> [2011-02-18 15:27:07.212030] I [io-stats.c:365:io_stats_dump_fd]
> glustervol1: Write 032768b+ : 54
> [2011-02-18 15:27:07.228051] I [io-stats.c:333:io_stats_dump_fd]
> glustervol1: --- fd stats ---
> [2011-02-18 15:27:07.228078] I [io-stats.c:338:io_stats_dump_fd]
> glustervol1:       Filename :
> /production/conan/hda/published/shot/backup/.svn/tmp/entries
> 
> ...so, the files not working don't have lifetime, read/written lines
after
> their log entry.
> 
I'll need the log for the NFS server in TRACE log level when you run a 
command that results in any of the errors above. i.e. stale file handle, 
remote IO error and input/output error.

Thanks


> all very perplexing - and scary.  one thing that reliably fails is using
svn
> working dirs on the gluster filesystem.  nfs locks keep being dropped. 
this
> is temporarily fixed when i view the file as root (on a client) - but then
> re-appears very quickly.  i assume that gluster is upto something as simple
> as having svn working dirs?
> 
> i'm hoping i've done something stupid which is easily fixed.  we
seem so
> close - but right now, i'm at a loss and loosing confidence.  i would
> greatly appreciate any help/pointers out there.
> 
> regards,
> 
> paul
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Gluster users - Feb 2011 - Fwd: files not syncing up with glusterfs 3.1.2

[Gluster-users] Fwd: files not syncing up with glusterfs 3.1.2

[Gluster-users] Fwd: files not syncing up with glusterfs 3.1.2

[Gluster-users] Fwd: files not syncing up with glusterfs 3.1.2