José Manuel Canelas
2010-Feb-26 15:55 UTC
[Gluster-users] error: "Transport endpoint is not connected" and "Stale NFS file handle"
Hello, everyone. We're setting up GlusterFS for some testing and having some trouble with the configuration. We have 4 nodes as clients and servers, 4 disks each. I'm trying to setup 3 replicas across all those 16 disks, configured at the client side, for high availability and optimal performance, in a way that makes it easy to add new disks and nodes. The best way I thought doing it was to put disks together from different nodes into 3 distributed volumes and then use each of those as a replica of the top volume. I'd like your input on this too, so if you look at the configuration and something looks wrong or dumb, it probably is, so please let me know :) Now the server config looks like this: volume posix1 type storage/posix option directory /srv/gdisk01 end-volume volume locks1 type features/locks subvolumes posix1 end-volume volume brick1 type performance/io-threads option thread-count 8 subvolumes locks1 end-volume [4 more identical bricks and...] volume server-tcp type protocol/server option transport-type tcp option auth.addr.brick1.allow * option auth.addr.brick2.allow * option auth.addr.brick3.allow * option auth.addr.brick4.allow * option transport.socket.listen-port 6996 option transport.socket.nodelay on subvolumes brick1 brick2 brick3 brick4 end-volume The client config: volume node01-1 type protocol/client option transport-type tcp option remote-host node01 option transport.socket.nodelay on option transport.remote-port 6996 option remote-subvolume brick1 end-volume [repeated for every brick, until node04-4] ### Our 3 replicas volume repstore1 type cluster/distribute subvolumes node01-1 node02-1 node03-1 node04-1 node04-4 end-volume volume repstore2 type cluster/distribute subvolumes node01-2 node02-2 node03-2 node04-2 node02-2 end-volume volume repstore3 type cluster/distribute subvolumes node01-3 node02-3 node03-3 node04-3 node03-3 end-volume volume replicate type cluster/replicate subvolumes repstore1 repstore2 repstore3 end-volume [and then the performance bits] When starting the glusterfs server, everything looks fine. I then mount the filesystem with node01:~# glusterfs --debug -f /etc/glusterfs/glusterfs.vol /srv/gluster-export and it does not complain and shows up as properly mounted. When accessing the content, it gives back an error, that the "Transport endpoint is not connected". The log has a "Stale NFS file handle" warning. See bellow: [...] [2010-02-26 14:56:01] D [dht-common.c:274:dht_revalidate_cbk] repstore3: mismatching layouts for / [2010-02-26 14:56:01] W [fuse-bridge.c:722:fuse_attr_cbk] glusterfs-fuse: 9: LOOKUP() / => -1 (Stale NFS file handle) node01:~# mount /dev/cciss/c0d0p1 on / type ext3 (rw,errors=remount-ro) tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755) proc on /proc type proc (rw,noexec,nosuid,nodev) sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) procbususb on /proc/bus/usb type usbfs (rw) udev on /dev type tmpfs (rw,mode=0755) tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620) fusectl on /sys/fs/fuse/connections type fusectl (rw) /dev/cciss/c0d1 on /srv/gdisk01 type ext3 (rw,errors=remount-ro) /dev/cciss/c0d2 on /srv/gdisk02 type ext3 (rw,errors=remount-ro) /dev/cciss/c0d3 on /srv/gdisk03 type ext3 (rw,errors=remount-ro) /dev/cciss/c0d4 on /srv/gdisk04 type ext3 (rw,errors=remount-ro) /etc/glusterfs/glusterfs.vol on /srv/gluster-export type fuse.glusterfs (rw,allow_other,default_permissions,max_read=131072) node01:~# ls /srv/gluster-export ls: cannot access /srv/gluster-export: Transport endpoint is not connected node01:~# The complete debug log and configuration files are attached. Thank you in advance, Jos? Canelas -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: glusterfs-debug-output.txt URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20100226/b001cc27/attachment.txt> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: glusterfs.vol URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20100226/b001cc27/attachment.ksh> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: glusterfsd.vol URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20100226/b001cc27/attachment-0001.ksh>
José Manuel Canelas
2010-Mar-02 16:08 UTC
[Gluster-users] error: "Transport endpoint is not connected" and "Stale NFS file handle"
Hi, Since no one replies to this, i'll reply to myself :) I just realized I assumed that it is possible to replicate distributed volumes. I am wrong? In my setup bellow I was trying to make "Replicated Distributed Storage", the inverse of what is described in http://www.gluster.com/community/documentation/index.php/Distributed_Replicated_Storage. Trying to draw a picture: replicated -------------|------------ <----> 3 replicas presented as one volume replica1 replica2 replica3 ---|---------|-------|---- <-----> 4 volumes, distributed, to make up 4vols 4vols 4vols each of the 3 volumes to be replicated Is this dumb or is there a better way? thanks, Jos? Canelas On 02/26/2010 03:55 PM, Jos? Manuel Canelas wrote:> Hello, everyone. > > We're setting up GlusterFS for some testing and having some trouble with > the configuration. > > We have 4 nodes as clients and servers, 4 disks each. I'm trying to > setup 3 replicas across all those 16 disks, configured at the client > side, for high availability and optimal performance, in a way that makes > it easy to add new disks and nodes. > > The best way I thought doing it was to put disks together from different > nodes into 3 distributed volumes and then use each of those as a replica > of the top volume. I'd like your input on this too, so if you look at > the configuration and something looks wrong or dumb, it probably is, so > please let me know :) > > Now the server config looks like this: > > volume posix1 > type storage/posix > option directory /srv/gdisk01 > end-volume > > volume locks1 > type features/locks > subvolumes posix1 > end-volume > > volume brick1 > type performance/io-threads > option thread-count 8 > subvolumes locks1 > end-volume > > [4 more identical bricks and...] > > volume server-tcp > type protocol/server > option transport-type tcp > option auth.addr.brick1.allow * > option auth.addr.brick2.allow * > option auth.addr.brick3.allow * > option auth.addr.brick4.allow * > option transport.socket.listen-port 6996 > option transport.socket.nodelay on > subvolumes brick1 brick2 brick3 brick4 > end-volume > > > The client config: > > volume node01-1 > type protocol/client > option transport-type tcp > option remote-host node01 > option transport.socket.nodelay on > option transport.remote-port 6996 > option remote-subvolume brick1 > end-volume > > [repeated for every brick, until node04-4] > > ### Our 3 replicas > volume repstore1 > type cluster/distribute > subvolumes node01-1 node02-1 node03-1 node04-1 node04-4 > end-volume > > volume repstore2 > type cluster/distribute > subvolumes node01-2 node02-2 node03-2 node04-2 node02-2 > end-volume > > volume repstore3 > type cluster/distribute > subvolumes node01-3 node02-3 node03-3 node04-3 node03-3 > end-volume > > volume replicate > type cluster/replicate > subvolumes repstore1 repstore2 repstore3 > end-volume > > [and then the performance bits] > > > When starting the glusterfs server, everything looks fine. I then mount > the filesystem with > > node01:~# glusterfs --debug -f /etc/glusterfs/glusterfs.vol > /srv/gluster-export > > and it does not complain and shows up as properly mounted. When > accessing the content, it gives back an error, that the "Transport > endpoint is not connected". The log has a "Stale NFS file handle" > warning. See bellow: > > [...] > [2010-02-26 14:56:01] D [dht-common.c:274:dht_revalidate_cbk] repstore3: > mismatching layouts for / > [2010-02-26 14:56:01] W [fuse-bridge.c:722:fuse_attr_cbk] > glusterfs-fuse: 9: LOOKUP() / => -1 (Stale NFS file handle) > > > node01:~# mount > /dev/cciss/c0d0p1 on / type ext3 (rw,errors=remount-ro) > tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755) > proc on /proc type proc (rw,noexec,nosuid,nodev) > sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) > procbususb on /proc/bus/usb type usbfs (rw) > udev on /dev type tmpfs (rw,mode=0755) > tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) > devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620) > fusectl on /sys/fs/fuse/connections type fusectl (rw) > /dev/cciss/c0d1 on /srv/gdisk01 type ext3 (rw,errors=remount-ro) > /dev/cciss/c0d2 on /srv/gdisk02 type ext3 (rw,errors=remount-ro) > /dev/cciss/c0d3 on /srv/gdisk03 type ext3 (rw,errors=remount-ro) > /dev/cciss/c0d4 on /srv/gdisk04 type ext3 (rw,errors=remount-ro) > /etc/glusterfs/glusterfs.vol on /srv/gluster-export type fuse.glusterfs > (rw,allow_other,default_permissions,max_read=131072) > node01:~# ls /srv/gluster-export > ls: cannot access /srv/gluster-export: Transport endpoint is not connected > node01:~# > > > The complete debug log and configuration files are attached. > > Thank you in advance, > Jos? Canelas > >
Tejas N. Bhise
2010-Mar-02 17:45 UTC
[Gluster-users] error: "Transport endpoint is not connected" and "Stale NFS file handle"
Jose, I would request you to use volgen. A suggestion on the config, why not group disks on two nodes ( simple e.g. group1 from server1 and server2, group2 from server2 and server4 ) with replicate. Then have distribute translator on top of the two replicated volumes. This would give you a single GlusterFS volume to be mounted on each of the clients. Would'nt something simple like that work for you ? I gave a representative example of 2 replicas this can be easily extended to 3 replicas also. Regards, Tejas. ----- Original Message ----- From: "Jos? Manuel Canelas" <jcanelas at co.sapo.pt> To: gluster-users at gluster.org Sent: Tuesday, March 2, 2010 9:38:32 PM GMT +05:30 Chennai, Kolkata, Mumbai, New Delhi Subject: Re: [Gluster-users] error: "Transport endpoint is not connected" and "Stale NFS file handle" Hi, Since no one replies to this, i'll reply to myself :) I just realized I assumed that it is possible to replicate distributed volumes. I am wrong? In my setup bellow I was trying to make "Replicated Distributed Storage", the inverse of what is described in http://www.gluster.com/community/documentation/index.php/Distributed_Replicated_Storage. Trying to draw a picture: replicated -------------|------------ <----> 3 replicas presented as one volume replica1 replica2 replica3 ---|---------|-------|---- <-----> 4 volumes, distributed, to make up 4vols 4vols 4vols each of the 3 volumes to be replicated Is this dumb or is there a better way? thanks, Jos? Canelas On 02/26/2010 03:55 PM, Jos? Manuel Canelas wrote:> Hello, everyone. > > We're setting up GlusterFS for some testing and having some trouble with > the configuration. > > We have 4 nodes as clients and servers, 4 disks each. I'm trying to > setup 3 replicas across all those 16 disks, configured at the client > side, for high availability and optimal performance, in a way that makes > it easy to add new disks and nodes. > > The best way I thought doing it was to put disks together from different > nodes into 3 distributed volumes and then use each of those as a replica > of the top volume. I'd like your input on this too, so if you look at > the configuration and something looks wrong or dumb, it probably is, so > please let me know :) > > Now the server config looks like this: > > volume posix1 > type storage/posix > option directory /srv/gdisk01 > end-volume > > volume locks1 > type features/locks > subvolumes posix1 > end-volume > > volume brick1 > type performance/io-threads > option thread-count 8 > subvolumes locks1 > end-volume > > [4 more identical bricks and...] > > volume server-tcp > type protocol/server > option transport-type tcp > option auth.addr.brick1.allow * > option auth.addr.brick2.allow * > option auth.addr.brick3.allow * > option auth.addr.brick4.allow * > option transport.socket.listen-port 6996 > option transport.socket.nodelay on > subvolumes brick1 brick2 brick3 brick4 > end-volume > > > The client config: > > volume node01-1 > type protocol/client > option transport-type tcp > option remote-host node01 > option transport.socket.nodelay on > option transport.remote-port 6996 > option remote-subvolume brick1 > end-volume > > [repeated for every brick, until node04-4] > > ### Our 3 replicas > volume repstore1 > type cluster/distribute > subvolumes node01-1 node02-1 node03-1 node04-1 node04-4 > end-volume > > volume repstore2 > type cluster/distribute > subvolumes node01-2 node02-2 node03-2 node04-2 node02-2 > end-volume > > volume repstore3 > type cluster/distribute > subvolumes node01-3 node02-3 node03-3 node04-3 node03-3 > end-volume > > volume replicate > type cluster/replicate > subvolumes repstore1 repstore2 repstore3 > end-volume > > [and then the performance bits] > > > When starting the glusterfs server, everything looks fine. I then mount > the filesystem with > > node01:~# glusterfs --debug -f /etc/glusterfs/glusterfs.vol > /srv/gluster-export > > and it does not complain and shows up as properly mounted. When > accessing the content, it gives back an error, that the "Transport > endpoint is not connected". The log has a "Stale NFS file handle" > warning. See bellow: > > [...] > [2010-02-26 14:56:01] D [dht-common.c:274:dht_revalidate_cbk] repstore3: > mismatching layouts for / > [2010-02-26 14:56:01] W [fuse-bridge.c:722:fuse_attr_cbk] > glusterfs-fuse: 9: LOOKUP() / => -1 (Stale NFS file handle) > > > node01:~# mount > /dev/cciss/c0d0p1 on / type ext3 (rw,errors=remount-ro) > tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755) > proc on /proc type proc (rw,noexec,nosuid,nodev) > sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) > procbususb on /proc/bus/usb type usbfs (rw) > udev on /dev type tmpfs (rw,mode=0755) > tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) > devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620) > fusectl on /sys/fs/fuse/connections type fusectl (rw) > /dev/cciss/c0d1 on /srv/gdisk01 type ext3 (rw,errors=remount-ro) > /dev/cciss/c0d2 on /srv/gdisk02 type ext3 (rw,errors=remount-ro) > /dev/cciss/c0d3 on /srv/gdisk03 type ext3 (rw,errors=remount-ro) > /dev/cciss/c0d4 on /srv/gdisk04 type ext3 (rw,errors=remount-ro) > /etc/glusterfs/glusterfs.vol on /srv/gluster-export type fuse.glusterfs > (rw,allow_other,default_permissions,max_read=131072) > node01:~# ls /srv/gluster-export > ls: cannot access /srv/gluster-export: Transport endpoint is not connected > node01:~# > > > The complete debug log and configuration files are attached. > > Thank you in advance, > Jos? Canelas > >_______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
José Manuel Canelas
2010-Mar-05 13:04 UTC
[Gluster-users] error: "Transport endpoint is not connected" and "Stale NFS file handle"
On 03/03/2010 01:12 PM, Tejas N. Bhise wrote:> Congrats !! .. you have been converted to glusterfs too :-) ..hehe. Well, we're still testing and we plan to look at Lustre next. Though I must say we really liked GlusterFS and are secretely rooting for it :)> The backend subvolumes can be just directories. They need not be full > filesystems/partitions etc. So on one machine, you can create one big > filesystem and make directories in it. Each directory can be a > backend subvolume. This is thin provisioning in GlusterFS. Then you > need not bother about smallest disk. You can delete files from one > directory ( hence one backend volume ) to create space in another > directory ( hence another backend volume ). As you play with the > system, you will see more and more possibilities :-).Hum! Interesting. Thanks for the tip. I'm still curious about why the previous configuration didn't work. Can you tell me if using the replication translator over distributed volumes is supported?> So what purpose do you intend to put the system to ?Well, we have a driving project, an application that needs scalable storage, and we'd like to use the chosen system for other existing projects that are now running over NFS exported filesystems (hence the need for posix) and for future projects with largish/scalable storage needs. We plan on trying it for storing our VM images too, which I've seen that some people do. So, basically, we need a scalable, redundant and flexible storage system, hopefully also easy to operate and configure when adding new nodes and disks. GlusterFS seemed perfect, with its layered approach and shared-nothing design :) Thank you all very much for your work! regards, z?
Raghavendra G
2010-Mar-22 04:03 UTC
[Gluster-users] error: "Transport endpoint is not connected" and "Stale NFS file handle"
Hi Jose, There is a mistake in volume specification file. Please see the inlined comments below. On Fri, Feb 26, 2010 at 7:55 PM, Jos? Manuel Canelas <jcanelas at co.sapo.pt>wrote:> Hello, everyone. > > We're setting up GlusterFS for some testing and having some trouble with > the configuration. > > We have 4 nodes as clients and servers, 4 disks each. I'm trying to > setup 3 replicas across all those 16 disks, configured at the client > side, for high availability and optimal performance, in a way that makes > it easy to add new disks and nodes. > > The best way I thought doing it was to put disks together from different > nodes into 3 distributed volumes and then use each of those as a replica > of the top volume. I'd like your input on this too, so if you look at > the configuration and something looks wrong or dumb, it probably is, so > please let me know :) > > Now the server config looks like this: > > volume posix1 > type storage/posix > option directory /srv/gdisk01 > end-volume > > volume locks1 > type features/locks > subvolumes posix1 > end-volume > > volume brick1 > type performance/io-threads > option thread-count 8 > subvolumes locks1 > end-volume > > [4 more identical bricks and...] > > volume server-tcp > type protocol/server > option transport-type tcp > option auth.addr.brick1.allow * > option auth.addr.brick2.allow * > option auth.addr.brick3.allow * > option auth.addr.brick4.allow * > option transport.socket.listen-port 6996 > option transport.socket.nodelay on > subvolumes brick1 brick2 brick3 brick4 > end-volume > > > The client config: > > volume node01-1 > type protocol/client > option transport-type tcp > option remote-host node01 > option transport.socket.nodelay on > option transport.remote-port 6996 > option remote-subvolume brick1 > end-volume > > [repeated for every brick, until node04-4] > > ### Our 3 replicas > volume repstore1 > type cluster/distribute > subvolumes node01-1 node02-1 node03-1 node04-1 node04-4 > end-volume > > volume repstore2 > type cluster/distribute > subvolumes node01-2 node02-2 node03-2 node04-2 node02-2 >node02-2 is specfied twice. Any subvolume should be specified only once here.> end-volume > > volume repstore3 > type cluster/distribute > subvolumes node01-3 node02-3 node03-3 node04-3 node03-3 >same here. node03-3 is specfied twice.> end-volume > > volume replicate > type cluster/replicate > subvolumes repstore1 repstore2 repstore3 > end-volume > > [and then the performance bits] > > > When starting the glusterfs server, everything looks fine. I then mount > the filesystem with > > node01:~# glusterfs --debug -f /etc/glusterfs/glusterfs.vol > /srv/gluster-export > > and it does not complain and shows up as properly mounted. When > accessing the content, it gives back an error, that the "Transport > endpoint is not connected". The log has a "Stale NFS file handle" > warning. See bellow: > > [...] > [2010-02-26 14:56:01] D [dht-common.c:274:dht_revalidate_cbk] repstore3: > mismatching layouts for / > [2010-02-26 14:56:01] W [fuse-bridge.c:722:fuse_attr_cbk] > glusterfs-fuse: 9: LOOKUP() / => -1 (Stale NFS file handle) > > > node01:~# mount > /dev/cciss/c0d0p1 on / type ext3 (rw,errors=remount-ro) > tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755) > proc on /proc type proc (rw,noexec,nosuid,nodev) > sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) > procbususb on /proc/bus/usb type usbfs (rw) > udev on /dev type tmpfs (rw,mode=0755) > tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) > devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620) > fusectl on /sys/fs/fuse/connections type fusectl (rw) > /dev/cciss/c0d1 on /srv/gdisk01 type ext3 (rw,errors=remount-ro) > /dev/cciss/c0d2 on /srv/gdisk02 type ext3 (rw,errors=remount-ro) > /dev/cciss/c0d3 on /srv/gdisk03 type ext3 (rw,errors=remount-ro) > /dev/cciss/c0d4 on /srv/gdisk04 type ext3 (rw,errors=remount-ro) > /etc/glusterfs/glusterfs.vol on /srv/gluster-export type fuse.glusterfs > (rw,allow_other,default_permissions,max_read=131072) > node01:~# ls /srv/gluster-export > ls: cannot access /srv/gluster-export: Transport endpoint is not connected > node01:~# > > > The complete debug log and configuration files are attached. > > Thank you in advance, > Jos? Canelas > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > >Let us know whether this fixes your problem :). -- Raghavendra G