phil cryer
2011-Apr-08 13:43 UTC
[Gluster-users] 'Transport endpoint is not connected' occurs while running long jobs
I'm having failures with long running processes. I'm running glusterfs 3.1.2 (glusterfs 3.1.2 built on Jan 16 2011 18:14:56 - Repository revision: v3.1.1-64-gf2a067c) on Debian 6 (sqeeze) and it's been stable for use, serving images via http - but when I issue a long running task, for example last night I ran a google sitemap generator, other times it was a chmod -R across a section of directories, it will eventually crash with the errors below, Transport endpoint is not connected. Then I have to stop glusterfsd, killall remaining glusterfs/glusterfsd apps running, unmount the gluster share, restart glusterfsd and then remount the share. What can I do to fix this? I wanted to have the sitemap run once a week, and since you can throttle it I thought it wouldn't be as heavy handed as chown or chmod would be, but no, it crashes it the same way. [...] 2011/04/08 09:14:15 [crit] 24137#0: *8533 open() "/mnt/glusterfs/www/r/recordsofgeneral04lond/recordso fgeneral04lond_bw.pdf" failed (107: Transport endpoint is not connected), client: 128.128.164.174, ser ver: cluster.biodiversitylibrary.org, request: "GET /r/recordsofgeneral04lond/recordsofgeneral04lond_b w.pdf HTTP/1.1", host: "cluster.biodiversitylibrary.org" 2011/04/08 09:14:23 [crit] 24137#0: *8534 open() "/mnt/glusterfs/www/d/dieumbelliferenu00liro/dieumbel liferenu00liro_djvu.txt" failed (107: Transport endpoint is not connected), client: 128.128.164.174, s erver: cluster.biodiversitylibrary.org, request: "GET /d/dieumbelliferenu00liro/dieumbelliferenu00liro _djvu.txt HTTP/1.0", host: "cluster.biodiversitylibrary.org" 2011/04/08 09:14:33 [crit] 24137#0: *8535 open() "/mnt/glusterfs/www/j/justsbotanischer4601berl/justsb otanischer4601berl_metasource.xml" failed (107: Transport endpoint is not connected), client: 128.128. 164.174, server: cluster.biodiversitylibrary.org, request: "GET /j/justsbotanischer4601berl/justsbotan ischer4601berl_metasource.xml HTTP/1.1", host: "cluster.biodiversitylibrary.org" 2011/04/08 09:14:39 [crit] 24137#0: *8537 open() "/mnt/glusterfs/www/r/recordsofindianm21indi/recordso findianm21indi.gif" failed (107: Transport endpoint is not connected), client: 128.128.164.174, server : cluster.biodiversitylibrary.org, request: "GET /r/recordsofindianm21indi/recordsofindianm21indi.gif HTTP/1.1", host: "cluster.biodiversitylibrary.org" 2011/04/08 09:14:51 [crit] 24137#0: *8538 open() "/mnt/glusterfs/www/r/recherchesdemorp00ameg/recherch esdemorp00ameg_dc.xml" failed (107: Transport endpoint is not connected), client: 128.128.164.174, ser ver: cluster.biodiversitylibrary.org, request: "GET /r/recherchesdemorp00ameg/recherchesdemorp00ameg_d c.xml HTTP/1.1", host: "cluster.biodiversitylibrary.org" 2011/04/08 09:15:06 [crit] 24137#0: *8539 open() "/mnt/glusterfs/www/j/journalfrdiega56stut/journalfrd iega56stut.pdf" failed (107: Transport endpoint is not connected), client: 128.128.164.174, server: cl uster.biodiversitylibrary.org, request: "GET /j/journalfrdiega56stut/journalfrdiega56stut.pdf HTTP/1.1 ", host: "cluster.biodiversitylibrary.org" 2011/04/08 09:15:17 [crit] 24137#0: *8540 stat() "/mnt/glusterfs/www/r/reisenundforschu32schr" failed (107: Transport endpoint is not connected), client: 128.128.164.174, server: cluster.biodiversitylibra ry.org, request: "GET /r/reisenundforschu32schr/ HTTP/1.1", host: "cluster.biodiversitylibrary.org" 2011/04/08 09:15:27 [crit] 24137#0: *8541 open() "/mnt/glusterfs/www/r/recreativescienc01lond/recreati vescienc01lond_bw.pdf" failed (107: Transport endpoint is not connected), client: 128.128.164.174, ser ver: cluster.biodiversitylibrary.org, request: "GET /r/recreativescienc01lond/recreativescienc01lond_b w.pdf HTTP/1.1", host: "cluster.biodiversitylibrary.org" [...] Thanks P -- http://philcryer.com