reitenbach_pub@rapideye.de
2007-Jun-01 07:27 UTC
[Ocfs2-users] Transport endpoint is not connected
Hi, I am on SLES 10/64Bit, with following ocfs2 rpm's installed: ocfs2-tools-1.2.2-0.2 ocfs2console-1.2.2-0.2 I tried to remove a file on a ocfs2 partition, but got the following error message: rm: cannot remove `somefile.xml': Transport endpoint is not connected in /var/log/messages I saw the following log statements: Jun 1 16:00:49 ppsnfs101 kernel: (4739,1):ocfs2_broadcast_vote:725 ERROR: status = -107 Jun 1 16:00:49 ppsnfs101 kernel: (4739,1):ocfs2_do_request_vote:798 ERROR: status = -107 Jun 1 16:00:49 ppsnfs101 kernel: (4739,1):ocfs2_unlink:840 ERROR: status = -107 on other hosts in the cluster, I was able to remove the file without a problem. yesterday everything was fine, over night one of the hosts in the cluster froze. what does the error message above mean? kind regards Sebastian
Network timeout. Did a node die during the delete? If so, then the error is to be expected. As in, retry the rm and it should succeed. Or, were you umounting that volume on another node during the said delete? If so, then this should not happen. reitenbach_pub@rapideye.de wrote:> Hi, > > I am on SLES 10/64Bit, with following ocfs2 rpm's installed: > ocfs2-tools-1.2.2-0.2 > ocfs2console-1.2.2-0.2 > > I tried to remove a file on a ocfs2 partition, but got the following error > message: > > rm: cannot remove `somefile.xml': Transport endpoint is not connected > > in /var/log/messages I saw the following log statements: > > Jun 1 16:00:49 ppsnfs101 kernel: (4739,1):ocfs2_broadcast_vote:725 ERROR: > status = -107 > Jun 1 16:00:49 ppsnfs101 kernel: (4739,1):ocfs2_do_request_vote:798 ERROR: > status = -107 > Jun 1 16:00:49 ppsnfs101 kernel: (4739,1):ocfs2_unlink:840 ERROR: status > = -107 > > > on other hosts in the cluster, I was able to remove the file without a > problem. > > yesterday everything was fine, over night one of the hosts in the cluster > froze. > > what does the error message above mean? > > kind regards > Sebastian > > > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users >
Did you retry the rm operation on the node it failed? The error itself is ENOTCONN. We set the connection status to that during mount, umount and after we get a ETIMEDOUT. We can set aside the ETIMEDOUT case as that is typically associated with a node death. But if there were no mounts and umounts (if the error occurred during either I would consider that a bug), I don't see any reason for the error. And yet it happened. Log a bugzilla and attach the /var/log/messages for all the nodes. Upload messages as-is. Also, indicate the time so I know where to look. Also, mention all version numbers, etc. Sunil Jun 1 16:00:49 ppsnfs101 kernel: (4739,1):ocfs2_broadcast_vote:725 ERROR: status = -107 Jun 1 16:00:49 ppsnfs101 kernel: (4739,1):ocfs2_do_request_vote:798 ERROR: status = -107 Jun 1 16:00:49 ppsnfs101 kernel: (4739,1):ocfs2_unlink:840 ERROR: status = -107 #define ENOTCONN 107 /* Transport endpoint is not connected */ Sebastian Reitenbach wrote:> Hi, > Sunil Mushran <Sunil.Mushran@oracle.com> wrote: > >> Network timeout. >> >> Did a node die during the delete? If so, then the error is >> to be expected. As in, retry the rm and it should succeed. >> > one node in the cluster died over night. This is a Gigabit LAN, all the > servers are in the same segment... so I do not think it is a network timeout. > > while deleting no server died, nor I tried to umount while deleting. > a mounted.ocfs2 showed me all running 4 hosts in the group. (on the node that > had the problems). ssh to another node, and rm the file was not a problem. > > >> Or, were you umounting that volume on another node during >> the said delete? If so, then this should not happen. >> > no, no umount while deleting. > > kind regards > Sebastian > > > RapidEye AG > Molkenmarkt 30 > 14776 Brandenburg an der Havel > Germany > > Head Office/Sitz der Gesellschaft: Brandenburg an der Havel > Management Board/Vorstand: Wolfgang G. Biedermann > Chairman of Supervisory Board/Vorsitzender des Aufsichtsrates: Axel Schmalz > Commercial Register/Handelsregister Potsdam HRB 17 796 > Tax Number/Steuernummer: 048/100/00053 > VAT-Ident-Number/Ust.-ID: DE 199331235 > > ************************************************************************* > Diese E-Mail enthaelt vertrauliche und/oder rechtlich geschuetzte > Informationen. Wenn Sie nicht der richtige Adressat sind oder diese > E-Mail irrtuemlich erhalten haben, informieren Sie bitte sofort den > Absender und vernichten Sie diese E-Mail. Das unerlaubte Kopieren sowie > die unbefugte Weitergabe dieser E-Mail ist nicht gestattet. > > The information in this e-mail is intended for the named recipients > only. It may contain privileged and confidential information. If you > have received this communication in error, any use, copying or > dissemination of its contents is strictly prohibited. Please erase all > copies of the message along with any included attachments and notify > RapidEye AG or the sender immediately by telephone at the number > indicated on this page. >