It seems that the connection gets dropped (or not even able to
establish). Is the ssh auth set up properly from the second volume?
Csaba
On Thu, Jun 30, 2011 at 4:22 PM, Adrian Carpenter <tac12 at
wbic.cam.ac.uk> wrote:> Hi Csaba,
>
> I'm now seeing consistent errors with a second volume:
>
> [2011-06-30 06:08:48.299174] I [monitor(monitor):19:set_state] Monitor: new
state: OK
> [2011-06-30 09:27:46.875745] E [syncdutils:131:exception] <top>:
FAIL:
> Traceback (most recent call last):
> ?File
"/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/syncdutils.py",
line 152, in twrap
> ? ?tf(*aa)
> ?File
"/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/repce.py",
line 118, in listen
> ? ?rid, exc, res = recv(self.inf)
> ?File
"/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/repce.py",
line 42, in recv
> ? ?return pickle.load(inf)
> EOFError
> [2011-06-30 09:27:58.413588] I [monitor(monitor):42:monitor] Monitor:
------------------------------------------------------------
> [2011-06-30 09:27:58.413830] I [monitor(monitor):43:monitor] Monitor:
starting gsyncd worker
> [2011-06-30 09:27:58.479687] I [gsyncd:286:main_i] <top>: syncing:
gluster://localhost:user-volume -> file:///geo-tank/user-volume
> [2011-06-30 09:28:03.963303] I [master:181:crawl] GMaster: new master is
a747062e-1caa-4cb3-9f86-34d03486a842
> [2011-06-30 09:28:03.963587] I [master:187:crawl] GMaster: primary master
with volume id a747062e-1caa-4cb3-9f86-34d03486a842 ...
> [2011-06-30 09:34:35.592005] E [syncdutils:131:exception] <top>:
FAIL:
> Traceback (most recent call last):
> ?File
"/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/syncdutils.py",
line 152, in twrap
> ? ?tf(*aa)
> ?File
"/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/repce.py",
line 118, in listen
> ? ?rid, exc, res = recv(self.inf)
> ?File
"/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/repce.py",
line 42, in recv
> ? ?return pickle.load(inf)
> EOFError
> [2011-06-30 09:34:45.595258] I [monitor(monitor):42:monitor] Monitor:
------------------------------------------------------------
> [2011-06-30 09:34:45.595668] I [monitor(monitor):43:monitor] Monitor:
starting gsyncd worker
> [2011-06-30 09:34:45.661334] I [gsyncd:286:main_i] <top>: syncing:
gluster://localhost:user-volume -> file:///geo-tank/user-volume
> [2011-06-30 09:34:51.145607] I [master:181:crawl] GMaster: new master is
a747062e-1caa-4cb3-9f86-34d03486a842
> [2011-06-30 09:34:51.145898] I [master:187:crawl] GMaster: primary master
with volume id a747062e-1caa-4cb3-9f86-34d03486a842 ...
> [2011-06-30 12:35:54.394453] E [syncdutils:131:exception] <top>:
FAIL:
> Traceback (most recent call last):
> ?File
"/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/syncdutils.py",
line 152, in twrap
> ? ?tf(*aa)
> ?File
"/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/repce.py",
line 118, in listen
> ? ?rid, exc, res = recv(self.inf)
> ?File
"/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/repce.py",
line 42, in recv
> ? ?return pickle.load(inf)
> UnpicklingError: invalid load key, '???'.
> [2011-06-30 12:36:05.839510] I [monitor(monitor):42:monitor] Monitor:
------------------------------------------------------------
> [2011-06-30 12:36:05.839916] I [monitor(monitor):43:monitor] Monitor:
starting gsyncd worker
> [2011-06-30 12:36:05.905232] I [gsyncd:286:main_i] <top>: syncing:
gluster://localhost:user-volume -> file:///geo-tank/user-volume
> [2011-06-30 12:36:11.413764] I [master:181:crawl] GMaster: new master is
a747062e-1caa-4cb3-9f86-34d03486a842
> [2011-06-30 12:36:11.414047] I [master:187:crawl] GMaster: primary master
with volume id a747062e-1caa-4cb3-9f86-34d03486a842 ...
>
>
> Adrian
> On 28 Jun 2011, at 11:16, Csaba Henk wrote:
>
>> Hi Adrian,
>>
>>
>> On Tue, Jun 28, 2011 at 12:04 PM, Adrian Carpenter <tac12 at
wbic.cam.ac.uk> wrote:
>>> Thanks Csaba,
>>>
>>> So far as I am aware nothing tampered with the xattrs, ?and all the
bricks etc are time synchronised. ?Anyway I did as you suggest, ?now for one
volume ?(I have three being geo-rep'd) I consistently get this:
>>>
>>> OSError: [Errno 12] Cannot allocate memory
>>
>> do you get this consistently, or randomly-but-recurring, or spotted
>> once/a few times then gone?
>>
>>> File
"/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/libcxattr.py",
line 26, in _query_xattr
>> ?cls.raise_oserr()
>>> File
"/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/libcxattr.py",
line 16, in raise_oserr
>> ?raise OSError(errn, os.strerror(errn))
>>> OSError: [Errno 12] Cannot allocate memory
>>
>> If seen more than once, how much does the stack trace vary? Exactly
>> the same, or not exactly but crashes in the same function (just on a
>> different code path), or not exactly but at least in libcxattr module,
>> or quite different?
>>
>> What python version do you use? If you use python 2.4.*, with external
>> ctypes, then what source you've taken ctypes from, what version?
>>
>> Thanks,
>> Csaba
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>
>