thr3ads.net - Gluster users - [Gluster-users] File operation failure on simple distributed volume [Dec 2016]

If this information is useful, please help other people find it:
Share via:

yonex

2016-Dec-13 16:26 UTC

[Gluster-users] File operation failure on simple distributed volume

Hi Rafi,

Thanks for your response. OK, I think it is possible to capture debug
logs, since the error seems to be reproduced a few times per day. I
will try that. However, so I want to avoid redundant debug outputs if
possible, is there a way to enable debug log only on specific client
nodes?

Regards

Yonex

2016-12-13 23:33 GMT+09:00 Mohammed Rafi K C <rkavunga at
redhat.com>:> Hi Yonex,
>
> Is this consistently reproducible ? if so, Can you enable debug log [1]
> and check for any message similar to [2]. Basically you can even search
> for "EOF on socket".
>
> You can set your log level back to default (INFO) after capturing for
> some time.
>
>
> [1] : gluster volume set <volname> diagnostics.brick-log-level DEBUG
and
> gluster volume set <volname> diagnostics.client-log-level DEBUG
>
> [2] : http://pastebin.com/xn8QHXWa
>
>
> Regards
>
> Rafi KC
>
> On 12/12/2016 09:35 PM, yonex wrote:
>> Hi,
>>
>> When my application moves a file from it's local disk to
FUSE-mounted
>> GlusterFS volume, the client outputs many warnings and errors not
>> always but occasionally. The volume is a simple distributed volume.
>>
>> A sample of logs pasted: http://pastebin.com/axkTCRJX
>>
>> It seems to come from something like a network disconnection
>> ("Transport endpoint is not connected") at a glance, but
other
>> networking applications on the same machine don't observe such a
>> thing. So I guess there may be a problem somewhere in GlusterFS stack.
>>
>> It ended in failing to rename a file, logging PHP Warning like below:
>>
>>     PHP Warning:  rename(/glusterfs01/db1/stack/f0/13a9a2f0): failed
>> to open stream: Input/output error in [snipped].php on line 278
>>     PHP Warning:
>> rename(/var/stack/13a9a2f0,/glusterfs01/db1/stack/f0/13a9a2f0):
>> Input/output error in [snipped].php on line 278
>>
>> Conditions:
>>
>> - GlusterFS 3.8.5 installed via yum CentOS-Gluster-3.8.repo
>> - Volume info and status pasted: http://pastebin.com/JPt2KeD8
>> - Client machines' OS: Scientific Linux 6 or CentOS 6.
>> - Server machines' OS: CentOS 6.
>> - Kernel version is 2.6.32-642.6.2.el6.x86_64 on all machines.
>> - The number of connected FUSE clients is 260.
>> - No firewall between connected machines.
>> - Neither remounting volumes nor rebooting client machines take effect.
>> - It is caused by not only rename() but also copy() and filesize()
operation.
>> - No outputs in brick logs when it happens.
>>
>> Any ideas? I'd appreciate any help.
>>
>> Regards.
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>

Mohammed Rafi K C

2016-Dec-14 06:20 UTC

head link

[Gluster-users] File operation failure on simple distributed volume

On 12/13/2016 09:56 PM, yonex wrote:> Hi Rafi,
>
> Thanks for your response. OK, I think it is possible to capture debug
> logs, since the error seems to be reproduced a few times per day. I
> will try that. However, so I want to avoid redundant debug outputs if
> possible, is there a way to enable debug log only on specific client
> nodes?
if you are using fuse mount, there is proc kind of feature called .meta
. You can set log level through that for a particular client [1] . But I
also want log from bricks because I suspect bricks process for
initiating the disconnects.


[1] eg : echo 8 > /mnt/glusterfs/.meta/logging/loglevel
>
> Regards
>
> Yonex
>
> 2016-12-13 23:33 GMT+09:00 Mohammed Rafi K C <rkavunga at
redhat.com>:
>> Hi Yonex,
>>
>> Is this consistently reproducible ? if so, Can you enable debug log [1]
>> and check for any message similar to [2]. Basically you can even search
>> for "EOF on socket".
>>
>> You can set your log level back to default (INFO) after capturing for
>> some time.
>>
>>
>> [1] : gluster volume set <volname> diagnostics.brick-log-level
DEBUG and
>> gluster volume set <volname> diagnostics.client-log-level DEBUG
>>
>> [2] : http://pastebin.com/xn8QHXWa
>>
>>
>> Regards
>>
>> Rafi KC
>>
>> On 12/12/2016 09:35 PM, yonex wrote:
>>> Hi,
>>>
>>> When my application moves a file from it's local disk to
FUSE-mounted
>>> GlusterFS volume, the client outputs many warnings and errors not
>>> always but occasionally. The volume is a simple distributed volume.
>>>
>>> A sample of logs pasted: http://pastebin.com/axkTCRJX
>>>
>>> It seems to come from something like a network disconnection
>>> ("Transport endpoint is not connected") at a glance, but
other
>>> networking applications on the same machine don't observe such
a
>>> thing. So I guess there may be a problem somewhere in GlusterFS
stack.
>>>
>>> It ended in failing to rename a file, logging PHP Warning like
below:
>>>
>>>     PHP Warning:  rename(/glusterfs01/db1/stack/f0/13a9a2f0):
failed
>>> to open stream: Input/output error in [snipped].php on line 278
>>>     PHP Warning:
>>> rename(/var/stack/13a9a2f0,/glusterfs01/db1/stack/f0/13a9a2f0):
>>> Input/output error in [snipped].php on line 278
>>>
>>> Conditions:
>>>
>>> - GlusterFS 3.8.5 installed via yum CentOS-Gluster-3.8.repo
>>> - Volume info and status pasted: http://pastebin.com/JPt2KeD8
>>> - Client machines' OS: Scientific Linux 6 or CentOS 6.
>>> - Server machines' OS: CentOS 6.
>>> - Kernel version is 2.6.32-642.6.2.el6.x86_64 on all machines.
>>> - The number of connected FUSE clients is 260.
>>> - No firewall between connected machines.
>>> - Neither remounting volumes nor rebooting client machines take
effect.
>>> - It is caused by not only rename() but also copy() and filesize()
operation.
>>> - No outputs in brick logs when it happens.
>>>
>>> Any ideas? I'd appreciate any help.
>>>
>>> Regards.
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-users

Gluster users - Dec 2016 - File operation failure on simple distributed volume

[Gluster-users] File operation failure on simple distributed volume

[Gluster-users] File operation failure on simple distributed volume