Displaying 20 results from an estimated 500 matches similar to: "Thousands of EPOLLERR - disconnecting now"
2018 Feb 08
0
Thousands of EPOLLERR - disconnecting now
On Thu, Feb 8, 2018 at 2:04 PM, Gino Lisignoli <glisignoli at gmail.com> wrote:
> Hello
>
> I have a large cluster in which every node is logging:
>
> I [socket.c:2474:socket_event_handler] 0-transport: EPOLLERR -
> disconnecting now
>
> At a rate of of around 4 or 5 per second per node, which is adding up to a
> lot of messages. This seems to happen while my
2018 Jan 07
1
Clear heal statistics
Is there any way to clear the historic statistic from the command "gluster
volume heal <volume_name> statistics" ?
It seems the command takes longer and longer to run each time it is used,
to the point where it times out and no longer works.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2017 Nov 21
1
Brick and Subvolume Info
Hello
I have a Distributed-Replicate volume and I would like to know if it is
possible to see what sub-volume a brick belongs to, eg:
A Distributed-Replicate volume containing:
Number of Bricks: 2 x 2 = 4
Brick1: node1.localdomain:/mnt/data1/brick1
Brick2: node2.localdomain:/mnt/data1/brick1
Brick3: node1.localdomain:/mnt/data2/brick2
Brick4: node2.localdomain:/mnt/data2/brick2
Is it possible
2017 Aug 21
1
Glusterd not working with systemd in redhat 7
Hi!
Please see bellow. Note that web1.dasilva.network is the address of the
local machine where one of the bricks is installed and that ties to mount.
[2017-08-20 20:30:40.359236] I [MSGID: 100030] [glusterfsd.c:2476:main]
0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.11.2
(args: /usr/sbin/glusterd -p /var/run/glusterd.pid)
[2017-08-20 20:30:40.973249] I [MSGID: 106478]
2017 Aug 21
0
Glusterd not working with systemd in redhat 7
On Mon, Aug 21, 2017 at 2:49 AM, Cesar da Silva <thunderlight1 at gmail.com>
wrote:
> Hi!
> I am having same issue but I am running Ubuntu v16.04.
> It does not mount during boot, but works if I mount it manually. I am
> running the Gluster-server on the same machines (3 machines)
> Here is the /tc/fstab file
>
> /dev/sdb1 /data/gluster ext4 defaults 0 0
>
>
2017 Aug 20
2
Glusterd not working with systemd in redhat 7
Hi!
I am having same issue but I am running Ubuntu v16.04.
It does not mount during boot, but works if I mount it manually. I am
running the Gluster-server on the same machines (3 machines)
Here is the /tc/fstab file
/dev/sdb1 /data/gluster ext4 defaults 0 0
web1.dasilva.network:/www /mnt/glusterfs/www glusterfs
defaults,_netdev,log-level=debug,log-file=/var/log/gluster.log 0 0
2017 Aug 06
1
[3.11.2] Bricks disconnect from gluster with 0-transport: EPOLLERR
Hi,
I have a distributed volume which runs on Fedora 26 systems with
glusterfs 3.11.2 from gluster.org repos:
----------
[root at taupo ~]# glusterd --version
glusterfs 3.11.2
gluster> volume info gv2
Volume Name: gv2
Type: Distribute
Volume ID: 6b468f43-3857-4506-917c-7eaaaef9b6ee
Status: Started
Snapshot Count: 0
Number of Bricks: 6
Transport-type: tcp
Bricks:
Brick1:
2017 Sep 13
1
[3.11.2] Bricks disconnect from gluster with 0-transport: EPOLLERR
I ran into something like this in 3.10.4 and filed two bugs for it:
https://bugzilla.redhat.com/show_bug.cgi?id=1491059
https://bugzilla.redhat.com/show_bug.cgi?id=1491060
Please see the above bugs for full detail.
In summary, my issue was related to glusterd's pid handling of pid files
when is starts self-heal and bricks. The issues are:
a. brick pid file leaves stale pid and brick fails
2018 Mar 21
2
Brick process not starting after reinstall
Hi all,
our systems have suffered a host failure in a replica three setup.
The host needed a complete reinstall. I followed the RH guide to
'replace a host with the same hostname'
(https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3/html/administration_guide/sect-replacing_hosts).
The machine has the same OS (CentOS 7). The new machine got a minor
version number newer
2018 Mar 21
0
Brick process not starting after reinstall
Could you share the following information:
1. gluster --version
2. output of gluster volume status
3. glusterd log and all brick log files from the node where bricks didn't
come up.
On Wed, Mar 21, 2018 at 12:35 PM, Richard Neuboeck <hawk at tbi.univie.ac.at>
wrote:
> Hi all,
>
> our systems have suffered a host failure in a replica three setup.
> The host needed a
2018 Sep 07
3
Auth process sometimes stop responding after upgrade
In data venerd? 7 settembre 2018 10:06:00 CEST, Sami Ketola ha scritto:
> > On 7 Sep 2018, at 11.00, Simone Lazzaris <s.lazzaris at interactive.eu>
> > wrote:
> >
> >
> > The only suspect thing is this:
> >
> > Sep 6 14:45:41 imap-front13 dovecot: director: doveadm: Host
> > 192.168.1.142
> > vhost count changed from 100 to 0
>
2019 Mar 08
1
Dovecot v2.3.5 released
On 7.3.2019 23.37, A. Schulze via dovecot wrote:
>
> Am 07.03.19 um 17:33 schrieb Aki Tuomi via dovecot:
>
>>> test-http-client-errors.c:2989: Assert failed: FALSE
>>> connection timed out ................................................. : FAILED
> Hello Aki,
>
>> Are you running with valgrind or on really slow system?
> I'm not aware my buildsystem
2017 Aug 15
2
Is transport=rdma tested with "stripe"?
On Tue, Aug 15, 2017 at 01:04:11PM +0000, Hatazaki, Takao wrote:
> Ji-Hyeon,
>
> You're saying that "stripe=2 transport=rdma" should work. Ok, that
> was firstly I wanted to know. I'll put together logs later this week.
Note that "stripe" is not tested much and practically unmaintained. We
do not advise you to use it. If you have large files that you
2018 Sep 07
6
Auth process sometimes stop responding after upgrade
Some more information: the issue has just occurred, again on an instance without the
"service_count = 0" configuration directive on pop3-login.
I've observed that while the issue is occurring, the director process goes 100% CPU. I've
straced the process. It is seemingly looping:
...
...
epoll_ctl(13, EPOLL_CTL_ADD, 78, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP,
{u32=149035320,
2015 Jun 21
3
dovecot auth using 100% CPU
Every few days I find that dovecot auth is using all my CPU.
This is from dovecot 2.2.13, I've just upgraded to 2.2.18
strace -r -p 17956 output:
Process 17956 attached
0.000000 lseek(19, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
0.000057 getsockname(19, {sa_family=AF_LOCAL, NULL}, [2]) = 0
0.000043 epoll_ctl(15, EPOLL_CTL_ADD, 19, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP,
2017 Aug 16
0
Is transport=rdma tested with "stripe"?
> Note that "stripe" is not tested much and practically unmaintained.
Ah, this was what I suspected. Understood. I'll be happy with "shard".
Having said that, "stripe" works fine with transport=tcp. The failure reproduces with just 2 RDMA servers (with InfiniBand), one of those acts also as a client.
I looked into logs. I paste lengthy logs below with
2010 Oct 10
3
pop3 TCP_CORK too late error
I was straceing a pop3 process and noticed that the TCP_CORK option
isn't set soon enough:
epoll_wait(8, {{EPOLLOUT, {u32=37481984, u64=37481984}}}, 38, 207) = 1
write(41, "iTxPBrNlaNFao+yQzLhuO4/+tQ5cuiKSe"..., 224) = 224
epoll_ctl(8, EPOLL_CTL_MOD, 41, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP,
{u32=37481984, u64=37481984}}) = 0
pread(19,
2018 Sep 07
1
Auth process sometimes stop responding after upgrade
On 7 Sep 2018, at 19.43, Timo Sirainen <tss at iki.fi> wrote:
>
> On 7 Sep 2018, at 16.50, Simone Lazzaris <s.lazzaris at interactive.eu <mailto:s.lazzaris at interactive.eu>> wrote:
>>
>> Some more information: the issue has just occurred, again on an instance without the "service_count = 0" configuration directive on pop3-login.
>>
>>
2018 Mar 27
4
[PATCH net V2] vhost: correctly remove wait queue during poll failure
We tried to remove vq poll from wait queue, but do not check whether
or not it was in a list before. This will lead double free. Fixing
this by switching to use vhost_poll_stop() which zeros poll->wqh after
removing poll from waitqueue to make sure it won't be freed twice.
Cc: Darren Kenny <darren.kenny at oracle.com>
Reported-by: syzbot+c0272972b01b872e604a at
2018 Mar 27
4
[PATCH net V2] vhost: correctly remove wait queue during poll failure
We tried to remove vq poll from wait queue, but do not check whether
or not it was in a list before. This will lead double free. Fixing
this by switching to use vhost_poll_stop() which zeros poll->wqh after
removing poll from waitqueue to make sure it won't be freed twice.
Cc: Darren Kenny <darren.kenny at oracle.com>
Reported-by: syzbot+c0272972b01b872e604a at