Additionally the brick log file of the same brick would be required. Please look for if brick process went down or crashed. Doing a volume start force should resolve the issue. On Wed, 13 Sep 2017 at 16:28, Gaurav Yadav <gyadav at redhat.com> wrote:> Please send me the logs as well i.e glusterd.logs and cmd_history.log. > > > On Wed, Sep 13, 2017 at 1:45 PM, lejeczek <peljasz at yahoo.co.uk> wrote: > >> >> >> On 13/09/17 06:21, Gaurav Yadav wrote: >> > Please provide the output of gluster volume info, gluster volume status >>> and gluster peer status. >>> >>> Apart from above info, please provide glusterd logs, cmd_history.log. >>> >>> Thanks >>> Gaurav >>> >>> On Tue, Sep 12, 2017 at 2:22 PM, lejeczek <peljasz at yahoo.co.uk <mailto: >>> peljasz at yahoo.co.uk>> wrote: >>> >>> hi everyone >>> >>> I have 3-peer cluster with all vols in replica mode, 9 >>> vols. >>> What I see, unfortunately, is one brick fails in one >>> vol, when it happens it's always the same vol on the >>> same brick. >>> Command: gluster vol status $vol - would show brick >>> not online. >>> Restarting glusterd with systemclt does not help, only >>> system reboot seem to help, until it happens, next time. >>> >>> How to troubleshoot this weird misbehaviour? >>> many thanks, L. >>> >>> . >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> >> <mailto:Gluster-users at gluster.org> >>> http://lists.gluster.org/mailman/listinfo/gluster-users >>> <http://lists.gluster.org/mailman/listinfo/gluster-users> >>> >>> >>> >> hi, here: >> >> $ gluster vol info C-DATA >> >> Volume Name: C-DATA >> Type: Replicate >> Volume ID: 18ffba73-532e-4a4d-84da-fceea52f8c2e >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 1 x 3 = 3 >> Transport-type: tcp >> Bricks: >> Brick1: 10.5.6.49:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-C-DATA >> Brick2: 10.5.6.100:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-C-DATA >> Brick3: 10.5.6.32:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-C-DATA >> Options Reconfigured: >> performance.md-cache-timeout: 600 >> performance.cache-invalidation: on >> performance.stat-prefetch: on >> features.cache-invalidation-timeout: 600 >> features.cache-invalidation: on >> performance.io-thread-count: 64 >> performance.cache-size: 128MB >> cluster.self-heal-daemon: enable >> features.quota-deem-statfs: on >> changelog.changelog: on >> geo-replication.ignore-pid-check: on >> geo-replication.indexing: on >> features.inode-quota: on >> features.quota: on >> performance.readdir-ahead: on >> nfs.disable: on >> transport.address-family: inet >> performance.cache-samba-metadata: on >> >> >> $ gluster vol status C-DATA >> Status of volume: C-DATA >> Gluster process TCP Port RDMA Port Online >> Pid >> >> ------------------------------------------------------------------------------ >> Brick 10.5.6.49:/__.aLocalStorages/0/0-GLUS >> TERs/0GLUSTER-C-DATA N/A N/A N N/A >> Brick 10.5.6.100:/__.aLocalStorages/0/0-GLU >> STERs/0GLUSTER-C-DATA 49152 0 Y 9376 >> Brick 10.5.6.32:/__.aLocalStorages/0/0-GLUS >> TERs/0GLUSTER-C-DATA 49152 0 Y 8638 >> Self-heal Daemon on localhost N/A N/A Y 387879 >> Quota Daemon on localhost N/A N/A Y 387891 >> Self-heal Daemon on rider.private.ccnr.ceb. >> private.cam.ac.uk N/A N/A Y 16439 >> Quota Daemon on rider.private.ccnr.ceb.priv >> ate.cam.ac.uk N/A N/A Y 16451 >> Self-heal Daemon on 10.5.6.32 N/A N/A Y 7708 >> Quota Daemon on 10.5.6.32 N/A N/A Y 8623 >> Self-heal Daemon on 10.5.6.17 N/A N/A Y 20549 >> Quota Daemon on 10.5.6.17 N/A N/A Y 9337 >> >> Task Status of Volume C-DATA >> >> ------------------------------------------------------------------------------ >> There are no active volume tasks > > >> >> >> >> . >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users >> > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users-- --Atin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170913/07d00115/attachment.html>
These symptoms appear to be the same as I've recorded in this post: http://lists.gluster.org/pipermail/gluster-users/2017-September/032435.html On Wed, Sep 13, 2017 at 7:01 AM, Atin Mukherjee <atin.mukherjee83 at gmail.com> wrote:> Additionally the brick log file of the same brick would be required. > Please look for if brick process went down or crashed. Doing a volume start > force should resolve the issue. > > On Wed, 13 Sep 2017 at 16:28, Gaurav Yadav <gyadav at redhat.com> wrote: > >> Please send me the logs as well i.e glusterd.logs and cmd_history.log. >> >> >> On Wed, Sep 13, 2017 at 1:45 PM, lejeczek <peljasz at yahoo.co.uk> wrote: >> >>> >>> >>> On 13/09/17 06:21, Gaurav Yadav wrote: >>> >> Please provide the output of gluster volume info, gluster volume status >>>> and gluster peer status. >>>> >>>> Apart from above info, please provide glusterd logs, cmd_history.log. >>>> >>>> Thanks >>>> Gaurav >>>> >>>> On Tue, Sep 12, 2017 at 2:22 PM, lejeczek <peljasz at yahoo.co.uk <mailto: >>>> peljasz at yahoo.co.uk>> wrote: >>>> >>>> hi everyone >>>> >>>> I have 3-peer cluster with all vols in replica mode, 9 >>>> vols. >>>> What I see, unfortunately, is one brick fails in one >>>> vol, when it happens it's always the same vol on the >>>> same brick. >>>> Command: gluster vol status $vol - would show brick >>>> not online. >>>> Restarting glusterd with systemclt does not help, only >>>> system reboot seem to help, until it happens, next time. >>>> >>>> How to troubleshoot this weird misbehaviour? >>>> many thanks, L. >>>> >>>> . >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> >>> <mailto:Gluster-users at gluster.org> >>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>> <http://lists.gluster.org/mailman/listinfo/gluster-users> >>>> >>>> >>>> >>> hi, here: >>> >>> $ gluster vol info C-DATA >>> >>> Volume Name: C-DATA >>> Type: Replicate >>> Volume ID: 18ffba73-532e-4a4d-84da-fceea52f8c2e >>> Status: Started >>> Snapshot Count: 0 >>> Number of Bricks: 1 x 3 = 3 >>> Transport-type: tcp >>> Bricks: >>> Brick1: 10.5.6.49:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-C-DATA >>> Brick2: 10.5.6.100:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-C-DATA >>> Brick3: 10.5.6.32:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-C-DATA >>> Options Reconfigured: >>> performance.md-cache-timeout: 600 >>> performance.cache-invalidation: on >>> performance.stat-prefetch: on >>> features.cache-invalidation-timeout: 600 >>> features.cache-invalidation: on >>> performance.io-thread-count: 64 >>> performance.cache-size: 128MB >>> cluster.self-heal-daemon: enable >>> features.quota-deem-statfs: on >>> changelog.changelog: on >>> geo-replication.ignore-pid-check: on >>> geo-replication.indexing: on >>> features.inode-quota: on >>> features.quota: on >>> performance.readdir-ahead: on >>> nfs.disable: on >>> transport.address-family: inet >>> performance.cache-samba-metadata: on >>> >>> >>> $ gluster vol status C-DATA >>> Status of volume: C-DATA >>> Gluster process TCP Port RDMA Port Online >>> Pid >>> ------------------------------------------------------------ >>> ------------------ >>> Brick 10.5.6.49:/__.aLocalStorages/0/0-GLUS >>> TERs/0GLUSTER-C-DATA N/A N/A N N/A >>> Brick 10.5.6.100:/__.aLocalStorages/0/0-GLU >>> STERs/0GLUSTER-C-DATA 49152 0 Y 9376 >>> Brick 10.5.6.32:/__.aLocalStorages/0/0-GLUS >>> TERs/0GLUSTER-C-DATA 49152 0 Y 8638 >>> Self-heal Daemon on localhost N/A N/A Y 387879 >>> Quota Daemon on localhost N/A N/A Y 387891 >>> Self-heal Daemon on rider.private.ccnr.ceb. >>> private.cam.ac.uk N/A N/A Y 16439 >>> Quota Daemon on rider.private.ccnr.ceb.priv >>> ate.cam.ac.uk N/A N/A Y 16451 >>> Self-heal Daemon on 10.5.6.32 N/A N/A Y 7708 >>> Quota Daemon on 10.5.6.32 N/A N/A Y 8623 >>> Self-heal Daemon on 10.5.6.17 N/A N/A Y 20549 >>> Quota Daemon on 10.5.6.17 N/A N/A Y 9337 >>> >>> Task Status of Volume C-DATA >>> ------------------------------------------------------------ >>> ------------------ >>> There are no active volume tasks >> >> >>> >>> >>> >>> . >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> http://lists.gluster.org/mailman/listinfo/gluster-users >>> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users > > -- > --Atin > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170913/06fae6fd/attachment.html>
On 13/09/17 20:47, Ben Werthmann wrote:> These symptoms appear to be the same as I've recorded in > this post: > > http://lists.gluster.org/pipermail/gluster-users/2017-September/032435.html > > On Wed, Sep 13, 2017 at 7:01 AM, Atin Mukherjee > <atin.mukherjee83 at gmail.com > <mailto:atin.mukherjee83 at gmail.com>> wrote: > > Additionally the brick log file of the same brick > would be required. Please look for if brick process > went down or crashed. Doing a volume start force > should resolve the issue. >When I do: vol start force I see this between the lines: [2017-09-28 16:00:55.120726] I [MSGID: 106568] [glusterd-proc-mgmt.c:87:glusterd_proc_stop] 0-management: Stopping glustershd daemon running in pid: 308300 [2017-09-28 16:00:55.128867] W [socket.c:593:__socket_rwv] 0-glustershd: readv on /var/run/gluster/0853a4555820d3442b1c3909f1cb8466.socket failed (No data available) [2017-09-28 16:00:56.122687] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: glustershd service is stopped funnily(or not) I now see, a week after: gluster vol status CYTO-DATA Status of volume: CYTO-DATA Gluster process???????????????????????????? TCP Port? RDMA Port Online? Pid ------------------------------------------------------------------------------ Brick 10.5.6.49:/__.aLocalStorages/0/0-GLUS TERs/0GLUSTER-CYTO-DATA???????????????????? 49161???? 0 Y?????? 1743719 Brick 10.5.6.100:/__.aLocalStorages/0/0-GLU STERs/0GLUSTER-CYTO-DATA??????????????????? 49152???? 0 Y?????? 20438 Brick 10.5.6.32:/__.aLocalStorages/0/0-GLUS TERs/0GLUSTER-CYTO-DATA???????????????????? 49152???? 0 Y?????? 5607 Self-heal Daemon on localhost?????????????? N/A?????? N/A Y?????? 41106 Quota Daemon on localhost?????????????????? N/A?????? N/A Y?????? 41117 Self-heal Daemon on 10.5.6.17?????????????? N/A?????? N/A Y?????? 19088 Quota Daemon on 10.5.6.17?????????????????? N/A?????? N/A Y?????? 19097 Self-heal Daemon on 10.5.6.32?????????????? N/A?????? N/A Y?????? 1832978 Quota Daemon on 10.5.6.32?????????????????? N/A?????? N/A Y?????? 1832987 Self-heal Daemon on 10.5.6.49?????????????? N/A?????? N/A Y?????? 320291 Quota Daemon on 10.5.6.49?????????????????? N/A?????? N/A Y?????? 320303 Task Status of Volume CYTO-DATA ------------------------------------------------------------------------------ There are no active volume tasks $ gluster vol heal CYTO-DATA info Brick 10.5.6.49:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-CYTO-DATA Status: Transport endpoint is not connected Number of entries: - Brick 10.5.6.100:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-CYTO-DATA .... ....> On Wed, 13 Sep 2017 at 16:28, Gaurav Yadav > <gyadav at redhat.com <mailto:gyadav at redhat.com>> wrote: > > Please send me the logs as well i.e glusterd.logs > and cmd_history.log. > > > On Wed, Sep 13, 2017 at 1:45 PM, lejeczek > <peljasz at yahoo.co.uk <mailto:peljasz at yahoo.co.uk>> > wrote: > > > > On 13/09/17 06:21, Gaurav Yadav wrote: > > Please provide the output of gluster > volume info, gluster volume status and > gluster peer status. > > Apart? from above info, please provide > glusterd logs, cmd_history.log. > > Thanks > Gaurav > > On Tue, Sep 12, 2017 at 2:22 PM, lejeczek > <peljasz at yahoo.co.uk > <mailto:peljasz at yahoo.co.uk> > <mailto:peljasz at yahoo.co.uk > <mailto:peljasz at yahoo.co.uk>>> wrote: > > ? ? hi everyone > > ? ? I have 3-peer cluster with all vols in > replica mode, 9 > ? ? vols. > ? ? What I see, unfortunately, is one > brick fails in one > ? ? vol, when it happens it's always the > same vol on the > ? ? same brick. > ? ? Command: gluster vol status $vol - > would show brick > ? ? not online. > ? ? Restarting glusterd with systemclt > does not help, only > ? ? system reboot seem to help, until it > happens, next time. > > ? ? How to troubleshoot this weird > misbehaviour? > ? ? many thanks, L. > > ? ? . > ? ? > _______________________________________________ > ? ? Gluster-users mailing list > Gluster-users at gluster.org > <mailto:Gluster-users at gluster.org> > > ? ? <mailto:Gluster-users at gluster.org > <mailto:Gluster-users at gluster.org>> > http://lists.gluster.org/mailman/listinfo/gluster-users > <http://lists.gluster.org/mailman/listinfo/gluster-users> > ? ? > <http://lists.gluster.org/mailman/listinfo/gluster-users > <http://lists.gluster.org/mailman/listinfo/gluster-users>> > > > > hi, here: > > $ gluster vol info C-DATA > > Volume Name: C-DATA > Type: Replicate > Volume ID: 18ffba73-532e-4a4d-84da-fceea52f8c2e > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x 3 = 3 > Transport-type: tcp > Bricks: > Brick1: > 10.5.6.49:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-C-DATA > Brick2: > 10.5.6.100:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-C-DATA > Brick3: > 10.5.6.32:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-C-DATA > Options Reconfigured: > performance.md-cache-timeout: 600 > performance.cache-invalidation: on > performance.stat-prefetch: on > features.cache-invalidation-timeout: 600 > features.cache-invalidation: on > performance.io-thread-count: 64 > performance.cache-size: 128MB > cluster.self-heal-daemon: enable > features.quota-deem-statfs: on > changelog.changelog: on > geo-replication.ignore-pid-check: on > geo-replication.indexing: on > features.inode-quota: on > features.quota: on > performance.readdir-ahead: on > nfs.disable: on > transport.address-family: inet > performance.cache-samba-metadata: on > > > $ gluster vol status C-DATA > Status of volume: C-DATA > Gluster process TCP Port? RDMA Port Online? Pid > ------------------------------------------------------------------------------ > Brick 10.5.6.49:/__.aLocalStorages/0/0-GLUS > TERs/0GLUSTER-C-DATA N/A?????? N/A N?????? N/A > Brick 10.5.6.100:/__.aLocalStorages/0/0-GLU > STERs/0GLUSTER-C-DATA 49152???? 0 Y?????? 9376 > Brick 10.5.6.32:/__.aLocalStorages/0/0-GLUS > TERs/0GLUSTER-C-DATA 49152???? 0 Y?????? 8638 > Self-heal Daemon on localhost N/A?????? N/A > Y?????? 387879 > Quota Daemon on localhost N/A?????? N/A > Y?????? 387891 > Self-heal Daemon on rider.private.ccnr.ceb. > private.cam.ac.uk <http://private.cam.ac.uk> > N/A?????? N/A Y?????? 16439 > Quota Daemon on rider.private.ccnr.ceb.priv > ate.cam.ac.uk <http://ate.cam.ac.uk> N/A?????? > N/A Y?????? 16451 > Self-heal Daemon on 10.5.6.32 N/A?????? N/A > Y?????? 7708 > Quota Daemon on 10.5.6.32 N/A?????? N/A > Y?????? 8623 > Self-heal Daemon on 10.5.6.17 N/A?????? N/A > Y?????? 20549 > Quota Daemon on 10.5.6.17 N/A?????? N/A > Y?????? 9337 > > Task Status of Volume C-DATA > ------------------------------------------------------------------------------ > There are no active volume tasks > > > > > > . > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > <mailto:Gluster-users at gluster.org> > http://lists.gluster.org/mailman/listinfo/gluster-users > <http://lists.gluster.org/mailman/listinfo/gluster-users> > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > <mailto:Gluster-users at gluster.org> > http://lists.gluster.org/mailman/listinfo/gluster-users > <http://lists.gluster.org/mailman/listinfo/gluster-users> > > -- > --Atin > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > <mailto:Gluster-users at gluster.org> > http://lists.gluster.org/mailman/listinfo/gluster-users > <http://lists.gluster.org/mailman/listinfo/gluster-users> > >.