Harry Mangalam
2012-Aug-02 23:37 UTC
[Gluster-users] brick online or not? Don't trust 'gluster peer status'
Further to what I wrote before: gluster server overload; recovers, now "Transport endpoint is not connected" for some files <http://goo.gl/CN6ud> I'm getting conflicting info here. On one hand, the peer that had its glusterfsd lock up seems to be in the gluster system, according to the frequently referenced 'gluster peer status' Thu Aug 02 15:48:46 [1.00 0.89 0.92] root at pbs1:~ 729 $ gluster peer status Number of Peers: 3 Hostname: pbs4ib Uuid: 2a593581-bf45-446c-8f7c-212c53297803 State: Peer in Cluster (Connected) Hostname: pbs2ib Uuid: 26de63bd-c5b7-48ba-b81d-5d77a533d077 State: Peer in Cluster (Connected) Hostname: pbs3ib Uuid: c79c4084-d6b9-4af9-b975-40dd6aa99b42 State: Peer in Cluster (Connected) On the other hand, some errors that I provided yesterday: ==================================================[2012-08-01 18:07:26.104910] W [dht-selfheal.c:875:dht_selfheal_directory] 0-gli-dht: 1 subvolumes down -- not fixing ================================================== as well as this information: $ gluster volume status all detail [top 2 brick stanzas trimmed; they're online] ------------------------------------------------------------------------------ Brick : Brick pbs3ib:/bducgl Port : 24018 Online : N <<====================Pid : 20953 File System : xfs Device : /dev/md127 Mount Options : rw Inode Size : 256 Disk Space Free : 6.1TB Total Disk Space : 8.2TB Inode Count : 1758158080 Free Inodes : 1752326373 ------------------------------------------------------------------------------ Brick : Brick pbs4ib:/bducgl Port : 24009 Online : Y Pid : 20948 File System : xfs Device : /dev/sda Mount Options : rw Inode Size : 256 Disk Space Free : 4.6TB Total Disk Space : 6.4TB Inode Count : 1367187392 Free Inodes : 1361305613 The above implies fairly strongly that the brick did not re-establish connection to the volume, altho the gluster peer info did. Strangely enough, when I RE-restarted the glusterd, it DID come back and re-joined the gluster volume and now the (restarted) fix-layout job is proceeding without those "subvolumes down -- not fixing" errors, just a steady stream of 'found anomalies/fixing the layout' messages, tho at the rate that it's going it looks like it will take several days. Still better several days to fix the data on-disk and having the fs live than having to tell users that their data is gone and then having to rebuild from zero. Luckily, it's officially a /scratch filesystem. Harry -- Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine [m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487 415 South Circle View Dr, Irvine, CA, 92697 [shipping] MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps)
Harry Mangalam
2012-Aug-06 20:13 UTC
[Gluster-users] brick online or not? Don't trust 'gluster peer status'
As a final(?) follow-up to my problem, after restarting the rebalance with: gluster volume rebalance [vol-name] fix-layout start it finished up last night after plowing thru the entirety of the filesystem - fixing about ~1M files (apparently ~2.2TB), all while the fs remained live (tho probably a bit slower than users would have liked). That's a strong '+' in the gluster column for resiliency. I started the rebalance without waiting for any advice to the contrary. 3.3 is supposed to have a built-in rebalance operator, but I saw no evidence of it. Other info from gluster.org suggested that it wouldn't do any harm to do this, so I went ahead and started it. Do the gluster wizards have any final words on this before I write this up in our trouble report? best wishes harry On Thu, Aug 2, 2012 at 4:37 PM, Harry Mangalam <hjmangalam at gmail.com> wrote:> Further to what I wrote before: > gluster server overload; recovers, now "Transport endpoint is not > connected" for some files > <http://goo.gl/CN6ud> > > I'm getting conflicting info here. On one hand, the peer that had its > glusterfsd lock up seems to be in the gluster system, according to > the frequently referenced 'gluster peer status' > > Thu Aug 02 15:48:46 [1.00 0.89 0.92] root at pbs1:~ > 729 $ gluster peer status > Number of Peers: 3 > > Hostname: pbs4ib > Uuid: 2a593581-bf45-446c-8f7c-212c53297803 > State: Peer in Cluster (Connected) > > Hostname: pbs2ib > Uuid: 26de63bd-c5b7-48ba-b81d-5d77a533d077 > State: Peer in Cluster (Connected) > > Hostname: pbs3ib > Uuid: c79c4084-d6b9-4af9-b975-40dd6aa99b42 > State: Peer in Cluster (Connected) > > On the other hand, some errors that I provided yesterday: > ==================================================> [2012-08-01 18:07:26.104910] W > [dht-selfheal.c:875:dht_selfheal_directory] 0-gli-dht: 1 subvolumes > down -- not fixing > ==================================================> > as well as this information: > $ gluster volume status all detail > > [top 2 brick stanzas trimmed; they're online] > > ------------------------------------------------------------------------------ > Brick : Brick pbs3ib:/bducgl > Port : 24018 > Online : N <<====================> Pid : 20953 > File System : xfs > Device : /dev/md127 > Mount Options : rw > Inode Size : 256 > Disk Space Free : 6.1TB > Total Disk Space : 8.2TB > Inode Count : 1758158080 > Free Inodes : 1752326373 > > ------------------------------------------------------------------------------ > Brick : Brick pbs4ib:/bducgl > Port : 24009 > Online : Y > Pid : 20948 > File System : xfs > Device : /dev/sda > Mount Options : rw > Inode Size : 256 > Disk Space Free : 4.6TB > Total Disk Space : 6.4TB > Inode Count : 1367187392 > Free Inodes : 1361305613 > > The above implies fairly strongly that the brick did not re-establish > connection to the volume, altho the gluster peer info did. > > Strangely enough, when I RE-restarted the glusterd, it DID come back > and re-joined the gluster volume and now the (restarted) fix-layout > job is proceeding without those "subvolumes > down -- not fixing" errors, just a steady stream of 'found > anomalies/fixing the layout' messages, tho at the rate that it's going > it looks like it will take several days. > > Still better several days to fix the data on-disk and having the fs > live than having to tell users that their data is gone and then having > to rebuild from zero. Luckily, it's officially a /scratch filesystem. > > Harry > > -- > Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine > [m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487 > 415 South Circle View Dr, Irvine, CA, 92697 [shipping] > MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps) >-- Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine [m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487 415 South Circle View Dr, Irvine, CA, 92697 [shipping] MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20120806/58799db7/attachment.html>