Pranith Kumar Karampuri
2016-Jan-25 02:27 UTC
[Gluster-users] [Gluster-devel] heal hanging
It seems like there is a lot of finodelk/inodelk traffic. I wonder why that is. I think the next steps is to collect statedump of the brick which is taking lot of CPU, using "gluster volume statedump <volname>" Pranith On 01/22/2016 08:36 AM, Glomski, Patrick wrote:> Pranith, attached are stack traces collected every second for 20 > seconds from the high-%cpu glusterfsd process. > > Patrick > > On Thu, Jan 21, 2016 at 9:46 PM, Glomski, Patrick > <patrick.glomski at corvidtec.com <mailto:patrick.glomski at corvidtec.com>> > wrote: > > Last entry for get_real_filename on any of the bricks was when we > turned off the samba gfapi vfs plugin earlier today: > > /var/log/glusterfs/bricks/data-brick01a-homegfs.log:[2016-01-21 > 15:13:00.008239] E [server-rpc-fops.c:768:server_getxattr_cbk] > 0-homegfs-server: 105: GETXATTR /wks_backup > (40e582d6-b0c7-4099-ba88-9168a3c32ca6) > (glusterfs.get_real_filename:desktop.ini) ==> (Permission denied) > > We'll get back to you with those traces when %cpu spikes again. As > with most sporadic problems, as soon as you want something out of > it, the issue becomes harder to reproduce. > > > On Thu, Jan 21, 2016 at 9:21 PM, Pranith Kumar Karampuri > <pkarampu at redhat.com <mailto:pkarampu at redhat.com>> wrote: > > > > On 01/22/2016 07:25 AM, Glomski, Patrick wrote: >> Unfortunately, all samba mounts to the gluster volume through >> the gfapi vfs plugin have been disabled for the last 6 hours >> or so and frequency of %cpu spikes is increased. We had >> switched to sharing a fuse mount through samba, but I just >> disabled that as well. There are no samba shares of this >> volume now. The spikes now happen every thirty minutes or so. >> We've resorted to just rebooting the machine with high load >> for the present. > > Could you see if the logs of following type are not at all coming? > [2016-01-21 15:13:00.005736] E > [server-rpc-fops.c:768:server_getxattr_cbk] 0-homegfs-server: > 110: GETXATTR /wks_backup (40e582d6-b0c7-4099-ba88-9168a3c > 32ca6) (glusterfs.get_real_filename:desktop.ini) ==> > (Permission denied) > > These are operations that failed. Operations that succeed are > the ones that will scan the directory. But I don't have a way > to find them other than using tcpdumps. > > At the moment I have 2 theories: > 1) these get_real_filename calls > 2) [2016-01-21 16:10:38.017828] E > [server-helpers.c:46:gid_resolve] 0-gid-cache: getpwuid_r(494) > failed > " > > Yessir they are. Normally, sssd would look to the local cache > file in /var/lib/sss/db/ first, to get any group or userid > information, then go out to the domain controller. I put the > options that we are using on our GFS volumes below? Thanks > for your help. > > We had been running sssd with sssd_nss and sssd_be > sub-processes on these systems for a long time, under the GFS > 3.5.2 code, and not run into the problem that David described > with the high cpu usage on sssd_nss. > > *" > *That was Tom Young's email 1.5 years back when we debugged > it. But the process which was consuming lot of cpu is > sssd_nss. So I am not sure if it is same issue. Let us debug > to see '1)' doesn't happen. The gstack traces I asked for > should also help. > > > Pranith >> >> On Thu, Jan 21, 2016 at 8:49 PM, Pranith Kumar Karampuri >> <pkarampu at redhat.com <mailto:pkarampu at redhat.com>> wrote: >> >> >> >> On 01/22/2016 07:13 AM, Glomski, Patrick wrote: >>> We use the samba glusterfs virtual filesystem (the >>> current version provided on download.gluster.org >>> <http://download.gluster.org>), but no windows clients >>> connecting directly. >> >> Hmm.. Is there a way to disable using this and check if >> the CPU% still increases? What getxattr of >> "glusterfs.get_real_filename <filanme>" does is to scan >> the entire directory looking for strcasecmp(<filname>, >> <scanned-filename>). If anything matches then it will >> return the <scanned-filename>. But the problem is the >> scan is costly. So I wonder if this is the reason for the >> CPU spikes. >> >> Pranith >> >>> >>> On Thu, Jan 21, 2016 at 8:37 PM, Pranith Kumar Karampuri >>> <pkarampu at redhat.com <mailto:pkarampu at redhat.com>> wrote: >>> >>> Do you have any windows clients? I see a lot of >>> getxattr calls for "glusterfs.get_real_filename" >>> which lead to full readdirs of the directories on >>> the brick. >>> >>> Pranith >>> >>> On 01/22/2016 12:51 AM, Glomski, Patrick wrote: >>>> Pranith, could this kind of behavior be >>>> self-inflicted by us deleting files directly from >>>> the bricks? We have done that in the past to clean >>>> up an issues where gluster wouldn't allow us to >>>> delete from the mount. >>>> >>>> If so, is it feasible to clean them up by running a >>>> search on the .glusterfs directories directly and >>>> removing files with a reference count of 1 that are >>>> non-zero size (or directly checking the xattrs to >>>> be sure that it's not a DHT link). >>>> >>>> find /data/brick01a/homegfs/.glusterfs -type f -not >>>> -empty -links -2 -exec rm -f "{}" \; >>>> >>>> Is there anything I'm inherently missing with that >>>> approach that will further corrupt the system? >>>> >>>> >>>> On Thu, Jan 21, 2016 at 1:02 PM, Glomski, Patrick >>>> <patrick.glomski at corvidtec.com >>>> <mailto:patrick.glomski at corvidtec.com>> wrote: >>>> >>>> Load spiked again: ~1200%cpu on gfs02a for >>>> glusterfsd. Crawl has been running on one of >>>> the bricks on gfs02b for 25 min or so and users >>>> cannot access the volume. >>>> >>>> I re-listed the xattrop directories as well as >>>> a 'top' entry and heal statistics. Then I >>>> restarted the gluster services on gfs02a. >>>> >>>> =================== top ==================>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM >>>> TIME+ COMMAND >>>> 8969 root 20 0 2815m 204m 3588 S 1181.0 >>>> 0.6 591:06.93 glusterfsd >>>> >>>> =================== xattrop ==================>>>> /data/brick01a/homegfs/.glusterfs/indices/xattrop: >>>> xattrop-41f19453-91e4-437c-afa9-3b25614de210 >>>> xattrop-9b815879-2f4d-402b-867c-a6d65087788c >>>> >>>> /data/brick02a/homegfs/.glusterfs/indices/xattrop: >>>> xattrop-70131855-3cfb-49af-abce-9d23f57fb393 >>>> xattrop-dfb77848-a39d-4417-a725-9beca75d78c6 >>>> >>>> /data/brick01b/homegfs/.glusterfs/indices/xattrop: >>>> e6e47ed9-309b-42a7-8c44-28c29b9a20f8 >>>> xattrop-5c797a64-bde7-4eac-b4fc-0befc632e125 >>>> xattrop-38ec65a1-00b5-4544-8a6c-bf0f531a1934 >>>> xattrop-ef0980ad-f074-4163-979f-16d5ef85b0a0 >>>> >>>> /data/brick02b/homegfs/.glusterfs/indices/xattrop: >>>> xattrop-7402438d-0ee7-4fcf-b9bb-b561236f99bc >>>> xattrop-8ffbf5f7-ace3-497d-944e-93ac85241413 >>>> >>>> /data/brick01a/homegfs/.glusterfs/indices/xattrop: >>>> xattrop-0115acd0-caae-4dfd-b3b4-7cc42a0ff531 >>>> >>>> /data/brick02a/homegfs/.glusterfs/indices/xattrop: >>>> xattrop-7e20fdb1-5224-4b9a-be06-568708526d70 >>>> >>>> /data/brick01b/homegfs/.glusterfs/indices/xattrop: >>>> 8034bc06-92cd-4fa5-8aaf-09039e79d2c8 >>>> c9ce22ed-6d8b-471b-a111-b39e57f0b512 >>>> 94fa1d60-45ad-4341-b69c-315936b51e8d >>>> xattrop-9c04623a-64ce-4f66-8b23-dbaba49119c7 >>>> >>>> /data/brick02b/homegfs/.glusterfs/indices/xattrop: >>>> xattrop-b8c8f024-d038-49a2-9a53-c54ead09111d >>>> >>>> >>>> =================== heal stats ==================>>>> >>>> homegfs [b0-gfsib01a] : Starting time of >>>> crawl : Thu Jan 21 12:36:45 2016 >>>> homegfs [b0-gfsib01a] : Ending time of crawl : >>>> Thu Jan 21 12:36:45 2016 >>>> homegfs [b0-gfsib01a] : Type of crawl: INDEX >>>> homegfs [b0-gfsib01a] : No. of entries healed : 0 >>>> homegfs [b0-gfsib01a] : No. of entries in >>>> split-brain: 0 >>>> homegfs [b0-gfsib01a] : No. of heal failed >>>> entries : 0 >>>> >>>> homegfs [b1-gfsib01b] : Starting time of >>>> crawl : Thu Jan 21 12:36:19 2016 >>>> homegfs [b1-gfsib01b] : Ending time of crawl : >>>> Thu Jan 21 12:36:19 2016 >>>> homegfs [b1-gfsib01b] : Type of crawl: INDEX >>>> homegfs [b1-gfsib01b] : No. of entries healed : 0 >>>> homegfs [b1-gfsib01b] : No. of entries in >>>> split-brain: 0 >>>> homegfs [b1-gfsib01b] : No. of heal failed >>>> entries : 1 >>>> >>>> homegfs [b2-gfsib01a] : Starting time of >>>> crawl : Thu Jan 21 12:36:48 2016 >>>> homegfs [b2-gfsib01a] : Ending time of crawl : >>>> Thu Jan 21 12:36:48 2016 >>>> homegfs [b2-gfsib01a] : Type of crawl: INDEX >>>> homegfs [b2-gfsib01a] : No. of entries healed : 0 >>>> homegfs [b2-gfsib01a] : No. of entries in >>>> split-brain: 0 >>>> homegfs [b2-gfsib01a] : No. of heal failed >>>> entries : 0 >>>> >>>> homegfs [b3-gfsib01b] : Starting time of >>>> crawl : Thu Jan 21 12:36:47 2016 >>>> homegfs [b3-gfsib01b] : Ending time of crawl : >>>> Thu Jan 21 12:36:47 2016 >>>> homegfs [b3-gfsib01b] : Type of crawl: INDEX >>>> homegfs [b3-gfsib01b] : No. of entries healed : 0 >>>> homegfs [b3-gfsib01b] : No. of entries in >>>> split-brain: 0 >>>> homegfs [b3-gfsib01b] : No. of heal failed >>>> entries : 0 >>>> >>>> homegfs [b4-gfsib02a] : Starting time of >>>> crawl : Thu Jan 21 12:36:06 2016 >>>> homegfs [b4-gfsib02a] : Ending time of crawl : >>>> Thu Jan 21 12:36:06 2016 >>>> homegfs [b4-gfsib02a] : Type of crawl: INDEX >>>> homegfs [b4-gfsib02a] : No. of entries healed : 0 >>>> homegfs [b4-gfsib02a] : No. of entries in >>>> split-brain: 0 >>>> homegfs [b4-gfsib02a] : No. of heal failed >>>> entries : 0 >>>> >>>> homegfs [b5-gfsib02b] : Starting time of >>>> crawl : Thu Jan 21 12:13:40 2016 >>>> homegfs [b5-gfsib02b] : *** Crawl is in >>>> progress *** >>>> homegfs [b5-gfsib02b] : Type of crawl: INDEX >>>> homegfs [b5-gfsib02b] : No. of entries healed : 0 >>>> homegfs [b5-gfsib02b] : No. of entries in >>>> split-brain: 0 >>>> homegfs [b5-gfsib02b] : No. of heal failed >>>> entries : 0 >>>> >>>> homegfs [b6-gfsib02a] : Starting time of >>>> crawl : Thu Jan 21 12:36:58 2016 >>>> homegfs [b6-gfsib02a] : Ending time of crawl : >>>> Thu Jan 21 12:36:58 2016 >>>> homegfs [b6-gfsib02a] : Type of crawl: INDEX >>>> homegfs [b6-gfsib02a] : No. of entries healed : 0 >>>> homegfs [b6-gfsib02a] : No. of entries in >>>> split-brain: 0 >>>> homegfs [b6-gfsib02a] : No. of heal failed >>>> entries : 0 >>>> >>>> homegfs [b7-gfsib02b] : Starting time of >>>> crawl : Thu Jan 21 12:36:50 2016 >>>> homegfs [b7-gfsib02b] : Ending time of crawl : >>>> Thu Jan 21 12:36:50 2016 >>>> homegfs [b7-gfsib02b] : Type of crawl: INDEX >>>> homegfs [b7-gfsib02b] : No. of entries healed : 0 >>>> homegfs [b7-gfsib02b] : No. of entries in >>>> split-brain: 0 >>>> homegfs [b7-gfsib02b] : No. of heal failed >>>> entries : 0 >>>> >>>> >>>> =======================================================================================>>>> I waited a few minutes for the heals to finish >>>> and ran the heal statistics and info again. one >>>> file is in split-brain. Aside from the >>>> split-brain, the load on all systems is down >>>> now and they are behaving normally. >>>> glustershd.log is attached. What is going on??? >>>> >>>> Thu Jan 21 12:53:50 EST 2016 >>>> >>>> =================== homegfs ==================>>>> >>>> homegfs [b0-gfsib01a] : Starting time of >>>> crawl : Thu Jan 21 12:53:02 2016 >>>> homegfs [b0-gfsib01a] : Ending time of crawl : >>>> Thu Jan 21 12:53:02 2016 >>>> homegfs [b0-gfsib01a] : Type of crawl: INDEX >>>> homegfs [b0-gfsib01a] : No. of entries healed : 0 >>>> homegfs [b0-gfsib01a] : No. of entries in >>>> split-brain: 0 >>>> homegfs [b0-gfsib01a] : No. of heal failed >>>> entries : 0 >>>> >>>> homegfs [b1-gfsib01b] : Starting time of >>>> crawl : Thu Jan 21 12:53:38 2016 >>>> homegfs [b1-gfsib01b] : Ending time of crawl : >>>> Thu Jan 21 12:53:38 2016 >>>> homegfs [b1-gfsib01b] : Type of crawl: INDEX >>>> homegfs [b1-gfsib01b] : No. of entries healed : 0 >>>> homegfs [b1-gfsib01b] : No. of entries in >>>> split-brain: 0 >>>> homegfs [b1-gfsib01b] : No. of heal failed >>>> entries : 1 >>>> >>>> homegfs [b2-gfsib01a] : Starting time of >>>> crawl : Thu Jan 21 12:53:04 2016 >>>> homegfs [b2-gfsib01a] : Ending time of crawl : >>>> Thu Jan 21 12:53:04 2016 >>>> homegfs [b2-gfsib01a] : Type of crawl: INDEX >>>> homegfs [b2-gfsib01a] : No. of entries healed : 0 >>>> homegfs [b2-gfsib01a] : No. of entries in >>>> split-brain: 0 >>>> homegfs [b2-gfsib01a] : No. of heal failed >>>> entries : 0 >>>> >>>> homegfs [b3-gfsib01b] : Starting time of >>>> crawl : Thu Jan 21 12:53:04 2016 >>>> homegfs [b3-gfsib01b] : Ending time of crawl : >>>> Thu Jan 21 12:53:04 2016 >>>> homegfs [b3-gfsib01b] : Type of crawl: INDEX >>>> homegfs [b3-gfsib01b] : No. of entries healed : 0 >>>> homegfs [b3-gfsib01b] : No. of entries in >>>> split-brain: 0 >>>> homegfs [b3-gfsib01b] : No. of heal failed >>>> entries : 0 >>>> >>>> homegfs [b4-gfsib02a] : Starting time of >>>> crawl : Thu Jan 21 12:53:33 2016 >>>> homegfs [b4-gfsib02a] : Ending time of crawl : >>>> Thu Jan 21 12:53:33 2016 >>>> homegfs [b4-gfsib02a] : Type of crawl: INDEX >>>> homegfs [b4-gfsib02a] : No. of entries healed : 0 >>>> homegfs [b4-gfsib02a] : No. of entries in >>>> split-brain: 0 >>>> homegfs [b4-gfsib02a] : No. of heal failed >>>> entries : 1 >>>> >>>> homegfs [b5-gfsib02b] : Starting time of >>>> crawl : Thu Jan 21 12:53:14 2016 >>>> homegfs [b5-gfsib02b] : Ending time of crawl : >>>> Thu Jan 21 12:53:15 2016 >>>> homegfs [b5-gfsib02b] : Type of crawl: INDEX >>>> homegfs [b5-gfsib02b] : No. of entries healed : 0 >>>> homegfs [b5-gfsib02b] : No. of entries in >>>> split-brain: 0 >>>> homegfs [b5-gfsib02b] : No. of heal failed >>>> entries : 3 >>>> >>>> homegfs [b6-gfsib02a] : Starting time of >>>> crawl : Thu Jan 21 12:53:04 2016 >>>> homegfs [b6-gfsib02a] : Ending time of crawl : >>>> Thu Jan 21 12:53:04 2016 >>>> homegfs [b6-gfsib02a] : Type of crawl: INDEX >>>> homegfs [b6-gfsib02a] : No. of entries healed : 0 >>>> homegfs [b6-gfsib02a] : No. of entries in >>>> split-brain: 0 >>>> homegfs [b6-gfsib02a] : No. of heal failed >>>> entries : 0 >>>> >>>> homegfs [b7-gfsib02b] : Starting time of >>>> crawl : Thu Jan 21 12:53:09 2016 >>>> homegfs [b7-gfsib02b] : Ending time of crawl : >>>> Thu Jan 21 12:53:09 2016 >>>> homegfs [b7-gfsib02b] : Type of crawl: INDEX >>>> homegfs [b7-gfsib02b] : No. of entries healed : 0 >>>> homegfs [b7-gfsib02b] : No. of entries in >>>> split-brain: 0 >>>> homegfs [b7-gfsib02b] : No. of heal failed >>>> entries : 0 >>>> >>>> *** gluster bug in 'gluster volume heal homegfs >>>> statistics' *** >>>> *** Use 'gluster volume heal homegfs info' >>>> until bug is fixed *** >>>> >>>> Brick gfs01a.corvidtec.com:/data/brick01a/homegfs/ >>>> Number of entries: 0 >>>> >>>> Brick gfs01b.corvidtec.com:/data/brick01b/homegfs/ >>>> Number of entries: 0 >>>> >>>> Brick gfs01a.corvidtec.com:/data/brick02a/homegfs/ >>>> Number of entries: 0 >>>> >>>> Brick gfs01b.corvidtec.com:/data/brick02b/homegfs/ >>>> Number of entries: 0 >>>> >>>> Brick gfs02a.corvidtec.com:/data/brick01a/homegfs/ >>>> /users/bangell/.gconfd - Is in split-brain >>>> >>>> Number of entries: 1 >>>> >>>> Brick gfs02b.corvidtec.com:/data/brick01b/homegfs/ >>>> /users/bangell/.gconfd - Is in split-brain >>>> >>>> /users/bangell/.gconfd/saved_state >>>> Number of entries: 2 >>>> >>>> Brick gfs02a.corvidtec.com:/data/brick02a/homegfs/ >>>> Number of entries: 0 >>>> >>>> Brick gfs02b.corvidtec.com:/data/brick02b/homegfs/ >>>> Number of entries: 0 >>>> >>>> >>>> >>>> >>>> On Thu, Jan 21, 2016 at 11:10 AM, Pranith Kumar >>>> Karampuri <pkarampu at redhat.com >>>> <mailto:pkarampu at redhat.com>> wrote: >>>> >>>> >>>> >>>> On 01/21/2016 09:26 PM, Glomski, Patrick wrote: >>>>> I should mention that the problem is not >>>>> currently occurring and there are no heals >>>>> (output appended). By restarting the >>>>> gluster services, we can stop the crawl, >>>>> which lowers the load for a while. >>>>> Subsequent crawls seem to finish properly. >>>>> For what it's worth, files/folders that >>>>> show up in the 'volume info' output during >>>>> a hung crawl don't seem to be anything out >>>>> of the ordinary. >>>>> >>>>> Over the past four days, the typical time >>>>> before the problem recurs after >>>>> suppressing it in this manner is an hour. >>>>> Last night when we reached out to you was >>>>> the last time it happened and the load has >>>>> been low since (a relief). David believes >>>>> that recursively listing the files (ls >>>>> -alR or similar) from a client mount can >>>>> force the issue to happen, but obviously >>>>> I'd rather not unless we have some precise >>>>> thing we're looking for. Let me know if >>>>> you'd like me to attempt to drive the >>>>> system unstable like that and what I >>>>> should look for. As it's a production >>>>> system, I'd rather not leave it in this >>>>> state for long. >>>> >>>> Will it be possible to send glustershd, >>>> mount logs of the past 4 days? I would like >>>> to see if this is because of directory >>>> self-heal going wild (Ravi is working on >>>> throttling feature for 3.8, which will >>>> allow to put breaks on self-heal traffic) >>>> >>>> Pranith >>>> >>>>> >>>>> [root at gfs01a xattrop]# gluster volume heal >>>>> homegfs info >>>>> Brick >>>>> gfs01a.corvidtec.com:/data/brick01a/homegfs/ >>>>> Number of entries: 0 >>>>> >>>>> Brick >>>>> gfs01b.corvidtec.com:/data/brick01b/homegfs/ >>>>> Number of entries: 0 >>>>> >>>>> Brick >>>>> gfs01a.corvidtec.com:/data/brick02a/homegfs/ >>>>> Number of entries: 0 >>>>> >>>>> Brick >>>>> gfs01b.corvidtec.com:/data/brick02b/homegfs/ >>>>> Number of entries: 0 >>>>> >>>>> Brick >>>>> gfs02a.corvidtec.com:/data/brick01a/homegfs/ >>>>> Number of entries: 0 >>>>> >>>>> Brick >>>>> gfs02b.corvidtec.com:/data/brick01b/homegfs/ >>>>> Number of entries: 0 >>>>> >>>>> Brick >>>>> gfs02a.corvidtec.com:/data/brick02a/homegfs/ >>>>> Number of entries: 0 >>>>> >>>>> Brick >>>>> gfs02b.corvidtec.com:/data/brick02b/homegfs/ >>>>> Number of entries: 0 >>>>> >>>>> >>>>> >>>>> >>>>> On Thu, Jan 21, 2016 at 10:40 AM, Pranith >>>>> Kumar Karampuri <pkarampu at redhat.com >>>>> <mailto:pkarampu at redhat.com>> wrote: >>>>> >>>>> >>>>> >>>>> On 01/21/2016 08:25 PM, Glomski, >>>>> Patrick wrote: >>>>>> Hello, Pranith. The typical behavior >>>>>> is that the %cpu on a glusterfsd >>>>>> process jumps to number of processor >>>>>> cores available (800% or 1200%, >>>>>> depending on the pair of nodes >>>>>> involved) and the load average on the >>>>>> machine goes very high (~20). The >>>>>> volume's heal statistics output shows >>>>>> that it is crawling one of the bricks >>>>>> and trying to heal, but this crawl >>>>>> hangs and never seems to finish. >>>>>> >>>>>> The number of files in the xattrop >>>>>> directory varies over time, so I ran >>>>>> a wc -l as you requested periodically >>>>>> for some time and then started >>>>>> including a datestamped list of the >>>>>> files that were in the xattrops >>>>>> directory on each brick to see which >>>>>> were persistent. All bricks had files >>>>>> in the xattrop folder, so all results >>>>>> are attached. >>>>> Thanks this info is helpful. I don't >>>>> see a lot of files. Could you give >>>>> output of "gluster volume heal >>>>> <volname> info"? Is there any >>>>> directory in there which is LARGE? >>>>> >>>>> Pranith >>>>> >>>>>> >>>>>> Please let me know if there is >>>>>> anything else I can provide. >>>>>> >>>>>> Patrick >>>>>> >>>>>> >>>>>> On Thu, Jan 21, 2016 at 12:01 AM, >>>>>> Pranith Kumar Karampuri >>>>>> <pkarampu at redhat.com >>>>>> <mailto:pkarampu at redhat.com>> wrote: >>>>>> >>>>>> hey, >>>>>> Which process is consuming >>>>>> so much cpu? I went through the >>>>>> logs you gave me. I see that the >>>>>> following files are in gfid >>>>>> mismatch state: >>>>>> >>>>>> <066e4525-8f8b-43aa-b7a1-86bbcecc68b9/safebrowsing-backup>, >>>>>> <1d48754b-b38c-403d-94e2-0f5c41d5f885/recovery.bak>, >>>>>> <ddc92637-303a-4059-9c56-ab23b1bb6ae9/patch0008.cnvrg>, >>>>>> >>>>>> Could you give me the output of >>>>>> "ls <brick-path>/indices/xattrop >>>>>> | wc -l" output on all the bricks >>>>>> which are acting this way? This >>>>>> will tell us the number of >>>>>> pending self-heals on the system. >>>>>> >>>>>> Pranith >>>>>> >>>>>> >>>>>> On 01/20/2016 09:26 PM, David >>>>>> Robinson wrote: >>>>>>> resending with parsed logs... >>>>>>>>> I am having issues with 3.6.6 >>>>>>>>> where the load will spike up >>>>>>>>> to 800% for one of the >>>>>>>>> glusterfsd processes and the >>>>>>>>> users can no longer access the >>>>>>>>> system. If I reboot the node, >>>>>>>>> the heal will finish normally >>>>>>>>> after a few minutes and the >>>>>>>>> system will be responsive, >>>>>>>>> but a few hours later the >>>>>>>>> issue will start again. It >>>>>>>>> look like it is hanging in a >>>>>>>>> heal and spinning up the load >>>>>>>>> on one of the bricks. The >>>>>>>>> heal gets stuck and says it is >>>>>>>>> crawling and never returns. >>>>>>>>> After a few minutes of the >>>>>>>>> heal saying it is crawling, >>>>>>>>> the load spikes up and the >>>>>>>>> mounts become unresponsive. >>>>>>>>> Any suggestions on how to fix >>>>>>>>> this? It has us stopped cold >>>>>>>>> as the user can no longer >>>>>>>>> access the systems when the >>>>>>>>> load spikes... Logs attached. >>>>>>>>> System setup info is: >>>>>>>>> [root at gfs01a ~]# gluster >>>>>>>>> volume info homegfs >>>>>>>>> >>>>>>>>> Volume Name: homegfs >>>>>>>>> Type: Distributed-Replicate >>>>>>>>> Volume ID: >>>>>>>>> 1e32672a-f1b7-4b58-ba94-58c085e59071 >>>>>>>>> Status: Started >>>>>>>>> Number of Bricks: 4 x 2 = 8 >>>>>>>>> Transport-type: tcp >>>>>>>>> Bricks: >>>>>>>>> Brick1: >>>>>>>>> gfsib01a.corvidtec.com:/data/brick01a/homegfs >>>>>>>>> Brick2: >>>>>>>>> gfsib01b.corvidtec.com:/data/brick01b/homegfs >>>>>>>>> Brick3: >>>>>>>>> gfsib01a.corvidtec.com:/data/brick02a/homegfs >>>>>>>>> Brick4: >>>>>>>>> gfsib01b.corvidtec.com:/data/brick02b/homegfs >>>>>>>>> Brick5: >>>>>>>>> gfsib02a.corvidtec.com:/data/brick01a/homegfs >>>>>>>>> Brick6: >>>>>>>>> gfsib02b.corvidtec.com:/data/brick01b/homegfs >>>>>>>>> Brick7: >>>>>>>>> gfsib02a.corvidtec.com:/data/brick02a/homegfs >>>>>>>>> Brick8: >>>>>>>>> gfsib02b.corvidtec.com:/data/brick02b/homegfs >>>>>>>>> Options Reconfigured: >>>>>>>>> performance.io-thread-count: 32 >>>>>>>>> performance.cache-size: 128MB >>>>>>>>> performance.write-behind-window-size: >>>>>>>>> 128MB >>>>>>>>> server.allow-insecure: on >>>>>>>>> network.ping-timeout: 42 >>>>>>>>> storage.owner-gid: 100 >>>>>>>>> geo-replication.indexing: off >>>>>>>>> geo-replication.ignore-pid-check: >>>>>>>>> on >>>>>>>>> changelog.changelog: off >>>>>>>>> changelog.fsync-interval: 3 >>>>>>>>> changelog.rollover-time: 15 >>>>>>>>> server.manage-gids: on >>>>>>>>> diagnostics.client-log-level: >>>>>>>>> WARNING >>>>>>>>> [root at gfs01a ~]# rpm -qa | >>>>>>>>> grep gluster >>>>>>>>> gluster-nagios-common-0.1.1-0.el6.noarch >>>>>>>>> glusterfs-fuse-3.6.6-1.el6.x86_64 >>>>>>>>> glusterfs-debuginfo-3.6.6-1.el6.x86_64 >>>>>>>>> glusterfs-libs-3.6.6-1.el6.x86_64 >>>>>>>>> glusterfs-geo-replication-3.6.6-1.el6.x86_64 >>>>>>>>> glusterfs-api-3.6.6-1.el6.x86_64 >>>>>>>>> glusterfs-devel-3.6.6-1.el6.x86_64 >>>>>>>>> glusterfs-api-devel-3.6.6-1.el6.x86_64 >>>>>>>>> glusterfs-3.6.6-1.el6.x86_64 >>>>>>>>> glusterfs-cli-3.6.6-1.el6.x86_64 >>>>>>>>> glusterfs-rdma-3.6.6-1.el6.x86_64 >>>>>>>>> samba-vfs-glusterfs-4.1.11-2.el6.x86_64 >>>>>>>>> glusterfs-server-3.6.6-1.el6.x86_64 >>>>>>>>> glusterfs-extra-xlators-3.6.6-1.el6.x86_64 >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Gluster-devel mailing list >>>>>>> Gluster-devel at gluster.org <mailto:Gluster-devel at gluster.org> >>>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Gluster-users mailing list >>>>>> Gluster-users at gluster.org >>>>>> <mailto:Gluster-users at gluster.org> >>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> >>>> >>> >>> >> >> > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160125/80f29e03/attachment.html>
It is doing it again... statedump from gfs02a is attached... ------ Original Message ------ From: "Pranith Kumar Karampuri" <pkarampu at redhat.com> To: "Glomski, Patrick" <patrick.glomski at corvidtec.com> Cc: "David Robinson" <drobinson at corvidtec.com>; "gluster-users at gluster.org" <gluster-users at gluster.org>; "Gluster Devel" <gluster-devel at gluster.org> Sent: 1/24/2016 9:27:02 PM Subject: Re: [Gluster-users] [Gluster-devel] heal hanging>It seems like there is a lot of finodelk/inodelk traffic. I wonder why >that is. I think the next steps is to collect statedump of the brick >which is taking lot of CPU, using "gluster volume statedump <volname>" > >Pranith >On 01/22/2016 08:36 AM, Glomski, Patrick wrote: >>Pranith, attached are stack traces collected every second for 20 >>seconds from the high-%cpu glusterfsd process. >> >>Patrick >> >>On Thu, Jan 21, 2016 at 9:46 PM, Glomski, Patrick >><patrick.glomski at corvidtec.com> wrote: >>>Last entry for get_real_filename on any of the bricks was when we >>>turned off the samba gfapi vfs plugin earlier today: >>> >>>/var/log/glusterfs/bricks/data-brick01a-homegfs.log:[2016-01-21 >>>15:13:00.008239] E [server-rpc-fops.c:768:server_getxattr_cbk] >>>0-homegfs-server: 105: GETXATTR /wks_backup >>>(40e582d6-b0c7-4099-ba88-9168a3c32ca6) >>>(glusterfs.get_real_filename:desktop.ini) ==> (Permission denied) >>> >>>We'll get back to you with those traces when %cpu spikes again. As >>>with most sporadic problems, as soon as you want something out of it, >>>the issue becomes harder to reproduce. >>> >>> >>>On Thu, Jan 21, 2016 at 9:21 PM, Pranith Kumar Karampuri >>><pkarampu at redhat.com> wrote: >>>> >>>> >>>>On 01/22/2016 07:25 AM, Glomski, Patrick wrote: >>>>>Unfortunately, all samba mounts to the gluster volume through the >>>>>gfapi vfs plugin have been disabled for the last 6 hours or so and >>>>>frequency of %cpu spikes is increased. We had switched to sharing a >>>>>fuse mount through samba, but I just disabled that as well. There >>>>>are no samba shares of this volume now. The spikes now happen every >>>>>thirty minutes or so. We've resorted to just rebooting the machine >>>>>with high load for the present. >>>> >>>>Could you see if the logs of following type are not at all coming? >>>>[2016-01-21 15:13:00.005736] E >>>>[server-rpc-fops.c:768:server_getxattr_cbk] 0-homegfs-server: 110: >>>>GETXATTR /wks_backup (40e582d6-b0c7-4099-ba88-9168a3c >>>>32ca6) (glusterfs.get_real_filename:desktop.ini) ==> (Permission >>>>denied) >>>> >>>>These are operations that failed. Operations that succeed are the >>>>ones that will scan the directory. But I don't have a way to find >>>>them other than using tcpdumps. >>>> >>>>At the moment I have 2 theories: >>>>1) these get_real_filename calls >>>>2) [2016-01-21 16:10:38.017828] E [server-helpers.c:46:gid_resolve] >>>>0-gid-cache: getpwuid_r(494) failed >>>>" >>>>Yessir they are. Normally, sssd would look to the local cache file >>>>in /var/lib/sss/db/ first, to get any group or userid information, >>>>then go out to the domain controller. I put the options that we are >>>>using on our GFS volumes below? Thanks for your help. >>>> >>>> >>>> >>>>We had been running sssd with sssd_nss and sssd_be sub-processes on >>>>these systems for a long time, under the GFS 3.5.2 code, and not run >>>>into the problem that David described with the high cpu usage on >>>>sssd_nss. >>>> >>>>" >>>>That was Tom Young's email 1.5 years back when we debugged it. But >>>>the process which was consuming lot of cpu is sssd_nss. So I am not >>>>sure if it is same issue. Let us debug to see '1)' doesn't happen. >>>>The gstack traces I asked for should also help. >>>> >>>> >>>>Pranith >>>>> >>>>>On Thu, Jan 21, 2016 at 8:49 PM, Pranith Kumar Karampuri >>>>><pkarampu at redhat.com> wrote: >>>>>> >>>>>> >>>>>>On 01/22/2016 07:13 AM, Glomski, Patrick wrote: >>>>>>>We use the samba glusterfs virtual filesystem (the current >>>>>>>version provided on download.gluster.org), but no windows clients >>>>>>>connecting directly. >>>>>> >>>>>>Hmm.. Is there a way to disable using this and check if the CPU% >>>>>>still increases? What getxattr of "glusterfs.get_real_filename >>>>>><filanme>" does is to scan the entire directory looking for >>>>>>strcasecmp(<filname>, <scanned-filename>). If anything matches >>>>>>then it will return the <scanned-filename>. But the problem is the >>>>>>scan is costly. So I wonder if this is the reason for the CPU >>>>>>spikes. >>>>>> >>>>>>Pranith >>>>>> >>>>>>> >>>>>>>On Thu, Jan 21, 2016 at 8:37 PM, Pranith Kumar Karampuri >>>>>>><pkarampu at redhat.com> wrote: >>>>>>>>Do you have any windows clients? I see a lot of getxattr calls >>>>>>>>for "glusterfs.get_real_filename" which lead to full readdirs of >>>>>>>>the directories on the brick. >>>>>>>> >>>>>>>>Pranith >>>>>>>> >>>>>>>>On 01/22/2016 12:51 AM, Glomski, Patrick wrote: >>>>>>>>>Pranith, could this kind of behavior be self-inflicted by us >>>>>>>>>deleting files directly from the bricks? We have done that in >>>>>>>>>the past to clean up an issues where gluster wouldn't allow us >>>>>>>>>to delete from the mount. >>>>>>>>> >>>>>>>>>If so, is it feasible to clean them up by running a search on >>>>>>>>>the .glusterfs directories directly and removing files with a >>>>>>>>>reference count of 1 that are non-zero size (or directly >>>>>>>>>checking the xattrs to be sure that it's not a DHT link). >>>>>>>>> >>>>>>>>>find /data/brick01a/homegfs/.glusterfs -type f -not -empty >>>>>>>>>-links -2 -exec rm -f "{}" \; >>>>>>>>> >>>>>>>>>Is there anything I'm inherently missing with that approach >>>>>>>>>that will further corrupt the system? >>>>>>>>> >>>>>>>>> >>>>>>>>>On Thu, Jan 21, 2016 at 1:02 PM, Glomski, Patrick >>>>>>>>><patrick.glomski at corvidtec.com> wrote: >>>>>>>>>>Load spiked again: ~1200%cpu on gfs02a for glusterfsd. Crawl >>>>>>>>>>has been running on one of the bricks on gfs02b for 25 min or >>>>>>>>>>so and users cannot access the volume. >>>>>>>>>> >>>>>>>>>>I re-listed the xattrop directories as well as a 'top' entry >>>>>>>>>>and heal statistics. Then I restarted the gluster services on >>>>>>>>>>gfs02a. >>>>>>>>>> >>>>>>>>>>=================== top ==================>>>>>>>>>>PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ >>>>>>>>>>COMMAND >>>>>>>>>> 8969 root 20 0 2815m 204m 3588 S 1181.0 0.6 591:06.93 >>>>>>>>>>glusterfsd >>>>>>>>>> >>>>>>>>>>=================== xattrop ==================>>>>>>>>>>/data/brick01a/homegfs/.glusterfs/indices/xattrop: >>>>>>>>>>xattrop-41f19453-91e4-437c-afa9-3b25614de210 >>>>>>>>>>xattrop-9b815879-2f4d-402b-867c-a6d65087788c >>>>>>>>>> >>>>>>>>>>/data/brick02a/homegfs/.glusterfs/indices/xattrop: >>>>>>>>>>xattrop-70131855-3cfb-49af-abce-9d23f57fb393 >>>>>>>>>>xattrop-dfb77848-a39d-4417-a725-9beca75d78c6 >>>>>>>>>> >>>>>>>>>>/data/brick01b/homegfs/.glusterfs/indices/xattrop: >>>>>>>>>>e6e47ed9-309b-42a7-8c44-28c29b9a20f8 >>>>>>>>>>xattrop-5c797a64-bde7-4eac-b4fc-0befc632e125 >>>>>>>>>>xattrop-38ec65a1-00b5-4544-8a6c-bf0f531a1934 >>>>>>>>>>xattrop-ef0980ad-f074-4163-979f-16d5ef85b0a0 >>>>>>>>>> >>>>>>>>>>/data/brick02b/homegfs/.glusterfs/indices/xattrop: >>>>>>>>>>xattrop-7402438d-0ee7-4fcf-b9bb-b561236f99bc >>>>>>>>>>xattrop-8ffbf5f7-ace3-497d-944e-93ac85241413 >>>>>>>>>> >>>>>>>>>>/data/brick01a/homegfs/.glusterfs/indices/xattrop: >>>>>>>>>>xattrop-0115acd0-caae-4dfd-b3b4-7cc42a0ff531 >>>>>>>>>> >>>>>>>>>>/data/brick02a/homegfs/.glusterfs/indices/xattrop: >>>>>>>>>>xattrop-7e20fdb1-5224-4b9a-be06-568708526d70 >>>>>>>>>> >>>>>>>>>>/data/brick01b/homegfs/.glusterfs/indices/xattrop: >>>>>>>>>>8034bc06-92cd-4fa5-8aaf-09039e79d2c8 >>>>>>>>>>c9ce22ed-6d8b-471b-a111-b39e57f0b512 >>>>>>>>>>94fa1d60-45ad-4341-b69c-315936b51e8d >>>>>>>>>>xattrop-9c04623a-64ce-4f66-8b23-dbaba49119c7 >>>>>>>>>> >>>>>>>>>>/data/brick02b/homegfs/.glusterfs/indices/xattrop: >>>>>>>>>>xattrop-b8c8f024-d038-49a2-9a53-c54ead09111d >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>=================== heal stats ==================>>>>>>>>>> >>>>>>>>>>homegfs [b0-gfsib01a] : Starting time of crawl : Thu Jan >>>>>>>>>>21 12:36:45 2016 >>>>>>>>>>homegfs [b0-gfsib01a] : Ending time of crawl : Thu Jan >>>>>>>>>>21 12:36:45 2016 >>>>>>>>>>homegfs [b0-gfsib01a] : Type of crawl: INDEX >>>>>>>>>>homegfs [b0-gfsib01a] : No. of entries healed : 0 >>>>>>>>>>homegfs [b0-gfsib01a] : No. of entries in split-brain: 0 >>>>>>>>>>homegfs [b0-gfsib01a] : No. of heal failed entries : 0 >>>>>>>>>> >>>>>>>>>>homegfs [b1-gfsib01b] : Starting time of crawl : Thu Jan >>>>>>>>>>21 12:36:19 2016 >>>>>>>>>>homegfs [b1-gfsib01b] : Ending time of crawl : Thu Jan >>>>>>>>>>21 12:36:19 2016 >>>>>>>>>>homegfs [b1-gfsib01b] : Type of crawl: INDEX >>>>>>>>>>homegfs [b1-gfsib01b] : No. of entries healed : 0 >>>>>>>>>>homegfs [b1-gfsib01b] : No. of entries in split-brain: 0 >>>>>>>>>>homegfs [b1-gfsib01b] : No. of heal failed entries : 1 >>>>>>>>>> >>>>>>>>>>homegfs [b2-gfsib01a] : Starting time of crawl : Thu Jan >>>>>>>>>>21 12:36:48 2016 >>>>>>>>>>homegfs [b2-gfsib01a] : Ending time of crawl : Thu Jan >>>>>>>>>>21 12:36:48 2016 >>>>>>>>>>homegfs [b2-gfsib01a] : Type of crawl: INDEX >>>>>>>>>>homegfs [b2-gfsib01a] : No. of entries healed : 0 >>>>>>>>>>homegfs [b2-gfsib01a] : No. of entries in split-brain: 0 >>>>>>>>>>homegfs [b2-gfsib01a] : No. of heal failed entries : 0 >>>>>>>>>> >>>>>>>>>>homegfs [b3-gfsib01b] : Starting time of crawl : Thu Jan >>>>>>>>>>21 12:36:47 2016 >>>>>>>>>>homegfs [b3-gfsib01b] : Ending time of crawl : Thu Jan >>>>>>>>>>21 12:36:47 2016 >>>>>>>>>>homegfs [b3-gfsib01b] : Type of crawl: INDEX >>>>>>>>>>homegfs [b3-gfsib01b] : No. of entries healed : 0 >>>>>>>>>>homegfs [b3-gfsib01b] : No. of entries in split-brain: 0 >>>>>>>>>>homegfs [b3-gfsib01b] : No. of heal failed entries : 0 >>>>>>>>>> >>>>>>>>>>homegfs [b4-gfsib02a] : Starting time of crawl : Thu Jan >>>>>>>>>>21 12:36:06 2016 >>>>>>>>>>homegfs [b4-gfsib02a] : Ending time of crawl : Thu Jan >>>>>>>>>>21 12:36:06 2016 >>>>>>>>>>homegfs [b4-gfsib02a] : Type of crawl: INDEX >>>>>>>>>>homegfs [b4-gfsib02a] : No. of entries healed : 0 >>>>>>>>>>homegfs [b4-gfsib02a] : No. of entries in split-brain: 0 >>>>>>>>>>homegfs [b4-gfsib02a] : No. of heal failed entries : 0 >>>>>>>>>> >>>>>>>>>>homegfs [b5-gfsib02b] : Starting time of crawl : Thu Jan >>>>>>>>>>21 12:13:40 2016 >>>>>>>>>>homegfs [b5-gfsib02b] : *** >>>>>>>>>>Crawl is in progress *** >>>>>>>>>>homegfs [b5-gfsib02b] : Type of crawl: INDEX >>>>>>>>>>homegfs [b5-gfsib02b] : No. of entries healed : 0 >>>>>>>>>>homegfs [b5-gfsib02b] : No. of entries in split-brain: 0 >>>>>>>>>>homegfs [b5-gfsib02b] : No. of heal failed entries : 0 >>>>>>>>>> >>>>>>>>>>homegfs [b6-gfsib02a] : Starting time of crawl : Thu Jan >>>>>>>>>>21 12:36:58 2016 >>>>>>>>>>homegfs [b6-gfsib02a] : Ending time of crawl : Thu Jan >>>>>>>>>>21 12:36:58 2016 >>>>>>>>>>homegfs [b6-gfsib02a] : Type of crawl: INDEX >>>>>>>>>>homegfs [b6-gfsib02a] : No. of entries healed : 0 >>>>>>>>>>homegfs [b6-gfsib02a] : No. of entries in split-brain: 0 >>>>>>>>>>homegfs [b6-gfsib02a] : No. of heal failed entries : 0 >>>>>>>>>> >>>>>>>>>>homegfs [b7-gfsib02b] : Starting time of crawl : Thu Jan >>>>>>>>>>21 12:36:50 2016 >>>>>>>>>>homegfs [b7-gfsib02b] : Ending time of crawl : Thu Jan >>>>>>>>>>21 12:36:50 2016 >>>>>>>>>>homegfs [b7-gfsib02b] : Type of crawl: INDEX >>>>>>>>>>homegfs [b7-gfsib02b] : No. of entries healed : 0 >>>>>>>>>>homegfs [b7-gfsib02b] : No. of entries in split-brain: 0 >>>>>>>>>>homegfs [b7-gfsib02b] : No. of heal failed entries : 0 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>=======================================================================================>>>>>>>>>>I waited a few minutes for the heals to finish and ran the >>>>>>>>>>heal statistics and info again. one file is in split-brain. >>>>>>>>>>Aside from the split-brain, the load on all systems is down >>>>>>>>>>now and they are behaving normally. glustershd.log is >>>>>>>>>>attached. What is going on??? >>>>>>>>>> >>>>>>>>>>Thu Jan 21 12:53:50 EST 2016 >>>>>>>>>> >>>>>>>>>>=================== homegfs ==================>>>>>>>>>> >>>>>>>>>>homegfs [b0-gfsib01a] : Starting time of crawl : Thu Jan >>>>>>>>>>21 12:53:02 2016 >>>>>>>>>>homegfs [b0-gfsib01a] : Ending time of crawl : Thu Jan >>>>>>>>>>21 12:53:02 2016 >>>>>>>>>>homegfs [b0-gfsib01a] : Type of crawl: INDEX >>>>>>>>>>homegfs [b0-gfsib01a] : No. of entries healed : 0 >>>>>>>>>>homegfs [b0-gfsib01a] : No. of entries in split-brain: 0 >>>>>>>>>>homegfs [b0-gfsib01a] : No. of heal failed entries : 0 >>>>>>>>>> >>>>>>>>>>homegfs [b1-gfsib01b] : Starting time of crawl : Thu Jan >>>>>>>>>>21 12:53:38 2016 >>>>>>>>>>homegfs [b1-gfsib01b] : Ending time of crawl : Thu Jan >>>>>>>>>>21 12:53:38 2016 >>>>>>>>>>homegfs [b1-gfsib01b] : Type of crawl: INDEX >>>>>>>>>>homegfs [b1-gfsib01b] : No. of entries healed : 0 >>>>>>>>>>homegfs [b1-gfsib01b] : No. of entries in split-brain: 0 >>>>>>>>>>homegfs [b1-gfsib01b] : No. of heal failed entries : 1 >>>>>>>>>> >>>>>>>>>>homegfs [b2-gfsib01a] : Starting time of crawl : Thu Jan >>>>>>>>>>21 12:53:04 2016 >>>>>>>>>>homegfs [b2-gfsib01a] : Ending time of crawl : Thu Jan >>>>>>>>>>21 12:53:04 2016 >>>>>>>>>>homegfs [b2-gfsib01a] : Type of crawl: INDEX >>>>>>>>>>homegfs [b2-gfsib01a] : No. of entries healed : 0 >>>>>>>>>>homegfs [b2-gfsib01a] : No. of entries in split-brain: 0 >>>>>>>>>>homegfs [b2-gfsib01a] : No. of heal failed entries : 0 >>>>>>>>>> >>>>>>>>>>homegfs [b3-gfsib01b] : Starting time of crawl : Thu Jan >>>>>>>>>>21 12:53:04 2016 >>>>>>>>>>homegfs [b3-gfsib01b] : Ending time of crawl : Thu Jan >>>>>>>>>>21 12:53:04 2016 >>>>>>>>>>homegfs [b3-gfsib01b] : Type of crawl: INDEX >>>>>>>>>>homegfs [b3-gfsib01b] : No. of entries healed : 0 >>>>>>>>>>homegfs [b3-gfsib01b] : No. of entries in split-brain: 0 >>>>>>>>>>homegfs [b3-gfsib01b] : No. of heal failed entries : 0 >>>>>>>>>> >>>>>>>>>>homegfs [b4-gfsib02a] : Starting time of crawl : Thu Jan >>>>>>>>>>21 12:53:33 2016 >>>>>>>>>>homegfs [b4-gfsib02a] : Ending time of crawl : Thu Jan >>>>>>>>>>21 12:53:33 2016 >>>>>>>>>>homegfs [b4-gfsib02a] : Type of crawl: INDEX >>>>>>>>>>homegfs [b4-gfsib02a] : No. of entries healed : 0 >>>>>>>>>>homegfs [b4-gfsib02a] : No. of entries in split-brain: 0 >>>>>>>>>>homegfs [b4-gfsib02a] : No. of heal failed entries : 1 >>>>>>>>>> >>>>>>>>>>homegfs [b5-gfsib02b] : Starting time of crawl : Thu Jan >>>>>>>>>>21 12:53:14 2016 >>>>>>>>>>homegfs [b5-gfsib02b] : Ending time of crawl : Thu Jan >>>>>>>>>>21 12:53:15 2016 >>>>>>>>>>homegfs [b5-gfsib02b] : Type of crawl: INDEX >>>>>>>>>>homegfs [b5-gfsib02b] : No. of entries healed : 0 >>>>>>>>>>homegfs [b5-gfsib02b] : No. of entries in split-brain: 0 >>>>>>>>>>homegfs [b5-gfsib02b] : No. of heal failed entries : 3 >>>>>>>>>> >>>>>>>>>>homegfs [b6-gfsib02a] : Starting time of crawl : Thu Jan >>>>>>>>>>21 12:53:04 2016 >>>>>>>>>>homegfs [b6-gfsib02a] : Ending time of crawl : Thu Jan >>>>>>>>>>21 12:53:04 2016 >>>>>>>>>>homegfs [b6-gfsib02a] : Type of crawl: INDEX >>>>>>>>>>homegfs [b6-gfsib02a] : No. of entries healed : 0 >>>>>>>>>>homegfs [b6-gfsib02a] : No. of entries in split-brain: 0 >>>>>>>>>>homegfs [b6-gfsib02a] : No. of heal failed entries : 0 >>>>>>>>>> >>>>>>>>>>homegfs [b7-gfsib02b] : Starting time of crawl : Thu Jan >>>>>>>>>>21 12:53:09 2016 >>>>>>>>>>homegfs [b7-gfsib02b] : Ending time of crawl : Thu Jan >>>>>>>>>>21 12:53:09 2016 >>>>>>>>>>homegfs [b7-gfsib02b] : Type of crawl: INDEX >>>>>>>>>>homegfs [b7-gfsib02b] : No. of entries healed : 0 >>>>>>>>>>homegfs [b7-gfsib02b] : No. of entries in split-brain: 0 >>>>>>>>>>homegfs [b7-gfsib02b] : No. of heal failed entries : 0 >>>>>>>>>> >>>>>>>>>>*** gluster bug in 'gluster volume heal homegfs statistics' >>>>>>>>>>*** >>>>>>>>>>*** Use 'gluster volume heal homegfs info' until bug is fixed >>>>>>>>>>*** >>>>>>>>>> >>>>>>>>>>Brick gfs01a.corvidtec.com:/data/brick01a/homegfs/ >>>>>>>>>>Number of entries: 0 >>>>>>>>>> >>>>>>>>>>Brick gfs01b.corvidtec.com:/data/brick01b/homegfs/ >>>>>>>>>>Number of entries: 0 >>>>>>>>>> >>>>>>>>>>Brick gfs01a.corvidtec.com:/data/brick02a/homegfs/ >>>>>>>>>>Number of entries: 0 >>>>>>>>>> >>>>>>>>>>Brick gfs01b.corvidtec.com:/data/brick02b/homegfs/ >>>>>>>>>>Number of entries: 0 >>>>>>>>>> >>>>>>>>>>Brick gfs02a.corvidtec.com:/data/brick01a/homegfs/ >>>>>>>>>>/users/bangell/.gconfd - Is in split-brain >>>>>>>>>> >>>>>>>>>>Number of entries: 1 >>>>>>>>>> >>>>>>>>>>Brick gfs02b.corvidtec.com:/data/brick01b/homegfs/ >>>>>>>>>>/users/bangell/.gconfd - Is in split-brain >>>>>>>>>> >>>>>>>>>>/users/bangell/.gconfd/saved_state >>>>>>>>>>Number of entries: 2 >>>>>>>>>> >>>>>>>>>>Brick gfs02a.corvidtec.com:/data/brick02a/homegfs/ >>>>>>>>>>Number of entries: 0 >>>>>>>>>> >>>>>>>>>>Brick gfs02b.corvidtec.com:/data/brick02b/homegfs/ >>>>>>>>>>Number of entries: 0 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>On Thu, Jan 21, 2016 at 11:10 AM, Pranith Kumar Karampuri >>>>>>>>>><pkarampu at redhat.com> wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>On 01/21/2016 09:26 PM, Glomski, Patrick wrote: >>>>>>>>>>>>I should mention that the problem is not currently occurring >>>>>>>>>>>>and there are no heals (output appended). By restarting the >>>>>>>>>>>>gluster services, we can stop the crawl, which lowers the >>>>>>>>>>>>load for a while. Subsequent crawls seem to finish properly. >>>>>>>>>>>>For what it's worth, files/folders that show up in the >>>>>>>>>>>>'volume info' output during a hung crawl don't seem to be >>>>>>>>>>>>anything out of the ordinary. >>>>>>>>>>>> >>>>>>>>>>>>Over the past four days, the typical time before the problem >>>>>>>>>>>>recurs after suppressing it in this manner is an hour. Last >>>>>>>>>>>>night when we reached out to you was the last time it >>>>>>>>>>>>happened and the load has been low since (a relief). David >>>>>>>>>>>>believes that recursively listing the files (ls -alR or >>>>>>>>>>>>similar) from a client mount can force the issue to happen, >>>>>>>>>>>>but obviously I'd rather not unless we have some precise >>>>>>>>>>>>thing we're looking for. Let me know if you'd like me to >>>>>>>>>>>>attempt to drive the system unstable like that and what I >>>>>>>>>>>>should look for. As it's a production system, I'd rather not >>>>>>>>>>>>leave it in this state for long. >>>>>>>>>>> >>>>>>>>>>>Will it be possible to send glustershd, mount logs of the >>>>>>>>>>>past 4 days? I would like to see if this is because of >>>>>>>>>>>directory self-heal going wild (Ravi is working on throttling >>>>>>>>>>>feature for 3.8, which will allow to put breaks on self-heal >>>>>>>>>>>traffic) >>>>>>>>>>> >>>>>>>>>>>Pranith >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>[root at gfs01a xattrop]# gluster volume heal homegfs info >>>>>>>>>>>>Brick gfs01a.corvidtec.com:/data/brick01a/homegfs/ >>>>>>>>>>>>Number of entries: 0 >>>>>>>>>>>> >>>>>>>>>>>>Brick gfs01b.corvidtec.com:/data/brick01b/homegfs/ >>>>>>>>>>>>Number of entries: 0 >>>>>>>>>>>> >>>>>>>>>>>>Brick gfs01a.corvidtec.com:/data/brick02a/homegfs/ >>>>>>>>>>>>Number of entries: 0 >>>>>>>>>>>> >>>>>>>>>>>>Brick gfs01b.corvidtec.com:/data/brick02b/homegfs/ >>>>>>>>>>>>Number of entries: 0 >>>>>>>>>>>> >>>>>>>>>>>>Brick gfs02a.corvidtec.com:/data/brick01a/homegfs/ >>>>>>>>>>>>Number of entries: 0 >>>>>>>>>>>> >>>>>>>>>>>>Brick gfs02b.corvidtec.com:/data/brick01b/homegfs/ >>>>>>>>>>>>Number of entries: 0 >>>>>>>>>>>> >>>>>>>>>>>>Brick gfs02a.corvidtec.com:/data/brick02a/homegfs/ >>>>>>>>>>>>Number of entries: 0 >>>>>>>>>>>> >>>>>>>>>>>>Brick gfs02b.corvidtec.com:/data/brick02b/homegfs/ >>>>>>>>>>>>Number of entries: 0 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>On Thu, Jan 21, 2016 at 10:40 AM, Pranith Kumar Karampuri >>>>>>>>>>>><pkarampu at redhat.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>On 01/21/2016 08:25 PM, Glomski, Patrick wrote: >>>>>>>>>>>>>>Hello, Pranith. The typical behavior is that the %cpu on a >>>>>>>>>>>>>>glusterfsd process jumps to number of processor cores >>>>>>>>>>>>>>available (800% or 1200%, depending on the pair of nodes >>>>>>>>>>>>>>involved) and the load average on the machine goes very >>>>>>>>>>>>>>high (~20). The volume's heal statistics output shows that >>>>>>>>>>>>>>it is crawling one of the bricks and trying to heal, but >>>>>>>>>>>>>>this crawl hangs and never seems to finish. >>>>>>>>>>>>>> >>>>>>>>>>>>>>The number of files in the xattrop directory varies over >>>>>>>>>>>>>>time, so I ran a wc -l as you requested periodically for >>>>>>>>>>>>>>some time and then started including a datestamped list of >>>>>>>>>>>>>>the files that were in the xattrops directory on each >>>>>>>>>>>>>>brick to see which were persistent. All bricks had files >>>>>>>>>>>>>>in the xattrop folder, so all results are attached. >>>>>>>>>>>>>Thanks this info is helpful. I don't see a lot of files. >>>>>>>>>>>>>Could you give output of "gluster volume heal <volname> >>>>>>>>>>>>>info"? Is there any directory in there which is LARGE? >>>>>>>>>>>>> >>>>>>>>>>>>>Pranith >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>Please let me know if there is anything else I can >>>>>>>>>>>>>>provide. >>>>>>>>>>>>>> >>>>>>>>>>>>>>Patrick >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>On Thu, Jan 21, 2016 at 12:01 AM, Pranith Kumar Karampuri >>>>>>>>>>>>>><pkarampu at redhat.com> wrote: >>>>>>>>>>>>>>>hey, >>>>>>>>>>>>>>> Which process is consuming so much cpu? I went >>>>>>>>>>>>>>>through the logs you gave me. I see that the following >>>>>>>>>>>>>>>files are in gfid mismatch state: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>><066e4525-8f8b-43aa-b7a1-86bbcecc68b9/safebrowsing-backup>, >>>>>>>>>>>>>>><1d48754b-b38c-403d-94e2-0f5c41d5f885/recovery.bak>, >>>>>>>>>>>>>>><ddc92637-303a-4059-9c56-ab23b1bb6ae9/patch0008.cnvrg>, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>Could you give me the output of "ls >>>>>>>>>>>>>>><brick-path>/indices/xattrop | wc -l" output on all the >>>>>>>>>>>>>>>bricks which are acting this way? This will tell us the >>>>>>>>>>>>>>>number of pending self-heals on the system. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>Pranith >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>On 01/20/2016 09:26 PM, David Robinson wrote: >>>>>>>>>>>>>>>>resending with parsed logs... >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>I am having issues with 3.6.6 where the load will >>>>>>>>>>>>>>>>>>spike up to 800% for one of the glusterfsd processes >>>>>>>>>>>>>>>>>>and the users can no longer access the system. If I >>>>>>>>>>>>>>>>>>reboot the node, the heal will finish normally after a >>>>>>>>>>>>>>>>>>few minutes and the system will be responsive, but a >>>>>>>>>>>>>>>>>>few hours later the issue will start again. It look >>>>>>>>>>>>>>>>>>like it is hanging in a heal and spinning up the load >>>>>>>>>>>>>>>>>>on one of the bricks. The heal gets stuck and says it >>>>>>>>>>>>>>>>>>is crawling and never returns. After a few minutes of >>>>>>>>>>>>>>>>>>the heal saying it is crawling, the load spikes up and >>>>>>>>>>>>>>>>>>the mounts become unresponsive. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>Any suggestions on how to fix this? It has us stopped >>>>>>>>>>>>>>>>>>cold as the user can no longer access the systems when >>>>>>>>>>>>>>>>>>the load spikes... Logs attached. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>System setup info is: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>[root at gfs01a ~]# gluster volume info homegfs >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>Volume Name: homegfs >>>>>>>>>>>>>>>>>>Type: Distributed-Replicate >>>>>>>>>>>>>>>>>>Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071 >>>>>>>>>>>>>>>>>>Status: Started >>>>>>>>>>>>>>>>>>Number of Bricks: 4 x 2 = 8 >>>>>>>>>>>>>>>>>>Transport-type: tcp >>>>>>>>>>>>>>>>>>Bricks: >>>>>>>>>>>>>>>>>>Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs >>>>>>>>>>>>>>>>>>Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs >>>>>>>>>>>>>>>>>>Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs >>>>>>>>>>>>>>>>>>Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs >>>>>>>>>>>>>>>>>>Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs >>>>>>>>>>>>>>>>>>Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs >>>>>>>>>>>>>>>>>>Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs >>>>>>>>>>>>>>>>>>Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs >>>>>>>>>>>>>>>>>>Options Reconfigured: >>>>>>>>>>>>>>>>>>performance.io-thread-count: 32 >>>>>>>>>>>>>>>>>>performance.cache-size: 128MB >>>>>>>>>>>>>>>>>>performance.write-behind-window-size: 128MB >>>>>>>>>>>>>>>>>>server.allow-insecure: on >>>>>>>>>>>>>>>>>>network.ping-timeout: 42 >>>>>>>>>>>>>>>>>>storage.owner-gid: 100 >>>>>>>>>>>>>>>>>>geo-replication.indexing: off >>>>>>>>>>>>>>>>>>geo-replication.ignore-pid-check: on >>>>>>>>>>>>>>>>>>changelog.changelog: off >>>>>>>>>>>>>>>>>>changelog.fsync-interval: 3 >>>>>>>>>>>>>>>>>>changelog.rollover-time: 15 >>>>>>>>>>>>>>>>>>server.manage-gids: on >>>>>>>>>>>>>>>>>>diagnostics.client-log-level: WARNING >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>[root at gfs01a ~]# rpm -qa | grep gluster >>>>>>>>>>>>>>>>>>gluster-nagios-common-0.1.1-0.el6.noarch >>>>>>>>>>>>>>>>>>glusterfs-fuse-3.6.6-1.el6.x86_64 >>>>>>>>>>>>>>>>>>glusterfs-debuginfo-3.6.6-1.el6.x86_64 >>>>>>>>>>>>>>>>>>glusterfs-libs-3.6.6-1.el6.x86_64 >>>>>>>>>>>>>>>>>>glusterfs-geo-replication-3.6.6-1.el6.x86_64 >>>>>>>>>>>>>>>>>>glusterfs-api-3.6.6-1.el6.x86_64 >>>>>>>>>>>>>>>>>>glusterfs-devel-3.6.6-1.el6.x86_64 >>>>>>>>>>>>>>>>>>glusterfs-api-devel-3.6.6-1.el6.x86_64 >>>>>>>>>>>>>>>>>>glusterfs-3.6.6-1.el6.x86_64 >>>>>>>>>>>>>>>>>>glusterfs-cli-3.6.6-1.el6.x86_64 >>>>>>>>>>>>>>>>>>glusterfs-rdma-3.6.6-1.el6.x86_64 >>>>>>>>>>>>>>>>>>samba-vfs-glusterfs-4.1.11-2.el6.x86_64 >>>>>>>>>>>>>>>>>>glusterfs-server-3.6.6-1.el6.x86_64 >>>>>>>>>>>>>>>>>>glusterfs-extra-xlators-3.6.6-1.el6.x86_64 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>_______________________________________________ >>>>>>>>>>>>>>>>Gluster-devel mailing list >>>>>>>>>>>>>>>>Gluster-devel at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-devel >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>_______________________________________________ >>>>>>>>>>>>>>>Gluster-users mailing list >>>>>>>>>>>>>>>Gluster-users at gluster.org >>>>>>>>>>>>>>>http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160125/4018f0a2/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: data-brick02a-homegfs.4066.dump.1453742225.gz Type: application/x-gzip Size: 1138050 bytes Desc: not available URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160125/4018f0a2/attachment-0002.gz> -------------- next part -------------- A non-text attachment was scrubbed... Name: data-brick01a-homegfs.4061.dump.1453742224.gz Type: application/x-gzip Size: 640151 bytes Desc: not available URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160125/4018f0a2/attachment-0003.gz>