----- Original Message -----> From: "Justin Clift" <justin at gluster.org> > To: "Benjamin Turner" <bennyturns at gmail.com> > Cc: "David F. Robinson" <david.robinson at corvidtec.com>, gluster-users at gluster.org, "Gluster Devel" > <gluster-devel at gluster.org>, "Ben Turner" <bturner at redhat.com> > Sent: Friday, February 6, 2015 3:27:53 PM > Subject: Re: [Gluster-devel] [Gluster-users] missing files > > On 6 Feb 2015, at 02:05, Benjamin Turner <bennyturns at gmail.com> wrote: > > I think that the multi threaded epoll changes that _just_ landed in master > > will help resolve this, but they are so new I haven't been able to test > > this. I'll know more when I get a chance to test tomorrow. > > Which multi-threaded epoll code just landed in master? Are you thinking > of this one? > > http://review.gluster.org/#/c/3842/ > > If so, it's not in master yet. ;)Doh! I just saw - "Required patches are all upstream now" and assumed they were merged. I have been in class all week so I am not up2date with everything. I gave instructions on compiling it from the gerrit patches + master so if David wants to give it a go he can. Sorry for the confusion. -b> + Justin > > > > -b > > > > On Thu, Feb 5, 2015 at 6:04 PM, David F. Robinson > > <david.robinson at corvidtec.com> wrote: > > Isn't rsync what geo-rep uses? > > > > David (Sent from mobile) > > > > ==============================> > David F. Robinson, Ph.D. > > President - Corvid Technologies > > 704.799.6944 x101 [office] > > 704.252.1310 [cell] > > 704.799.7974 [fax] > > David.Robinson at corvidtec.com > > http://www.corvidtechnologies.com > > > > > On Feb 5, 2015, at 5:41 PM, Ben Turner <bturner at redhat.com> wrote: > > > > > > ----- Original Message ----- > > >> From: "Ben Turner" <bturner at redhat.com> > > >> To: "David F. Robinson" <david.robinson at corvidtec.com> > > >> Cc: "Pranith Kumar Karampuri" <pkarampu at redhat.com>, "Xavier Hernandez" > > >> <xhernandez at datalab.es>, "Benjamin Turner" > > >> <bennyturns at gmail.com>, gluster-users at gluster.org, "Gluster Devel" > > >> <gluster-devel at gluster.org> > > >> Sent: Thursday, February 5, 2015 5:22:26 PM > > >> Subject: Re: [Gluster-users] [Gluster-devel] missing files > > >> > > >> ----- Original Message ----- > > >>> From: "David F. Robinson" <david.robinson at corvidtec.com> > > >>> To: "Ben Turner" <bturner at redhat.com> > > >>> Cc: "Pranith Kumar Karampuri" <pkarampu at redhat.com>, "Xavier Hernandez" > > >>> <xhernandez at datalab.es>, "Benjamin Turner" > > >>> <bennyturns at gmail.com>, gluster-users at gluster.org, "Gluster Devel" > > >>> <gluster-devel at gluster.org> > > >>> Sent: Thursday, February 5, 2015 5:01:13 PM > > >>> Subject: Re: [Gluster-users] [Gluster-devel] missing files > > >>> > > >>> I'll send you the emails I sent Pranith with the logs. What causes > > >>> these > > >>> disconnects? > > >> > > >> Thanks David! Disconnects happen when there are interruption in > > >> communication between peers, normally there is ping timeout that > > >> happens. > > >> It could be anything from a flaky NW to the system was to busy to > > >> respond > > >> to the pings. My initial take is more towards the ladder as rsync is > > >> absolutely the worst use case for gluster - IIRC it writes in 4kb > > >> blocks. I > > >> try to keep my writes at least 64KB as in my testing that is the > > >> smallest > > >> block size I can write with before perf starts to really drop off. I'll > > >> try > > >> something similar in the lab. > > > > > > Ok I do think that the file being self healed is RCA for what you were > > > seeing. Lets look at one of the disconnects: > > > > > > data-brick02a-homegfs.log:[2015-02-03 20:54:02.772180] I > > > [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting > > > connection from > > > gfs01b.corvidtec.com-4175-2015/02/02-16:44:31:179119-homegfs-client-2-0-1 > > > > > > And in the glustershd.log from the gfs01b_glustershd.log file: > > > > > > [2015-02-03 20:55:48.001797] I > > > [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: > > > performing entry selfheal on 6c79a368-edaa-432b-bef9-ec690ab42448 > > > [2015-02-03 20:55:49.341996] I > > > [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0: > > > Completed entry selfheal on 6c79a368-edaa-432b-bef9-ec690ab42448. > > > source=1 sinks=0 > > > [2015-02-03 20:55:49.343093] I > > > [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: > > > performing entry selfheal on 792cb0d6-9290-4447-8cd7-2b2d7a116a69 > > > [2015-02-03 20:55:50.463652] I > > > [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0: > > > Completed entry selfheal on 792cb0d6-9290-4447-8cd7-2b2d7a116a69. > > > source=1 sinks=0 > > > [2015-02-03 20:55:51.465289] I > > > [afr-self-heal-metadata.c:54:__afr_selfheal_metadata_do] > > > 0-homegfs-replicate-0: performing metadata selfheal on > > > 403e661a-1c27-4e79-9867-c0572aba2b3c > > > [2015-02-03 20:55:51.466515] I > > > [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0: > > > Completed metadata selfheal on 403e661a-1c27-4e79-9867-c0572aba2b3c. > > > source=1 sinks=0 > > > [2015-02-03 20:55:51.467098] I > > > [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: > > > performing entry selfheal on 403e661a-1c27-4e79-9867-c0572aba2b3c > > > [2015-02-03 20:55:55.257808] I > > > [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0: > > > Completed entry selfheal on 403e661a-1c27-4e79-9867-c0572aba2b3c. > > > source=1 sinks=0 > > > [2015-02-03 20:55:55.258548] I > > > [afr-self-heal-metadata.c:54:__afr_selfheal_metadata_do] > > > 0-homegfs-replicate-0: performing metadata selfheal on > > > c612ee2f-2fb4-4157-a9ab-5a2d5603c541 > > > [2015-02-03 20:55:55.259367] I > > > [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0: > > > Completed metadata selfheal on c612ee2f-2fb4-4157-a9ab-5a2d5603c541. > > > source=1 sinks=0 > > > [2015-02-03 20:55:55.259980] I > > > [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: > > > performing entry selfheal on c612ee2f-2fb4-4157-a9ab-5a2d5603c541 > > > > > > As you can see the self heal logs are just spammed with files being > > > healed, and I looked at a couple of disconnects and I see self heals > > > getting run shortly after on the bricks that were down. Now we need to > > > find the cause of the disconnects, I am thinking once the disconnects > > > are resolved the files should be properly copied over without SH having > > > to fix things. Like I said I'll give this a go on my lab systems and > > > see if I can repro the disconnects, I'll have time to run through it > > > tomorrow. If in the mean time anyone else has a theory / anything to > > > add here it would be appreciated. > > > > > > -b > > > > > >> -b > > >> > > >>> David (Sent from mobile) > > >>> > > >>> ==============================> > >>> David F. Robinson, Ph.D. > > >>> President - Corvid Technologies > > >>> 704.799.6944 x101 [office] > > >>> 704.252.1310 [cell] > > >>> 704.799.7974 [fax] > > >>> David.Robinson at corvidtec.com > > >>> http://www.corvidtechnologies.com > > >>> > > >>>> On Feb 5, 2015, at 4:55 PM, Ben Turner <bturner at redhat.com> wrote: > > >>>> > > >>>> ----- Original Message ----- > > >>>>> From: "Pranith Kumar Karampuri" <pkarampu at redhat.com> > > >>>>> To: "Xavier Hernandez" <xhernandez at datalab.es>, "David F. Robinson" > > >>>>> <david.robinson at corvidtec.com>, "Benjamin Turner" > > >>>>> <bennyturns at gmail.com> > > >>>>> Cc: gluster-users at gluster.org, "Gluster Devel" > > >>>>> <gluster-devel at gluster.org> > > >>>>> Sent: Thursday, February 5, 2015 5:30:04 AM > > >>>>> Subject: Re: [Gluster-users] [Gluster-devel] missing files > > >>>>> > > >>>>> > > >>>>>> On 02/05/2015 03:48 PM, Pranith Kumar Karampuri wrote: > > >>>>>> I believe David already fixed this. I hope this is the same issue he > > >>>>>> told about permissions issue. > > >>>>> Oops, it is not. I will take a look. > > >>>> > > >>>> Yes David exactly like these: > > >>>> > > >>>> data-brick02a-homegfs.log:[2015-02-03 19:09:34.568842] I > > >>>> [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting > > >>>> connection from > > >>>> gfs02a.corvidtec.com-18563-2015/02/03-19:07:58:519134-homegfs-client-2-0-0 > > >>>> data-brick02a-homegfs.log:[2015-02-03 19:09:41.286551] I > > >>>> [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting > > >>>> connection from > > >>>> gfs01a.corvidtec.com-12804-2015/02/03-19:09:38:497808-homegfs-client-2-0-0 > > >>>> data-brick02a-homegfs.log:[2015-02-03 19:16:35.906412] I > > >>>> [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting > > >>>> connection from > > >>>> gfs02b.corvidtec.com-27190-2015/02/03-19:15:53:458467-homegfs-client-2-0-0 > > >>>> data-brick02a-homegfs.log:[2015-02-03 19:51:22.761293] I > > >>>> [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting > > >>>> connection from > > >>>> gfs01a.corvidtec.com-25926-2015/02/03-19:51:02:89070-homegfs-client-2-0-0 > > >>>> data-brick02a-homegfs.log:[2015-02-03 20:54:02.772180] I > > >>>> [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting > > >>>> connection from > > >>>> gfs01b.corvidtec.com-4175-2015/02/02-16:44:31:179119-homegfs-client-2-0-1 > > >>>> > > >>>> You can 100% verify my theory if you can correlate the time on the > > >>>> disconnects to the time that the missing files were healed. Can you > > >>>> have > > >>>> a look at /var/log/glusterfs/glustershd.log? That has all of the > > >>>> healed > > >>>> files + timestamps, if we can see a disconnect during the rsync and a > > >>>> self > > >>>> heal of the missing file I think we can safely assume that the > > >>>> disconnects > > >>>> may have caused this. I'll try this on my test systems, how much data > > >>>> did > > >>>> you rsync? What size ish of files / an idea of the dir layout? > > >>>> > > >>>> @Pranith - Could bricks flapping up and down during the rsync cause > > >>>> the > > >>>> files to be missing on the first ls(written to 1 subvol but not the > > >>>> other > > >>>> cause it was down), the ls triggered SH, and thats why the files were > > >>>> there for the second ls be a possible cause here? > > >>>> > > >>>> -b > > >>>> > > >>>> > > >>>>> Pranith > > >>>>>> > > >>>>>> Pranith > > >>>>>>> On 02/05/2015 03:44 PM, Xavier Hernandez wrote: > > >>>>>>> Is the failure repeatable ? with the same directories ? > > >>>>>>> > > >>>>>>> It's very weird that the directories appear on the volume when you > > >>>>>>> do > > >>>>>>> an 'ls' on the bricks. Could it be that you only made a single 'ls' > > >>>>>>> on fuse mount which not showed the directory ? Is it possible that > > >>>>>>> this 'ls' triggered a self-heal that repaired the problem, whatever > > >>>>>>> it was, and when you did another 'ls' on the fuse mount after the > > >>>>>>> 'ls' on the bricks, the directories were there ? > > >>>>>>> > > >>>>>>> The first 'ls' could have healed the files, causing that the > > >>>>>>> following 'ls' on the bricks showed the files as if nothing were > > >>>>>>> damaged. If that's the case, it's possible that there were some > > >>>>>>> disconnections during the copy. > > >>>>>>> > > >>>>>>> Added Pranith because he knows better replication and self-heal > > >>>>>>> details. > > >>>>>>> > > >>>>>>> Xavi > > >>>>>>> > > >>>>>>>> On 02/04/2015 07:23 PM, David F. Robinson wrote: > > >>>>>>>> Distributed/replicated > > >>>>>>>> > > >>>>>>>> Volume Name: homegfs > > >>>>>>>> Type: Distributed-Replicate > > >>>>>>>> Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071 > > >>>>>>>> Status: Started > > >>>>>>>> Number of Bricks: 4 x 2 = 8 > > >>>>>>>> Transport-type: tcp > > >>>>>>>> Bricks: > > >>>>>>>> Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs > > >>>>>>>> Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs > > >>>>>>>> Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs > > >>>>>>>> Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs > > >>>>>>>> Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs > > >>>>>>>> Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs > > >>>>>>>> Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs > > >>>>>>>> Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs > > >>>>>>>> Options Reconfigured: > > >>>>>>>> performance.io-thread-count: 32 > > >>>>>>>> performance.cache-size: 128MB > > >>>>>>>> performance.write-behind-window-size: 128MB > > >>>>>>>> server.allow-insecure: on > > >>>>>>>> network.ping-timeout: 10 > > >>>>>>>> storage.owner-gid: 100 > > >>>>>>>> geo-replication.indexing: off > > >>>>>>>> geo-replication.ignore-pid-check: on > > >>>>>>>> changelog.changelog: on > > >>>>>>>> changelog.fsync-interval: 3 > > >>>>>>>> changelog.rollover-time: 15 > > >>>>>>>> server.manage-gids: on > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> ------ Original Message ------ > > >>>>>>>> From: "Xavier Hernandez" <xhernandez at datalab.es> > > >>>>>>>> To: "David F. Robinson" <david.robinson at corvidtec.com>; "Benjamin > > >>>>>>>> Turner" <bennyturns at gmail.com> > > >>>>>>>> Cc: "gluster-users at gluster.org" <gluster-users at gluster.org>; > > >>>>>>>> "Gluster > > >>>>>>>> Devel" <gluster-devel at gluster.org> > > >>>>>>>> Sent: 2/4/2015 6:03:45 AM > > >>>>>>>> Subject: Re: [Gluster-devel] missing files > > >>>>>>>> > > >>>>>>>>>> On 02/04/2015 01:30 AM, David F. Robinson wrote: > > >>>>>>>>>> Sorry. Thought about this a little more. I should have been > > >>>>>>>>>> clearer. > > >>>>>>>>>> The files were on both bricks of the replica, not just one side. > > >>>>>>>>>> So, > > >>>>>>>>>> both bricks had to have been up... The files/directories just > > >>>>>>>>>> don't show > > >>>>>>>>>> up on the mount. > > >>>>>>>>>> I was reading and saw a related bug > > >>>>>>>>>> (https://bugzilla.redhat.com/show_bug.cgi?id=1159484). I saw it > > >>>>>>>>>> suggested to run: > > >>>>>>>>>> find <mount> -d -exec getfattr -h -n trusted.ec.heal {} > > >>>>>>>>>> \; > > >>>>>>>>> > > >>>>>>>>> This command is specific for a dispersed volume. It won't do > > >>>>>>>>> anything > > >>>>>>>>> (aside from the error you are seeing) on a replicated volume. > > >>>>>>>>> > > >>>>>>>>> I think you are using a replicated volume, right ? > > >>>>>>>>> > > >>>>>>>>> In this case I'm not sure what can be happening. Is your volume a > > >>>>>>>>> pure > > >>>>>>>>> replicated one or a distributed-replicated ? on a pure replicated > > >>>>>>>>> it > > >>>>>>>>> doesn't make sense that some entries do not show in an 'ls' when > > >>>>>>>>> the > > >>>>>>>>> file is in both replicas (at least without any error message in > > >>>>>>>>> the > > >>>>>>>>> logs). On a distributed-replicated it could be caused by some > > >>>>>>>>> problem > > >>>>>>>>> while combining contents of each replica set. > > >>>>>>>>> > > >>>>>>>>> What's the configuration of your volume ? > > >>>>>>>>> > > >>>>>>>>> Xavi > > >>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> I get a bunch of errors for operation not supported: > > >>>>>>>>>> [root at gfs02a homegfs]# find wks_backup -d -exec getfattr -h -n > > >>>>>>>>>> trusted.ec.heal {} \; > > >>>>>>>>>> find: warning: the -d option is deprecated; please use -depth > > >>>>>>>>>> instead, > > >>>>>>>>>> because the latter is a POSIX-compliant feature. > > >>>>>>>>>> wks_backup/homer_backup/backup: trusted.ec.heal: Operation not > > >>>>>>>>>> supported > > >>>>>>>>>> wks_backup/homer_backup/logs/2014_05_20.log: trusted.ec.heal: > > >>>>>>>>>> Operation > > >>>>>>>>>> not supported > > >>>>>>>>>> wks_backup/homer_backup/logs/2014_05_21.log: trusted.ec.heal: > > >>>>>>>>>> Operation > > >>>>>>>>>> not supported > > >>>>>>>>>> wks_backup/homer_backup/logs/2014_05_18.log: trusted.ec.heal: > > >>>>>>>>>> Operation > > >>>>>>>>>> not supported > > >>>>>>>>>> wks_backup/homer_backup/logs/2014_05_19.log: trusted.ec.heal: > > >>>>>>>>>> Operation > > >>>>>>>>>> not supported > > >>>>>>>>>> wks_backup/homer_backup/logs/2014_05_22.log: trusted.ec.heal: > > >>>>>>>>>> Operation > > >>>>>>>>>> not supported > > >>>>>>>>>> wks_backup/homer_backup/logs: trusted.ec.heal: Operation not > > >>>>>>>>>> supported > > >>>>>>>>>> wks_backup/homer_backup: trusted.ec.heal: Operation not > > >>>>>>>>>> supported > > >>>>>>>>>> ------ Original Message ------ > > >>>>>>>>>> From: "Benjamin Turner" <bennyturns at gmail.com > > >>>>>>>>>> <mailto:bennyturns at gmail.com>> > > >>>>>>>>>> To: "David F. Robinson" <david.robinson at corvidtec.com > > >>>>>>>>>> <mailto:david.robinson at corvidtec.com>> > > >>>>>>>>>> Cc: "Gluster Devel" <gluster-devel at gluster.org > > >>>>>>>>>> <mailto:gluster-devel at gluster.org>>; "gluster-users at gluster.org" > > >>>>>>>>>> <gluster-users at gluster.org <mailto:gluster-users at gluster.org>> > > >>>>>>>>>> Sent: 2/3/2015 7:12:34 PM > > >>>>>>>>>> Subject: Re: [Gluster-devel] missing files > > >>>>>>>>>>> It sounds to me like the files were only copied to one replica, > > >>>>>>>>>>> werent > > >>>>>>>>>>> there for the initial for the initial ls which triggered a self > > >>>>>>>>>>> heal, > > >>>>>>>>>>> and were there for the last ls because they were healed. Is > > >>>>>>>>>>> there > > >>>>>>>>>>> any > > >>>>>>>>>>> chance that one of the replicas was down during the rsync? It > > >>>>>>>>>>> could > > >>>>>>>>>>> be that you lost a brick during copy or something like that. To > > >>>>>>>>>>> confirm I would look for disconnects in the brick logs as well > > >>>>>>>>>>> as > > >>>>>>>>>>> checking glusterfshd.log to verify the missing files were > > >>>>>>>>>>> actually > > >>>>>>>>>>> healed. > > >>>>>>>>>>> > > >>>>>>>>>>> -b > > >>>>>>>>>>> > > >>>>>>>>>>> On Tue, Feb 3, 2015 at 5:37 PM, David F. Robinson > > >>>>>>>>>>> <david.robinson at corvidtec.com > > >>>>>>>>>>> <mailto:david.robinson at corvidtec.com>> > > >>>>>>>>>>> wrote: > > >>>>>>>>>>> > > >>>>>>>>>>> I rsync'd 20-TB over to my gluster system and noticed that I > > >>>>>>>>>>> had > > >>>>>>>>>>> some directories missing even though the rsync completed > > >>>>>>>>>>> normally. > > >>>>>>>>>>> The rsync logs showed that the missing files were > > >>>>>>>>>>> transferred. > > >>>>>>>>>>> I went to the bricks and did an 'ls -al > > >>>>>>>>>>> /data/brick*/homegfs/dir/*' the files were on the bricks. > > >>>>>>>>>>> After I > > >>>>>>>>>>> did this 'ls', the files then showed up on the FUSE mounts. > > >>>>>>>>>>> 1) Why are the files hidden on the fuse mount? > > >>>>>>>>>>> 2) Why does the ls make them show up on the FUSE mount? > > >>>>>>>>>>> 3) How can I prevent this from happening again? > > >>>>>>>>>>> Note, I also mounted the gluster volume using NFS and saw the > > >>>>>>>>>>> same > > >>>>>>>>>>> behavior. The files/directories were not shown until I did > > >>>>>>>>>>> the > > >>>>>>>>>>> "ls" on the bricks. > > >>>>>>>>>>> David > > >>>>>>>>>>> ==============================> > >>>>>>>>>>> David F. Robinson, Ph.D. > > >>>>>>>>>>> President - Corvid Technologies > > >>>>>>>>>>> 704.799.6944 x101 <tel:704.799.6944%20x101> [office] > > >>>>>>>>>>> 704.252.1310 <tel:704.252.1310> [cell] > > >>>>>>>>>>> 704.799.7974 <tel:704.799.7974> [fax] > > >>>>>>>>>>> David.Robinson at corvidtec.com > > >>>>>>>>>>> <mailto:David.Robinson at corvidtec.com> > > >>>>>>>>>>> http://www.corvidtechnologies.com > > >>>>>>>>>>> <http://www.corvidtechnologies.com/> > > >>>>>>>>>>> > > >>>>>>>>>>> _______________________________________________ > > >>>>>>>>>>> Gluster-devel mailing list > > >>>>>>>>>>> Gluster-devel at gluster.org <mailto:Gluster-devel at gluster.org> > > >>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> _______________________________________________ > > >>>>>>>>>> Gluster-devel mailing list > > >>>>>>>>>> Gluster-devel at gluster.org > > >>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel > > >>>>>> > > >>>>>> _______________________________________________ > > >>>>>> Gluster-users mailing list > > >>>>>> Gluster-users at gluster.org > > >>>>>> http://www.gluster.org/mailman/listinfo/gluster-users > > >>>>> > > >>>>> _______________________________________________ > > >>>>> Gluster-users mailing list > > >>>>> Gluster-users at gluster.org > > >>>>> http://www.gluster.org/mailman/listinfo/gluster-users > > >> > > > > _______________________________________________ > > Gluster-devel mailing list > > Gluster-devel at gluster.org > > http://www.gluster.org/mailman/listinfo/gluster-devel > > -- > GlusterFS - http://www.gluster.org > > An open source, distributed file system scaling to several > petabytes, and handling thousands of clients. > > My personal twitter: twitter.com/realjustinclift > >
I don't think I understood what you sent enough to give it a try. I'll wait until it comes out in a beta or release version. David ------ Original Message ------ From: "Ben Turner" <bturner at redhat.com> To: "Justin Clift" <justin at gluster.org>; "David F. Robinson" <david.robinson at corvidtec.com> Cc: "Benjamin Turner" <bennyturns at gmail.com>; gluster-users at gluster.org; "Gluster Devel" <gluster-devel at gluster.org> Sent: 2/6/2015 3:33:42 PM Subject: Re: [Gluster-devel] [Gluster-users] missing files>----- Original Message ----- >> From: "Justin Clift" <justin at gluster.org> >> To: "Benjamin Turner" <bennyturns at gmail.com> >> Cc: "David F. Robinson" <david.robinson at corvidtec.com>, >>gluster-users at gluster.org, "Gluster Devel" >> <gluster-devel at gluster.org>, "Ben Turner" <bturner at redhat.com> >> Sent: Friday, February 6, 2015 3:27:53 PM >> Subject: Re: [Gluster-devel] [Gluster-users] missing files >> >> On 6 Feb 2015, at 02:05, Benjamin Turner <bennyturns at gmail.com> >>wrote: >> > I think that the multi threaded epoll changes that _just_ landed in >>master >> > will help resolve this, but they are so new I haven't been able to >>test >> > this. I'll know more when I get a chance to test tomorrow. >> >> Which multi-threaded epoll code just landed in master? Are you >>thinking >> of this one? >> >> http://review.gluster.org/#/c/3842/ >> >> If so, it's not in master yet. ;) > >Doh! I just saw - "Required patches are all upstream now" and assumed >they were merged. I have been in class all week so I am not up2date >with everything. I gave instructions on compiling it from the gerrit >patches + master so if David wants to give it a go he can. Sorry for >the confusion. > >-b > >> + Justin >> >> >> > -b >> > >> > On Thu, Feb 5, 2015 at 6:04 PM, David F. Robinson >> > <david.robinson at corvidtec.com> wrote: >> > Isn't rsync what geo-rep uses? >> > >> > David (Sent from mobile) >> > >> > ==============================>> > David F. Robinson, Ph.D. >> > President - Corvid Technologies >> > 704.799.6944 x101 [office] >> > 704.252.1310 [cell] >> > 704.799.7974 [fax] >> > David.Robinson at corvidtec.com >> > http://www.corvidtechnologies.com >> > >> > > On Feb 5, 2015, at 5:41 PM, Ben Turner <bturner at redhat.com> >>wrote: >> > > >> > > ----- Original Message ----- >> > >> From: "Ben Turner" <bturner at redhat.com> >> > >> To: "David F. Robinson" <david.robinson at corvidtec.com> >> > >> Cc: "Pranith Kumar Karampuri" <pkarampu at redhat.com>, "Xavier >>Hernandez" >> > >> <xhernandez at datalab.es>, "Benjamin Turner" >> > >> <bennyturns at gmail.com>, gluster-users at gluster.org, "Gluster >>Devel" >> > >> <gluster-devel at gluster.org> >> > >> Sent: Thursday, February 5, 2015 5:22:26 PM >> > >> Subject: Re: [Gluster-users] [Gluster-devel] missing files >> > >> >> > >> ----- Original Message ----- >> > >>> From: "David F. Robinson" <david.robinson at corvidtec.com> >> > >>> To: "Ben Turner" <bturner at redhat.com> >> > >>> Cc: "Pranith Kumar Karampuri" <pkarampu at redhat.com>, "Xavier >>Hernandez" >> > >>> <xhernandez at datalab.es>, "Benjamin Turner" >> > >>> <bennyturns at gmail.com>, gluster-users at gluster.org, "Gluster >>Devel" >> > >>> <gluster-devel at gluster.org> >> > >>> Sent: Thursday, February 5, 2015 5:01:13 PM >> > >>> Subject: Re: [Gluster-users] [Gluster-devel] missing files >> > >>> >> > >>> I'll send you the emails I sent Pranith with the logs. What >>causes >> > >>> these >> > >>> disconnects? >> > >> >> > >> Thanks David! Disconnects happen when there are interruption in >> > >> communication between peers, normally there is ping timeout that >> > >> happens. >> > >> It could be anything from a flaky NW to the system was to busy >>to >> > >> respond >> > >> to the pings. My initial take is more towards the ladder as >>rsync is >> > >> absolutely the worst use case for gluster - IIRC it writes in >>4kb >> > >> blocks. I >> > >> try to keep my writes at least 64KB as in my testing that is the >> > >> smallest >> > >> block size I can write with before perf starts to really drop >>off. I'll >> > >> try >> > >> something similar in the lab. >> > > >> > > Ok I do think that the file being self healed is RCA for what you >>were >> > > seeing. Lets look at one of the disconnects: >> > > >> > > data-brick02a-homegfs.log:[2015-02-03 20:54:02.772180] I >> > > [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting >> > > connection from >> > > >>gfs01b.corvidtec.com-4175-2015/02/02-16:44:31:179119-homegfs-client-2-0-1 >> > > >> > > And in the glustershd.log from the gfs01b_glustershd.log file: >> > > >> > > [2015-02-03 20:55:48.001797] I >> > > [afr-self-heal-entry.c:554:afr_selfheal_entry_do] >>0-homegfs-replicate-0: >> > > performing entry selfheal on 6c79a368-edaa-432b-bef9-ec690ab42448 >> > > [2015-02-03 20:55:49.341996] I >> > > [afr-self-heal-common.c:476:afr_log_selfheal] >>0-homegfs-replicate-0: >> > > Completed entry selfheal on 6c79a368-edaa-432b-bef9-ec690ab42448. >> > > source=1 sinks=0 >> > > [2015-02-03 20:55:49.343093] I >> > > [afr-self-heal-entry.c:554:afr_selfheal_entry_do] >>0-homegfs-replicate-0: >> > > performing entry selfheal on 792cb0d6-9290-4447-8cd7-2b2d7a116a69 >> > > [2015-02-03 20:55:50.463652] I >> > > [afr-self-heal-common.c:476:afr_log_selfheal] >>0-homegfs-replicate-0: >> > > Completed entry selfheal on 792cb0d6-9290-4447-8cd7-2b2d7a116a69. >> > > source=1 sinks=0 >> > > [2015-02-03 20:55:51.465289] I >> > > [afr-self-heal-metadata.c:54:__afr_selfheal_metadata_do] >> > > 0-homegfs-replicate-0: performing metadata selfheal on >> > > 403e661a-1c27-4e79-9867-c0572aba2b3c >> > > [2015-02-03 20:55:51.466515] I >> > > [afr-self-heal-common.c:476:afr_log_selfheal] >>0-homegfs-replicate-0: >> > > Completed metadata selfheal on >>403e661a-1c27-4e79-9867-c0572aba2b3c. >> > > source=1 sinks=0 >> > > [2015-02-03 20:55:51.467098] I >> > > [afr-self-heal-entry.c:554:afr_selfheal_entry_do] >>0-homegfs-replicate-0: >> > > performing entry selfheal on 403e661a-1c27-4e79-9867-c0572aba2b3c >> > > [2015-02-03 20:55:55.257808] I >> > > [afr-self-heal-common.c:476:afr_log_selfheal] >>0-homegfs-replicate-0: >> > > Completed entry selfheal on 403e661a-1c27-4e79-9867-c0572aba2b3c. >> > > source=1 sinks=0 >> > > [2015-02-03 20:55:55.258548] I >> > > [afr-self-heal-metadata.c:54:__afr_selfheal_metadata_do] >> > > 0-homegfs-replicate-0: performing metadata selfheal on >> > > c612ee2f-2fb4-4157-a9ab-5a2d5603c541 >> > > [2015-02-03 20:55:55.259367] I >> > > [afr-self-heal-common.c:476:afr_log_selfheal] >>0-homegfs-replicate-0: >> > > Completed metadata selfheal on >>c612ee2f-2fb4-4157-a9ab-5a2d5603c541. >> > > source=1 sinks=0 >> > > [2015-02-03 20:55:55.259980] I >> > > [afr-self-heal-entry.c:554:afr_selfheal_entry_do] >>0-homegfs-replicate-0: >> > > performing entry selfheal on c612ee2f-2fb4-4157-a9ab-5a2d5603c541 >> > > >> > > As you can see the self heal logs are just spammed with files >>being >> > > healed, and I looked at a couple of disconnects and I see self >>heals >> > > getting run shortly after on the bricks that were down. Now we >>need to >> > > find the cause of the disconnects, I am thinking once the >>disconnects >> > > are resolved the files should be properly copied over without SH >>having >> > > to fix things. Like I said I'll give this a go on my lab systems >>and >> > > see if I can repro the disconnects, I'll have time to run through >>it >> > > tomorrow. If in the mean time anyone else has a theory / anything >>to >> > > add here it would be appreciated. >> > > >> > > -b >> > > >> > >> -b >> > >> >> > >>> David (Sent from mobile) >> > >>> >> > >>> ==============================>> > >>> David F. Robinson, Ph.D. >> > >>> President - Corvid Technologies >> > >>> 704.799.6944 x101 [office] >> > >>> 704.252.1310 [cell] >> > >>> 704.799.7974 [fax] >> > >>> David.Robinson at corvidtec.com >> > >>> http://www.corvidtechnologies.com >> > >>> >> > >>>> On Feb 5, 2015, at 4:55 PM, Ben Turner <bturner at redhat.com> >>wrote: >> > >>>> >> > >>>> ----- Original Message ----- >> > >>>>> From: "Pranith Kumar Karampuri" <pkarampu at redhat.com> >> > >>>>> To: "Xavier Hernandez" <xhernandez at datalab.es>, "David F. >>Robinson" >> > >>>>> <david.robinson at corvidtec.com>, "Benjamin Turner" >> > >>>>> <bennyturns at gmail.com> >> > >>>>> Cc: gluster-users at gluster.org, "Gluster Devel" >> > >>>>> <gluster-devel at gluster.org> >> > >>>>> Sent: Thursday, February 5, 2015 5:30:04 AM >> > >>>>> Subject: Re: [Gluster-users] [Gluster-devel] missing files >> > >>>>> >> > >>>>> >> > >>>>>> On 02/05/2015 03:48 PM, Pranith Kumar Karampuri wrote: >> > >>>>>> I believe David already fixed this. I hope this is the same >>issue he >> > >>>>>> told about permissions issue. >> > >>>>> Oops, it is not. I will take a look. >> > >>>> >> > >>>> Yes David exactly like these: >> > >>>> >> > >>>> data-brick02a-homegfs.log:[2015-02-03 19:09:34.568842] I >> > >>>> [server.c:518:server_rpc_notify] 0-homegfs-server: >>disconnecting >> > >>>> connection from >> > >>>> >>gfs02a.corvidtec.com-18563-2015/02/03-19:07:58:519134-homegfs-client-2-0-0 >> > >>>> data-brick02a-homegfs.log:[2015-02-03 19:09:41.286551] I >> > >>>> [server.c:518:server_rpc_notify] 0-homegfs-server: >>disconnecting >> > >>>> connection from >> > >>>> >>gfs01a.corvidtec.com-12804-2015/02/03-19:09:38:497808-homegfs-client-2-0-0 >> > >>>> data-brick02a-homegfs.log:[2015-02-03 19:16:35.906412] I >> > >>>> [server.c:518:server_rpc_notify] 0-homegfs-server: >>disconnecting >> > >>>> connection from >> > >>>> >>gfs02b.corvidtec.com-27190-2015/02/03-19:15:53:458467-homegfs-client-2-0-0 >> > >>>> data-brick02a-homegfs.log:[2015-02-03 19:51:22.761293] I >> > >>>> [server.c:518:server_rpc_notify] 0-homegfs-server: >>disconnecting >> > >>>> connection from >> > >>>> >>gfs01a.corvidtec.com-25926-2015/02/03-19:51:02:89070-homegfs-client-2-0-0 >> > >>>> data-brick02a-homegfs.log:[2015-02-03 20:54:02.772180] I >> > >>>> [server.c:518:server_rpc_notify] 0-homegfs-server: >>disconnecting >> > >>>> connection from >> > >>>> >>gfs01b.corvidtec.com-4175-2015/02/02-16:44:31:179119-homegfs-client-2-0-1 >> > >>>> >> > >>>> You can 100% verify my theory if you can correlate the time on >>the >> > >>>> disconnects to the time that the missing files were healed. >>Can you >> > >>>> have >> > >>>> a look at /var/log/glusterfs/glustershd.log? That has all of >>the >> > >>>> healed >> > >>>> files + timestamps, if we can see a disconnect during the >>rsync and a >> > >>>> self >> > >>>> heal of the missing file I think we can safely assume that the >> > >>>> disconnects >> > >>>> may have caused this. I'll try this on my test systems, how >>much data >> > >>>> did >> > >>>> you rsync? What size ish of files / an idea of the dir layout? >> > >>>> >> > >>>> @Pranith - Could bricks flapping up and down during the rsync >>cause >> > >>>> the >> > >>>> files to be missing on the first ls(written to 1 subvol but >>not the >> > >>>> other >> > >>>> cause it was down), the ls triggered SH, and thats why the >>files were >> > >>>> there for the second ls be a possible cause here? >> > >>>> >> > >>>> -b >> > >>>> >> > >>>> >> > >>>>> Pranith >> > >>>>>> >> > >>>>>> Pranith >> > >>>>>>> On 02/05/2015 03:44 PM, Xavier Hernandez wrote: >> > >>>>>>> Is the failure repeatable ? with the same directories ? >> > >>>>>>> >> > >>>>>>> It's very weird that the directories appear on the volume >>when you >> > >>>>>>> do >> > >>>>>>> an 'ls' on the bricks. Could it be that you only made a >>single 'ls' >> > >>>>>>> on fuse mount which not showed the directory ? Is it >>possible that >> > >>>>>>> this 'ls' triggered a self-heal that repaired the problem, >>whatever >> > >>>>>>> it was, and when you did another 'ls' on the fuse mount >>after the >> > >>>>>>> 'ls' on the bricks, the directories were there ? >> > >>>>>>> >> > >>>>>>> The first 'ls' could have healed the files, causing that >>the >> > >>>>>>> following 'ls' on the bricks showed the files as if nothing >>were >> > >>>>>>> damaged. If that's the case, it's possible that there were >>some >> > >>>>>>> disconnections during the copy. >> > >>>>>>> >> > >>>>>>> Added Pranith because he knows better replication and >>self-heal >> > >>>>>>> details. >> > >>>>>>> >> > >>>>>>> Xavi >> > >>>>>>> >> > >>>>>>>> On 02/04/2015 07:23 PM, David F. Robinson wrote: >> > >>>>>>>> Distributed/replicated >> > >>>>>>>> >> > >>>>>>>> Volume Name: homegfs >> > >>>>>>>> Type: Distributed-Replicate >> > >>>>>>>> Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071 >> > >>>>>>>> Status: Started >> > >>>>>>>> Number of Bricks: 4 x 2 = 8 >> > >>>>>>>> Transport-type: tcp >> > >>>>>>>> Bricks: >> > >>>>>>>> Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs >> > >>>>>>>> Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs >> > >>>>>>>> Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs >> > >>>>>>>> Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs >> > >>>>>>>> Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs >> > >>>>>>>> Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs >> > >>>>>>>> Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs >> > >>>>>>>> Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs >> > >>>>>>>> Options Reconfigured: >> > >>>>>>>> performance.io-thread-count: 32 >> > >>>>>>>> performance.cache-size: 128MB >> > >>>>>>>> performance.write-behind-window-size: 128MB >> > >>>>>>>> server.allow-insecure: on >> > >>>>>>>> network.ping-timeout: 10 >> > >>>>>>>> storage.owner-gid: 100 >> > >>>>>>>> geo-replication.indexing: off >> > >>>>>>>> geo-replication.ignore-pid-check: on >> > >>>>>>>> changelog.changelog: on >> > >>>>>>>> changelog.fsync-interval: 3 >> > >>>>>>>> changelog.rollover-time: 15 >> > >>>>>>>> server.manage-gids: on >> > >>>>>>>> >> > >>>>>>>> >> > >>>>>>>> ------ Original Message ------ >> > >>>>>>>> From: "Xavier Hernandez" <xhernandez at datalab.es> >> > >>>>>>>> To: "David F. Robinson" <david.robinson at corvidtec.com>; >>"Benjamin >> > >>>>>>>> Turner" <bennyturns at gmail.com> >> > >>>>>>>> Cc: "gluster-users at gluster.org" >><gluster-users at gluster.org>; >> > >>>>>>>> "Gluster >> > >>>>>>>> Devel" <gluster-devel at gluster.org> >> > >>>>>>>> Sent: 2/4/2015 6:03:45 AM >> > >>>>>>>> Subject: Re: [Gluster-devel] missing files >> > >>>>>>>> >> > >>>>>>>>>> On 02/04/2015 01:30 AM, David F. Robinson wrote: >> > >>>>>>>>>> Sorry. Thought about this a little more. I should have >>been >> > >>>>>>>>>> clearer. >> > >>>>>>>>>> The files were on both bricks of the replica, not just >>one side. >> > >>>>>>>>>> So, >> > >>>>>>>>>> both bricks had to have been up... The files/directories >>just >> > >>>>>>>>>> don't show >> > >>>>>>>>>> up on the mount. >> > >>>>>>>>>> I was reading and saw a related bug >> > >>>>>>>>>> (https://bugzilla.redhat.com/show_bug.cgi?id=1159484). I >>saw it >> > >>>>>>>>>> suggested to run: >> > >>>>>>>>>> find <mount> -d -exec getfattr -h -n trusted.ec.heal {} >> > >>>>>>>>>> \; >> > >>>>>>>>> >> > >>>>>>>>> This command is specific for a dispersed volume. It won't >>do >> > >>>>>>>>> anything >> > >>>>>>>>> (aside from the error you are seeing) on a replicated >>volume. >> > >>>>>>>>> >> > >>>>>>>>> I think you are using a replicated volume, right ? >> > >>>>>>>>> >> > >>>>>>>>> In this case I'm not sure what can be happening. Is your >>volume a >> > >>>>>>>>> pure >> > >>>>>>>>> replicated one or a distributed-replicated ? on a pure >>replicated >> > >>>>>>>>> it >> > >>>>>>>>> doesn't make sense that some entries do not show in an >>'ls' when >> > >>>>>>>>> the >> > >>>>>>>>> file is in both replicas (at least without any error >>message in >> > >>>>>>>>> the >> > >>>>>>>>> logs). On a distributed-replicated it could be caused by >>some >> > >>>>>>>>> problem >> > >>>>>>>>> while combining contents of each replica set. >> > >>>>>>>>> >> > >>>>>>>>> What's the configuration of your volume ? >> > >>>>>>>>> >> > >>>>>>>>> Xavi >> > >>>>>>>>> >> > >>>>>>>>>> >> > >>>>>>>>>> I get a bunch of errors for operation not supported: >> > >>>>>>>>>> [root at gfs02a homegfs]# find wks_backup -d -exec getfattr >>-h -n >> > >>>>>>>>>> trusted.ec.heal {} \; >> > >>>>>>>>>> find: warning: the -d option is deprecated; please use >>-depth >> > >>>>>>>>>> instead, >> > >>>>>>>>>> because the latter is a POSIX-compliant feature. >> > >>>>>>>>>> wks_backup/homer_backup/backup: trusted.ec.heal: >>Operation not >> > >>>>>>>>>> supported >> > >>>>>>>>>> wks_backup/homer_backup/logs/2014_05_20.log: >>trusted.ec.heal: >> > >>>>>>>>>> Operation >> > >>>>>>>>>> not supported >> > >>>>>>>>>> wks_backup/homer_backup/logs/2014_05_21.log: >>trusted.ec.heal: >> > >>>>>>>>>> Operation >> > >>>>>>>>>> not supported >> > >>>>>>>>>> wks_backup/homer_backup/logs/2014_05_18.log: >>trusted.ec.heal: >> > >>>>>>>>>> Operation >> > >>>>>>>>>> not supported >> > >>>>>>>>>> wks_backup/homer_backup/logs/2014_05_19.log: >>trusted.ec.heal: >> > >>>>>>>>>> Operation >> > >>>>>>>>>> not supported >> > >>>>>>>>>> wks_backup/homer_backup/logs/2014_05_22.log: >>trusted.ec.heal: >> > >>>>>>>>>> Operation >> > >>>>>>>>>> not supported >> > >>>>>>>>>> wks_backup/homer_backup/logs: trusted.ec.heal: Operation >>not >> > >>>>>>>>>> supported >> > >>>>>>>>>> wks_backup/homer_backup: trusted.ec.heal: Operation not >> > >>>>>>>>>> supported >> > >>>>>>>>>> ------ Original Message ------ >> > >>>>>>>>>> From: "Benjamin Turner" <bennyturns at gmail.com >> > >>>>>>>>>> <mailto:bennyturns at gmail.com>> >> > >>>>>>>>>> To: "David F. Robinson" <david.robinson at corvidtec.com >> > >>>>>>>>>> <mailto:david.robinson at corvidtec.com>> >> > >>>>>>>>>> Cc: "Gluster Devel" <gluster-devel at gluster.org >> > >>>>>>>>>> <mailto:gluster-devel at gluster.org>>; >>"gluster-users at gluster.org" >> > >>>>>>>>>> <gluster-users at gluster.org >><mailto:gluster-users at gluster.org>> >> > >>>>>>>>>> Sent: 2/3/2015 7:12:34 PM >> > >>>>>>>>>> Subject: Re: [Gluster-devel] missing files >> > >>>>>>>>>>> It sounds to me like the files were only copied to one >>replica, >> > >>>>>>>>>>> werent >> > >>>>>>>>>>> there for the initial for the initial ls which >>triggered a self >> > >>>>>>>>>>> heal, >> > >>>>>>>>>>> and were there for the last ls because they were >>healed. Is >> > >>>>>>>>>>> there >> > >>>>>>>>>>> any >> > >>>>>>>>>>> chance that one of the replicas was down during the >>rsync? It >> > >>>>>>>>>>> could >> > >>>>>>>>>>> be that you lost a brick during copy or something like >>that. To >> > >>>>>>>>>>> confirm I would look for disconnects in the brick logs >>as well >> > >>>>>>>>>>> as >> > >>>>>>>>>>> checking glusterfshd.log to verify the missing files >>were >> > >>>>>>>>>>> actually >> > >>>>>>>>>>> healed. >> > >>>>>>>>>>> >> > >>>>>>>>>>> -b >> > >>>>>>>>>>> >> > >>>>>>>>>>> On Tue, Feb 3, 2015 at 5:37 PM, David F. Robinson >> > >>>>>>>>>>> <david.robinson at corvidtec.com >> > >>>>>>>>>>> <mailto:david.robinson at corvidtec.com>> >> > >>>>>>>>>>> wrote: >> > >>>>>>>>>>> >> > >>>>>>>>>>> I rsync'd 20-TB over to my gluster system and noticed >>that I >> > >>>>>>>>>>> had >> > >>>>>>>>>>> some directories missing even though the rsync >>completed >> > >>>>>>>>>>> normally. >> > >>>>>>>>>>> The rsync logs showed that the missing files were >> > >>>>>>>>>>> transferred. >> > >>>>>>>>>>> I went to the bricks and did an 'ls -al >> > >>>>>>>>>>> /data/brick*/homegfs/dir/*' the files were on the >>bricks. >> > >>>>>>>>>>> After I >> > >>>>>>>>>>> did this 'ls', the files then showed up on the FUSE >>mounts. >> > >>>>>>>>>>> 1) Why are the files hidden on the fuse mount? >> > >>>>>>>>>>> 2) Why does the ls make them show up on the FUSE mount? >> > >>>>>>>>>>> 3) How can I prevent this from happening again? >> > >>>>>>>>>>> Note, I also mounted the gluster volume using NFS and >>saw the >> > >>>>>>>>>>> same >> > >>>>>>>>>>> behavior. The files/directories were not shown until I >>did >> > >>>>>>>>>>> the >> > >>>>>>>>>>> "ls" on the bricks. >> > >>>>>>>>>>> David >> > >>>>>>>>>>> ==============================>> > >>>>>>>>>>> David F. Robinson, Ph.D. >> > >>>>>>>>>>> President - Corvid Technologies >> > >>>>>>>>>>> 704.799.6944 x101 <tel:704.799.6944%20x101> [office] >> > >>>>>>>>>>> 704.252.1310 <tel:704.252.1310> [cell] >> > >>>>>>>>>>> 704.799.7974 <tel:704.799.7974> [fax] >> > >>>>>>>>>>> David.Robinson at corvidtec.com >> > >>>>>>>>>>> <mailto:David.Robinson at corvidtec.com> >> > >>>>>>>>>>> http://www.corvidtechnologies.com >> > >>>>>>>>>>> <http://www.corvidtechnologies.com/> >> > >>>>>>>>>>> >> > >>>>>>>>>>> _______________________________________________ >> > >>>>>>>>>>> Gluster-devel mailing list >> > >>>>>>>>>>> Gluster-devel at gluster.org >><mailto:Gluster-devel at gluster.org> >> > >>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel >> > >>>>>>>>>> >> > >>>>>>>>>> >> > >>>>>>>>>> _______________________________________________ >> > >>>>>>>>>> Gluster-devel mailing list >> > >>>>>>>>>> Gluster-devel at gluster.org >> > >>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel >> > >>>>>> >> > >>>>>> _______________________________________________ >> > >>>>>> Gluster-users mailing list >> > >>>>>> Gluster-users at gluster.org >> > >>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >> > >>>>> >> > >>>>> _______________________________________________ >> > >>>>> Gluster-users mailing list >> > >>>>> Gluster-users at gluster.org >> > >>>>> http://www.gluster.org/mailman/listinfo/gluster-users >> > >> >> > >> > _______________________________________________ >> > Gluster-devel mailing list >> > Gluster-devel at gluster.org >> > http://www.gluster.org/mailman/listinfo/gluster-devel >> >> -- >> GlusterFS - http://www.gluster.org >> >> An open source, distributed file system scaling to several >> petabytes, and handling thousands of clients. >> >> My personal twitter: twitter.com/realjustinclift >> >>
On 6 Feb 2015, at 20:33, Ben Turner <bturner at redhat.com> wrote:> ----- Original Message ----- >> From: "Justin Clift" <justin at gluster.org> >> To: "Benjamin Turner" <bennyturns at gmail.com> >> Cc: "David F. Robinson" <david.robinson at corvidtec.com>, gluster-users at gluster.org, "Gluster Devel" >> <gluster-devel at gluster.org>, "Ben Turner" <bturner at redhat.com> >> Sent: Friday, February 6, 2015 3:27:53 PM >> Subject: Re: [Gluster-devel] [Gluster-users] missing files >> >> On 6 Feb 2015, at 02:05, Benjamin Turner <bennyturns at gmail.com> wrote: >>> I think that the multi threaded epoll changes that _just_ landed in master >>> will help resolve this, but they are so new I haven't been able to test >>> this. I'll know more when I get a chance to test tomorrow. >> >> Which multi-threaded epoll code just landed in master? Are you thinking >> of this one? >> >> http://review.gluster.org/#/c/3842/ >> >> If so, it's not in master yet. ;) > > Doh! I just saw - "Required patches are all upstream now" and assumed they were merged. I have been in class all week so I am not up2date with everything. I gave instructions on compiling it from the gerrit patches + master so if David wants to give it a go he can. Sorry for the confusion.Vijay merged the code into master yesterday, so it should be too long under we can get some rpms created for people to test with (easily). :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift