thr3ads.net - Gluster users - [Gluster-users] [Gluster-devel] missing files [Feb 2015]

If this information is useful, please help other people find it:
Share via:

Xavier Hernandez

2015-Feb-05 10:14 UTC

[Gluster-users] [Gluster-devel] missing files

Is the failure repeatable ? with the same directories ?

It's very weird that the directories appear on the volume when you do an 
'ls' on the bricks. Could it be that you only made a single 'ls'
on fuse
mount which not showed the directory ? Is it possible that this 'ls' 
triggered a self-heal that repaired the problem, whatever it was, and 
when you did another 'ls' on the fuse mount after the 'ls' on
the
bricks, the directories were there ?

The first 'ls' could have healed the files, causing that the following 
'ls' on the bricks showed the files as if nothing were damaged. If 
that's the case, it's possible that there were some disconnections 
during the copy.

Added Pranith because he knows better replication and self-heal details.

Xavi

On 02/04/2015 07:23 PM, David F. Robinson wrote:> Distributed/replicated
>
> Volume Name: homegfs
> Type: Distributed-Replicate
> Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071
> Status: Started
> Number of Bricks: 4 x 2 = 8
> Transport-type: tcp
> Bricks:
> Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs
> Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs
> Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs
> Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs
> Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs
> Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs
> Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs
> Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs
> Options Reconfigured:
> performance.io-thread-count: 32
> performance.cache-size: 128MB
> performance.write-behind-window-size: 128MB
> server.allow-insecure: on
> network.ping-timeout: 10
> storage.owner-gid: 100
> geo-replication.indexing: off
> geo-replication.ignore-pid-check: on
> changelog.changelog: on
> changelog.fsync-interval: 3
> changelog.rollover-time: 15
> server.manage-gids: on
>
>
> ------ Original Message ------
> From: "Xavier Hernandez" <xhernandez at datalab.es>
> To: "David F. Robinson" <david.robinson at corvidtec.com>;
"Benjamin
> Turner" <bennyturns at gmail.com>
> Cc: "gluster-users at gluster.org" <gluster-users at
gluster.org>; "Gluster
> Devel" <gluster-devel at gluster.org>
> Sent: 2/4/2015 6:03:45 AM
> Subject: Re: [Gluster-devel] missing files
>
>> On 02/04/2015 01:30 AM, David F. Robinson wrote:
>>> Sorry. Thought about this a little more. I should have been
clearer.
>>> The files were on both bricks of the replica, not just one side.
So,
>>> both bricks had to have been up... The files/directories just
don't show
>>> up on the mount.
>>> I was reading and saw a related bug
>>> (https://bugzilla.redhat.com/show_bug.cgi?id=1159484). I saw it
>>> suggested to run:
>>>          find <mount> -d -exec getfattr -h -n trusted.ec.heal
{} \;
>>
>> This command is specific for a dispersed volume. It won't do
anything
>> (aside from the error you are seeing) on a replicated volume.
>>
>> I think you are using a replicated volume, right ?
>>
>> In this case I'm not sure what can be happening. Is your volume a
pure
>> replicated one or a distributed-replicated ? on a pure replicated it
>> doesn't make sense that some entries do not show in an 'ls'
when the
>> file is in both replicas (at least without any error message in the
>> logs). On a distributed-replicated it could be caused by some problem
>> while combining contents of each replica set.
>>
>> What's the configuration of your volume ?
>>
>> Xavi
>>
>>>
>>> I get a bunch of errors for operation not supported:
>>> [root at gfs02a homegfs]# find wks_backup -d -exec getfattr -h -n
>>> trusted.ec.heal {} \;
>>> find: warning: the -d option is deprecated; please use -depth
instead,
>>> because the latter is a POSIX-compliant feature.
>>> wks_backup/homer_backup/backup: trusted.ec.heal: Operation not
supported
>>> wks_backup/homer_backup/logs/2014_05_20.log: trusted.ec.heal:
Operation
>>> not supported
>>> wks_backup/homer_backup/logs/2014_05_21.log: trusted.ec.heal:
Operation
>>> not supported
>>> wks_backup/homer_backup/logs/2014_05_18.log: trusted.ec.heal:
Operation
>>> not supported
>>> wks_backup/homer_backup/logs/2014_05_19.log: trusted.ec.heal:
Operation
>>> not supported
>>> wks_backup/homer_backup/logs/2014_05_22.log: trusted.ec.heal:
Operation
>>> not supported
>>> wks_backup/homer_backup/logs: trusted.ec.heal: Operation not
supported
>>> wks_backup/homer_backup: trusted.ec.heal: Operation not supported
>>> ------ Original Message ------
>>> From: "Benjamin Turner" <bennyturns at gmail.com
>>> <mailto:bennyturns at gmail.com>>
>>> To: "David F. Robinson" <david.robinson at
corvidtec.com
>>> <mailto:david.robinson at corvidtec.com>>
>>> Cc: "Gluster Devel" <gluster-devel at gluster.org
>>> <mailto:gluster-devel at gluster.org>>;
"gluster-users at gluster.org"
>>> <gluster-users at gluster.org <mailto:gluster-users at
gluster.org>>
>>> Sent: 2/3/2015 7:12:34 PM
>>> Subject: Re: [Gluster-devel] missing files
>>>> It sounds to me like the files were only copied to one replica,
werent
>>>> there for the initial for the initial ls which triggered a self
heal,
>>>> and were there for the last ls because they were healed. Is
there any
>>>> chance that one of the replicas was down during the rsync? It
could
>>>> be that you lost a brick during copy or something like that. To
>>>> confirm I would look for disconnects in the brick logs as well
as
>>>> checking glusterfshd.log to verify the missing files were
actually
>>>> healed.
>>>>
>>>> -b
>>>>
>>>> On Tue, Feb 3, 2015 at 5:37 PM, David F. Robinson
>>>> <david.robinson at corvidtec.com <mailto:david.robinson
at corvidtec.com>>
>>>> wrote:
>>>>
>>>>     I rsync'd 20-TB over to my gluster system and noticed
that I had
>>>>     some directories missing even though the rsync completed
normally.
>>>>     The rsync logs showed that the missing files were
transferred.
>>>>     I went to the bricks and did an 'ls -al
>>>>     /data/brick*/homegfs/dir/*' the files were on the
bricks. After I
>>>>     did this 'ls', the files then showed up on the FUSE
mounts.
>>>>     1) Why are the files hidden on the fuse mount?
>>>>     2) Why does the ls make them show up on the FUSE mount?
>>>>     3) How can I prevent this from happening again?
>>>>     Note, I also mounted the gluster volume using NFS and saw
the same
>>>>     behavior. The files/directories were not shown until I did
the
>>>>     "ls" on the bricks.
>>>>     David
>>>>     ==============================>>>>     David F.
Robinson, Ph.D.
>>>>     President - Corvid Technologies
>>>>     704.799.6944 x101 <tel:704.799.6944%20x101> [office]
>>>>     704.252.1310 <tel:704.252.1310> [cell]
>>>>     704.799.7974 <tel:704.799.7974> [fax]
>>>>     David.Robinson at corvidtec.com <mailto:David.Robinson
at corvidtec.com>
>>>>     http://www.corvidtechnologies.com
>>>> <http://www.corvidtechnologies.com/>
>>>>
>>>>     _______________________________________________
>>>>     Gluster-devel mailing list
>>>>     Gluster-devel at gluster.org <mailto:Gluster-devel at
gluster.org>
>>>>     http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>
>

Pranith Kumar Karampuri

2015-Feb-05 10:18 UTC

head link

[Gluster-users] [Gluster-devel] missing files

I believe David already fixed this. I hope this is the same issue he 
told about permissions issue.

Pranith
On 02/05/2015 03:44 PM, Xavier Hernandez wrote:> Is the failure repeatable ? with the same directories ?
>
> It's very weird that the directories appear on the volume when you do 
> an 'ls' on the bricks. Could it be that you only made a single
'ls' on
> fuse mount which not showed the directory ? Is it possible that this 
> 'ls' triggered a self-heal that repaired the problem, whatever it
was,
> and when you did another 'ls' on the fuse mount after the
'ls' on the
> bricks, the directories were there ?
>
> The first 'ls' could have healed the files, causing that the
following
> 'ls' on the bricks showed the files as if nothing were damaged. If 
> that's the case, it's possible that there were some disconnections 
> during the copy.
>
> Added Pranith because he knows better replication and self-heal details.
>
> Xavi
>
> On 02/04/2015 07:23 PM, David F. Robinson wrote:
>> Distributed/replicated
>>
>> Volume Name: homegfs
>> Type: Distributed-Replicate
>> Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071
>> Status: Started
>> Number of Bricks: 4 x 2 = 8
>> Transport-type: tcp
>> Bricks:
>> Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs
>> Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs
>> Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs
>> Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs
>> Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs
>> Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs
>> Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs
>> Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs
>> Options Reconfigured:
>> performance.io-thread-count: 32
>> performance.cache-size: 128MB
>> performance.write-behind-window-size: 128MB
>> server.allow-insecure: on
>> network.ping-timeout: 10
>> storage.owner-gid: 100
>> geo-replication.indexing: off
>> geo-replication.ignore-pid-check: on
>> changelog.changelog: on
>> changelog.fsync-interval: 3
>> changelog.rollover-time: 15
>> server.manage-gids: on
>>
>>
>> ------ Original Message ------
>> From: "Xavier Hernandez" <xhernandez at datalab.es>
>> To: "David F. Robinson" <david.robinson at
corvidtec.com>; "Benjamin
>> Turner" <bennyturns at gmail.com>
>> Cc: "gluster-users at gluster.org" <gluster-users at
gluster.org>; "Gluster
>> Devel" <gluster-devel at gluster.org>
>> Sent: 2/4/2015 6:03:45 AM
>> Subject: Re: [Gluster-devel] missing files
>>
>>> On 02/04/2015 01:30 AM, David F. Robinson wrote:
>>>> Sorry. Thought about this a little more. I should have been
clearer.
>>>> The files were on both bricks of the replica, not just one
side. So,
>>>> both bricks had to have been up... The files/directories just
don't
>>>> show
>>>> up on the mount.
>>>> I was reading and saw a related bug
>>>> (https://bugzilla.redhat.com/show_bug.cgi?id=1159484). I saw it
>>>> suggested to run:
>>>>          find <mount> -d -exec getfattr -h -n
trusted.ec.heal {} \;
>>>
>>> This command is specific for a dispersed volume. It won't do
anything
>>> (aside from the error you are seeing) on a replicated volume.
>>>
>>> I think you are using a replicated volume, right ?
>>>
>>> In this case I'm not sure what can be happening. Is your volume
a pure
>>> replicated one or a distributed-replicated ? on a pure replicated
it
>>> doesn't make sense that some entries do not show in an
'ls' when the
>>> file is in both replicas (at least without any error message in the
>>> logs). On a distributed-replicated it could be caused by some
problem
>>> while combining contents of each replica set.
>>>
>>> What's the configuration of your volume ?
>>>
>>> Xavi
>>>
>>>>
>>>> I get a bunch of errors for operation not supported:
>>>> [root at gfs02a homegfs]# find wks_backup -d -exec getfattr -h
-n
>>>> trusted.ec.heal {} \;
>>>> find: warning: the -d option is deprecated; please use -depth
instead,
>>>> because the latter is a POSIX-compliant feature.
>>>> wks_backup/homer_backup/backup: trusted.ec.heal: Operation not 
>>>> supported
>>>> wks_backup/homer_backup/logs/2014_05_20.log: trusted.ec.heal: 
>>>> Operation
>>>> not supported
>>>> wks_backup/homer_backup/logs/2014_05_21.log: trusted.ec.heal: 
>>>> Operation
>>>> not supported
>>>> wks_backup/homer_backup/logs/2014_05_18.log: trusted.ec.heal: 
>>>> Operation
>>>> not supported
>>>> wks_backup/homer_backup/logs/2014_05_19.log: trusted.ec.heal: 
>>>> Operation
>>>> not supported
>>>> wks_backup/homer_backup/logs/2014_05_22.log: trusted.ec.heal: 
>>>> Operation
>>>> not supported
>>>> wks_backup/homer_backup/logs: trusted.ec.heal: Operation not
supported
>>>> wks_backup/homer_backup: trusted.ec.heal: Operation not
supported
>>>> ------ Original Message ------
>>>> From: "Benjamin Turner" <bennyturns at gmail.com
>>>> <mailto:bennyturns at gmail.com>>
>>>> To: "David F. Robinson" <david.robinson at
corvidtec.com
>>>> <mailto:david.robinson at corvidtec.com>>
>>>> Cc: "Gluster Devel" <gluster-devel at gluster.org
>>>> <mailto:gluster-devel at gluster.org>>;
"gluster-users at gluster.org"
>>>> <gluster-users at gluster.org <mailto:gluster-users at
gluster.org>>
>>>> Sent: 2/3/2015 7:12:34 PM
>>>> Subject: Re: [Gluster-devel] missing files
>>>>> It sounds to me like the files were only copied to one
replica,
>>>>> werent
>>>>> there for the initial for the initial ls which triggered a
self heal,
>>>>> and were there for the last ls because they were healed. Is
there any
>>>>> chance that one of the replicas was down during the rsync?
It could
>>>>> be that you lost a brick during copy or something like
that. To
>>>>> confirm I would look for disconnects in the brick logs as
well as
>>>>> checking glusterfshd.log to verify the missing files were
actually
>>>>> healed.
>>>>>
>>>>> -b
>>>>>
>>>>> On Tue, Feb 3, 2015 at 5:37 PM, David F. Robinson
>>>>> <david.robinson at corvidtec.com
<mailto:david.robinson at corvidtec.com>>
>>>>> wrote:
>>>>>
>>>>>     I rsync'd 20-TB over to my gluster system and
noticed that I had
>>>>>     some directories missing even though the rsync
completed
>>>>> normally.
>>>>>     The rsync logs showed that the missing files were
transferred.
>>>>>     I went to the bricks and did an 'ls -al
>>>>>     /data/brick*/homegfs/dir/*' the files were on the
bricks. After I
>>>>>     did this 'ls', the files then showed up on the
FUSE mounts.
>>>>>     1) Why are the files hidden on the fuse mount?
>>>>>     2) Why does the ls make them show up on the FUSE mount?
>>>>>     3) How can I prevent this from happening again?
>>>>>     Note, I also mounted the gluster volume using NFS and
saw the
>>>>> same
>>>>>     behavior. The files/directories were not shown until I
did the
>>>>>     "ls" on the bricks.
>>>>>     David
>>>>>     ==============================>>>>>    
David F. Robinson, Ph.D.
>>>>>     President - Corvid Technologies
>>>>>     704.799.6944 x101 <tel:704.799.6944%20x101>
[office]
>>>>>     704.252.1310 <tel:704.252.1310> [cell]
>>>>>     704.799.7974 <tel:704.799.7974> [fax]
>>>>>     David.Robinson at corvidtec.com 
>>>>> <mailto:David.Robinson at corvidtec.com>
>>>>>     http://www.corvidtechnologies.com
>>>>> <http://www.corvidtechnologies.com/>
>>>>>
>>>>>     _______________________________________________
>>>>>     Gluster-devel mailing list
>>>>>     Gluster-devel at gluster.org <mailto:Gluster-devel
at gluster.org>
>>>>>     http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-devel mailing list
>>>> Gluster-devel at gluster.org
>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>
>>

David F. Robinson

2015-Feb-06 04:36 UTC

head link

[Gluster-users] [Gluster-devel] missing files

Not repeatable.  Once it shows up, it stays there.  I sent some other 
strange behavior I am seeing to Pranith earlier this evening.  Attached 
below...

David

Another issue I am having that might be related is that I cannot delete 
some directories. It complains that the directories are not empty. But 
when I list them out, there is nothing there.
However, if I know of the name of the directory, I can cd into it and 
see the files.

[root at gfs01a Phase_1_SOCOM14-003_adv_armor]# pwd
/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor

[root at gfs01a Phase_1_SOCOM14-003_adv_armor]# ls -al
total 0
drwxrws--x 7 root root 449 Feb 4 18:12 .
drwxrwx--- 3 root root 200 Feb 4 18:19 ..
drwxrws--- 3 root root 41 Feb 4 18:12 References
drwxrws--x 4 root root 54 Feb 4 18:12 Testing
drwxrws--- 4 root root 51 Feb 4 18:12 Velodyne
drwxrws--x 4 root root 38 Feb 4 18:12 progress_reports

[root at gfs01a Phase_1_SOCOM14-003_adv_armor]# rm -rf *
rm: cannot remove `References': Directory not empty
rm: cannot remove `Testing': Directory not empty
rm: cannot remove `Velodyne': Directory not empty
rm: cannot remove `progress_reports/pr2': Directory not empty
rm: cannot remove `progress_reports/pr3': Directory not empty

[root at gfs01a Phase_1_SOCOM14-003_adv_armor]# ls -alR
total 0
drwxrws--x 6 root root 449 Feb 4 18:12 .
drwxrwx--- 3 root root 200 Feb 4 18:19 ..
drwxrws--- 3 root root 41 Feb 4 18:12 References *** Note that there is 
nothing in this References directory.
drwxrws--x 4 root root 54 Feb 4 18:12 Testing
drwxrws--- 4 root root 51 Feb 4 18:12 Velodyne
drwxrws--x 4 root root 38 Feb 4 18:12 progress_reports


However, from the bricks (see listings below), there are other 
directories that are not shown. For example, the References directory 
contains the USSOCOM_OPAQUE_ARMOR directory on the brick, but it doesn't 
show up on the volume.

[root at gfs01a USSOCOM_OPAQUE_ARMOR]# pwd
/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor

[root at gfs01a Phase_1_SOCOM14-003_adv_armor]# cd References/
[root at gfs01a References]# ls -al *** There is nothing shown in the 
References directory
total 0
drwxrws--- 3 root root 133 Feb 4 18:12 .
drwxrws--x 7 root root 449 Feb 4 18:12 ..

[root at gfs01a References]# cd USSOCOM_OPAQUE_ARMOR *** From the brick 
listing, I knew the directory name. Even though it isn't shown, I can cd 
to it and see the files.
[root at gfs01a USSOCOM_OPAQUE_ARMOR]# ls -al
total 6787
drwxrws--- 2 streadway sbir 244 Feb 5 21:28 .
drwxrws--- 3 root root 164 Feb 5 21:28 ..
-rwxrw---- 1 streadway sbir 42440 Jun 19 2014 ARMOR PACKAGES.one
-rwxrw---- 1 streadway sbir 17248 Jun 19 2014 COMPARISON OF 
SOLUTIONS.one
-rwxrw---- 1 streadway sbir 38184 Jun 19 2014 CURRENT STANDARD 
ARMORING.one
-rwxrw---- 1 sgilbert sbir 2974120 Jan 22 09:15 FEASABILITY STUDY.docx
-rwxrw---- 1 streadway sbir 3826704 Jan 21 14:57 FEASABILITY STUDY.one
-rwxrw---- 1 streadway sbir 49736 Jan 21 13:18 GIVEN TRADE SPACE.one



The recursive file listed (ls -alR) from each of the bricks shows that 
there are files/directories that do not show up on the /homegfs volume.

[root at gfs01a Phase_1_SOCOM14-003_adv_armor]# ls -alR 
/data/brick0*/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References
/data/brick01a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 75 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick01a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR:
total 6648
drwxrws--- 2 streadway sbir 75 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw---- 2 sgilbert sbir 2974120 Jan 22 09:15 FEASABILITY STUDY.docx
-rwxrw---- 2 streadway sbir 3826704 Jan 21 14:57 FEASABILITY STUDY.one

/data/brick02a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 2 root root 10 Feb 4 18:12 .
drwxrws--x 6 root root 95 Feb 4 18:12 ..

[root at gfs01b ~]# ls -alR 
/data/brick0*/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References
/data/brick01b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 75 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick01b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR:
total 6648
drwxrws--- 2 streadway sbir 75 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw---- 2 sgilbert sbir 2974120 Jan 22 09:15 FEASABILITY STUDY.docx
-rwxrw---- 2 streadway sbir 3826704 Jan 21 14:57 FEASABILITY STUDY.one

/data/brick02b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 2 root root 10 Feb 4 18:12 .
drwxrws--x 6 root root 95 Feb 4 18:12 ..

[root at gfs02a ~]# ls -alR 
/data/brick0*/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References
/data/brick01a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 80 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick01a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR:
total 72
drwxrws--- 2 streadway sbir 80 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw---- 2 streadway sbir 17248 Jun 19 2014 COMPARISON OF 
SOLUTIONS.one
-rwxrw---- 2 streadway sbir 49736 Jan 21 13:18 GIVEN TRADE SPACE.one

/data/brick02a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 79 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick02a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR:
total 84
drwxrws--- 2 streadway sbir 79 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw---- 2 streadway sbir 42440 Jun 19 2014 ARMOR PACKAGES.one
-rwxrw---- 2 streadway sbir 38184 Jun 19 2014 CURRENT STANDARD 
ARMORING.one

[root at gfs02b ~]# ls -alR 
/data/brick0*/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References
/data/brick01b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 80 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick01b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR:
total 72
drwxrws--- 2 streadway sbir 80 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw---- 2 streadway sbir 17248 Jun 19 2014 COMPARISON OF 
SOLUTIONS.one
-rwxrw---- 2 streadway sbir 49736 Jan 21 13:18 GIVEN TRADE SPACE.one

/data/brick02b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 79 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick02b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR:
total 84
drwxrws--- 2 streadway sbir 79 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw---- 2 streadway sbir 42440 Jun 19 2014 ARMOR PACKAGES.one
-rwxrw---- 2 streadway sbir 38184 Jun 19 2014 CURRENT STANDARD 
ARMORING.one





------ Original Message ------
From: "Xavier Hernandez" <xhernandez at datalab.es>
To: "David F. Robinson" <david.robinson at corvidtec.com>;
"Benjamin
Turner" <bennyturns at gmail.com>; "Pranith Kumar
Karampuri"
<pkarampu at redhat.com>
Cc: "gluster-users at gluster.org" <gluster-users at
gluster.org>; "Gluster
Devel" <gluster-devel at gluster.org>
Sent: 2/5/2015 5:14:22 AM
Subject: Re: [Gluster-devel] missing files
>Is the failure repeatable ? with the same directories ?
>
>It's very weird that the directories appear on the volume when you do 
>an 'ls' on the bricks. Could it be that you only made a single
'ls' on
>fuse mount which not showed the directory ? Is it possible that this 
>'ls' triggered a self-heal that repaired the problem, whatever it
was,
>and when you did another 'ls' on the fuse mount after the
'ls' on the
>bricks, the directories were there ?
>
>The first 'ls' could have healed the files, causing that the
following
>'ls' on the bricks showed the files as if nothing were damaged. If 
>that's the case, it's possible that there were some disconnections 
>during the copy.
>
>Added Pranith because he knows better replication and self-heal 
>details.
>
>Xavi
>
>On 02/04/2015 07:23 PM, David F. Robinson wrote:
>>Distributed/replicated
>>
>>Volume Name: homegfs
>>Type: Distributed-Replicate
>>Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071
>>Status: Started
>>Number of Bricks: 4 x 2 = 8
>>Transport-type: tcp
>>Bricks:
>>Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs
>>Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs
>>Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs
>>Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs
>>Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs
>>Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs
>>Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs
>>Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs
>>Options Reconfigured:
>>performance.io-thread-count: 32
>>performance.cache-size: 128MB
>>performance.write-behind-window-size: 128MB
>>server.allow-insecure: on
>>network.ping-timeout: 10
>>storage.owner-gid: 100
>>geo-replication.indexing: off
>>geo-replication.ignore-pid-check: on
>>changelog.changelog: on
>>changelog.fsync-interval: 3
>>changelog.rollover-time: 15
>>server.manage-gids: on
>>
>>
>>------ Original Message ------
>>From: "Xavier Hernandez" <xhernandez at datalab.es>
>>To: "David F. Robinson" <david.robinson at
corvidtec.com>; "Benjamin
>>Turner" <bennyturns at gmail.com>
>>Cc: "gluster-users at gluster.org" <gluster-users at
gluster.org>; "Gluster
>>Devel" <gluster-devel at gluster.org>
>>Sent: 2/4/2015 6:03:45 AM
>>Subject: Re: [Gluster-devel] missing files
>>
>>>On 02/04/2015 01:30 AM, David F. Robinson wrote:
>>>>Sorry. Thought about this a little more. I should have been
clearer.
>>>>The files were on both bricks of the replica, not just one side.
So,
>>>>both bricks had to have been up... The files/directories just
don't
>>>>show
>>>>up on the mount.
>>>>I was reading and saw a related bug
>>>>(https://bugzilla.redhat.com/show_bug.cgi?id=1159484). I saw it
>>>>suggested to run:
>>>>          find <mount> -d -exec getfattr -h -n
trusted.ec.heal {} \;
>>>
>>>This command is specific for a dispersed volume. It won't do
anything
>>>(aside from the error you are seeing) on a replicated volume.
>>>
>>>I think you are using a replicated volume, right ?
>>>
>>>In this case I'm not sure what can be happening. Is your volume
a
>>>pure
>>>replicated one or a distributed-replicated ? on a pure replicated it
>>>doesn't make sense that some entries do not show in an
'ls' when the
>>>file is in both replicas (at least without any error message in the
>>>logs). On a distributed-replicated it could be caused by some
problem
>>>while combining contents of each replica set.
>>>
>>>What's the configuration of your volume ?
>>>
>>>Xavi
>>>
>>>>
>>>>I get a bunch of errors for operation not supported:
>>>>[root at gfs02a homegfs]# find wks_backup -d -exec getfattr -h
-n
>>>>trusted.ec.heal {} \;
>>>>find: warning: the -d option is deprecated; please use -depth 
>>>>instead,
>>>>because the latter is a POSIX-compliant feature.
>>>>wks_backup/homer_backup/backup: trusted.ec.heal: Operation not 
>>>>supported
>>>>wks_backup/homer_backup/logs/2014_05_20.log: trusted.ec.heal: 
>>>>Operation
>>>>not supported
>>>>wks_backup/homer_backup/logs/2014_05_21.log: trusted.ec.heal: 
>>>>Operation
>>>>not supported
>>>>wks_backup/homer_backup/logs/2014_05_18.log: trusted.ec.heal: 
>>>>Operation
>>>>not supported
>>>>wks_backup/homer_backup/logs/2014_05_19.log: trusted.ec.heal: 
>>>>Operation
>>>>not supported
>>>>wks_backup/homer_backup/logs/2014_05_22.log: trusted.ec.heal: 
>>>>Operation
>>>>not supported
>>>>wks_backup/homer_backup/logs: trusted.ec.heal: Operation not 
>>>>supported
>>>>wks_backup/homer_backup: trusted.ec.heal: Operation not
supported
>>>>------ Original Message ------
>>>>From: "Benjamin Turner" <bennyturns at gmail.com
>>>><mailto:bennyturns at gmail.com>>
>>>>To: "David F. Robinson" <david.robinson at
corvidtec.com
>>>><mailto:david.robinson at corvidtec.com>>
>>>>Cc: "Gluster Devel" <gluster-devel at gluster.org
>>>><mailto:gluster-devel at gluster.org>>;
"gluster-users at gluster.org"
>>>><gluster-users at gluster.org <mailto:gluster-users at
gluster.org>>
>>>>Sent: 2/3/2015 7:12:34 PM
>>>>Subject: Re: [Gluster-devel] missing files
>>>>>It sounds to me like the files were only copied to one
replica,
>>>>>werent
>>>>>there for the initial for the initial ls which triggered a
self
>>>>>heal,
>>>>>and were there for the last ls because they were healed. Is
there
>>>>>any
>>>>>chance that one of the replicas was down during the rsync?
It could
>>>>>be that you lost a brick during copy or something like that.
To
>>>>>confirm I would look for disconnects in the brick logs as
well as
>>>>>checking glusterfshd.log to verify the missing files were
actually
>>>>>healed.
>>>>>
>>>>>-b
>>>>>
>>>>>On Tue, Feb 3, 2015 at 5:37 PM, David F. Robinson
>>>>><david.robinson at corvidtec.com 
>>>>><mailto:david.robinson at corvidtec.com>>
>>>>>wrote:
>>>>>
>>>>>     I rsync'd 20-TB over to my gluster system and
noticed that I
>>>>>had
>>>>>     some directories missing even though the rsync
completed
>>>>>normally.
>>>>>     The rsync logs showed that the missing files were
transferred.
>>>>>     I went to the bricks and did an 'ls -al
>>>>>     /data/brick*/homegfs/dir/*' the files were on the
bricks. After
>>>>>I
>>>>>     did this 'ls', the files then showed up on the
FUSE mounts.
>>>>>     1) Why are the files hidden on the fuse mount?
>>>>>     2) Why does the ls make them show up on the FUSE mount?
>>>>>     3) How can I prevent this from happening again?
>>>>>     Note, I also mounted the gluster volume using NFS and
saw the
>>>>>same
>>>>>     behavior. The files/directories were not shown until I
did the
>>>>>     "ls" on the bricks.
>>>>>     David
>>>>>     ==============================>>>>>    
David F. Robinson, Ph.D.
>>>>>     President - Corvid Technologies
>>>>>     704.799.6944 x101 <tel:704.799.6944%20x101>
[office]
>>>>>     704.252.1310 <tel:704.252.1310> [cell]
>>>>>     704.799.7974 <tel:704.799.7974> [fax]
>>>>>     David.Robinson at corvidtec.com 
>>>>><mailto:David.Robinson at corvidtec.com>
>>>>>     http://www.corvidtechnologies.com
>>>>><http://www.corvidtechnologies.com/>
>>>>>
>>>>>     _______________________________________________
>>>>>     Gluster-devel mailing list
>>>>>     Gluster-devel at gluster.org <mailto:Gluster-devel
at gluster.org>
>>>>>     http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>
>>>>>
>>>>
>>>>
>>>>_______________________________________________
>>>>Gluster-devel mailing list
>>>>Gluster-devel at gluster.org
>>>>http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>
>>

Gluster users - Feb 2015 - [Gluster-devel] missing files

[Gluster-users] [Gluster-devel] missing files

[Gluster-users] [Gluster-devel] missing files

[Gluster-users] [Gluster-devel] missing files