thr3ads.net - Gluster users - [Gluster-users] Replication for missing file [Jan 2009]

If this information is useful, please help other people find it:
Share via:

Gluster Novice

2009-Jan-14 19:52 UTC

[Gluster-users] Replication for missing file

Hi again folks,

Sorry to bug you with another newbie question :
With regard to file replication, does Gluster FS repair a damaged file
*only* when someone tries to read it ?

For example, let's say the filesystem is supposed to maintain 3 copies of
the file and one of the copies is lost / removed from the system for
whatever reason, then will the missing copy be created the first time only
when someone reads the file ?

Thanks again for your help!
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20090114/e397d9a9/attachment.html>

Keith Freedman

2009-Jan-15 04:05 UTC

head link

[Gluster-users] Replication for missing file

At 11:52 AM 1/14/2009, Gluster Novice wrote:>Sorry to bug you with another newbie question :
>With regard to file replication, does Gluster FS repair a damaged 
>file *only* when someone tries to read it ?
yes  (or, I believe when the directory containing the missing file is read).
>For example, let''s say the filesystem is supposed to maintain 3 
>copies of the file and one of the copies is lost / removed from the 
>system for whatever reason, then will the missing copy be created 
>the first time only when someone reads the file ?
if you need to insure syncronicity after a failure, there is a find 
command in the wiki that will force auto-healing of the whole filesystem
find . -exec head -1 () \; > /dev/null
(I think that''s it, it may not be syntactically valid)

>Thanks again for your help!
>
>_______________________________________________
>Gluster-users mailing list
>Gluster-users at gluster.org
>http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users

Keith Freedman

2009-Jan-15 04:05 UTC

head link

[Gluster-users] Replication for missing file

At 11:52 AM 1/14/2009, Gluster Novice wrote:>Sorry to bug you with another newbie question :
>With regard to file replication, does Gluster FS repair a damaged 
>file *only* when someone tries to read it ?
yes  (or, I believe when the directory containing the missing file is read).
>For example, let's say the filesystem is supposed to maintain 3 
>copies of the file and one of the copies is lost / removed from the 
>system for whatever reason, then will the missing copy be created 
>the first time only when someone reads the file ?
if you need to insure syncronicity after a failure, there is a find 
command in the wiki that will force auto-healing of the whole filesystem
find . -exec head -1 () \; > /dev/null
(I think that's it, it may not be syntactically valid)

>Thanks again for your help!
>
>_______________________________________________
>Gluster-users mailing list
>Gluster-users at gluster.org
>http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users

Krishna Srinivas

2009-Jan-16 11:22 UTC

head link

[Gluster-users] Replication for missing file

In the latest AFR code, the healing code is in lookup and not in open
call flow. Basically 'lookup' is done just before any kind of access
to that file (stat, open, chmod, chown, rm, rename). So AFR heals the
file when you try to access. so "ls -lR" command will trigger the heal
of the entire directory structure.

Krishna

On Thu, Jan 15, 2009 at 9:35 AM, Keith Freedman <freedman at
freeformit.com> wrote:> At 11:52 AM 1/14/2009, Gluster Novice wrote:
>>Sorry to bug you with another newbie question :
>>With regard to file replication, does Gluster FS repair a damaged
>>file *only* when someone tries to read it ?
>
> yes  (or, I believe when the directory containing the missing file is
read).
>
>>For example, let's say the filesystem is supposed to maintain 3
>>copies of the file and one of the copies is lost / removed from the
>>system for whatever reason, then will the missing copy be created
>>the first time only when someone reads the file ?
>
> if you need to insure syncronicity after a failure, there is a find
> command in the wiki that will force auto-healing of the whole filesystem
> find . -exec head -1 () \; > /dev/null
> (I think that's it, it may not be syntactically valid)
>
>
>>Thanks again for your help!
>>
>>_______________________________________________
>>Gluster-users mailing list
>>Gluster-users at gluster.org
>>http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>

Keith Freedman

2009-Jan-16 11:39 UTC

head link

[Gluster-users] Replication for missing file

At 03:22 AM 1/16/2009, Krishna Srinivas wrote:>In the latest AFR code, the healing code is in lookup and not in open
>call flow. Basically ''lookup'' is done just before any kind
of access
>to that file (stat, open, chmod, chown, rm, rename). So AFR heals the
>file when you try to access. so "ls -lR" command will trigger the
heal
>of the entire directory structure.
this behavior is extremely handy from the perspective of data 
integrity, however, it''s disastrous from the perspective of IO 
performance from an applications point of view.

the idea that an application should have to wait while file 
completely unrelated to it''s needs are being auto-healed is an 
unnecessary one.  There''s got to be a way to handle this.  The 
replication should happen in the background, and gluster should be 
smart enough to first auto-heal the file in question return control 
back to the requesting process and then continue healing in the background.

in the case of a directory, the listing of the directory can be 
returned without actually copying over all the files therein.  This 
should be a relatively quick operation.

Lets take an example case of a repository for large video 
files.  each one being 1GB.
I have a server down for a few hours, during which time 300 of these 
files have been updated.

now all I need to know is which ones changed recently (say, ls -alrtu 
| tail -5) .  I block waiting for 300GB of data to be transferred 
when I only need a directory listing?
similarly, if I get a request for just one of those files, I have to 
wait for 300GB of data to move around before I can get access to the 
only 1GB that matters at that time?

If this is only temporary until the new healing methodology 
previously discussed on the list is in place, I suppose it''s 
liveable, but if this is the way it''s going to continue to work, I 
can''t imagine it being useful in any practical real-world situations 
with either large directories or large files with a normal level of 
file updates/modifications.

Keith
>Krishna
>
>On Thu, Jan 15, 2009 at 9:35 AM, Keith Freedman 
><freedman at freeformit.com> wrote:
> > At 11:52 AM 1/14/2009, Gluster Novice wrote:
> >>Sorry to bug you with another newbie question :
> >>With regard to file replication, does Gluster FS repair a damaged
> >>file *only* when someone tries to read it ?
> >
> > yes  (or, I believe when the directory containing the missing 
> file is read).
> >
> >>For example, let''s say the filesystem is supposed to
maintain 3
> >>copies of the file and one of the copies is lost / removed from the
> >>system for whatever reason, then will the missing copy be created
> >>the first time only when someone reads the file ?
> >
> > if you need to insure syncronicity after a failure, there is a find
> > command in the wiki that will force auto-healing of the whole
filesystem
> > find . -exec head -1 () \; > /dev/null
> > (I think that''s it, it may not be syntactically valid)
> >
> >
> >>Thanks again for your help!
> >>
> >>_______________________________________________
> >>Gluster-users mailing list
> >>Gluster-users at gluster.org
> >>http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
> >
> >
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
> >

Anand Avati

2009-Jan-16 17:25 UTC

head link

[Gluster-users] Replication for missing file

We will bw working on background file sync and atleast give it as a
configurable option.

Avati

On Jan 16, 2009 3:41 AM, "Keith Freedman" <freedman at
freeformit.com> wrote:

At 03:22 AM 1/16/2009, Krishna Srinivas wrote: >In the latest AFR code, the
healing code is in looku...
this behavior is extremely handy from the perspective of data
integrity, however, it's disastrous from the perspective of IO
performance from an applications point of view.

the idea that an application should have to wait while file
completely unrelated to it's needs are being auto-healed is an
unnecessary one. There's got to be a way to handle this. The
replication should happen in the background, and gluster should be
smart enough to first auto-heal the file in question return control
back to the requesting process and then continue healing in the background.

in the case of a directory, the listing of the directory can be
returned without actually copying over all the files therein. This
should be a relatively quick operation.

Lets take an example case of a repository for large video
files. each one being 1GB.
I have a server down for a few hours, during which time 300 of these
files have been updated.

now all I need to know is which ones changed recently (say, ls -alrtu
| tail -5) . I block waiting for 300GB of data to be transferred
when I only need a directory listing?
similarly, if I get a request for just one of those files, I have to
wait for 300GB of data to move around before I can get access to the
only 1GB that matters at that time?

If this is only temporary until the new healing methodology
previously discussed on the list is in place, I suppose it's
liveable, but if this is the way it's going to continue to work, I
can't imagine it being useful in any practical real-world situations
with either large directories or large files with a normal level of
file updates/modifications.

Keith
>Krishna > >On Thu, Jan 15, 2009 at 9:35 AM, Keith Freedman ><freedman at freeformit.com> wrote: > > ...
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20090116/5f3f375e/attachment.html>

Gluster users - Jan 2009 - Replication for missing file

[Gluster-users] Replication for missing file

[Gluster-users] Replication for missing file

[Gluster-users] Replication for missing file

[Gluster-users] Replication for missing file

[Gluster-users] Replication for missing file

[Gluster-users] Replication for missing file