thr3ads.net - Gluster users - [Gluster-users] Hopefully answering some mirroring questions asked here and offline [May 2011]

If this information is useful, please help other people find it:
Share via:

Joe Landman

2011-May-02 22:08 UTC

[Gluster-users] Hopefully answering some mirroring questions asked here and offline

Hi folks

  We've fielded a number of mirroring questions offline as well as 
watched/participated in discussions here.  I thought it was important to 
make sure some of these are answered and searchable on the lists.

  One major question that kept arising was as follows:

q:  If I have a large image file (say a VM vmdk/other format) on a 
mirrored volume, will one small change of a few bytes result in a resync 
of the entire file?

a:  No.

To test this, we created a 20GB file on a mirror volume.

root at metal:/local2/home/landman# ls -alF /mirror1gfs/big.file
-rw-r--r-- 1 root root 21474836490 2011-05-02 12:44 /mirror1gfs/big.file

Then using the following quick and dirty Perl, we appended about 10-20 
bytes to the file.

#!/usr/bin/env perl

my $file=shift;
my $fh;
open($fh,">>".$file);
print $fh "end ".$$."\n";
close($fh);


root at metal:/local2/home/landman# ./app.pl /mirror1gfs/big.file

then I had to write a quick and dirty tail replacement, as I've 
discovered that tail doesn't seek ... (yeah, it started reading every 
'line' of that file ...)

#!/usr/bin/env perl

my $file=shift;
my $fh;
my $buf;

open($fh,"<".$file);
seek $fh,-200,2;
read $fh,$buf,200;
printf "buffer: \'%s\'\n",$buf;
close($fh);


root at metal:/local2/home/landman# ./tail.pl /mirror1gfs/big.file
buffer: 'end 19362'

While running the app.pl, I did not see any massive resyncs.  I had 
dstat running in another window.

You might say, that this is irrelevant, as we only appended, and that 
could be special cased.

So I wrote a random updater, that updated at random spots throughtout 
the large file (sorta like a VM vmdk and other files).


#!/usr/bin/env perl

my $file=shift;
my $fh;
my $buf;
my @stat;
my $loc;

@stat = stat($file);
$loc	=	int(rand($stat[7]));
open($fh,">>+".$file);
seek $fh,$loc,0;
printf $fh "I was here!!!";
printf "loc: %i\n",$loc;
close($fh);

root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 17598205436
root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 16468787891
root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 9271612568
root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 1356667302
root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 12365324308
root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 15654714313
root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 10127739152
root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 10259920623

and again, no massive resyncs.

So I think its fairly safe to say that the concern over massive resyncs 
for small updates is not something we see in the field.

Regards,

Joe

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

Anand Avati

2011-May-03 05:06 UTC

head link

[Gluster-users] Hopefully answering some mirroring questions asked here and offline

Thanks for the post Joe. We introduced the "diff" based self heal
algorithm
in 3.1 release.

Avati

On Tue, May 3, 2011 at 3:38 AM, Joe Landman <landman at
scalableinformatics.com> wrote:
> Hi folks
>
>  We've fielded a number of mirroring questions offline as well as
> watched/participated in discussions here.  I thought it was important to
> make sure some of these are answered and searchable on the lists.
>
>  One major question that kept arising was as follows:
>
> q:  If I have a large image file (say a VM vmdk/other format) on a mirrored
> volume, will one small change of a few bytes result in a resync of the
> entire file?
>
> a:  No.
>
> To test this, we created a 20GB file on a mirror volume.
>
> root at metal:/local2/home/landman# ls -alF /mirror1gfs/big.file
> -rw-r--r-- 1 root root 21474836490 2011-05-02 12:44 /mirror1gfs/big.file
>
> Then using the following quick and dirty Perl, we appended about 10-20
> bytes to the file.
>
> #!/usr/bin/env perl
>
> my $file=shift;
> my $fh;
> open($fh,">>".$file);
> print $fh "end ".$$."\n";
> close($fh);
>
>
> root at metal:/local2/home/landman# ./app.pl /mirror1gfs/big.file
>
> then I had to write a quick and dirty tail replacement, as I've
discovered
> that tail doesn't seek ... (yeah, it started reading every
'line' of that
> file ...)
>
> #!/usr/bin/env perl
>
> my $file=shift;
> my $fh;
> my $buf;
>
> open($fh,"<".$file);
> seek $fh,-200,2;
> read $fh,$buf,200;
> printf "buffer: \'%s\'\n",$buf;
> close($fh);
>
>
> root at metal:/local2/home/landman# ./tail.pl /mirror1gfs/big.file
> buffer: 'end 19362'
>
> While running the app.pl, I did not see any massive resyncs.  I had dstat
> running in another window.
>
> You might say, that this is irrelevant, as we only appended, and that could
> be special cased.
>
> So I wrote a random updater, that updated at random spots throughtout the
> large file (sorta like a VM vmdk and other files).
>
>
> #!/usr/bin/env perl
>
> my $file=shift;
> my $fh;
> my $buf;
> my @stat;
> my $loc;
>
> @stat = stat($file);
> $loc    =       int(rand($stat[7]));
> open($fh,">>+".$file);
> seek $fh,$loc,0;
> printf $fh "I was here!!!";
> printf "loc: %i\n",$loc;
> close($fh);
>
> root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
> loc: 17598205436
> root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
> loc: 16468787891
> root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
> loc: 9271612568
> root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
> loc: 1356667302
> root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
> loc: 12365324308
> root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
> loc: 15654714313
> root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
> loc: 10127739152
> root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
> loc: 10259920623
>
> and again, no massive resyncs.
>
> So I think its fairly safe to say that the concern over massive resyncs for
> small updates is not something we see in the field.
>
> Regards,
>
> Joe
>
> --
> Joseph Landman, Ph.D
> Founder and CEO
> Scalable Informatics Inc.
> email: landman at scalableinformatics.com
> web  : http://scalableinformatics.com
>       http://scalableinformatics.com/sicluster
> phone: +1 734 786 8423 x121
> fax  : +1 866 888 3112
> cell : +1 734 612 4615
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20110503/f688a942/attachment.html>

Gluster users - May 2011 - Hopefully answering some mirroring questions asked here and offline

[Gluster-users] Hopefully answering some mirroring questions asked here and offline

[Gluster-users] Hopefully answering some mirroring questions asked here and offline