thr3ads.net - Btrfs devel - Various Questions [Jan 2011]

If this information is useful, please help other people find it:
Share via:

Carl Cook

2011-Jan-07 17:15 UTC

Various Questions

On Fri 07 January 2011 08:14:17 Hubert Kario wrote:> I''d suggest at least 
> mkfs.btrfs -m raid1 -d raid0 /dev/sdc /dev/sdd
> if you really want raid0
I don''t fully understand -m or -d.  Why would this make a truer raid0
that with no options?

Is it necessary to use fdisk on new drives in creating a BTRFS multi-drive
array?  Or is this all that''s needed:
# mkfs.btrfs /dev/sdb /dev/sdc
# btrfs filesystem show

Is this related to ''subvolumes''?  The FAQ implies that a
subvolume is like a directory, but also like a partition.  What''s the
rationale for being able to create a subvolume under a subvolume, as Hubert says
so he can "use the shadow_copy module for samba to publish the snapshots 
to windows clients."  I don''t have any windows clients, but what
difference does his structure make?

I know that if using SATA+LVM, turn off the writeback cache on the drive, as it
doesn''t do cash flushing, and ensure NCQ is on.  But does this also
apply to a BTRFS array?  If so, is this done in rc.local with
hdparm -I /dev/sdb
hdparm -I /dev/sdc

How do you know what options to rsync are on by default?  I can''t find
this anywhere.  For example, it seems to me that --perms -ogE  --hard-links and
--delete-excluded should be on by default, for a true sync?

If using the  --numeric-ids switch for rsync, do you just have to manually make
sure the IDs and usernames are the same on source and destination machines?

For files that fail to transfer, wouldn''t it be wise to use 
--partial-dir=DIR to at least recover part of lost files?

The rsync man page says that rsync uses ssh by default, but is that the case?  I
think -e may be related to engaging ssh, but don''t understand the
explanation.

So for my system where there is a backup server, I guess I run the rsync daemon
on the backup server which presents a port, then when the other systems decide
it''s time for a backup (cron) they:
- stop mysql, dump the database somewhere, start mysql;
- connect to the backup server''s rsync port and dump their data to
(hopefully) some specific place there.
Right?

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

C Anthony Risinger

2011-Jan-07 17:37 UTC

head link

Re: Various Questions

On Fri, Jan 7, 2011 at 11:15 AM, Carl Cook <CACook@quantum-sci.com>
wrote:>
> On Fri 07 January 2011 08:14:17 Hubert Kario wrote:
>> I''d suggest at least
>> mkfs.btrfs -m raid1 -d raid0 /dev/sdc /dev/sdd
>> if you really want raid0
>
> I don''t fully understand -m or -d.  Why would this make a truer
raid0 that with no options?
this will give you RAID0 for your data, but RAID1 for your metadata,
making it less likely that the FS itself gets corrupted, even though
you will lose some data in crash cases, if i understand correctly.
> Is it necessary to use fdisk on new drives in creating a BTRFS multi-drive
array?  Or is this all that''s needed:
> # mkfs.btrfs /dev/sdb /dev/sdc
> # btrfs filesystem show
depends on whether you need /boot partitions or other partitions.
what you have works fine though.
> Is this related to ''subvolumes''?  The FAQ implies that a
subvolume is like a directory, but also like a partition.  What''s the
rationale for being able to create a subvolume under a subvolume, as Hubert says
so he can "use the shadow_copy module for samba to publish the snapshots
 to windows clients."  I don''t have any windows clients, but what
difference does his structure make?
just his preference to put it there... the snapshot of a snapshot can
go anywhere.  it doesn''t have to reside under it''s
"parent", the
parent was just used as a base, it''s not bound to it in any way AFAIK.
> How do you know what options to rsync are on by default?  I can''t
find this anywhere.  For example, it seems to me that --perms -ogE  --hard-links
and --delete-excluded should be on by default, for a true sync?
the links and command Freddie Cash posted are a really good base to work from.
> So for my system where there is a backup server, I guess I run the rsync
daemon on the backup server which presents a port, then when the other systems
decide it''s time for a backup (cron) they:
> - stop mysql, dump the database somewhere, start mysql;
> - connect to the backup server''s rsync port and dump their data to
(hopefully) some specific place there.
> Right?
you don''t have to stop mysql, you just need to "freeze" any
new,
incoming writes, and flush (ie. let finish) whatever is happening
right now.  this ensures mysql is _internally_ consistent on the disk.

see comment by Lloyd Standish here:

http://dev.mysql.com/doc/refman/5.1/en/backup-methods.html

C Anthony
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Freddie Cash

2011-Jan-07 17:41 UTC

head link

Re: Various Questions

On Fri, Jan 7, 2011 at 9:15 AM, Carl Cook <CACook@quantum-sci.com>
wrote:> How do you know what options to rsync are on by default?  I can''t
find this anywhere.  For example, it seems to me that --perms -ogE  --hard-links
and --delete-excluded should be on by default, for a true sync?
Who cares which ones are on by default?  List the ones you want to use
on the command-line, everytime.  That way, if the defaults change,
your setup won''t.
> If using the  --numeric-ids switch for rsync, do you just have to manually
make sure the IDs and usernames are the same on source and destination machines?
You use the --numeric-ids switch so that it *doesn''t* matter if the
IDs/usernames are the same.  It just sends the ID number on the wire.
Sure, if you do an ls on the backup box, the username will appear to
be messed up.  But if you compare the user ID assigned to the file,
and the user ID to the backed up etc/passwd file, they are correct.
Then, if you ever need to restore the HTPC from backups, the
etc/passwd file is transferred over, the user IDs are transferred
over, and when you do an ls on the HTPC, everything matches up
correctly.
> For files that fail to transfer, wouldn''t it be wise to use
 --partial-dir=DIR to at least recover part of lost files?
Or, just run rsync again, if the connection is dropped.
> The rsync man page says that rsync uses ssh by default, but is that the
case?  I think -e may be related to engaging ssh, but don''t understand
the explanation.
Does it matter what the default is, if you specify exactly how you
want it to work on the command-line?
> So for my system where there is a backup server, I guess I run the rsync
daemon on the backup server which presents a port, then when the other systems
decide it''s time for a backup (cron) they:
> - stop mysql, dump the database somewhere, start mysql;
> - connect to the backup server''s rsync port and dump their data to
(hopefully) some specific place there.
> Right?
That''s one way (push backups).  It works ok for small numbers of
systems being backed up.  But get above a handful of machines, and it
gets very hard to time everything so that you don''t hammer the disks
on the backup server.

Pull backups (backups server does everything) works better, in my
experience.  Then you just script things up once, run 1 script, worry
about 1 schedule, and everything is stored on the backups server.  No
need to run rsync daemons everywhere, just run the rsync client, using
-e ssh, and let it do everything.

If you need it to run a script on the remote machine first, that''s
easy enough to do:
  - ssh to remote system, run script to stop DBs, dump DBs, snapshot
FS, whatever
  - then run rsync
  - ssh to remote system run script to start DBs, delete snapshot, whatever

You''re starting to over-think things.  Keep it simple, don''t
worry
about defaults, specify everything you want to do, and do it all from
the backups box.

-- 
Freddie Cash
fjwcash@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Carl Cook

2011-Jan-07 18:55 UTC

head link

Re: Various Questions

Wow, this rsync and backup system is pretty amazing.  I''ve always just
tarred each directory manually, but now find I can RELIABLY automate backups,
and have SOLID versioning to boot.  Thanks to everyone who advised, especially
Freddie and Anthony.

I am still waiting for hardware for my backup server, but have been preparing. 
On the backup server I''ll be doing pull backups for everything except
my phone (which is connected intermittently).  I''m going to set up a
cron script on the backup server to pull backups once a week (as opposed to
once/mo which I''ve done for 12 years).  I am at a loss how to to lock
the database on the HTPC while exporting the dump, as per Lloyd Standish, but
will study it.  (Freddie gave a nice script, but it doesn''t seem to
lock/flush first)  Also don''t know how to email results/success/fail on
completion, as I''ve not a very good coder.

But here is my proposed cron:
btrfs subvolume snapshot hex:///home /media/backups/snapshots/hex-{DATE}
rsync --archive --hard-links --delete-during --delete-excluded --inplace
--numeric-ids -e ssh --exclude-from=/media/backups/exclude-hex hex:///home
/media/backups/hex
btrfs subvolume snapshot droog:///home /media/backups/snapshots/droog-{DATE}
rsync --archive --hard-links --delete-during --delete-excluded --inplace
--numeric-ids -e ssh --exclude-from=/media/backups/exclude-droog droog:///home
/media/backups/droog

My root filesystems are ext4, so I guess they cannot be snapshotted before
backup.  My home directories are/will be BTRFS though.

On Fri 07 January 2011 08:14:17 Hubert Kario wrote:>> I''d suggest at least 
>> mkfs.btrfs -m raid1 -d raid0 /dev/sdc /dev/sdd
>> if you really want raid0
>
> I don''t fully understand -m or -d.  Why would this make a truer
raid0 that with no options?
I am beginning to suspect that this is the -default- behavior, as described in
the wiki:
"# Create a filesystem across four drives (metadata mirrored, data
striped)"

Should I turn off the writeback cache on each drive when running BTRFS?

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Carl Cook

2011-Jan-08 13:25 UTC

head link

Re: Various Questions

In addition to the questions below, if anyone has a chance could you advise on
why my destination drive has more data  than the source after this command:
# rsync --hard-links --delete --inplace --archive --numeric-ids /media/disk/*
/home
sending incremental file list
sent 658660 bytes  received 2433 bytes  1322186.00 bytes/sec
total size is 1355368091626  speedup is 2050192.77

# df /media/disk
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/md2             1868468340 1315408384 553059956  71% /media/disk
# df /home
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sdb             3907029168 1325491836 2581537332  34% /home




On Fri 07 January 2011 10:55:43 Carl Cook wrote:> 
> Wow, this rsync and backup system is pretty amazing.  I''ve always
just tarred each directory manually, but now find I can RELIABLY automate
backups, and have SOLID versioning to boot.  Thanks to everyone who advised,
especially Freddie and Anthony.
> 
> I am still waiting for hardware for my backup server, but have been
preparing.  On the backup server I''ll be doing pull backups for
everything except my phone (which is connected intermittently).  I''m
going to set up a cron script on the backup server to pull backups once a week
(as opposed to once/mo which I''ve done for 12 years).  I am at a loss
how to to lock the database on the HTPC while exporting the dump, as per Lloyd
Standish, but will study it.  (Freddie gave a nice script, but it
doesn''t seem to lock/flush first)  Also don''t know how to
email results/success/fail on completion, as I''ve not a very good
coder.
> 
> But here is my proposed cron:
> btrfs subvolume snapshot hex:///home /media/backups/snapshots/hex-{DATE}
> rsync --archive --hard-links --delete-during --delete-excluded --inplace
--numeric-ids -e ssh --exclude-from=/media/backups/exclude-hex hex:///home
/media/backups/hex
> btrfs subvolume snapshot droog:///home
/media/backups/snapshots/droog-{DATE}
> rsync --archive --hard-links --delete-during --delete-excluded --inplace
--numeric-ids -e ssh --exclude-from=/media/backups/exclude-droog droog:///home
/media/backups/droog
> 
> My root filesystems are ext4, so I guess they cannot be snapshotted before
backup.  My home directories are/will be BTRFS though.
> 
> 
> On Fri 07 January 2011 08:14:17 Hubert Kario wrote:
> >> I''d suggest at least 
> >> mkfs.btrfs -m raid1 -d raid0 /dev/sdc /dev/sdd
> >> if you really want raid0
> >
> > I don''t fully understand -m or -d.  Why would this make a
truer raid0 that with no options?
> 
> I am beginning to suspect that this is the -default- behavior, as described
in the wiki:
> "# Create a filesystem across four drives (metadata mirrored, data
striped)"
> 
> Should I turn off the writeback cache on each drive when running BTRFS?
> 
> --
> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ian! D. Allen

2011-Jan-08 15:40 UTC

head link

Re: Various Questions

On Sat, Jan 08, 2011 at 05:25:19AM -0800, Carl Cook
wrote:> In addition to the questions below, if anyone has a chance could you
> advise on why my destination drive has more data than the source after
> this command:
> # rsync --hard-links --delete --inplace --archive --numeric-ids
/media/disk/* /home
> sending incremental file list
> sent 658660 bytes  received 2433 bytes  1322186.00 bytes/sec
> total size is 1355368091626  speedup is 2050192.77
> 
> # df /media/disk
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/md2             1868468340 1315408384 553059956  71% /media/disk
> # df /home
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/sdb             3907029168 1325491836 2581537332  34% /home
This has little to do with btrfs; it happens with many file systems due
to file system infrastructure details such as directory sizes, sparse
file handling, file fragmentation, etc.

For example: If you have a directory with a huge number of file names
in it, the actual directory disk space used will be large and will not
be reclaimed when you delete all the file names from the directory.
You would have to remove the directory itself and recreate it to reclaim
that space.  Also, using rsync without --sparse (which can''t work with
--inplace), sparse files on the source may get expanded to take real
disk blocks on the destination.

Unless you use "dd" to copy a partition exactly, including all the
file
system infrastructure details, any copy you make will be subject to the
vagaries of how the file system decides to lay out the data.

-- 
| Ian! D. Allen  -  idallen@idallen.ca  -  Ottawa, Ontario, Canada
| Home Page: http://idallen.com/   Contact Improv: http://contactimprov.ca/
| College professor (Free/Libre GNU+Linux) at: http://teaching.idallen.com/
| Defend digital freedom:  http://eff.org/  and have fun:  http://fools.ca/
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Freddie Cash

2011-Jan-09 01:26 UTC

head link

Re: Various Questions

On Sat, Jan 8, 2011 at 5:25 AM, Carl Cook <CACook@quantum-sci.com>
wrote:>
> In addition to the questions below, if anyone has a chance could you advise
on why my destination drive has more data  than the source after this command:
> # rsync --hard-links --delete --inplace --archive --numeric-ids
/media/disk/* /home
> sending incremental file list
What happens if you delete /home, then run the command again, but
without the *?  You generally don''t use wildcards for the source or
destination when using rsync.  You just tell it which directory to
start in.

If you do an "ls /home" and "ls /media/disk" are they
different?

-- 
Freddie Cash
fjwcash@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Carl Cook

2011-Jan-09 13:16 UTC

head link

Re: Various Questions

I''d rather not do the copy again unless necessary, as it took a day.

Directories look identical, but who knows?  I''m going to try and figure
out how to do a file-by-file crc check, for peace of mind.


On Sat 08 January 2011 17:26:25 Freddie Cash wrote:> On Sat, Jan 8, 2011 at 5:25 AM, Carl Cook <CACook@quantum-sci.com>
wrote:
> >
> > In addition to the questions below, if anyone has a chance could you
advise on why my destination drive has more data  than the source after this
command:
> > # rsync --hard-links --delete --inplace --archive --numeric-ids
/media/disk/* /home
> > sending incremental file list
> 
> What happens if you delete /home, then run the command again, but
> without the *?  You generally don''t use wildcards for the source
or
> destination when using rsync.  You just tell it which directory to
> start in.
> 
> If you do an "ls /home" and "ls /media/disk" are they
different?
> 
> --
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Fajar A. Nugraha

2011-Jan-09 13:37 UTC

head link

Re: Various Questions

On Sun, Jan 9, 2011 at 8:16 PM, Carl Cook <CACook@quantum-sci.com>
wrote:>
> I''d rather not do the copy again unless necessary, as it took a
day.
>
> Directories look identical, but who knows?  I''m going to try and
figure out how to do a file-by-file crc check, for peace of mind.
try "du --apparent-size -slh"
It should rule out any differences caused by sparse files and hardlinks.
>
>
> On Sat 08 January 2011 17:26:25 Freddie Cash wrote:
>> On Sat, Jan 8, 2011 at 5:25 AM, Carl Cook
<CACook@quantum-sci.com> wrote:
>> >
>> > In addition to the questions below, if anyone has a chance could
you advise on why my destination drive has more data  than the source after this
command:
>> > # rsync --hard-links --delete --inplace --archive --numeric-ids
/media/disk/* /home
Are you SURE you don''t get the command mixed up? The last argument to
rsync should be the destination. Your command looks like you''re
copying things to /home.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Alan Chandler

2011-Jan-09 13:58 UTC

head link

Re: Various Questions

On 09/01/11 13:37, Fajar A. Nugraha wrote:> On Sun, Jan 9, 2011 at 8:16 PM, Carl Cook<CACook@quantum-sci.com> 
wrote:
>    
>> I''d rather not do the copy again unless necessary, as it took
a day.
>>
>> Directories look identical, but who knows?  I''m going to try
and figure out how to do a file-by-file crc check, for peace of mind.
>>      
> try "du --apparent-size -slh"
> It should rule out any differences caused by sparse files and hardlinks.
>
>    
>>
>> On Sat 08 January 2011 17:26:25 Freddie Cash wrote:
>>      
>>> On Sat, Jan 8, 2011 at 5:25 AM, Carl
Cook<CACook@quantum-sci.com>  wrote:
>>>        
>>>> In addition to the questions below, if anyone has a chance
could you advise on why my destination drive has more data  than the source
after this command:
>>>> # rsync --hard-links --delete --inplace --archive --numeric-ids
/media/disk/* /home
>>>>          
> Are you SURE you don''t get the command mixed up? The last argument
to
> rsync should be the destination. Your command looks like you''re
> copying things to /home.
>    
What is also important is that use of * - it means all the . files at 
the top level are NOT being copied

rsync is clever enough to notice if you have the / at the end of the 
source to know whether you want the directory to be put into the 
destination or the contents of the directory.  The / at the end of the 
source means copy the contents.

This could be (I am not sure of the exact scope of --delete) the reason 
why the destination has more data than the source.  If --delete is not 
deleting /home/.* files (if there any there).

-- 
Alan Chandler
http://www.chandlerfamily.org.uk

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Seemingly Similar Threads

Search for more reasonably related threads

Btrfs devel - Jan 2011 - Various Questions

Various Questions

Re: Various Questions

Re: Various Questions

Re: Various Questions

Re: Various Questions

Re: Various Questions

Re: Various Questions

Re: Various Questions

Re: Various Questions

Re: Various Questions

Seemingly Similar Threads