thr3ads.net - Btrfs devel - [Btrfs-devel] Initial Planning document for multiple device support [Jan 2008]

If this information is useful, please help other people find it:
Share via:

Chris Mason

2008-Jan-21 16:47 UTC

[Btrfs-devel] Initial Planning document for multiple device support

Hello everyone,

I've spent some time outlining the support for multiple devices, and here is
my basic plan so far.  Any comments are welcome:

http://oss.oracle.com/projects/btrfs/dist/documentation/btrfs-volumes.html

-chris

sftf

2008-Jan-22 00:56 UTC

head link

[Btrfs-devel] Initial Planning document for multiple device support

CM>
http://oss.oracle.com/projects/btrfs/dist/documentation/btrfs-volumes.html

Allocation Record is similar to LVM's physical extent (PE)
Storage Chunk is similar to LVM's logical extent (LE)
Storage Chunk1+Storage Chunk2+Storage Chunk3+... is similar to LVM's volume
group (VG)

And filesystem use aggregation of Storage Chunks directly to store metadata and
data on them.

I'm right?

Chris Mason

2008-Jan-23 04:52 UTC

head link

[Btrfs-devel] Re: Initial Planning document for multiple device support

On Wednesday 23 January 2008, Andi Kleen wrote:> Chris Mason <chris.mason@oracle.com> writes:
>
> Just commenting on something that tripped me while reading
> the document.
>
> >If Btrfs were to rely on device mapper or MD for mirroring, it would
> >not be able to resolve checksum failures by checking the mirrored
> >copy. The lower layers don't know the checksum or granularity of
the
> >filesystem blocks, and so they are not able to verify the data they
> >return.
>
> I cannot imagine it would be that difficult to add a new READ_OTHER_COPY
> io operation that would cause MD/LVM/... to return the other copy
> in a mirror set.
This is something SGI recently proposed, and it is a very good idea I think.  
It also makes sense for hooks between MD and the FS to figure out which 
blocks are in use during a rebuild, and for the FS to tell LVM when blocks 
are freed to help make snapshots more efficient.
>
> Even without btrfs that might be even generally useful for other
> applications that do some checking on their files.
>
> e.g. I could well imagine a new system call to trigger this on the
> page cache level.
>
> There might be other reasons to reinvent another storage manager
> of course. Just that one above doesn't seem to be very convincing.
> I admit I haven't thought too deeply about the other issues you
> raise in the document.
The key problem that requires most of this infrastructure is mirroring 
metadata on a single spindle.  Chunks aren't required to solve it, but they 
do add flexibility to do lots of other things.  For example, relocating hot 
blocks on to the SSD portion of a combined SSD/spindle drive, or writing to 
the SSD when on battery and then transferring in bulk to the spindle.

The chunk code is basically a storage layer with three or four hooks into the 
FS.  Once I have it working, I'll take a hard look at pushing it down into
DM
where it can be used for other things.

-chris

Andi Kleen

2008-Jan-30 05:10 UTC

head link

[Btrfs-devel] Re: Initial Planning document for multiple device support

Chris Mason <chris.mason@oracle.com> writes:

Just commenting on something that tripped me while reading 
the document.
>If Btrfs were to rely on device mapper or MD for mirroring, it would
>not be able to resolve checksum failures by checking the mirrored
>copy. The lower layers don't know the checksum or granularity of the
>filesystem blocks, and so they are not able to verify the data they
>return.
I cannot imagine it would be that difficult to add a new READ_OTHER_COPY
io operation that would cause MD/LVM/... to return the other copy
in a mirror set. 

Even without btrfs that might be even generally useful for other
applications that do some checking on their files.

e.g. I could well imagine a new system call to trigger this on the
page cache level.

There might be other reasons to reinvent another storage manager
of course. Just that one above doesn't seem to be very convincing.
I admit I haven't thought too deeply about the other issues you
raise in the document.

-Andi

Btrfs devel - Jan 2008 - Initial Planning document for multiple device support

[Btrfs-devel] Initial Planning document for multiple device support

[Btrfs-devel] Initial Planning document for multiple device support

[Btrfs-devel] Re: Initial Planning document for multiple device support

[Btrfs-devel] Re: Initial Planning document for multiple device support