thr3ads.net - Btrfs devel - Why btrfs inline small file by default? [Oct 2012]

If this information is useful, please help other people find it:
Share via:

ching

2012-Oct-30 11:04 UTC

Why btrfs inline small file by default?

Hi all,

I am testing my btrfs root partition with "max_inline=0", and 64k leaf
size for weeks and it seems that it is fine.


AFAIK btrfs inline small files into metadata by default, I am curious why?

If there is only a few small files, then there will be neither effect nor
benefit at all
If there is a lot of small files, then the size of metadata will be undesirable
due to deduplication

there are also some email threads related to problem of metadata inline (i
don''t know whether they are fixed in recent kernel):
http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg16295.html
http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg05265.html

How about turning off inline so that btrfs works better "out of the
box"?

ching



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Felix Pepinghege

2012-Oct-30 12:04 UTC

head link

Re: Why btrfs inline small file by default?

Hi ching!

Am 30.10.2012 12:04, schrieb ching:> Hi all,
>
> I am testing my btrfs root partition with "max_inline=0", and 64k
leaf size for weeks and it seems that it is fine.
>
>
> AFAIK btrfs inline small files into metadata by default, I am curious why?
>
> If there is only a few small files, then there will be neither effect nor
benefit at all
I disagree in this point. There will be a (probably huge) benefit in 
terms of performance. If the file data is inlined, you have a good 
chance (especially with large leaf sizes, although even then it is not 
guaranteed) that the data is in the same leaf as the inode element. So 
you already have the file data as you always access complete leafs. If 
you would store the data in extents, you would need to read the 
respective extent, which can be anywhere on the disk. That is, you most 
likely need to move the head. This brings overhead (especially with 
small files, as the reading process is relatively fast compared to the 
time you need to position the head).
> If there is a lot of small files, then the size of metadata will be
undesirable due to deduplication
Yes, that is a fact, but if that really matters depends on the use-case 
(e.g., the small files to large files ratio, ...). But as btrfs is 
designed explicitly as a general purpose file system, you usually want 
the good performance instead of the better disk-usage (especially as 
disk space isn''t expensive anymore).
>
> there are also some email threads related to problem of metadata inline (i
don''t know whether they are fixed in recent kernel):
> http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg16295.html
> http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg05265.html
>
> How about turning off inline so that btrfs works better "out of the
box"?
>
> ching
>
Regards,
Felix
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Felix Pepinghege

2012-Oct-30 12:11 UTC

head link

Re: Why btrfs inline small file by default?

Hi ching!

Am 30.10.2012 12:04, schrieb ching:> Hi all,
>
> I am testing my btrfs root partition with "max_inline=0", and 64k
leaf size for weeks and it seems that it is fine.
>
>
> AFAIK btrfs inline small files into metadata by default, I am curious why?
>
> If there is only a few small files, then there will be neither effect nor
benefit at all
I disagree in this point. There will be a (probably huge) benefit in 
terms of performance. If the file data is inlined, you have a good 
chance (especially with large leaf sizes, although even then it is not 
guaranteed) that the data is in the same leaf as the inode element. So 
you already have the file data as you always access complete leafs. If 
you would store the data in extents, you would need to read the 
respective extent, which can be anywhere on the disk. That is, you most 
likely need to move the head. This brings overhead (especially with 
small files, as the reading process is relatively fast compared to the 
time you need to position the head).
> If there is a lot of small files, then the size of metadata will be
undesirable due to deduplication
Yes, that is a fact, but if that really matters depends on the use-case 
(e.g., the small files to large files ratio, ...). But as btrfs is 
designed explicitly as a general purpose file system, you usually want 
the good performance instead of the better disk-usage (especially as 
disk space isn''t expensive anymore).

>
> there are also some email threads related to problem of metadata inline (i
don''t know whether they are fixed in recent kernel):
> http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg16295.html
> http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg05265.html
>
> How about turning off inline so that btrfs works better "out of the
box"?
>
> ching
>
Regards,
Felix
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Mitch Harder

2012-Oct-30 12:13 UTC

head link

Re: Why btrfs inline small file by default?

On Tue, Oct 30, 2012 at 6:04 AM, ching <lsching17@gmail.com>
wrote:> Hi all,
>
> I am testing my btrfs root partition with "max_inline=0", and 64k
leaf size for weeks and it seems that it is fine.
>
>
> AFAIK btrfs inline small files into metadata by default, I am curious why?
>
> If there is only a few small files, then there will be neither effect nor
benefit at all
> If there is a lot of small files, then the size of metadata will be
undesirable due to deduplication
>
> there are also some email threads related to problem of metadata inline (i
don''t know whether they are fixed in recent kernel):
> http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg16295.html
> http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg05265.html
>
> How about turning off inline so that btrfs works better "out of the
box"?
>
> ching
>
I did some rough benchmarking around this a few weeks ago.  I''ll try
to clean up my method and post the results.

I was working with multiple copies and rsyncs of kernel sources, which
have many candidate files for inlining.

To my surprise, my btrfs benchmarks were always the same or faster
when I let btrfs inline the files, even though metadata was much
larger.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

cwillu

2012-Oct-30 12:17 UTC

head link

Re: Why btrfs inline small file by default?

>> If there is a lot of small files, then the size of metadata will be
>> undesirable due to deduplication
>
>
> Yes, that is a fact, but if that really matters depends on the use-case
> (e.g., the small files to large files ratio, ...). But as btrfs is designed
> explicitly as a general purpose file system, you usually want the good
> performance instead of the better disk-usage (especially as disk space
isn''t
> expensive anymore).
As I understand it, in basically all cases the total storage used by
inlining will be _smaller_, as the allocation doesn''t need to be
aligned to the sector size.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Sterba

2012-Oct-30 16:38 UTC

head link

Re: Why btrfs inline small file by default?

On Tue, Oct 30, 2012 at 07:04:59PM +0800, ching wrote:> I am testing my btrfs root partition with "max_inline=0", and 64k
leaf
> size for weeks and it seems that it is fine.
Related to inlining itself, ext4 and xfs are receiving inline data
support, so it would make sense to introduce a per-file attribute, eg.
FS_NOINLINE_FL that would get set on directories and inherited into
newly created files with the expected outcome.

One of the reasons why is to give a way to user to avoid potential
performance hits, eg. when frequently switching from inline to
non-inline (provided that this is known to happen). Nice to have IMHO,
but more evaluation of real-world use cases where it hurts would be
desirable.

david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

ching

2012-Oct-30 21:39 UTC

head link

Re: Why btrfs inline small file by default?

On 10/30/2012 08:04 PM, Felix Pepinghege wrote:> Hi ching!
>
> Am 30.10.2012 12:04, schrieb ching:
>> Hi all,
>>
>> I am testing my btrfs root partition with "max_inline=0", and
64k leaf size for weeks and it seems that it is fine.
>>
>>
>> AFAIK btrfs inline small files into metadata by default, I am curious
why?
>>
>> If there is only a few small files, then there will be neither effect
nor benefit at all
>
> I disagree in this point. There will be a (probably huge) benefit in terms
of performance. If the file data is inlined, you have a good chance (especially
with large leaf sizes, although even then it is not guaranteed) that the data is
in the same leaf as the inode element. So you already have the file data as you
always access complete leafs. If you would store the data in extents, you would
need to read the respective extent, which can be anywhere on the disk. That is,
you most likely need to move the head. This brings overhead (especially with
small files, as the reading process is relatively fast compared to the time you
need to position the head).
You may be correct.

But I doubt inline data may bring possible performance benefit, bigger metadata
always means more trouble for concurrency/performance and cache miss ratio
>
>> If there is a lot of small files, then the size of metadata will be
undesirable due to deduplication
>
> Yes, that is a fact, but if that really matters depends on the use-case
(e.g., the small files to large files ratio, ...). But as btrfs is designed
explicitly as a general purpose file system, you usually want the good
performance instead of the better disk-usage (especially as disk space
isn''t expensive anymore).
Yes, but as a general purpose filesystem, i guess that the default behaviour
should be "safe"?

Not many user is patient enough to troubleshoot metadata "explosion".
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

ching

2012-Oct-30 21:40 UTC

head link

Re: Why btrfs inline small file by default?

On 10/30/2012 08:17 PM, cwillu wrote:>>> If there is a lot of small files, then the size of metadata will be
>>> undesirable due to deduplication
>>
>> Yes, that is a fact, but if that really matters depends on the use-case
>> (e.g., the small files to large files ratio, ...). But as btrfs is
designed
>> explicitly as a general purpose file system, you usually want the good
>> performance instead of the better disk-usage (especially as disk space
isn''t
>> expensive anymore).
> As I understand it, in basically all cases the total storage used by
> inlining will be _smaller_, as the allocation doesn''t need to be
> aligned to the sector size.
>
if i have 10G small files in total, then it will consume 20G by default.

ching
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hugo Mills

2012-Oct-30 22:14 UTC

head link

Re: Why btrfs inline small file by default?

On Wed, Oct 31, 2012 at 05:40:25AM +0800, ching wrote:> On 10/30/2012 08:17 PM, cwillu wrote:
> >>> If there is a lot of small files, then the size of metadata
will be
> >>> undesirable due to deduplication
> >>
> >> Yes, that is a fact, but if that really matters depends on the
use-case
> >> (e.g., the small files to large files ratio, ...). But as btrfs is
designed
> >> explicitly as a general purpose file system, you usually want the
good
> >> performance instead of the better disk-usage (especially as disk
space isn''t
> >> expensive anymore).
> > As I understand it, in basically all cases the total storage used by
> > inlining will be _smaller_, as the allocation doesn''t need to
be
> > aligned to the sector size.
> >
> 
> if i have 10G small files in total, then it will consume 20G by default.
   If those small files are each 128 bytes in size, then you have
approximately 80 million of them, and they''d take up 80 million pages,
or 320 GiB of total disk space.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ==  PGP
key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
             --- I always felt that as a C programmer, I ---             
                         was becoming typecast.

cwillu

2012-Oct-30 22:16 UTC

head link

Re: Why btrfs inline small file by default?

On Tue, Oct 30, 2012 at 3:40 PM, ching <lsching17@gmail.com>
wrote:> On 10/30/2012 08:17 PM, cwillu wrote:
>>>> If there is a lot of small files, then the size of metadata
will be
>>>> undesirable due to deduplication
>>>
>>> Yes, that is a fact, but if that really matters depends on the
use-case
>>> (e.g., the small files to large files ratio, ...). But as btrfs is
designed
>>> explicitly as a general purpose file system, you usually want the
good
>>> performance instead of the better disk-usage (especially as disk
space isn''t
>>> expensive anymore).
>> As I understand it, in basically all cases the total storage used by
>> inlining will be _smaller_, as the allocation doesn''t need to
be
>> aligned to the sector size.
>>
>
> if i have 10G small files in total, then it will consume 20G by default.
>
> ching
No.  No they will not.  As I already explained.

root@repository:/mnt$ mount ~/inline /mnt -o loop
root@repository:/mnt$ mount ~/inline /mnt2 -o loop,max_inline=0

root@repository:/mnt$ mount
/dev/loop0 on /mnt type btrfs (rw)
/dev/loop1 on /mnt2 type btrfs (rw,max_inline=0)

root@repository:/mnt$ time for x in {1..243854}; do echo "some
stuff"> /mnt/$x; done
real	1m5.447s
user	0m38.422s
sys	0m18.493s

root@repository:/mnt$ time for x in {1..243854}; do echo "some
stuff"> /mnt2/$x; done
real    1m49.880s
user    0m40.379s
sys     0m26.210s

root@repository:/mnt$ df /mnt /mnt2
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/loop0            10485760    266952   8359680   4% /mnt
/dev/loop1            10485760   1311620   7384236  16% /mnt2

root@repository:/mnt$ btrfs fi df /mnt
Data: total=1.01GB, used=256.00KB
System, DUP: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=130.22MB
Metadata: total=8.00MB, used=0.00

root@repository:/mnt$ btrfs fi df /mnt2
Data: total=2.01GB, used=953.05MB
System, DUP: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=164.03MB
Metadata: total=8.00MB, used=0.00

root@repository:/mnt$ btrfs fi show
Label: none  uuid: e5440337-9f44-4b2d-9889-80ab0ab8f245
	Total devices 1 FS bytes used 130.47MB
	devid    1 size 10.00GB used 3.04GB path /dev/loop0

Label: none  uuid: cfcc4149-3102-465d-89b8-0a6bb6a4749a
	Total devices 1 FS bytes used 1.09GB
	devid    1 size 10.00GB used 4.04GB path /dev/loop1

Btrfs Btrfs v0.19

Any questions?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hugo Mills

2012-Oct-30 22:19 UTC

head link

Re: Why btrfs inline small file by default?

On Tue, Oct 30, 2012 at 10:14:12PM +0000, Hugo Mills
wrote:> On Wed, Oct 31, 2012 at 05:40:25AM +0800, ching wrote:
> > On 10/30/2012 08:17 PM, cwillu wrote:
> > >>> If there is a lot of small files, then the size of
metadata will be
> > >>> undesirable due to deduplication
> > >>
> > >> Yes, that is a fact, but if that really matters depends on
the use-case
> > >> (e.g., the small files to large files ratio, ...). But as
btrfs is designed
> > >> explicitly as a general purpose file system, you usually want
the good
> > >> performance instead of the better disk-usage (especially as
disk space isn''t
> > >> expensive anymore).
> > > As I understand it, in basically all cases the total storage used
by
> > > inlining will be _smaller_, as the allocation doesn''t
need to be
> > > aligned to the sector size.
> > >
> > 
> > if i have 10G small files in total, then it will consume 20G by
default.
> 
>    If those small files are each 128 bytes in size, then you have
> approximately 80 million of them, and they''d take up 80 million
pages,
> or 320 GiB of total disk space.
   Sorry, to make that clear -- I meant if they were stored in Data.
If they''re inlined in metadata, then they''ll take
approximately 20 GiB
as you claim, which is a lot less than the 320 GiB they''d be if
they''re not.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ==  PGP
key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
             --- I always felt that as a C programmer, I ---             
                         was becoming typecast.

ching

2012-Oct-30 23:41 UTC

head link

Re: Why btrfs inline small file by default?

On 10/31/2012 06:16 AM, cwillu wrote:> On Tue, Oct 30, 2012 at 3:40 PM, ching <lsching17@gmail.com> wrote:
>> On 10/30/2012 08:17 PM, cwillu wrote:
>>>>> If there is a lot of small files, then the size of metadata
will be
>>>>> undesirable due to deduplication
>>>> Yes, that is a fact, but if that really matters depends on the
use-case
>>>> (e.g., the small files to large files ratio, ...). But as btrfs
is designed
>>>> explicitly as a general purpose file system, you usually want
the good
>>>> performance instead of the better disk-usage (especially as
disk space isn''t
>>>> expensive anymore).
>>> As I understand it, in basically all cases the total storage used
by
>>> inlining will be _smaller_, as the allocation doesn''t need
to be
>>> aligned to the sector size.
>>>
>> if i have 10G small files in total, then it will consume 20G by
default.
>>
>> ching
> No.  No they will not.  As I already explained.
>
> root@repository:/mnt$ mount ~/inline /mnt -o loop
> root@repository:/mnt$ mount ~/inline /mnt2 -o loop,max_inline=0
>
> root@repository:/mnt$ mount
> /dev/loop0 on /mnt type btrfs (rw)
> /dev/loop1 on /mnt2 type btrfs (rw,max_inline=0)
>
> root@repository:/mnt$ time for x in {1..243854}; do echo "some
stuff"
>> /mnt/$x; done
> real	1m5.447s
> user	0m38.422s
> sys	0m18.493s
>
> root@repository:/mnt$ time for x in {1..243854}; do echo "some
stuff"
>> /mnt2/$x; done
> real    1m49.880s
> user    0m40.379s
> sys     0m26.210s
>
> root@repository:/mnt$ df /mnt /mnt2
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/loop0            10485760    266952   8359680   4% /mnt
> /dev/loop1            10485760   1311620   7384236  16% /mnt2
>
> root@repository:/mnt$ btrfs fi df /mnt
> Data: total=1.01GB, used=256.00KB
> System, DUP: total=8.00MB, used=4.00KB
> System: total=4.00MB, used=0.00
> Metadata, DUP: total=1.00GB, used=130.22MB
> Metadata: total=8.00MB, used=0.00
>
> root@repository:/mnt$ btrfs fi df /mnt2
> Data: total=2.01GB, used=953.05MB
> System, DUP: total=8.00MB, used=4.00KB
> System: total=4.00MB, used=0.00
> Metadata, DUP: total=1.00GB, used=164.03MB
> Metadata: total=8.00MB, used=0.00
>
> root@repository:/mnt$ btrfs fi show
> Label: none  uuid: e5440337-9f44-4b2d-9889-80ab0ab8f245
> 	Total devices 1 FS bytes used 130.47MB
> 	devid    1 size 10.00GB used 3.04GB path /dev/loop0
>
> Label: none  uuid: cfcc4149-3102-465d-89b8-0a6bb6a4749a
> 	Total devices 1 FS bytes used 1.09GB
> 	devid    1 size 10.00GB used 4.04GB path /dev/loop1
>
> Btrfs Btrfs v0.19
>
> Any questions?
>
can the test be repeated for:
1. 3k per file with leaf size=4K
2. 60k per file with leaf size=64k


 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

ching

2012-Oct-30 23:47 UTC

head link

Re: Why btrfs inline small file by default?

On 10/31/2012 06:19 AM, Hugo Mills wrote:> On Tue, Oct 30, 2012 at 10:14:12PM +0000, Hugo Mills wrote:
>> On Wed, Oct 31, 2012 at 05:40:25AM +0800, ching wrote:
>>> On 10/30/2012 08:17 PM, cwillu wrote:
>>>>>> If there is a lot of small files, then the size of
metadata will be
>>>>>> undesirable due to deduplication
>>>>> Yes, that is a fact, but if that really matters depends on
the use-case
>>>>> (e.g., the small files to large files ratio, ...). But as
btrfs is designed
>>>>> explicitly as a general purpose file system, you usually
want the good
>>>>> performance instead of the better disk-usage (especially as
disk space isn''t
>>>>> expensive anymore).
>>>> As I understand it, in basically all cases the total storage
used by
>>>> inlining will be _smaller_, as the allocation doesn''t
need to be
>>>> aligned to the sector size.
>>>>
>>> if i have 10G small files in total, then it will consume 20G by
default.
>>    If those small files are each 128 bytes in size, then you have
>> approximately 80 million of them, and they''d take up 80
million pages,
>> or 320 GiB of total disk space.
>    Sorry, to make that clear -- I meant if they were stored in Data.
> If they''re inlined in metadata, then they''ll take
approximately 20 GiB
> as you claim, which is a lot less than the 320 GiB they''d be if
> they''re not.
>
>    Hugo.
>

is it the same for:
1. 3k per file with leaf size=4K
2. 60k per file with leaf size=64k


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Sterba

2012-Oct-31 00:12 UTC

head link

Re: Why btrfs inline small file by default?

On Wed, Oct 31, 2012 at 07:47:14AM +0800, ching wrote:> On 10/31/2012 06:19 AM, Hugo Mills wrote:
> > On Tue, Oct 30, 2012 at 10:14:12PM +0000, Hugo Mills wrote:
> >>> if i have 10G small files in total, then it will consume 20G
by default.
> >>    If those small files are each 128 bytes in size, then you have
> >> approximately 80 million of them, and they''d take up 80
million pages,
> >> or 320 GiB of total disk space.
> >    Sorry, to make that clear -- I meant if they were stored in Data.
> > If they''re inlined in metadata, then they''ll take
approximately 20 GiB
> > as you claim, which is a lot less than the 320 GiB they''d be
if
> > they''re not.
> >
> is it the same for:
> 1. 3k per file with leaf size=4K
> 2. 60k per file with leaf size=64k
The inline limit is minimum of
* ''max_inline'' (8k by default)
* PAGE_SIZE
* leafsize - header

so 60k files for 64k leaves will not get inlined, unless you have a
system with 64k pages.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

cwillu

2012-Oct-31 00:18 UTC

head link

Re: Why btrfs inline small file by default?

On Tue, Oct 30, 2012 at 5:47 PM, ching <lsching17@gmail.com>
wrote:> On 10/31/2012 06:19 AM, Hugo Mills wrote:
>> On Tue, Oct 30, 2012 at 10:14:12PM +0000, Hugo Mills wrote:
>>> On Wed, Oct 31, 2012 at 05:40:25AM +0800, ching wrote:
>>>> On 10/30/2012 08:17 PM, cwillu wrote:
>>>>>>> If there is a lot of small files, then the size of
metadata will be
>>>>>>> undesirable due to deduplication
>>>>>> Yes, that is a fact, but if that really matters depends
on the use-case
>>>>>> (e.g., the small files to large files ratio, ...). But
as btrfs is designed
>>>>>> explicitly as a general purpose file system, you
usually want the good
>>>>>> performance instead of the better disk-usage
(especially as disk space isn''t
>>>>>> expensive anymore).
>>>>> As I understand it, in basically all cases the total
storage used by
>>>>> inlining will be _smaller_, as the allocation
doesn''t need to be
>>>>> aligned to the sector size.
>>>>>
>>>> if i have 10G small files in total, then it will consume 20G by
default.
>>>    If those small files are each 128 bytes in size, then you have
>>> approximately 80 million of them, and they''d take up 80
million pages,
>>> or 320 GiB of total disk space.
>>    Sorry, to make that clear -- I meant if they were stored in Data.
>> If they''re inlined in metadata, then they''ll take
approximately 20 GiB
>> as you claim, which is a lot less than the 320 GiB they''d be
if
>> they''re not.
>>
>>    Hugo.
>>
>
>
> is it the same for:
> 1. 3k per file with leaf size=4K
> 2. 60k per file with leaf size=64k
>
>
import os
import sys

data = "1" * 1024 * 3

for x in xrange(100 * 1000):
  with open(''%s/%s'' % (sys.argv[1], x), ''a'')
as f:
    f.write(data)

root@repository:~$ mount -o loop ~/inline /mnt
root@repository:~$ mount -o loop,max_inline=0 ~/noninline /mnt2

root@repository:~$ time python test.py /mnt
real	0m11.105s
user	0m1.328s
sys	0m5.416s
root@repository:~$ time python test.py /mnt2
real	0m21.905s
user	0m1.292s
sys	0m5.460s

root@repository:/$ btrfs fi df /mnt
Data: total=1.01GB, used=256.00KB
System, DUP: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=652.70MB
Metadata: total=8.00MB, used=0.00

root@repository:/$ btrfs fi df /mnt2
Data: total=1.01GB, used=391.12MB
System, DUP: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=60.98MB
Metadata: total=8.00MB, used=0.00

3k data, 4k leaf: inline is twice the speed, but 1.4x bigger.

----

root@repository:~$ mkfs.btrfs inline -l 64k
root@repository:~$ mkfs.btrfs noninline -l 64k
...
root@repository:~$ time python test.py /mnt
real	0m12.244s
user	0m1.396s
sys	0m8.101s
root@repository:~$ time python test.py /mnt2
real	0m13.047s
user	0m1.436s
sys	0m7.772s

root@repository:/$ btr\fs fi df /mnt
Data: total=8.00MB, used=256.00KB
System, DUP: total=8.00MB, used=64.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=342.06MB
Metadata: total=8.00MB, used=0.00

root@repository:/$ btr\fs fi df /mnt2
Data: total=1.01GB, used=391.10MB
System, DUP: total=8.00MB, used=64.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=50.06MB
Metadata: total=8.00MB, used=0.00

3k data, 64k leaf: inline is still 10% faster, and is now 25% smaller

----

data = "1" * 1024 * 32

... (mkfs, mount, etc)

root@repository:~$ time python test.py /mnt
real	0m17.834s
user	0m1.224s
sys	0m4.772s
root@repository:~$ time python test.py /mnt2
real	0m20.521s
user	0m1.304s
sys	0m6.344s

root@repository:/$ btrfs fi df /mnt
Data: total=4.01GB, used=3.05GB
System, DUP: total=8.00MB, used=64.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=54.00MB
Metadata: total=8.00MB, used=0.00

root@repository:/$ btrfs fi df /mnt2
Data: total=4.01GB, used=3.05GB
System, DUP: total=8.00MB, used=64.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=53.56MB
Metadata: total=8.00MB, used=0.00

32k data, 64k leaf: inline is still 10% faster, and is now the same
size (not dead sure why, probably some interaction with the size of
the actual write that happens)

----

data = "1" * 1024 * 7

... etc


root@repository:~$ time python test.py /mnt
real	0m9.628s
user	0m1.368s
sys	0m4.188s
root@repository:~$ time python test.py /mnt2
real	0m13.455s
user	0m1.608s
sys	0m7.884s

root@repository:/$ btrfs fi df /mnt
Data: total=3.01GB, used=1.91GB
System, DUP: total=8.00MB, used=64.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=74.69MB
Metadata: total=8.00MB, used=0.00

root@repository:/$ btrfs fi df /mnt2
Data: total=3.01GB, used=1.91GB
System, DUP: total=8.00MB, used=64.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=74.69MB
Metadata: total=8.00MB, used=0.00

7k data, 64k leaf:  30% faster, same data usage.

----

Are we done yet?  Can I go home now? ;p
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ahmet Inan

2012-Oct-31 08:48 UTC

head link

Re: Why btrfs inline small file by default?

i also dont see any benefit from inlining small files:

this example is me doing a fully fledged prebuilt
gentoo system installation on a fresh HDD from
squashfs image on usb key in under 5 minutes:

with defaults (inlining small files):

# mount -o noatime,compress=lzo /dev/sda2 /mnt/point
# time unsquashfs -f -d /mnt/point/ /dev/sdb2
real    4m39.253s
user    1m37.854s
sys     1m1.433s

# btrfs filesystem show
Label: ''root''  uuid: 6fb7104d-8f4a-4f3a-aff8-fdad0a39f569
        Total devices 1 FS bytes used 10.05GB
        devid    1 size 232.63GB used 14.04GB path /dev/sda2

Btrfs v0.20-rc1

# btrfs filesystem df /mnt/point/
Data: total=10.01GB, used=9.08GB
System, DUP: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=2.00GB, used=992.48MB
Metadata: total=8.00MB, used=0.00

without inline:

# mount -o max_inline=0,noatime,compress=lzo /dev/sda2 /mnt/point
# time unsquashfs -f -d /mnt/point/ /dev/sdb2
real    4m42.085s
user    1m36.894s
sys     1m1.583s

# btrfs filesystem show
failed to read /dev/sr0
Label: ''root''  uuid: 97ad3c97-ab03-4197-86d3-72869604b368
        Total devices 1 FS bytes used 11.36GB
        devid    1 size 232.63GB used 13.04GB path /dev/sda2

Btrfs v0.20-rc1

# btrfs filesystem df /mnt/point/
Data: total=11.01GB, used=10.85GB
System, DUP: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=518.59MB
Metadata: total=8.00MB, used=0.00

i will test "no inlining" for some more time though.

Ahmet
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

cwillu

2012-Oct-31 09:39 UTC

head link

Re: Why btrfs inline small file by default?

On Wed, Oct 31, 2012 at 2:48 AM, Ahmet Inan
<ainan@mathematik.uni-freiburg.de> wrote:> i also dont see any benefit from inlining small files:
> with defaults (inlining small files):
> real    4m39.253s
> Data: total=10.01GB, used=9.08GB
> Metadata, DUP: total=2.00GB, used=992.48MB
> without inline:
> real    4m42.085s
> Data: total=11.01GB, used=10.85GB
> Metadata, DUP: total=1.00GB, used=518.59MB
I suggest you take a closer look at your numbers.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ahmet Inan

2012-Oct-31 10:48 UTC

head link

Re: Why btrfs inline small file by default?

>> i also dont see any benefit from inlining small files:
>> with defaults (inlining small files):
>> real    4m39.253s
>> Data: total=10.01GB, used=9.08GB
>> Metadata, DUP: total=2.00GB, used=992.48MB
>> without inline:
>> real    4m42.085s
>> Data: total=11.01GB, used=10.85GB
>> Metadata, DUP: total=1.00GB, used=518.59MB
>
> I suggest you take a closer look at your numbers.
both use 12GiB in total and both need 280 seconds.
am i missing something?

Ahmet
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Michael Kjörling

2012-Oct-31 10:55 UTC

head link

Re: Why btrfs inline small file by default?

On 31 Oct 2012 11:48 +0100, from ainan@mathematik.uni-freiburg.de (Ahmet
Inan):>>> i also dont see any benefit from inlining small files:
> 
>>> with defaults (inlining small files):
>>> real    4m39.253s
>>> Data: total=10.01GB, used=9.08GB
>>> Metadata, DUP: total=2.00GB, used=992.48MB
This uses 10290.40 MB total, if we pad with zeroes (9.08GB plus
992.48MB).

>>> without inline:
>>> real    4m42.085s
>>> Data: total=11.01GB, used=10.85GB
>>> Metadata, DUP: total=1.00GB, used=518.59MB
Under the same assumption, this uses 11628.99 MB total (10.85GB +
518.59MB).

>> I suggest you take a closer look at your numbers.
> 
> both use 12GiB in total and both need 280 seconds.
> am i missing something?
With inlining, you''re using about 1.3 GB less disk space and require a
few seconds less wall-clock time for the same thing. A 10% difference
in storage space requirement does not seem like "no benefit" to me,
and both sets of numbers favor the default (with inlining).

-- 
Michael Kjörling • http://michael.kjorling.se • michael@kjorling.se
                “People who think they know everything really annoy
                those of us who know we don’t.” (Bjarne Stroustrup)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

cwillu

2012-Oct-31 10:57 UTC

head link

Re: Why btrfs inline small file by default?

On Wed, Oct 31, 2012 at 4:48 AM, Ahmet Inan
<ainan@mathematik.uni-freiburg.de> wrote:>>> i also dont see any benefit from inlining small files:
>
>>> with defaults (inlining small files):
>>> real    4m39.253s
>>> Data: total=10.01GB, used=9.08GB
>>> Metadata, DUP: total=2.00GB, used=992.48MB
>
>>> without inline:
>>> real    4m42.085s
>>> Data: total=11.01GB, used=10.85GB
>>> Metadata, DUP: total=1.00GB, used=518.59MB
>>
>> I suggest you take a closer look at your numbers.
>
> both use 12GiB in total and both need 280 seconds.
> am i missing something?
9.08GB + 992.48MB*2 == 11.02GB

10.85GB + 518MB*2 == 11.86GB

That''s nearly a GB smaller.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ahmet Inan

2012-Oct-31 11:10 UTC

head link

Re: Why btrfs inline small file by default?

>>>> with defaults (inlining small files):
>>>> real    4m39.253s
>>>> Data: total=10.01GB, used=9.08GB
>>>> Metadata, DUP: total=2.00GB, used=992.48MB
>
> This uses 10290.40 MB total, if we pad with zeroes (9.08GB plus
> 992.48MB).
>>>> without inline:
>>>> real    4m42.085s
>>>> Data: total=11.01GB, used=10.85GB
>>>> Metadata, DUP: total=1.00GB, used=518.59MB
>
> Under the same assumption, this uses 11628.99 MB total (10.85GB +
> 518.59MB).
> With inlining, you''re using about 1.3 GB less disk space and
require athank you for clarifying this. 10% is indeed a benefit.
> few seconds less wall-clock time for the same thing.those where only 2 tests, have to make a lot more runs
to make qualified judgement there.
one percent difference is noise floor to me.

Ahmet
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Michael Kjörling

2012-Oct-31 11:56 UTC

head link

Re: Why btrfs inline small file by default?

On 31 Oct 2012 04:57 -0600, from cwillu@cwillu.com
(cwillu):> 9.08GB + 992.48MB*2 == 11.02GB
> 
> 10.85GB + 518MB*2 == 11.86GB
> 
> That''s nearly a GB smaller.
That, too; I missed the "DUP". Not quite as pronounced as in my
calculations, then, but still a significant enough difference.

-- 
Michael Kjörling • http://michael.kjorling.se • michael@kjorling.se
                “People who think they know everything really annoy
                those of us who know we don’t.” (Bjarne Stroustrup)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ahmet Inan

2012-Oct-31 13:27 UTC

head link

Re: Why btrfs inline small file by default?

>> 9.08GB + 992.48MB*2 == 11.02GB
>> 10.85GB + 518MB*2 == 11.86GB
>> That''s nearly a GB smaller.
> That, too; I missed the "DUP". Not quite as pronounced as in my
> calculations, then, but still a significant enough difference.
great. now were down to 7-8%

just FYI:

ive retested with max_inline=0 but with leafsize=64K this time:
# mkfs.btrfs -l 64K -L root /dev/sda2
...
real    4m45.878s
user    1m44.730s
sys     1m7.226s

thats 2% slower for this one test (no big deal really)

# btrfs filesystem show
Label: ''root''  uuid: dd2951da-2529-4320-a952-e692ea5bdbc3
        Total devices 1 FS bytes used 11.37GB
        devid    1 size 232.63GB used 13.04GB path /dev/sda2

Btrfs v0.20-rc1

# btrfs filesystem df /mnt/point/
Data: total=11.01GB, used=10.89GB
System, DUP: total=8.00MB, used=64.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=487.94MB
Metadata: total=8.00MB, used=0.00

(1024*10.89 + 2*487.94) / 1024 = 11.84GiB

still around 7-8%

now lets see what everyday use with
max_inline=0 and leafsize=64K
feels like :)

Ahmet
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Roman Mamedov

2012-Oct-31 13:44 UTC

head link

Re: Why btrfs inline small file by default?

On Wed, 31 Oct 2012 11:56:39 +0000
Michael Kjörling <michael@kjorling.se> wrote:
> On 31 Oct 2012 04:57 -0600, from cwillu@cwillu.com (cwillu):
> > 9.08GB + 992.48MB*2 == 11.02GB
> > 
> > 10.85GB + 518MB*2 == 11.86GB
> > 
> > That''s nearly a GB smaller.
> 
> That, too; I missed the "DUP". Not quite as pronounced as in my
> calculations, then, but still a significant enough difference.
There is also a number of cases which justify disabling DUP for metadata, e.g.

- underlying block device is an internally deduplicating SSD (i.e. possibly
  most of them)
- or the block device is a RAID incorporating redundancy
- or simply one wants increase performance at the cost of some reliability

With non-DUP metadata your calculations showing inlining being more efficient
remain correct.

-- 
With respect,
Roman

~~~~~~~~~~~~~~~~~~~~~~~~~~~
"Stallman had a printer,
with code he could not see.
So he began to tinker,
and set the software free."

ching

2012-Oct-31 21:05 UTC

head link

Re: Why btrfs inline small file by default?

On 10/31/2012 08:18 AM, cwillu wrote:> import os
> import sys
>
> data = "1" * 1024 * 3
>
> for x in xrange(100 * 1000):
>   with open(''%s/%s'' % (sys.argv[1], x),
''a'') as f:
>     f.write(data)
>
> root@repository:~$ mount -o loop ~/inline /mnt
> root@repository:~$ mount -o loop,max_inline=0 ~/noninline /mnt2
>
> root@repository:~$ time python test.py /mnt
> real	0m11.105s
> user	0m1.328s
> sys	0m5.416s
> root@repository:~$ time python test.py /mnt2
> real	0m21.905s
> user	0m1.292s
> sys	0m5.460s
>
> root@repository:/$ btrfs fi df /mnt
> Data: total=1.01GB, used=256.00KB
> System, DUP: total=8.00MB, used=4.00KB
> System: total=4.00MB, used=0.00
> Metadata, DUP: total=1.00GB, used=652.70MB
> Metadata: total=8.00MB, used=0.00
>
> root@repository:/$ btrfs fi df /mnt2
> Data: total=1.01GB, used=391.12MB
> System, DUP: total=8.00MB, used=4.00KB
> System: total=4.00MB, used=0.00
> Metadata, DUP: total=1.00GB, used=60.98MB
> Metadata: total=8.00MB, used=0.00
>
> 3k data, 4k leaf: inline is twice the speed, but 1.4x bigger.
>
> ----
>
> root@repository:~$ mkfs.btrfs inline -l 64k
> root@repository:~$ mkfs.btrfs noninline -l 64k
> ...
> root@repository:~$ time python test.py /mnt
> real	0m12.244s
> user	0m1.396s
> sys	0m8.101s
> root@repository:~$ time python test.py /mnt2
> real	0m13.047s
> user	0m1.436s
> sys	0m7.772s
>
> root@repository:/$ btr\fs fi df /mnt
> Data: total=8.00MB, used=256.00KB
> System, DUP: total=8.00MB, used=64.00KB
> System: total=4.00MB, used=0.00
> Metadata, DUP: total=1.00GB, used=342.06MB
> Metadata: total=8.00MB, used=0.00
>
> root@repository:/$ btr\fs fi df /mnt2
> Data: total=1.01GB, used=391.10MB
> System, DUP: total=8.00MB, used=64.00KB
> System: total=4.00MB, used=0.00
> Metadata, DUP: total=1.00GB, used=50.06MB
> Metadata: total=8.00MB, used=0.00
>
> 3k data, 64k leaf: inline is still 10% faster, and is now 25% smaller
>
> ----
>
> data = "1" * 1024 * 32
>
> ... (mkfs, mount, etc)
>
> root@repository:~$ time python test.py /mnt
> real	0m17.834s
> user	0m1.224s
> sys	0m4.772s
> root@repository:~$ time python test.py /mnt2
> real	0m20.521s
> user	0m1.304s
> sys	0m6.344s
>
> root@repository:/$ btrfs fi df /mnt
> Data: total=4.01GB, used=3.05GB
> System, DUP: total=8.00MB, used=64.00KB
> System: total=4.00MB, used=0.00
> Metadata, DUP: total=1.00GB, used=54.00MB
> Metadata: total=8.00MB, used=0.00
>
> root@repository:/$ btrfs fi df /mnt2
> Data: total=4.01GB, used=3.05GB
> System, DUP: total=8.00MB, used=64.00KB
> System: total=4.00MB, used=0.00
> Metadata, DUP: total=1.00GB, used=53.56MB
> Metadata: total=8.00MB, used=0.00
>
> 32k data, 64k leaf: inline is still 10% faster, and is now the same
> size (not dead sure why, probably some interaction with the size of
> the actual write that happens)
>
> ----
>
> data = "1" * 1024 * 7
>
> ... etc
>
>
> root@repository:~$ time python test.py /mnt
> real	0m9.628s
> user	0m1.368s
> sys	0m4.188s
> root@repository:~$ time python test.py /mnt2
> real	0m13.455s
> user	0m1.608s
> sys	0m7.884s
>
> root@repository:/$ btrfs fi df /mnt
> Data: total=3.01GB, used=1.91GB
> System, DUP: total=8.00MB, used=64.00KB
> System: total=4.00MB, used=0.00
> Metadata, DUP: total=1.00GB, used=74.69MB
> Metadata: total=8.00MB, used=0.00
>
> root@repository:/$ btrfs fi df /mnt2
> Data: total=3.01GB, used=1.91GB
> System, DUP: total=8.00MB, used=64.00KB
> System: total=4.00MB, used=0.00
> Metadata, DUP: total=1.00GB, used=74.69MB
> Metadata: total=8.00MB, used=0.00
>
> 7k data, 64k leaf:  30% faster, same data usage.
>
> ----
>
> Are we done yet?  Can I go home now? ;p
>

thanks for the test.

but the result just indicate the inline small file is not a "safe"
optimization to be turned on by default.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

ching

2012-Oct-31 21:07 UTC

head link

Re: Why btrfs inline small file by default?

On 10/31/2012 08:12 AM, David Sterba wrote:> On Wed, Oct 31, 2012 at 07:47:14AM +0800, ching wrote:
>> On 10/31/2012 06:19 AM, Hugo Mills wrote:
>>> On Tue, Oct 30, 2012 at 10:14:12PM +0000, Hugo Mills wrote:
>>>>> if i have 10G small files in total, then it will consume
20G by default.
>>>>    If those small files are each 128 bytes in size, then you
have
>>>> approximately 80 million of them, and they''d take up
80 million pages,
>>>> or 320 GiB of total disk space.
>>>    Sorry, to make that clear -- I meant if they were stored in
Data.
>>> If they''re inlined in metadata, then they''ll take
approximately 20 GiB
>>> as you claim, which is a lot less than the 320 GiB they''d
be if
>>> they''re not.
>>>
>> is it the same for:
>> 1. 3k per file with leaf size=4K
>> 2. 60k per file with leaf size=64k
> The inline limit is minimum of
> * ''max_inline'' (8k by default)
> * PAGE_SIZE
> * leafsize - header
>
> so 60k files for 64k leaves will not get inlined, unless you have a
> system with 64k pages.
>
thank you very much for your clear explanation :)

this is the first time i heard about this.

ching
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Btrfs devel - Oct 2012 - Why btrfs inline small file by default?

Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?

Re: Why btrfs inline small file by default?