Hi all, I am testing my btrfs root partition with "max_inline=0", and 64k leaf size for weeks and it seems that it is fine. AFAIK btrfs inline small files into metadata by default, I am curious why? If there is only a few small files, then there will be neither effect nor benefit at all If there is a lot of small files, then the size of metadata will be undesirable due to deduplication there are also some email threads related to problem of metadata inline (i don''t know whether they are fixed in recent kernel): http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg16295.html http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg05265.html How about turning off inline so that btrfs works better "out of the box"? ching -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi ching! Am 30.10.2012 12:04, schrieb ching:> Hi all, > > I am testing my btrfs root partition with "max_inline=0", and 64k leaf size for weeks and it seems that it is fine. > > > AFAIK btrfs inline small files into metadata by default, I am curious why? > > If there is only a few small files, then there will be neither effect nor benefit at allI disagree in this point. There will be a (probably huge) benefit in terms of performance. If the file data is inlined, you have a good chance (especially with large leaf sizes, although even then it is not guaranteed) that the data is in the same leaf as the inode element. So you already have the file data as you always access complete leafs. If you would store the data in extents, you would need to read the respective extent, which can be anywhere on the disk. That is, you most likely need to move the head. This brings overhead (especially with small files, as the reading process is relatively fast compared to the time you need to position the head).> If there is a lot of small files, then the size of metadata will be undesirable due to deduplicationYes, that is a fact, but if that really matters depends on the use-case (e.g., the small files to large files ratio, ...). But as btrfs is designed explicitly as a general purpose file system, you usually want the good performance instead of the better disk-usage (especially as disk space isn''t expensive anymore).> > there are also some email threads related to problem of metadata inline (i don''t know whether they are fixed in recent kernel): > http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg16295.html > http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg05265.html > > How about turning off inline so that btrfs works better "out of the box"? > > ching >Regards, Felix> > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi ching! Am 30.10.2012 12:04, schrieb ching:> Hi all, > > I am testing my btrfs root partition with "max_inline=0", and 64k leaf size for weeks and it seems that it is fine. > > > AFAIK btrfs inline small files into metadata by default, I am curious why? > > If there is only a few small files, then there will be neither effect nor benefit at allI disagree in this point. There will be a (probably huge) benefit in terms of performance. If the file data is inlined, you have a good chance (especially with large leaf sizes, although even then it is not guaranteed) that the data is in the same leaf as the inode element. So you already have the file data as you always access complete leafs. If you would store the data in extents, you would need to read the respective extent, which can be anywhere on the disk. That is, you most likely need to move the head. This brings overhead (especially with small files, as the reading process is relatively fast compared to the time you need to position the head).> If there is a lot of small files, then the size of metadata will be undesirable due to deduplicationYes, that is a fact, but if that really matters depends on the use-case (e.g., the small files to large files ratio, ...). But as btrfs is designed explicitly as a general purpose file system, you usually want the good performance instead of the better disk-usage (especially as disk space isn''t expensive anymore).> > there are also some email threads related to problem of metadata inline (i don''t know whether they are fixed in recent kernel): > http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg16295.html > http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg05265.html > > How about turning off inline so that btrfs works better "out of the box"? > > ching >Regards, Felix -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 30, 2012 at 6:04 AM, ching <lsching17@gmail.com> wrote:> Hi all, > > I am testing my btrfs root partition with "max_inline=0", and 64k leaf size for weeks and it seems that it is fine. > > > AFAIK btrfs inline small files into metadata by default, I am curious why? > > If there is only a few small files, then there will be neither effect nor benefit at all > If there is a lot of small files, then the size of metadata will be undesirable due to deduplication > > there are also some email threads related to problem of metadata inline (i don''t know whether they are fixed in recent kernel): > http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg16295.html > http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg05265.html > > How about turning off inline so that btrfs works better "out of the box"? > > ching >I did some rough benchmarking around this a few weeks ago. I''ll try to clean up my method and post the results. I was working with multiple copies and rsyncs of kernel sources, which have many candidate files for inlining. To my surprise, my btrfs benchmarks were always the same or faster when I let btrfs inline the files, even though metadata was much larger. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>> If there is a lot of small files, then the size of metadata will be >> undesirable due to deduplication > > > Yes, that is a fact, but if that really matters depends on the use-case > (e.g., the small files to large files ratio, ...). But as btrfs is designed > explicitly as a general purpose file system, you usually want the good > performance instead of the better disk-usage (especially as disk space isn''t > expensive anymore).As I understand it, in basically all cases the total storage used by inlining will be _smaller_, as the allocation doesn''t need to be aligned to the sector size. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 30, 2012 at 07:04:59PM +0800, ching wrote:> I am testing my btrfs root partition with "max_inline=0", and 64k leaf > size for weeks and it seems that it is fine.Related to inlining itself, ext4 and xfs are receiving inline data support, so it would make sense to introduce a per-file attribute, eg. FS_NOINLINE_FL that would get set on directories and inherited into newly created files with the expected outcome. One of the reasons why is to give a way to user to avoid potential performance hits, eg. when frequently switching from inline to non-inline (provided that this is known to happen). Nice to have IMHO, but more evaluation of real-world use cases where it hurts would be desirable. david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/30/2012 08:04 PM, Felix Pepinghege wrote:> Hi ching! > > Am 30.10.2012 12:04, schrieb ching: >> Hi all, >> >> I am testing my btrfs root partition with "max_inline=0", and 64k leaf size for weeks and it seems that it is fine. >> >> >> AFAIK btrfs inline small files into metadata by default, I am curious why? >> >> If there is only a few small files, then there will be neither effect nor benefit at all > > I disagree in this point. There will be a (probably huge) benefit in terms of performance. If the file data is inlined, you have a good chance (especially with large leaf sizes, although even then it is not guaranteed) that the data is in the same leaf as the inode element. So you already have the file data as you always access complete leafs. If you would store the data in extents, you would need to read the respective extent, which can be anywhere on the disk. That is, you most likely need to move the head. This brings overhead (especially with small files, as the reading process is relatively fast compared to the time you need to position the head).You may be correct. But I doubt inline data may bring possible performance benefit, bigger metadata always means more trouble for concurrency/performance and cache miss ratio> >> If there is a lot of small files, then the size of metadata will be undesirable due to deduplication > > Yes, that is a fact, but if that really matters depends on the use-case (e.g., the small files to large files ratio, ...). But as btrfs is designed explicitly as a general purpose file system, you usually want the good performance instead of the better disk-usage (especially as disk space isn''t expensive anymore).Yes, but as a general purpose filesystem, i guess that the default behaviour should be "safe"? Not many user is patient enough to troubleshoot metadata "explosion". -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/30/2012 08:17 PM, cwillu wrote:>>> If there is a lot of small files, then the size of metadata will be >>> undesirable due to deduplication >> >> Yes, that is a fact, but if that really matters depends on the use-case >> (e.g., the small files to large files ratio, ...). But as btrfs is designed >> explicitly as a general purpose file system, you usually want the good >> performance instead of the better disk-usage (especially as disk space isn''t >> expensive anymore). > As I understand it, in basically all cases the total storage used by > inlining will be _smaller_, as the allocation doesn''t need to be > aligned to the sector size. >if i have 10G small files in total, then it will consume 20G by default. ching -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 31, 2012 at 05:40:25AM +0800, ching wrote:> On 10/30/2012 08:17 PM, cwillu wrote: > >>> If there is a lot of small files, then the size of metadata will be > >>> undesirable due to deduplication > >> > >> Yes, that is a fact, but if that really matters depends on the use-case > >> (e.g., the small files to large files ratio, ...). But as btrfs is designed > >> explicitly as a general purpose file system, you usually want the good > >> performance instead of the better disk-usage (especially as disk space isn''t > >> expensive anymore). > > As I understand it, in basically all cases the total storage used by > > inlining will be _smaller_, as the allocation doesn''t need to be > > aligned to the sector size. > > > > if i have 10G small files in total, then it will consume 20G by default.If those small files are each 128 bytes in size, then you have approximately 80 million of them, and they''d take up 80 million pages, or 320 GiB of total disk space. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk == PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- I always felt that as a C programmer, I --- was becoming typecast.
On Tue, Oct 30, 2012 at 3:40 PM, ching <lsching17@gmail.com> wrote:> On 10/30/2012 08:17 PM, cwillu wrote: >>>> If there is a lot of small files, then the size of metadata will be >>>> undesirable due to deduplication >>> >>> Yes, that is a fact, but if that really matters depends on the use-case >>> (e.g., the small files to large files ratio, ...). But as btrfs is designed >>> explicitly as a general purpose file system, you usually want the good >>> performance instead of the better disk-usage (especially as disk space isn''t >>> expensive anymore). >> As I understand it, in basically all cases the total storage used by >> inlining will be _smaller_, as the allocation doesn''t need to be >> aligned to the sector size. >> > > if i have 10G small files in total, then it will consume 20G by default. > > chingNo. No they will not. As I already explained. root@repository:/mnt$ mount ~/inline /mnt -o loop root@repository:/mnt$ mount ~/inline /mnt2 -o loop,max_inline=0 root@repository:/mnt$ mount /dev/loop0 on /mnt type btrfs (rw) /dev/loop1 on /mnt2 type btrfs (rw,max_inline=0) root@repository:/mnt$ time for x in {1..243854}; do echo "some stuff"> /mnt/$x; donereal 1m5.447s user 0m38.422s sys 0m18.493s root@repository:/mnt$ time for x in {1..243854}; do echo "some stuff"> /mnt2/$x; donereal 1m49.880s user 0m40.379s sys 0m26.210s root@repository:/mnt$ df /mnt /mnt2 Filesystem 1K-blocks Used Available Use% Mounted on /dev/loop0 10485760 266952 8359680 4% /mnt /dev/loop1 10485760 1311620 7384236 16% /mnt2 root@repository:/mnt$ btrfs fi df /mnt Data: total=1.01GB, used=256.00KB System, DUP: total=8.00MB, used=4.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=130.22MB Metadata: total=8.00MB, used=0.00 root@repository:/mnt$ btrfs fi df /mnt2 Data: total=2.01GB, used=953.05MB System, DUP: total=8.00MB, used=4.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=164.03MB Metadata: total=8.00MB, used=0.00 root@repository:/mnt$ btrfs fi show Label: none uuid: e5440337-9f44-4b2d-9889-80ab0ab8f245 Total devices 1 FS bytes used 130.47MB devid 1 size 10.00GB used 3.04GB path /dev/loop0 Label: none uuid: cfcc4149-3102-465d-89b8-0a6bb6a4749a Total devices 1 FS bytes used 1.09GB devid 1 size 10.00GB used 4.04GB path /dev/loop1 Btrfs Btrfs v0.19 Any questions? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 30, 2012 at 10:14:12PM +0000, Hugo Mills wrote:> On Wed, Oct 31, 2012 at 05:40:25AM +0800, ching wrote: > > On 10/30/2012 08:17 PM, cwillu wrote: > > >>> If there is a lot of small files, then the size of metadata will be > > >>> undesirable due to deduplication > > >> > > >> Yes, that is a fact, but if that really matters depends on the use-case > > >> (e.g., the small files to large files ratio, ...). But as btrfs is designed > > >> explicitly as a general purpose file system, you usually want the good > > >> performance instead of the better disk-usage (especially as disk space isn''t > > >> expensive anymore). > > > As I understand it, in basically all cases the total storage used by > > > inlining will be _smaller_, as the allocation doesn''t need to be > > > aligned to the sector size. > > > > > > > if i have 10G small files in total, then it will consume 20G by default. > > If those small files are each 128 bytes in size, then you have > approximately 80 million of them, and they''d take up 80 million pages, > or 320 GiB of total disk space.Sorry, to make that clear -- I meant if they were stored in Data. If they''re inlined in metadata, then they''ll take approximately 20 GiB as you claim, which is a lot less than the 320 GiB they''d be if they''re not. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk == PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- I always felt that as a C programmer, I --- was becoming typecast.
On 10/31/2012 06:16 AM, cwillu wrote:> On Tue, Oct 30, 2012 at 3:40 PM, ching <lsching17@gmail.com> wrote: >> On 10/30/2012 08:17 PM, cwillu wrote: >>>>> If there is a lot of small files, then the size of metadata will be >>>>> undesirable due to deduplication >>>> Yes, that is a fact, but if that really matters depends on the use-case >>>> (e.g., the small files to large files ratio, ...). But as btrfs is designed >>>> explicitly as a general purpose file system, you usually want the good >>>> performance instead of the better disk-usage (especially as disk space isn''t >>>> expensive anymore). >>> As I understand it, in basically all cases the total storage used by >>> inlining will be _smaller_, as the allocation doesn''t need to be >>> aligned to the sector size. >>> >> if i have 10G small files in total, then it will consume 20G by default. >> >> ching > No. No they will not. As I already explained. > > root@repository:/mnt$ mount ~/inline /mnt -o loop > root@repository:/mnt$ mount ~/inline /mnt2 -o loop,max_inline=0 > > root@repository:/mnt$ mount > /dev/loop0 on /mnt type btrfs (rw) > /dev/loop1 on /mnt2 type btrfs (rw,max_inline=0) > > root@repository:/mnt$ time for x in {1..243854}; do echo "some stuff" >> /mnt/$x; done > real 1m5.447s > user 0m38.422s > sys 0m18.493s > > root@repository:/mnt$ time for x in {1..243854}; do echo "some stuff" >> /mnt2/$x; done > real 1m49.880s > user 0m40.379s > sys 0m26.210s > > root@repository:/mnt$ df /mnt /mnt2 > Filesystem 1K-blocks Used Available Use% Mounted on > /dev/loop0 10485760 266952 8359680 4% /mnt > /dev/loop1 10485760 1311620 7384236 16% /mnt2 > > root@repository:/mnt$ btrfs fi df /mnt > Data: total=1.01GB, used=256.00KB > System, DUP: total=8.00MB, used=4.00KB > System: total=4.00MB, used=0.00 > Metadata, DUP: total=1.00GB, used=130.22MB > Metadata: total=8.00MB, used=0.00 > > root@repository:/mnt$ btrfs fi df /mnt2 > Data: total=2.01GB, used=953.05MB > System, DUP: total=8.00MB, used=4.00KB > System: total=4.00MB, used=0.00 > Metadata, DUP: total=1.00GB, used=164.03MB > Metadata: total=8.00MB, used=0.00 > > root@repository:/mnt$ btrfs fi show > Label: none uuid: e5440337-9f44-4b2d-9889-80ab0ab8f245 > Total devices 1 FS bytes used 130.47MB > devid 1 size 10.00GB used 3.04GB path /dev/loop0 > > Label: none uuid: cfcc4149-3102-465d-89b8-0a6bb6a4749a > Total devices 1 FS bytes used 1.09GB > devid 1 size 10.00GB used 4.04GB path /dev/loop1 > > Btrfs Btrfs v0.19 > > Any questions? >can the test be repeated for: 1. 3k per file with leaf size=4K 2. 60k per file with leaf size=64k -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/31/2012 06:19 AM, Hugo Mills wrote:> On Tue, Oct 30, 2012 at 10:14:12PM +0000, Hugo Mills wrote: >> On Wed, Oct 31, 2012 at 05:40:25AM +0800, ching wrote: >>> On 10/30/2012 08:17 PM, cwillu wrote: >>>>>> If there is a lot of small files, then the size of metadata will be >>>>>> undesirable due to deduplication >>>>> Yes, that is a fact, but if that really matters depends on the use-case >>>>> (e.g., the small files to large files ratio, ...). But as btrfs is designed >>>>> explicitly as a general purpose file system, you usually want the good >>>>> performance instead of the better disk-usage (especially as disk space isn''t >>>>> expensive anymore). >>>> As I understand it, in basically all cases the total storage used by >>>> inlining will be _smaller_, as the allocation doesn''t need to be >>>> aligned to the sector size. >>>> >>> if i have 10G small files in total, then it will consume 20G by default. >> If those small files are each 128 bytes in size, then you have >> approximately 80 million of them, and they''d take up 80 million pages, >> or 320 GiB of total disk space. > Sorry, to make that clear -- I meant if they were stored in Data. > If they''re inlined in metadata, then they''ll take approximately 20 GiB > as you claim, which is a lot less than the 320 GiB they''d be if > they''re not. > > Hugo. >is it the same for: 1. 3k per file with leaf size=4K 2. 60k per file with leaf size=64k -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 31, 2012 at 07:47:14AM +0800, ching wrote:> On 10/31/2012 06:19 AM, Hugo Mills wrote: > > On Tue, Oct 30, 2012 at 10:14:12PM +0000, Hugo Mills wrote: > >>> if i have 10G small files in total, then it will consume 20G by default. > >> If those small files are each 128 bytes in size, then you have > >> approximately 80 million of them, and they''d take up 80 million pages, > >> or 320 GiB of total disk space. > > Sorry, to make that clear -- I meant if they were stored in Data. > > If they''re inlined in metadata, then they''ll take approximately 20 GiB > > as you claim, which is a lot less than the 320 GiB they''d be if > > they''re not. > > > is it the same for: > 1. 3k per file with leaf size=4K > 2. 60k per file with leaf size=64kThe inline limit is minimum of * ''max_inline'' (8k by default) * PAGE_SIZE * leafsize - header so 60k files for 64k leaves will not get inlined, unless you have a system with 64k pages. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 30, 2012 at 5:47 PM, ching <lsching17@gmail.com> wrote:> On 10/31/2012 06:19 AM, Hugo Mills wrote: >> On Tue, Oct 30, 2012 at 10:14:12PM +0000, Hugo Mills wrote: >>> On Wed, Oct 31, 2012 at 05:40:25AM +0800, ching wrote: >>>> On 10/30/2012 08:17 PM, cwillu wrote: >>>>>>> If there is a lot of small files, then the size of metadata will be >>>>>>> undesirable due to deduplication >>>>>> Yes, that is a fact, but if that really matters depends on the use-case >>>>>> (e.g., the small files to large files ratio, ...). But as btrfs is designed >>>>>> explicitly as a general purpose file system, you usually want the good >>>>>> performance instead of the better disk-usage (especially as disk space isn''t >>>>>> expensive anymore). >>>>> As I understand it, in basically all cases the total storage used by >>>>> inlining will be _smaller_, as the allocation doesn''t need to be >>>>> aligned to the sector size. >>>>> >>>> if i have 10G small files in total, then it will consume 20G by default. >>> If those small files are each 128 bytes in size, then you have >>> approximately 80 million of them, and they''d take up 80 million pages, >>> or 320 GiB of total disk space. >> Sorry, to make that clear -- I meant if they were stored in Data. >> If they''re inlined in metadata, then they''ll take approximately 20 GiB >> as you claim, which is a lot less than the 320 GiB they''d be if >> they''re not. >> >> Hugo. >> > > > is it the same for: > 1. 3k per file with leaf size=4K > 2. 60k per file with leaf size=64k > >import os import sys data = "1" * 1024 * 3 for x in xrange(100 * 1000): with open(''%s/%s'' % (sys.argv[1], x), ''a'') as f: f.write(data) root@repository:~$ mount -o loop ~/inline /mnt root@repository:~$ mount -o loop,max_inline=0 ~/noninline /mnt2 root@repository:~$ time python test.py /mnt real 0m11.105s user 0m1.328s sys 0m5.416s root@repository:~$ time python test.py /mnt2 real 0m21.905s user 0m1.292s sys 0m5.460s root@repository:/$ btrfs fi df /mnt Data: total=1.01GB, used=256.00KB System, DUP: total=8.00MB, used=4.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=652.70MB Metadata: total=8.00MB, used=0.00 root@repository:/$ btrfs fi df /mnt2 Data: total=1.01GB, used=391.12MB System, DUP: total=8.00MB, used=4.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=60.98MB Metadata: total=8.00MB, used=0.00 3k data, 4k leaf: inline is twice the speed, but 1.4x bigger. ---- root@repository:~$ mkfs.btrfs inline -l 64k root@repository:~$ mkfs.btrfs noninline -l 64k ... root@repository:~$ time python test.py /mnt real 0m12.244s user 0m1.396s sys 0m8.101s root@repository:~$ time python test.py /mnt2 real 0m13.047s user 0m1.436s sys 0m7.772s root@repository:/$ btr\fs fi df /mnt Data: total=8.00MB, used=256.00KB System, DUP: total=8.00MB, used=64.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=342.06MB Metadata: total=8.00MB, used=0.00 root@repository:/$ btr\fs fi df /mnt2 Data: total=1.01GB, used=391.10MB System, DUP: total=8.00MB, used=64.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=50.06MB Metadata: total=8.00MB, used=0.00 3k data, 64k leaf: inline is still 10% faster, and is now 25% smaller ---- data = "1" * 1024 * 32 ... (mkfs, mount, etc) root@repository:~$ time python test.py /mnt real 0m17.834s user 0m1.224s sys 0m4.772s root@repository:~$ time python test.py /mnt2 real 0m20.521s user 0m1.304s sys 0m6.344s root@repository:/$ btrfs fi df /mnt Data: total=4.01GB, used=3.05GB System, DUP: total=8.00MB, used=64.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=54.00MB Metadata: total=8.00MB, used=0.00 root@repository:/$ btrfs fi df /mnt2 Data: total=4.01GB, used=3.05GB System, DUP: total=8.00MB, used=64.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=53.56MB Metadata: total=8.00MB, used=0.00 32k data, 64k leaf: inline is still 10% faster, and is now the same size (not dead sure why, probably some interaction with the size of the actual write that happens) ---- data = "1" * 1024 * 7 ... etc root@repository:~$ time python test.py /mnt real 0m9.628s user 0m1.368s sys 0m4.188s root@repository:~$ time python test.py /mnt2 real 0m13.455s user 0m1.608s sys 0m7.884s root@repository:/$ btrfs fi df /mnt Data: total=3.01GB, used=1.91GB System, DUP: total=8.00MB, used=64.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=74.69MB Metadata: total=8.00MB, used=0.00 root@repository:/$ btrfs fi df /mnt2 Data: total=3.01GB, used=1.91GB System, DUP: total=8.00MB, used=64.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=74.69MB Metadata: total=8.00MB, used=0.00 7k data, 64k leaf: 30% faster, same data usage. ---- Are we done yet? Can I go home now? ;p -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
i also dont see any benefit from inlining small files: this example is me doing a fully fledged prebuilt gentoo system installation on a fresh HDD from squashfs image on usb key in under 5 minutes: with defaults (inlining small files): # mount -o noatime,compress=lzo /dev/sda2 /mnt/point # time unsquashfs -f -d /mnt/point/ /dev/sdb2 real 4m39.253s user 1m37.854s sys 1m1.433s # btrfs filesystem show Label: ''root'' uuid: 6fb7104d-8f4a-4f3a-aff8-fdad0a39f569 Total devices 1 FS bytes used 10.05GB devid 1 size 232.63GB used 14.04GB path /dev/sda2 Btrfs v0.20-rc1 # btrfs filesystem df /mnt/point/ Data: total=10.01GB, used=9.08GB System, DUP: total=8.00MB, used=4.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=2.00GB, used=992.48MB Metadata: total=8.00MB, used=0.00 without inline: # mount -o max_inline=0,noatime,compress=lzo /dev/sda2 /mnt/point # time unsquashfs -f -d /mnt/point/ /dev/sdb2 real 4m42.085s user 1m36.894s sys 1m1.583s # btrfs filesystem show failed to read /dev/sr0 Label: ''root'' uuid: 97ad3c97-ab03-4197-86d3-72869604b368 Total devices 1 FS bytes used 11.36GB devid 1 size 232.63GB used 13.04GB path /dev/sda2 Btrfs v0.20-rc1 # btrfs filesystem df /mnt/point/ Data: total=11.01GB, used=10.85GB System, DUP: total=8.00MB, used=4.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=518.59MB Metadata: total=8.00MB, used=0.00 i will test "no inlining" for some more time though. Ahmet -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 31, 2012 at 2:48 AM, Ahmet Inan <ainan@mathematik.uni-freiburg.de> wrote:> i also dont see any benefit from inlining small files:> with defaults (inlining small files): > real 4m39.253s > Data: total=10.01GB, used=9.08GB > Metadata, DUP: total=2.00GB, used=992.48MB> without inline: > real 4m42.085s > Data: total=11.01GB, used=10.85GB > Metadata, DUP: total=1.00GB, used=518.59MBI suggest you take a closer look at your numbers. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>> i also dont see any benefit from inlining small files:>> with defaults (inlining small files): >> real 4m39.253s >> Data: total=10.01GB, used=9.08GB >> Metadata, DUP: total=2.00GB, used=992.48MB>> without inline: >> real 4m42.085s >> Data: total=11.01GB, used=10.85GB >> Metadata, DUP: total=1.00GB, used=518.59MB > > I suggest you take a closer look at your numbers.both use 12GiB in total and both need 280 seconds. am i missing something? Ahmet -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 31 Oct 2012 11:48 +0100, from ainan@mathematik.uni-freiburg.de (Ahmet Inan):>>> i also dont see any benefit from inlining small files: > >>> with defaults (inlining small files): >>> real 4m39.253s >>> Data: total=10.01GB, used=9.08GB >>> Metadata, DUP: total=2.00GB, used=992.48MBThis uses 10290.40 MB total, if we pad with zeroes (9.08GB plus 992.48MB).>>> without inline: >>> real 4m42.085s >>> Data: total=11.01GB, used=10.85GB >>> Metadata, DUP: total=1.00GB, used=518.59MBUnder the same assumption, this uses 11628.99 MB total (10.85GB + 518.59MB).>> I suggest you take a closer look at your numbers. > > both use 12GiB in total and both need 280 seconds. > am i missing something?With inlining, you''re using about 1.3 GB less disk space and require a few seconds less wall-clock time for the same thing. A 10% difference in storage space requirement does not seem like "no benefit" to me, and both sets of numbers favor the default (with inlining). -- Michael Kjörling • http://michael.kjorling.se • michael@kjorling.se “People who think they know everything really annoy those of us who know we don’t.” (Bjarne Stroustrup) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 31, 2012 at 4:48 AM, Ahmet Inan <ainan@mathematik.uni-freiburg.de> wrote:>>> i also dont see any benefit from inlining small files: > >>> with defaults (inlining small files): >>> real 4m39.253s >>> Data: total=10.01GB, used=9.08GB >>> Metadata, DUP: total=2.00GB, used=992.48MB > >>> without inline: >>> real 4m42.085s >>> Data: total=11.01GB, used=10.85GB >>> Metadata, DUP: total=1.00GB, used=518.59MB >> >> I suggest you take a closer look at your numbers. > > both use 12GiB in total and both need 280 seconds. > am i missing something?9.08GB + 992.48MB*2 == 11.02GB 10.85GB + 518MB*2 == 11.86GB That''s nearly a GB smaller. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>> with defaults (inlining small files): >>>> real 4m39.253s >>>> Data: total=10.01GB, used=9.08GB >>>> Metadata, DUP: total=2.00GB, used=992.48MB > > This uses 10290.40 MB total, if we pad with zeroes (9.08GB plus > 992.48MB).>>>> without inline: >>>> real 4m42.085s >>>> Data: total=11.01GB, used=10.85GB >>>> Metadata, DUP: total=1.00GB, used=518.59MB > > Under the same assumption, this uses 11628.99 MB total (10.85GB + > 518.59MB).> With inlining, you''re using about 1.3 GB less disk space and require athank you for clarifying this. 10% is indeed a benefit.> few seconds less wall-clock time for the same thing.those where only 2 tests, have to make a lot more runs to make qualified judgement there. one percent difference is noise floor to me. Ahmet -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 31 Oct 2012 04:57 -0600, from cwillu@cwillu.com (cwillu):> 9.08GB + 992.48MB*2 == 11.02GB > > 10.85GB + 518MB*2 == 11.86GB > > That''s nearly a GB smaller.That, too; I missed the "DUP". Not quite as pronounced as in my calculations, then, but still a significant enough difference. -- Michael Kjörling • http://michael.kjorling.se • michael@kjorling.se “People who think they know everything really annoy those of us who know we don’t.” (Bjarne Stroustrup) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>> 9.08GB + 992.48MB*2 == 11.02GB >> 10.85GB + 518MB*2 == 11.86GB >> That''s nearly a GB smaller. > That, too; I missed the "DUP". Not quite as pronounced as in my > calculations, then, but still a significant enough difference.great. now were down to 7-8% just FYI: ive retested with max_inline=0 but with leafsize=64K this time: # mkfs.btrfs -l 64K -L root /dev/sda2 ... real 4m45.878s user 1m44.730s sys 1m7.226s thats 2% slower for this one test (no big deal really) # btrfs filesystem show Label: ''root'' uuid: dd2951da-2529-4320-a952-e692ea5bdbc3 Total devices 1 FS bytes used 11.37GB devid 1 size 232.63GB used 13.04GB path /dev/sda2 Btrfs v0.20-rc1 # btrfs filesystem df /mnt/point/ Data: total=11.01GB, used=10.89GB System, DUP: total=8.00MB, used=64.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=487.94MB Metadata: total=8.00MB, used=0.00 (1024*10.89 + 2*487.94) / 1024 = 11.84GiB still around 7-8% now lets see what everyday use with max_inline=0 and leafsize=64K feels like :) Ahmet -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 31 Oct 2012 11:56:39 +0000 Michael Kjörling <michael@kjorling.se> wrote:> On 31 Oct 2012 04:57 -0600, from cwillu@cwillu.com (cwillu): > > 9.08GB + 992.48MB*2 == 11.02GB > > > > 10.85GB + 518MB*2 == 11.86GB > > > > That''s nearly a GB smaller. > > That, too; I missed the "DUP". Not quite as pronounced as in my > calculations, then, but still a significant enough difference.There is also a number of cases which justify disabling DUP for metadata, e.g. - underlying block device is an internally deduplicating SSD (i.e. possibly most of them) - or the block device is a RAID incorporating redundancy - or simply one wants increase performance at the cost of some reliability With non-DUP metadata your calculations showing inlining being more efficient remain correct. -- With respect, Roman ~~~~~~~~~~~~~~~~~~~~~~~~~~~ "Stallman had a printer, with code he could not see. So he began to tinker, and set the software free."
On 10/31/2012 08:18 AM, cwillu wrote:> import os > import sys > > data = "1" * 1024 * 3 > > for x in xrange(100 * 1000): > with open(''%s/%s'' % (sys.argv[1], x), ''a'') as f: > f.write(data) > > root@repository:~$ mount -o loop ~/inline /mnt > root@repository:~$ mount -o loop,max_inline=0 ~/noninline /mnt2 > > root@repository:~$ time python test.py /mnt > real 0m11.105s > user 0m1.328s > sys 0m5.416s > root@repository:~$ time python test.py /mnt2 > real 0m21.905s > user 0m1.292s > sys 0m5.460s > > root@repository:/$ btrfs fi df /mnt > Data: total=1.01GB, used=256.00KB > System, DUP: total=8.00MB, used=4.00KB > System: total=4.00MB, used=0.00 > Metadata, DUP: total=1.00GB, used=652.70MB > Metadata: total=8.00MB, used=0.00 > > root@repository:/$ btrfs fi df /mnt2 > Data: total=1.01GB, used=391.12MB > System, DUP: total=8.00MB, used=4.00KB > System: total=4.00MB, used=0.00 > Metadata, DUP: total=1.00GB, used=60.98MB > Metadata: total=8.00MB, used=0.00 > > 3k data, 4k leaf: inline is twice the speed, but 1.4x bigger. > > ---- > > root@repository:~$ mkfs.btrfs inline -l 64k > root@repository:~$ mkfs.btrfs noninline -l 64k > ... > root@repository:~$ time python test.py /mnt > real 0m12.244s > user 0m1.396s > sys 0m8.101s > root@repository:~$ time python test.py /mnt2 > real 0m13.047s > user 0m1.436s > sys 0m7.772s > > root@repository:/$ btr\fs fi df /mnt > Data: total=8.00MB, used=256.00KB > System, DUP: total=8.00MB, used=64.00KB > System: total=4.00MB, used=0.00 > Metadata, DUP: total=1.00GB, used=342.06MB > Metadata: total=8.00MB, used=0.00 > > root@repository:/$ btr\fs fi df /mnt2 > Data: total=1.01GB, used=391.10MB > System, DUP: total=8.00MB, used=64.00KB > System: total=4.00MB, used=0.00 > Metadata, DUP: total=1.00GB, used=50.06MB > Metadata: total=8.00MB, used=0.00 > > 3k data, 64k leaf: inline is still 10% faster, and is now 25% smaller > > ---- > > data = "1" * 1024 * 32 > > ... (mkfs, mount, etc) > > root@repository:~$ time python test.py /mnt > real 0m17.834s > user 0m1.224s > sys 0m4.772s > root@repository:~$ time python test.py /mnt2 > real 0m20.521s > user 0m1.304s > sys 0m6.344s > > root@repository:/$ btrfs fi df /mnt > Data: total=4.01GB, used=3.05GB > System, DUP: total=8.00MB, used=64.00KB > System: total=4.00MB, used=0.00 > Metadata, DUP: total=1.00GB, used=54.00MB > Metadata: total=8.00MB, used=0.00 > > root@repository:/$ btrfs fi df /mnt2 > Data: total=4.01GB, used=3.05GB > System, DUP: total=8.00MB, used=64.00KB > System: total=4.00MB, used=0.00 > Metadata, DUP: total=1.00GB, used=53.56MB > Metadata: total=8.00MB, used=0.00 > > 32k data, 64k leaf: inline is still 10% faster, and is now the same > size (not dead sure why, probably some interaction with the size of > the actual write that happens) > > ---- > > data = "1" * 1024 * 7 > > ... etc > > > root@repository:~$ time python test.py /mnt > real 0m9.628s > user 0m1.368s > sys 0m4.188s > root@repository:~$ time python test.py /mnt2 > real 0m13.455s > user 0m1.608s > sys 0m7.884s > > root@repository:/$ btrfs fi df /mnt > Data: total=3.01GB, used=1.91GB > System, DUP: total=8.00MB, used=64.00KB > System: total=4.00MB, used=0.00 > Metadata, DUP: total=1.00GB, used=74.69MB > Metadata: total=8.00MB, used=0.00 > > root@repository:/$ btrfs fi df /mnt2 > Data: total=3.01GB, used=1.91GB > System, DUP: total=8.00MB, used=64.00KB > System: total=4.00MB, used=0.00 > Metadata, DUP: total=1.00GB, used=74.69MB > Metadata: total=8.00MB, used=0.00 > > 7k data, 64k leaf: 30% faster, same data usage. > > ---- > > Are we done yet? Can I go home now? ;p >thanks for the test. but the result just indicate the inline small file is not a "safe" optimization to be turned on by default. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/31/2012 08:12 AM, David Sterba wrote:> On Wed, Oct 31, 2012 at 07:47:14AM +0800, ching wrote: >> On 10/31/2012 06:19 AM, Hugo Mills wrote: >>> On Tue, Oct 30, 2012 at 10:14:12PM +0000, Hugo Mills wrote: >>>>> if i have 10G small files in total, then it will consume 20G by default. >>>> If those small files are each 128 bytes in size, then you have >>>> approximately 80 million of them, and they''d take up 80 million pages, >>>> or 320 GiB of total disk space. >>> Sorry, to make that clear -- I meant if they were stored in Data. >>> If they''re inlined in metadata, then they''ll take approximately 20 GiB >>> as you claim, which is a lot less than the 320 GiB they''d be if >>> they''re not. >>> >> is it the same for: >> 1. 3k per file with leaf size=4K >> 2. 60k per file with leaf size=64k > The inline limit is minimum of > * ''max_inline'' (8k by default) > * PAGE_SIZE > * leafsize - header > > so 60k files for 64k leaves will not get inlined, unless you have a > system with 64k pages. >thank you very much for your clear explanation :) this is the first time i heard about this. ching -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html