Maciej Marcin Piechotka
2011-Sep-18 23:32 UTC
Inefficient storing of ISO images with compress=lzo
I''ve noticed that: - with x86-64 Fedora 15 DVD install images: - du -sh <ROOT VOLUME> was 36 GB - btrfs df | grep -i data have shown over 40 GB used - without - du -sh <ROOT VOLUME> is 34 GB - btrfs df | grep -i data have shown less then 34 GB used It seems that iso files are considered compressable while they may not be (and penalty is severe - 3x). Regards
Maciej Marcin Piechotka wrote:> I''ve noticed that: > > - with x86-64 Fedora 15 DVD install images: > - du -sh <ROOT VOLUME> was 36 GB > - btrfs df | grep -i data have shown over 40 GB used > - without > - du -sh <ROOT VOLUME> is 34 GB > - btrfs df | grep -i data have shown less then 34 GB used > > It seems that iso files are considered compressable while they may not be (and penalty is severe - 3x). >With compress option specified, btrfs will try to compress the file, at most 128K at one time, and if the compressed result is not smaller, the file will be marked as uncompressable. I just tried with Fedora-14-i386-DVD.iso, and the first 896K is compressed, with a compress ratio about 71.7%, and the remaining data is not compressed. -- Li Zefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Li Zefan wrote:> Maciej Marcin Piechotka wrote: >> I''ve noticed that: >> >> - with x86-64 Fedora 15 DVD install images: >> - du -sh <ROOT VOLUME> was 36 GB >> - btrfs df | grep -i data have shown over 40 GB used >> - without >> - du -sh <ROOT VOLUME> is 34 GB >> - btrfs df | grep -i data have shown less then 34 GB used >> >> It seems that iso files are considered compressable while they may not be (and penalty is severe - 3x). >> > > With compress option specified, btrfs will try to compress the file, at most > 128K at one time, and if the compressed result is not smaller, the file will > be marked as uncompressable. > > I just tried with Fedora-14-i386-DVD.iso, and the first 896K is compressed, > with a compress ratio about 71.7%, and the remaining data is not compressed. >correct: the compression ratio is 38.3%, pretty good :) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Maciej Marcin Piechotka
2011-Sep-19 23:06 UTC
Re: Inefficient storing of ISO images with compress=lzo
On Mon, 2011-09-19 at 10:53 +0800, Li Zefan wrote:> Maciej Marcin Piechotka wrote: > > I''ve noticed that: > > > > - with x86-64 Fedora 15 DVD install images: > > - du -sh <ROOT VOLUME> was 36 GB > > - btrfs df | grep -i data have shown over 40 GB used > > - without > > - du -sh <ROOT VOLUME> is 34 GB > > - btrfs df | grep -i data have shown less then 34 GB used > > > > It seems that iso files are considered compressable while they may not be (and penalty is severe - 3x). > > > > With compress option specified, btrfs will try to compress the file, at most > 128K at one time, and if the compressed result is not smaller, the file will > be marked as uncompressable. > > I just tried with Fedora-14-i386-DVD.iso, and the first 896K is compressed, > with a compress ratio about 71.7%, and the remaining data is not compressed. > > -- > Li ZefanJust a question from person who don''t know how btrfs operates - what if the beginning of file is well compressable and the rest is not? In any case the compression was my uneducated guess where is missing 4GB. Regards -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
07:06, Maciej Marcin Piechotka wrote:> On Mon, 2011-09-19 at 10:53 +0800, Li Zefan wrote: >> Maciej Marcin Piechotka wrote: >>> I''ve noticed that: >>> >>> - with x86-64 Fedora 15 DVD install images: >>> - du -sh <ROOT VOLUME> was 36 GB >>> - btrfs df | grep -i data have shown over 40 GB used >>> - without >>> - du -sh <ROOT VOLUME> is 34 GB >>> - btrfs df | grep -i data have shown less then 34 GB used >>> >>> It seems that iso files are considered compressable while they may not be (and penalty is severe - 3x). >>> >> >> With compress option specified, btrfs will try to compress the file, at most >> 128K at one time, and if the compressed result is not smaller, the file will >> be marked as uncompressable. >> >> I just tried with Fedora-14-i386-DVD.iso, and the first 896K is compressed, >> with a compress ratio about 71.7%, and the remaining data is not compressed. >> >> -- >> Li Zefan > > Just a question from person who don''t know how btrfs operates - what if > the beginning of file is well compressable and the rest is not? >It''s explained in the previous mail - the beginning part will be compressed, and the rest will not.> In any case the compression was my uneducated guess where is missing > 4GB. >It probably has nothing to do with compression. You can try without compress=lzo, and see if the issue still exists. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba
2011-Sep-20 14:29 UTC
Re: Inefficient storing of ISO images with compress=lzo
On Mon, Sep 19, 2011 at 10:53:45AM +0800, Li Zefan wrote:> With compress option specified, btrfs will try to compress the file, at most > 128K at one time, and if the compressed result is not smaller, the file will > be marked as uncompressable. > > I just tried with Fedora-14-i386-DVD.iso, and the first 896K is compressed, > with a compress ratio about 71.7%, and the remaining data is not compressed.I''m curious how did you obtain that number and if it''s a rough estimate (ie. some rounding up to 4k or such), or the % comes from exact numbers. AFAIK there are two possibilities to read compressed sizes: rough: * traverse extents, look for compressed extens and sum up extent_map->block_len, or just extent_map->len for uncompressed * block_len is rounded up to 4k * compressed inline size is not stored in any structur member, at most 4k exact: as you know, the only place where exact size of compressed data is stored are first 4 bytes of every compressed extent, counting exact size of compressed extent means to read those bytes, naturally. Touching non-metadata just to read compressed size does not look nice. I did some research in that area and my conclusion is that it there''s a missing structure member "compressed_length" in extent_map (in-memory structure, no problem to add it there) which is filled from strcut btrfs_file_extent_item (on-disk structure, eg. holding compression type) -- disk format change :( Other members could not be used to calculate the compressed size, being either estimates by definition (ram_size) or contain size depending on other data (disk_num_bytes, depend on checksum size). Although there are 2 bytes spare for other compression types, there are none to hold the actual compression or encryption or whateverencoding length. So until there''s going to be format change, there are the two ways, rough or slow, to read compressed size. (Unless I''ve missed something obvious etc.) Looking forward to your input or patches :) Thanks, david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html