Hello zfs-discuss, During christmass I managed to add my own compression to zfs - it as quite easy. Actually I registered new compression named milek with functions milek_compress and milek_decompress - but I copied the same algorithm used for lzjb - first I wanted to learn how to register/compile new compression. One thing I noticed - I had to also add new compression name to libzfs - I think that available compression methods should be exported from zfs module and not directly coded in libzfs. Anyway it worked. Now in a free time (well, no vacations right now...) I''m trying to put zlib into ZFS... Does anyone know any ready compression functions written in C which can be used in a kernel (no mallocs, etc.)? Perhaps something from BSD (so no GPL<->CDDL problems will show up)? Another thing - why there''s a 12.5% limit on using compression (so if compression method did compress below 12.5% zfs writes uncompresed data)? Why 12.5 and not different value? Why hard coded-in? -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Hi Robert, Robert Milkowski wrote:> During christmass I managed to add my own compression to zfs - it as > quite easy. Actually I registered new compression named milek with > functions milek_compress and milek_decompress - but I copied the > same algorithm used for lzjb - first I wanted to learn how to > register/compile new compression.*cool* --- and "people" said that nobody outside of Sun would be interested in making use of Solaris kernel code. OpenSolaris Community +1, naysayers 0!> One thing I noticed - I had to also add new compression name to > libzfs - I think that available compression methods should be > exported from zfs module and not directly coded in libzfs. Anyway it > worked.Sounds like a reasonable RFE to me.> Now in a free time (well, no vacations right now...) I''m trying to > put zlib into ZFS...umm, zlib as in /usr/lib/libz.* or something else? If it''s /usr/lib/libz then don''t we already have access to all that already?> Does anyone know any ready compression functions written in C which > can be used in a kernel (no mallocs, etc.)? Perhaps something from > BSD (so no GPL<->CDDL problems will show up)?I think it''s time to dig out your copy of Knuth.....> Another thing - why there''s a 12.5% limit on using compression (so > if compression method did compress below 12.5% zfs writes > uncompresed data)? Why 12.5 and not different value? Why hard > coded-in?That''s a good question. I don''t know for sure so I''m guessing -- no doubt during the design and early scoping-of-work phases the 12.5% figure was determined. It''s probably got something to do with the overhead of storing the actual data such that over 12.5% we get a > X% benefit, and below that the cost of the compression is either too high or not sufficiently different. I await some facts (to spoil my hypothesizing!) from team ZFS. So have you blogged about your Christmas coding yet? cheers, James C. McPherson -- Solaris Datapath Engineering Data Management Group Sun Microsystems
On Wed, Jan 04, 2006 at 04:53:34PM +1100, James C. McPherson wrote:> > > One thing I noticed - I had to also add new compression name to > > libzfs - I think that available compression methods should be > > exported from zfs module and not directly coded in libzfs. Anyway it > > worked. > > Sounds like a reasonable RFE to me.Feel free to file it, but I wouldn''t expect it to be fixed any time soon. Adding a compression algorithm requires modifying a hardcoded table, making changes to the on-disk format, so I don''t see why modifying libzfs is a problem. We would like to support a more pluggable architecture (and there is an open RFE), but as you can see there is more work to do than just exporting the table to libzfs. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
> > Another thing - why there''s a 12.5% limit on using compression (so > > if compression method did compress below 12.5% zfs writes > > uncompresed data)? Why 12.5 and not different value? Why hard > > coded-in? > > That''s a good question. I don''t know for sure so I''m guessing -- no doubt > during the design and early scoping-of-work phases the 12.5% figure was > determined. It''s probably got something to do with the overhead of storing > the actual data such that over 12.5% we get a > X% benefit, and below > that the cost of the compression is either too high or not sufficiently > different. I await some facts (to spoil my hypothesizing!) from team ZFS.Right. The threshold is somewhat arbitrary, and not terribly important in practice. Data tends to compress either quite well (2x is common) or not at all (e.g. JPEG files, which are already compressed). It would be trivial to make the threshold a tunable, but we''re trying to avoid this sort of thing. I don''t want there to be a ZFS tuning guide, ever. That would mean we failed. Jeff
> During christmass I managed to add my own compression to zfsCool!> Now in a free time (well, no vacations right now...) I''m trying to > put zlib into ZFS...We actually have the decompress half of zlib in the kernel already (to support CTF data). We''ll be adding the compress half soon so that ZFS can use it.> Does anyone know any ready compression functions written in CI''d be interested too. There''s no shortage of compression algorithms, but not many are suitable for in-kernel, hot-code-path usage. The advantages of lzjb are that it''s fast, reentrant, stateless, and incredibly small (85 lines of code). But it doesn''t squeeze the bits as hard as zlib does. Jeff
Jeff Bonwick wrote:> The advantages of lzjb are that it''s fast, reentrant, stateless, > and incredibly small (85 lines of code). But it doesn''t squeeze > the bits as hard as zlib does.Is the lzjb algorithm documented somewhere that''s not CDDL source code? (GPL compatibility probs with CDDL source code) Luke.
> Is the lzjb algorithm documented somewhere that''s not CDDL source code? > (GPL compatibility probs with CDDL source code)Sorry, no. Jeff
On Wed, 2006-01-04 at 06:53, Eric Schrock wrote:> We would like to support a more pluggable architecture (and there is an > open RFE), but as you can see there is more work to do than just > exporting the table to libzfs.One thing that might help with this is if we added compression support to the crypto framework. Compression ops are basically the same type of op as crypto anyway. We have already been asked by Hifn if we would consider doing this because they have a patented compression algorithm that they implement in hardware, this would allow ZFS to use it (on machines where that hardware exists). -- Darren J Moffat
> One thing that might help with this is if we added compression support > to the crypto framework. Compression ops are basically the same type of > op as crypto anyway. We have already been asked by Hifn if we would > consider doing this because they have a patented compression algorithm > that they implement in hardware, this would allow ZFS to use it (on > machines where that hardware exists).That''s interesting. Bill recently restructured the zio pipeline so that all CPU-intensive operations (compress, encrypt, checksum) are consecutive pipeline stages, with the thought that we''d eventually like to merge them so that we only make one pass over the data. (I''d also like to fold in RAID-Z parity generation, but that''s a little trickier.) We should talk. Jeff
Casper.Dik at Sun.COM
2006-Jan-04 11:20 UTC
[zfs-discuss] Adding my own compression to zfs
>> One thing that might help with this is if we added compression support >> to the crypto framework. Compression ops are basically the same type of >> op as crypto anyway. We have already been asked by Hifn if we would >> consider doing this because they have a patented compression algorithm >> that they implement in hardware, this would allow ZFS to use it (on >> machines where that hardware exists). > >That''s interesting. Bill recently restructured the zio pipeline >so that all CPU-intensive operations (compress, encrypt, checksum) >are consecutive pipeline stages, with the thought that we''d >eventually like to merge them so that we only make one pass >over the data. (I''d also like to fold in RAID-Z parity >generation, but that''s a little trickier.) We should talk.I''d imagine that compression is different in two possibly important details: the amount of data in and out is not equal (less data out when compressing, (much) more when decompressing) compression can presumably fail (data does not compress) without it being an error for the process. Casper
On Wed, 2006-01-04 at 11:14, Jeff Bonwick wrote:> > One thing that might help with this is if we added compression support > > to the crypto framework. Compression ops are basically the same type of > > op as crypto anyway. We have already been asked by Hifn if we would > > consider doing this because they have a patented compression algorithm > > that they implement in hardware, this would allow ZFS to use it (on > > machines where that hardware exists). > > That''s interesting. Bill recently restructured the zio pipeline > so that all CPU-intensive operations (compress, encrypt, checksum) > are consecutive pipeline stages, with the thought that we''d > eventually like to merge them so that we only make one pass > over the data. (I''d also like to fold in RAID-Z parity > generation, but that''s a little trickier.) We should talk.and this is where the crypto framework should be able to help you out since we already have the ability to do hash and encrypt at the same time doing only a single pass over the data. Of course we would be compressing then encrypting and hashing the data so it will always be "two" passes (compress the real data then encrypt and hash in a single pass the compressed data). -- Darren J Moffat
On Wed, 2006-01-04 at 06:20, Casper.Dik at Sun.COM wrote:> I''d imagine that compression is different in two possibly important > details: > the amount of data in and out is not equal > (less data out when compressing, (much) more when decompressing)This is also true for some encryption modes. - Bill
> > Is the lzjb algorithm documented somewhere that''s not CDDL source code? > > (GPL compatibility probs with CDDL source code) > > Sorry, no.Shame, I was hoping it might not be too hard for me to write a read-only linux driver for zfs, as that would help lower the barrier for linux people to try solaris. I don''t think I can look at any solaris ZFS code if I want to do that though. Does anyone know what lzjb stands for? I presume that "lz" refers to Lempel & Ziv and their compression algorithms: http://en.wikipedia.org/wiki/LZ77_and_LZ78_%28algorithms%29 ...so I guess lzjb would be a derivative of Lempel & Ziv''s algorithms? I wonder who J. and B. were Luke. This message posted from opensolaris.org
> Shame, I was hoping it might not be too hard for me to write a read-only > linux driver for zfs, as that would help lower the barrier for linux people > to try solaris. I don''t think I can look at any solaris ZFS code if I > want to do that though.Right. (But please join me in resisting the urge to turn this thread into a license debate. It''s all been said before.)> ...so I guess lzjb would be a derivative of Lempel & Ziv''s algorithms? > > I wonder who J. and B. wereThat would be me (first.last, following the convention set by LZRW). I wrote it many years ago to compress crash dumps. That meant that it had to work in panic() context, which is *really* restricted -- no malloc, no threads, no nothing -- and it had to be fast, because compression time is down time. We put it in ZFS because we have it, we know it''s safe, and it does a decent job in very little time. Jeff
Hello James, Wednesday, January 4, 2006, 6:53:34 AM, you wrote: JCM> Hi Robert,>> Now in a free time (well, no vacations right now...) I''m trying to >> put zlib into ZFS...JCM> umm, zlib as in /usr/lib/libz.* or something else? If it''s /usr/lib/libz JCM> then don''t we already have access to all that already? zlib which is libz - I downloaded sources from official page of zlib. Where in kernel libz is used???? JCM> So have you blogged about your Christmas coding yet? Not yet - wanted to do it later when it would be more useful - but I''ll blog about it sooner and eventually update it later :) Roght now it''s hard to find some free time... -- Best regards, Robert mailto:rmilkowski at task.gda.pl
Hello Eric, Wednesday, January 4, 2006, 7:53:43 AM, you wrote: ES> On Wed, Jan 04, 2006 at 04:53:34PM +1100, James C. McPherson wrote:>> >> > One thing I noticed - I had to also add new compression name to >> > libzfs - I think that available compression methods should be >> > exported from zfs module and not directly coded in libzfs. Anyway it >> > worked. >> >> Sounds like a reasonable RFE to me.ES> Feel free to file it, but I wouldn''t expect it to be fixed any time ES> soon. Adding a compression algorithm requires modifying a hardcoded ES> table, making changes to the on-disk format, so I don''t see why ES> modifying libzfs is a problem. ES> We would like to support a more pluggable architecture (and there is an ES> open RFE), but as you can see there is more work to do than just ES> exporting the table to libzfs. I know it''s not critical - it''s just that I was suprised that zfs command didn''t worked after I added "another" compression to zfs. Changes to libzfs were trival of course. Another thing - zfs gui doesn''t work either after I added compression (some kind of exception saying it doesn''t know compression named "milek"). - to be honest I haven''t tried gui without my modification either - but as it complains about this new compression name I take it it would work without my modifications. -- Best regards, Robert mailto:rmilkowski at task.gda.pl
Hello Jeff, Wednesday, January 4, 2006, 8:58:19 AM, you wrote:>> > Another thing - why there''s a 12.5% limit on using compression (so >> > if compression method did compress below 12.5% zfs writes >> > uncompresed data)? Why 12.5 and not different value? Why hard >> > coded-in? >> >> That''s a good question. I don''t know for sure so I''m guessing -- no doubt >> during the design and early scoping-of-work phases the 12.5% figure was >> determined. It''s probably got something to do with the overhead of storing >> the actual data such that over 12.5% we get a > X% benefit, and below >> that the cost of the compression is either too high or not sufficiently >> different. I await some facts (to spoil my hypothesizing!) from team ZFS.JB> Right. The threshold is somewhat arbitrary, and not terribly important JB> in practice. Data tends to compress either quite well (2x is common) JB> or not at all (e.g. JPEG files, which are already compressed). ok JB> It would be trivial to make the threshold a tunable, but we''re JB> trying to avoid this sort of thing. I don''t want there to be a JB> ZFS tuning guide, ever. That would mean we failed. In this case I belive it''s not a problem - it''s similar to specyfying different compression algorithms and parameters to them. Sometimes this 12.5% can make a difference (when you''ve got lot of TBs of data). -- Best regards, Robert mailto:rmilkowski at task.gda.pl
Hello Eric, ES> Feel free to file it, but I wouldn''t expect it to be fixed any time ES> soon. Adding a compression algorithm requires modifying a hardcoded ES> table, making changes to the on-disk format, so I don''t see why ES> modifying libzfs is a problem. One thing that just occured to me - does that mean that ZFS on-disk format will change again due to new (coming?) compression in ZFS in that way that new bit won''t be able to mount data from current ZFS? -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Hi Robert, You asked about compression functions in C which were kernel-friendly. At my previous employer I adapted LZO for this purpose and it was trivial. http://www.oberhumer.com/opensource/lzo/ Actually, there are large classes of compression algorithms which use bounded memory, and any of those would probably be suitable for in-kernel use with minor tweaks. As always, if you use someone else''s code for other than personal use, there may be licensing issues, and if you use someone else''s algorithm there may be patent issues. It keeps the lawyers in business. :-) Anton This message posted from opensolaris.org
On Thu, Jan 05, 2006 at 03:04:10PM +0100, Robert Milkowski wrote:> Hello Eric, > > ES> Feel free to file it, but I wouldn''t expect it to be fixed any time > ES> soon. Adding a compression algorithm requires modifying a hardcoded > ES> table, making changes to the on-disk format, so I don''t see why > ES> modifying libzfs is a problem. > > One thing that just occured to me - does that mean that ZFS on-disk > format will change again due to new (coming?) compression in ZFS in > that way that new bit won''t be able to mount data from current ZFS?It''s not clear exatly how we''ll do this, but the change will be backwards-compatible, but not upwards compatible. Old filesystems will mount under the new version, but new versions that choose to use a new compression algorithm will not be accessible under old filesystems. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Hello Eric, Thursday, January 5, 2006, 7:32:28 PM, you wrote: ES> On Thu, Jan 05, 2006 at 03:04:10PM +0100, Robert Milkowski wrote:>> Hello Eric, >> >> ES> Feel free to file it, but I wouldn''t expect it to be fixed any time >> ES> soon. Adding a compression algorithm requires modifying a hardcoded >> ES> table, making changes to the on-disk format, so I don''t see why >> ES> modifying libzfs is a problem. >> >> One thing that just occured to me - does that mean that ZFS on-disk >> format will change again due to new (coming?) compression in ZFS in >> that way that new bit won''t be able to mount data from current ZFS?ES> It''s not clear exatly how we''ll do this, but the change will be ES> backwards-compatible, but not upwards compatible. Old filesystems will ES> mount under the new version, but new versions that choose to use a new ES> compression algorithm will not be accessible under old filesystems. Uffff... this is great news :) Thank you. -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
If I were to hazzard a wild ass guess, I''d say JB was Jeff Bonwick; which would explain why he could speak so authoritively on it :) alan. Luke wrote:>>>Is the lzjb algorithm documented somewhere that''s not CDDL source code? >>>(GPL compatibility probs with CDDL source code) >> >>Sorry, no. > > > Shame, I was hoping it might not be too hard for me to write a read-only linux driver for zfs, as that would help lower the barrier for linux people to try solaris. I don''t think I can look at any solaris ZFS code if I want to do that though. > > Does anyone know what lzjb stands for? I presume that "lz" refers to Lempel & Ziv and their compression algorithms: > > http://en.wikipedia.org/wiki/LZ77_and_LZ78_%28algorithms%29 > > ...so I guess lzjb would be a derivative of Lempel & Ziv''s algorithms? > > I wonder who J. and B. were > > Luke. > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- Alan Hargreaves - http://blogs.sun.com/tpenta Kernel/VOSJEC/Performance Staff Engineer Product Technical Support (APAC) Sun Microsystems
> Right. (But please join me in resisting the urge to turn this > thread into a license debate. It''s all been said before.)Yeah I''m over license debates, I''m already suffering as a consequence of one :P Pragmatically speaking, I''m hopeful there''s a way to read ZFS filesystems in a linux module if I jump through the right hoops (eg document the lzjb algorithm from opensolaris source and then find a friend to reimplement the algorithm?). I think it would be great for raising awareness of opensolaris among linux users and making it a bit easier for people to dual-boot to try out opensolaris.> > I wonder who J. and B. were > That would be me (first.last, following the convention set by LZRW).Oh of course, nice work Jeff :-) This message posted from opensolaris.org
is it planned to add some other compression algorithm to zfs ? lzjb is quite good and especially performing very well, but i`d like to have better compression (bzip2?) - no matter how worse performance drops with this. regards roland This message posted from opensolaris.org
Wade.Stuart at fallon.com
2007-Jan-28 21:36 UTC
[zfs-discuss] Re: Adding my own compression to zfs
zfs-discuss-bounces at opensolaris.org wrote on 01/27/2007 06:48:17 AM:> is it planned to add some other compression algorithm to zfs ? > > lzjb is quite good and especially performing very well, but i`d like > to have better compression (bzip2?) - no matter how worse > performance drops with this. > > regards > roland >>From the code it looks like they have the flags set to allow multiple typesof compression. One thing you may notice is that this is per block, bzip2 or compression schemes that depend very heavily on large dictionary keyspace for optimal compression should not perform very well here. -Wade
Have a look at: http://blogs.sun.com/ahl/entry/a_little_zfs_hack On 27/01/07, roland <devzero at web.de> wrote:> is it planned to add some other compression algorithm to zfs ? > > lzjb is quite good and especially performing very well, but i`d like to have better compression (bzip2?) - no matter how worse performance drops with this. > > regards > roland > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-- Rasputin :: Jack of All Trades - Master of Nuns http://number9.hellooperator.net/
Cindy.Swearingen at Sun.COM
2007-Jan-29 17:12 UTC
[zfs-discuss] Re: Adding my own compression to zfs
See the following bug: http://bugs.opensolaris.org/view_bug.do?bug_id=6280662 Cindy roland wrote:> is it planned to add some other compression algorithm to zfs ? > > lzjb is quite good and especially performing very well, but i`d like to have better compression (bzip2?) - no matter how worse performance drops with this. > > regards > roland > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> Have a look at: > > http://blogs.sun.com/ahl/entry/a_little_zfs_hackthanks for the link, dick ! this sounds fantastic ! is the source for that (yet) available somewhere ?>Adam Leventhal''s Weblog >inside the sausage factorybtw - just wondering - is this some english phrase or some running gag ? i have seen it once ago on another blog and so i`m wondering.... greetings from the beer and sausage nation ;) roland This message posted from opensolaris.org
Matt Ingenthron
2007-Jan-29 22:15 UTC
[zfs-discuss] Re: Re: Adding my own compression to zfs
roland wrote:> > >> Adam Leventhal''s Weblog >> inside the sausage factory >> > > btw - just wondering - is this some english phrase or some running gag ? i > have seen it once ago on another blog and so i`m wondering.... > > greetings from the beer and sausage nation ;) > >It''s a response to a common English colloquialism which says ''nearly everybody likes eating sausage, but many people would probably rather not see how it''s made''. Adam is a Sausage maker in the Solaris world. Open Solaris is the newly expanded, room for everyone, Solaris sausage factory. His blog covers topics relating to what goes on in his sausage making duties. - Matt p.s.: The web says a German word for colloquialism is umgangssprachlich. -- Matt Ingenthron - Web Infrastructure Solutions Architect Sun Microsystems, Inc. - Global Systems Practice http://blogs.sun.com/mingenthron/ email: matt.ingenthron at sun.com Phone: 310-242-6439 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070129/3e3bf4d5/attachment.html>
The lzjb compression implementation (IMO) is the fastest one on SPARC Solaris systems. I''ve seen it beat lzo in speed while not necesarily in compressibility. I''ve measured both implementations inside Solaris SPARC kernels, and would love to hear from others about their experiences. As some one else alluded, multithreading the compression implementation will certainly improve performancel. Sri This message posted from opensolaris.org
hey, thanks for your overwhelming private lesson for english colloquialism :D now back to the technical :)> # zfs create pool/gzip > # zfs set compression=gzip pool/gzip > # cp -r /pool/lzjb/* /pool/gzip > # zfs list > NAME USED AVAIL REFER MOUNTPOINT > pool/gzip 64.9M 33.2G 64.9M /pool/gzip > pool/lzjb 128M 33.2G 128M /pool/lzjb > > That''s with a 1.2G crash dump (pretty much the most compressible file > imaginable). Here are the compression ratios with a pile of ELF binaries > (/usr/bin and /usr/lib):> # zfs get compressratio > NAME PROPERTY VALUE SOURCE > pool/gzip compressratio 3.27x - > pool/lzjb compressratio 1.89x -this looks MUCH better than i would have ever expected for smaller files. any real-world data how good or bad compressratio goes with lots of very small but good compressible files , for example some (evil for those solaris evangelists) untarred linux-source tree ? i''m rather excited how effective gzip will compress here. for comparison: sun1:/comptest # bzcat /tmp/linux-2.6.19.2.tar.bz2 |tar xvf - --snipp-- sun1:/comptest # du -s -k * 143895 linux-2.6.19.2 1 pax_global_header sun1:/comptest # du -s -k --apparent-size * 224282 linux-2.6.19.2 1 pax_global_header sun1:/comptest # zfs get compressratio comptest NAME PROPERTY VALUE SOURCE comptest tank compressratio 1.79x - This message posted from opensolaris.org
Bill Sommerfeld
2007-Jan-29 22:40 UTC
[zfs-discuss] Re: Re: Adding my own compression to zfs
On Mon, 2007-01-29 at 14:15 -0800, Matt Ingenthron wrote:> > > inside the sausage factory > > > > > > > btw - just wondering - is this some english phrase or some running gag ? i > > have seen it once ago on another blog and so i`m wondering.... > > > > greetings from the beer and sausage nation ;) > > > > > It''s a response to a common English colloquialism which says ''nearly > everybody likes eating sausage, but many people would probably rather > not see how it''s made''.I''ve actually seen the quote attributed to a German: Otto von Bismark, rendered in English as: "Laws are like sausages -- it is better not to see them being made." or "If you like laws and sausages, you should never watch either one being made." Of course, the same can, and has, been said about software... - Bill
Adam Leventhal
2007-Jan-30 04:54 UTC
[zfs-discuss] Re: Re: Adding my own compression to zfs
On Mon, Jan 29, 2007 at 02:39:13PM -0800, roland wrote:> > # zfs get compressratio > > NAME PROPERTY VALUE SOURCE > > pool/gzip compressratio 3.27x - > > pool/lzjb compressratio 1.89x - > > this looks MUCH better than i would have ever expected for smaller files. > > any real-world data how good or bad compressratio goes with lots of very small but good compressible files , for example some (evil for those solaris evangelists) untarred linux-source tree ? > > i''m rather excited how effective gzip will compress here. > > for comparison: > > sun1:/comptest # bzcat /tmp/linux-2.6.19.2.tar.bz2 |tar xvf - > --snipp-- > > sun1:/comptest # du -s -k * > 143895 linux-2.6.19.2 > 1 pax_global_header > > sun1:/comptest # du -s -k --apparent-size * > 224282 linux-2.6.19.2 > 1 pax_global_header > > sun1:/comptest # zfs get compressratio comptest > NAME PROPERTY VALUE SOURCE > comptest tank compressratio 1.79x -Don''t start sending me your favorite files to compress (it really should work about the same as gzip), but here''s the result for the above (I found a tar file that''s about 235M uncompressed): # du -ks linux-2.6.19.2/ 80087 linux-2.6.19.2 # zfs get compressratio pool/gzip NAME PROPERTY VALUE SOURCE pool/gzip compressratio 3.40x - Doing a gzip with the default compression level (6 -- the same setting I''m using in ZFS) yields a file that''s about 52M. The small files are hurting a bit here, but it''s still pretty good -- and considerably better than LZJB. Adam -- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
lzo in-kernel implementation for solaris/sparc ? your answer makes me believe, it exists. could you give a comment ? roland This message posted from opensolaris.org
any news on additional compression-schemes for zfs ? this is interesting research-topic, imho :) so, some more real-world tests with zfs-fuse + lzo patch : -LZO------------------------------------------------ zfs set compression=lzo mypool time cp /vmware/vserver1/vserver1.vmdk /mypool real 7m8.540s user 0m0.708s sys 0m24.839s zfs get compressratio mypool NAME PROPERTY VALUE SOURCE mypool compressratio 1.74x - 1.7G vserver1.vmdk compressed 3.0G vserver1.vmdk uncompressed -LZJB------------------------------------------------ zfs set compression=lzjb mypool time cp /vmware/vserver1/vserver1.vmdk /mypool real 7m16.392s user 0m0.709s sys 0m25.107s zfs get compressratio mypool NAME PROPERTY VALUE SOURCE mypool compressratio 1.47x - 2.0G vserver1.vmdk compressed 3.0G vserver1.vmdk uncompressed -GZIP------------------------------------------------ zfs set compression=gzip mypool time cp /vmware/vserver1/vserver1.vmdk /mypool/ real 12m54.183s user 0m0.653s sys 0m24.933s zfs get compressratio NAME PROPERTY VALUE SOURCE mypool compressratio 2.02x - 1.5G vserver1.vmdk compressed 3.0G vserver1.vmdk uncompressed btw - lzo-patch for zfs-fuse (does apply to latest zfs-fuse sources) is at http://groups.google.com/group/zfs-fuse/attach/a489f630aa4aa189/zfs-lzo.diff.bz2?part=4 This message posted from opensolaris.org
Hi, No news. I received some very good suggestions, but unfortunately I didn''t get as much discussion as I had hoped it would. I''m sending the project proposal again. I think that there are a lot of interesting things to research and develop regarding the subject and I hope this time we discuss a bit more about it. I would like to point out the Adam Leventhal''s suggestion of an adaptive compression scheme: I think it would be a challenging and interesting direction to take. Besides, there are some new results about BWT that I''m sure would be of interest in this context. Kind Regards, Domingos. Follows the text of my original proposal: ----------------------------------------------------------------------------------------------------------------- Bellow, follows a proposal for a new opensolaris project. Of course, this is open to change since I just wrote down some ideas I had months ago, while researching the topic as a graduate student in Computer Science, and since I''m not an opensolaris/ZFS expert at all. I would really appreciate any suggestion or comments. PROJECT PROPOSAL: ZFS Compression Algorithms. The main purpose of this project is the development of new compression schemes for the ZFS file system. We plan to start with the development of a fast implementation of a Burrows Wheeler Transform based algorithm (BWT). BWT is an outstanding tool and the currently known lossless compression algorithms based on it outperform the compression ratio of algorithms derived from the well known Ziv-Lempel algorithm, while being a little more time and space expensive. Therefore, there is space for improvement: recent results show that the running time and space needs of such algorithms can be significantly reduced and the same results suggests that BWT is likely to become the new standard in compression algorithms[1]. Suffixes Sorting (i.e. the problem of sorting suffixes of a given string) is the main bottleneck of BWT and really significant progress has been made in this area since the first algorithms of Manbers and Myers[2] and Larsson and Sadakane[3], notably the new linear time algorithms of Karkkainen and Sanders[4]; Kim, Sim and Park[5] and Ko e aluru[6] and also the promising O(nlogn) algorithm of Karkkainen and Burkhardt[7]. As a conjecture, we believe that some intrinsic properties of ZFS and file systems in general (e.g. sparseness and data entropy in blocks) could be exploited in order to produce brand new and really efficient compression algorithms, as well as the adaptation of existing ones to the task. The study might be extended to the analysis of data in specific applications (e.g. web servers, mail servers and others) in order to develop compression schemes for specific environments and/or modify the existing Ziv-Lempel based scheme to deal better with such environments. [1] "The Burrows-Wheeler Transform: Theory and Practice". Manzini, Giovanni. Proc. 24th Int. Symposium on Mathematical Foundations of Computer Science [2] "Suffix Arrays: A New Method for On-Line String Searches". Manber, Udi and Myers, Eugene W.. SIAM Journal on Computing, Vol. 22 Issue 5. 1990 [3] "Faster suffix sorting". Larsson, N Jasper and Sadakane, Kunihiko. TECHREPORT, Department of Computer Science, Lund University, 1999 [4] "Simple Linear Work Suffix Array Construction". Karkkainen, Juha and Sanders,Peter. Proc. 13th International Conference on Automata, Languages and Programming, 2003 [5]"Linear-time construction of suffix arrays" D.K. Kim, J.S. Sim, H. Park, K. Park, CPM, LNCS, Vol. 2676, 2003 [6]"Space ecient linear time construction of sux arrays",P. Ko and S. Aluru, CPM 2003. [7]"Fast Lightweight Suffix Array Construction and Checking". Burkhardt, Stefan and K?rkk?inen, Juha. 14th Annual Symposium, CPM 2003, Domingos Soares Neto University of Sao Paulo Institute of Mathematics and Statistics and IBM Software Group. __________________________________________________________________________ On 10/7/07, roland <devzero at web.de> wrote:> any news on additional compression-schemes for zfs ? > > this is interesting research-topic, imho :) > > so, some more real-world tests with zfs-fuse + lzo patch : > > -LZO------------------------------------------------ > zfs set compression=lzo mypool > > time cp /vmware/vserver1/vserver1.vmdk /mypool > > real 7m8.540s > user 0m0.708s > sys 0m24.839s > > zfs get compressratio mypool > NAME PROPERTY VALUE SOURCE > mypool compressratio 1.74x - > > 1.7G vserver1.vmdk compressed > 3.0G vserver1.vmdk uncompressed > > -LZJB------------------------------------------------ > zfs set compression=lzjb mypool > > time cp /vmware/vserver1/vserver1.vmdk /mypool > > real 7m16.392s > user 0m0.709s > sys 0m25.107s > > zfs get compressratio mypool > NAME PROPERTY VALUE SOURCE > mypool compressratio 1.47x - > > 2.0G vserver1.vmdk compressed > 3.0G vserver1.vmdk uncompressed > > -GZIP------------------------------------------------ > zfs set compression=gzip mypool > > time cp /vmware/vserver1/vserver1.vmdk /mypool/ > > real 12m54.183s > user 0m0.653s > sys 0m24.933s > > zfs get compressratio > NAME PROPERTY VALUE SOURCE > mypool compressratio 2.02x - > > 1.5G vserver1.vmdk compressed > 3.0G vserver1.vmdk uncompressed > > > btw - lzo-patch for zfs-fuse (does apply to latest zfs-fuse sources) is at http://groups.google.com/group/zfs-fuse/attach/a489f630aa4aa189/zfs-lzo.diff.bz2?part=4 > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
> Besides, > there are some new results about BWT that I''m sure would be of > interest in this context.I thought bzip2/BWT is a compression scheme that has a heavy footprint and is generally brain damaging to implement? -mg -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 648 bytes Desc: OpenPGP digital signature URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20071008/e096dd45/attachment.bin>
Hi Mario, This is common knowledge but not completely true. The bottleneck of BWT is the suffix sorting step and there have been many recent advances that significantly reduced the time and space needs of the algorithm. Of course, it will probably never be so fast as a lightweight Ziv-Lempel implementation as is lzo, but I believe (as do many others) that it can be made to run as fast as a medium weight LZ implementation, while compressing a lot more. Also, note that we are dealing with an specific application and not a general purpose library/compression utility as bzip2. How fast exactly BWT can be made to run under ZFS? I don''t know and don''t have a clue. It will require a lot if investigation to find out. Now, regarding the hardness if implementing BWT, you are right in the sense that fast suffix sorting algorithms are not at all trivial to implement. Kind regards, Domingos. On 10/8/07, Mario Goebbels <me at tomservo.cc> wrote:> > Besides, > > there are some new results about BWT that I''m sure would be of > > interest in this context. > > I thought bzip2/BWT is a compression scheme that has a heavy footprint > and is generally brain damaging to implement? > > -mg > > >
besides re-inventing the wheel somebody at sun should wake up and go ask mr. oberhumer and pay him $$$ to get lzo into ZFS. this is taken from http://www.oberhumer.com/opensource/lzo/lzodoc.php : Copyright --------- LZO is Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005 Markus Franz Xaver Johannes Oberhumer LZO is distributed under the terms of the GNU General Public License (GPL). See the file COPYING. Special licenses for commercial and other applications which are not willing to accept the GNU General Public License are available by contacting the author. so, lzo with opensolaris doesn`t sound like a no-go for me. if Sun doesn`t jump in to pay for that - let`s create some LZO-into-ZFS-fund. i`m here with the first $100 bucks. :) This message posted from opensolaris.org
On 8-Oct-07, at 5:39 PM, roland wrote:> besides re-inventing the wheel somebody at sun should wake up and > go ask mr. oberhumer and pay him $$$ to get lzo into ZFS. > > this is taken from http://www.oberhumer.com/opensource/lzo/ > lzodoc.php : > > Copyright > --------- > LZO is Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2002, > 2003, 2004, > 2005 Markus Franz Xaver Johannes Oberhumer > > LZO is distributed under the terms of the GNU General Public > License (GPL). > See the file COPYING. > > Special licenses for commercial and other applications which > are not willing to accept the GNU General Public License > are available by contacting the author. > > > so, lzo with opensolaris doesn`t sound like a no-go for me. > > if Sun doesn`t jump in to pay for that - let`s create some LZO-into- > ZFS-fund. > > i`m here with the first $100 bucks. :)I''m in too. :-) LZO is a great product (my company relies on OpenVPN). --Toby> > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
for those who are interested in lzo with zfs, i have made a special version of the patch taken from the zfs-fuse mailinglist: http://82.141.46.148/tmp/zfs-fuse-lzo.tgz this file contains the patch in unified diff format and also a broken out version (i.e. split into single files). maybe this makes integrating into an onnv-tree easier and also is better for review. i took some quick look and compared to onnv sources and it looks that it`s not too hard to be integrated - most lines are new files and oonv files seem to be changed just little. unfortunately i have no solaris build environment around for now, so i cannot give it a try and i also have no clue if this will compile at all. maybe the code needs much rework to be able to run in kernelspace, maybe not - but some solaris kernelhacker will better know.... This message posted from opensolaris.org
I haven''t heard from any other core contributors, but this sounds like a worthy project to me. Someone from the ZFS team should follow through to create the project on os.org[1] Its sounds like like Domingos and Roland might constitute the initial "project team". - Eric [1] http://www.opensolaris.org/os/community/ogb/policies/project-instantiation.txt On Sun, Oct 07, 2007 at 03:56:04PM -0300, Domingos Soares wrote:> Hi, > > No news. I received some very good suggestions, but unfortunately I > didn''t get as much discussion as I had hoped it would. I''m sending the > project proposal again. I think that there are a lot of interesting > things to research and develop regarding the subject and I hope this > time we discuss a bit more about it. I would like to point out the > Adam Leventhal''s suggestion of an adaptive compression scheme: I think > it would be a challenging and interesting direction to take. Besides, > there are some new results about BWT that I''m sure would be of > interest in this context. > > > Kind Regards, > > Domingos. > > Follows the text of my original proposal: > > ----------------------------------------------------------------------------------------------------------------- > > Bellow, follows a proposal for a new opensolaris project. Of course, > this is open to change since I just wrote down some ideas I had months > ago, while researching the topic as a graduate student in Computer > Science, and since I''m not an opensolaris/ZFS expert at all. I would > really appreciate any suggestion or comments. > > PROJECT PROPOSAL: ZFS Compression Algorithms. > > The main purpose of this project is the development of new > compression schemes for the ZFS file system. We plan to start with > the development of a fast implementation of a Burrows Wheeler > Transform based algorithm (BWT). BWT is an outstanding tool > and the currently known lossless compression algorithms > based on it outperform the compression ratio of algorithms derived from the well > known Ziv-Lempel algorithm, while being a little more time and space > expensive. Therefore, there is space for improvement: recent results > show that the running time and space needs of such algorithms can be > significantly reduced and the same results suggests that BWT is > likely to become the new standard in compression > algorithms[1]. Suffixes Sorting (i.e. the problem of sorting suffixes of a > given string) is the main bottleneck of BWT and really significant > progress has been made in this area since the first algorithms of > Manbers and Myers[2] and Larsson and Sadakane[3], notably the new > linear time algorithms of Karkkainen and Sanders[4]; Kim, Sim and > Park[5] and Ko e aluru[6] and also the promising O(nlogn) algorithm of > Karkkainen and Burkhardt[7]. > > As a conjecture, we believe that some intrinsic properties of ZFS and > file systems in general (e.g. sparseness and data entropy in blocks) > could be exploited in order to produce brand new and really efficient > compression algorithms, as well as the adaptation of existing ones to > the task. The study might be extended to the analysis of data in > specific applications (e.g. web servers, mail servers and others) in > order to develop compression schemes for specific environments and/or > modify the existing Ziv-Lempel based scheme to deal better with such > environments. > > [1] "The Burrows-Wheeler Transform: Theory and Practice". Manzini, > Giovanni. Proc. 24th Int. Symposium on Mathematical Foundations of > Computer Science > > [2] "Suffix Arrays: A New Method for > On-Line String Searches". Manber, Udi and Myers, Eugene W.. SIAM > Journal on Computing, Vol. 22 Issue 5. 1990 > > [3] "Faster suffix sorting". Larsson, N Jasper and Sadakane, > Kunihiko. TECHREPORT, Department of Computer Science, Lund University, > 1999 > > [4] "Simple Linear Work Suffix Array Construction". Karkkainen, Juha > and Sanders,Peter. Proc. 13th International Conference on Automata, > Languages and Programming, 2003 > > [5]"Linear-time construction of suffix arrays" D.K. Kim, J.S. Sim, > H. Park, K. Park, CPM, LNCS, Vol. 2676, 2003 > > [6]"Space ecient linear time construction of sux arrays",P. Ko and > S. Aluru, CPM 2003. > > [7]"Fast Lightweight Suffix Array Construction and > Checking". Burkhardt, Stefan and K?rkk?inen, Juha. 14th Annual > Symposium, CPM 2003, > > > Domingos Soares Neto > University of Sao Paulo > Institute of Mathematics and Statistics > > and > > IBM Software Group. > > __________________________________________________________________________ > > > On 10/7/07, roland <devzero at web.de> wrote: > > any news on additional compression-schemes for zfs ? > > > > this is interesting research-topic, imho :) > > > > so, some more real-world tests with zfs-fuse + lzo patch : > > > > -LZO------------------------------------------------ > > zfs set compression=lzo mypool > > > > time cp /vmware/vserver1/vserver1.vmdk /mypool > > > > real 7m8.540s > > user 0m0.708s > > sys 0m24.839s > > > > zfs get compressratio mypool > > NAME PROPERTY VALUE SOURCE > > mypool compressratio 1.74x - > > > > 1.7G vserver1.vmdk compressed > > 3.0G vserver1.vmdk uncompressed > > > > -LZJB------------------------------------------------ > > zfs set compression=lzjb mypool > > > > time cp /vmware/vserver1/vserver1.vmdk /mypool > > > > real 7m16.392s > > user 0m0.709s > > sys 0m25.107s > > > > zfs get compressratio mypool > > NAME PROPERTY VALUE SOURCE > > mypool compressratio 1.47x - > > > > 2.0G vserver1.vmdk compressed > > 3.0G vserver1.vmdk uncompressed > > > > -GZIP------------------------------------------------ > > zfs set compression=gzip mypool > > > > time cp /vmware/vserver1/vserver1.vmdk /mypool/ > > > > real 12m54.183s > > user 0m0.653s > > sys 0m24.933s > > > > zfs get compressratio > > NAME PROPERTY VALUE SOURCE > > mypool compressratio 2.02x - > > > > 1.5G vserver1.vmdk compressed > > 3.0G vserver1.vmdk uncompressed > > > > > > btw - lzo-patch for zfs-fuse (does apply to latest zfs-fuse sources) is at http://groups.google.com/group/zfs-fuse/attach/a489f630aa4aa189/zfs-lzo.diff.bz2?part=4 > > > > > > This message posted from opensolaris.org > > _______________________________________________ > > zfs-discuss mailing list > > zfs-discuss at opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
> I haven''t heard from any other core contributors, but this sounds like a > worthy project to me. Someone from the ZFS team should follow through > to create the project on os.org[1] > > Its sounds like like Domingos and Roland might constitute the initial > "project team".In my opinion, the project should also include an effort in getting LZO into ZFS. As an advanced fast but efficient variant. For that matter, if it were up to me, there should be an effort in modularizing the ZFS compression algorithms into loadable kernel modules, also allowing easy addition of algorithms. I suppose the same should apply to other components where possible, e.g. the spacemap allocator discussed on this list. But I''m a mere C# coder, so I can''t really help with that. -mg
Yes, I think that was the original intent of the project proposal. It could probably be reworded to decrease emphasis on a single algorithm, but I read it as a generic exploration of alternative algorithms. Pluggable algorithms is tricky, because compression is encoded as a single 8-bit quantity in the block pointer. This doesn''t make it impossible, just difficult. One could imagine, for example, reserving the top bit to indicate that the remainder of the value is an index into some auxiliary table that can identify compression schemes in some extended manner. This avoids the centralized repository, but introduces a number of interesting failure modes, such as being unable to open a pool because it uses an unsupported compression scheme. All very doable, but it''s a lot of work for (IMO) little gain, not to mention increased difficulty maintaining compatibility across disparate versions (what is the set of compression algorithms needed to be 100% compatible?). - Eric On Sun, Oct 14, 2007 at 03:28:24AM -0700, me at tomservo.cc wrote:> > I haven''t heard from any other core contributors, but this sounds like a > > worthy project to me. Someone from the ZFS team should follow through > > to create the project on os.org[1] > > > > Its sounds like like Domingos and Roland might constitute the initial > > "project team". > > In my opinion, the project should also include an effort in getting LZO > into ZFS. As an advanced fast but efficient variant. > > For that matter, if it were up to me, there should be an effort in > modularizing the ZFS compression algorithms into loadable kernel modules, > also allowing easy addition of algorithms. I suppose the same should apply > to other components where possible, e.g. the spacemap allocator discussed > on this list. But I''m a mere C# coder, so I can''t really help with that. > > -mg-- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
being at $300 now - a friend of mine just adding another $100 This message posted from opensolaris.org
> Yes, I think that was the original intent of the project proposal. It > could probably be reworded to decrease emphasis on a single algorithm, > but I read it as a generic exploration of alternative algorithms.Yes, you read right. The original intent was to investigate alternative algorithms in a general way instead of focused on a single one.> All very > doable, but it''s a lot of work for (IMO) little gain, not to mention > increased difficulty maintaining compatibility across disparate > versions (what is the set of compression algorithms needed to be 100% > compatible?).Do you have any strong reason to believe that there would be little gain or is it just a conjecture? I think I could agree with you, but in my case it''s just a conjecture. IMHO I have no idea about how much gain would be possible and I think that this is the main point we need to discover. I''m performing some tests to try to find out if different compression algorithms are REALLY better suited to different entropy scenarios. A lot of papers claim that but no one, as fair as I know, gave arguments to support it. As soon as I have got some results I will post them here. If the claim is true, I''m sure loadable compression modules would be a very good improvement to ZFS. It it''s not true, I think that the best to do is just port LZO to ZFS and let aside any other compression algorithm.> > > I haven''t heard from any other core contributors, but this sounds like a > > > worthy project to me. Someone from the ZFS team should follow through > > > to create the project on os.org[1]Thank you very much for the support. Domingos.> On Sun, Oct 14, 2007 at 03:28:24AM -0700, me at tomservo.cc wrote: > > > I haven''t heard from any other core contributors, but this sounds like a > > > worthy project to me. Someone from the ZFS team should follow through > > > to create the project on os.org[1] > > > > > > Its sounds like like Domingos and Roland might constitute the initial > > > "project team". > > > > In my opinion, the project should also include an effort in getting LZO > > into ZFS. As an advanced fast but efficient variant. > > > > For that matter, if it were up to me, there should be an effort in > > modularizing the ZFS compression algorithms into loadable kernel modules, > > also allowing easy addition of algorithms. I suppose the same should apply > > to other components where possible, e.g. the spacemap allocator discussed > > on this list. But I''m a mere C# coder, so I can''t really help with that. > > > > -mg > > -- > Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock >
*bump* just wanted to keep this into discussion. i think it could be important to zfs if it could compress faster with a better compressratio. This message posted from opensolaris.org
> Robert Milkowski wrote: > During christmass I managed to add my own compression to zfs - it as quite easy.Great to see innovation but unless your personal compression method is somehow better (very fast with excellent compression) then would it not be a better idea to use an existing (leading edge) compression method ? 7-Zip''s (http://www.7-zip.org/) ''newest'' methods are LZMA and PPMD (http://www.7-zip.org/7z.html). There is a "proprietary license" for LZMA that _might_ interest Sun but PPMD is "no explicit license" see this link: Using PPMD for compression http://www.codeproject.com/KB/recipes/ppmd.aspx Rob This message posted from opensolaris.org
> Robert Milkowski wrote: > During christmass I managed to add my own compression to zfs - it as quite easy.Great to see innovation but unless your personal compression method is somehow better (very fast with excellent compression) then would it not be a better idea to use an existing (leading edge) compression method ? 7-Zip''s (http://www.7-zip.org/) ''newest'' methods are LZMA and PPMD (http://www.7-zip.org/7z.html). There is a "proprietary license" for LZMA that _might_ interest Sun but PPMD is "no explicit license" see this link: Using PPMD for compression http://www.codeproject.com/KB/recipes/ppmd.aspx Rob This message posted from opensolaris.org
Hello Rob, Sunday, July 20, 2008, 12:11:56 PM, you wrote:>> Robert Milkowski wrote: >> During christmass I managed to add my own compression to zfs - it as quite easy.RC> Great to see innovation but unless your personal compression RC> method is somehow better (very fast with excellent RC> compression) then would it not be a better idea to use an RC> existing (leading edge) compression method ? Well, it was just an exercise on my side to get a better understanding about ZFS internals - it definitely wasn''t about writing any new compression algorithm. -- Best regards, Robert mailto:milek at task.gda.pl http://milek.blogspot.com
> It would be trivial to make the threshold a tunable, > but we''re > trying to avoid this sort of thing. I don''t want > there to be a > ZFS tuning guide, ever. That would mean we failed. > > Jeffharumph... http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide :-) Well now that that battle is lost (and unrelated, lzma is in the kernel), how about a renewed interest in zfs compression :) -- This message posted from opensolaris.org