thr3ads.net - zfs discuss - [zfs-discuss] Adding my own compression to zfs [Jan 2006]

If this information is useful, please help other people find it:
Share via:

Robert Milkowski

2006-Jan-04 05:30 UTC

[zfs-discuss] Adding my own compression to zfs

Hello zfs-discuss,

  During christmass I managed to add my own compression to zfs - it as
  quite easy. Actually I registered new compression named milek with
  functions milek_compress and milek_decompress - but I copied the
  same algorithm used for lzjb - first I wanted to learn how to
  register/compile new compression.

  One thing I noticed - I had to also add new compression name to
  libzfs - I think that available compression methods should be
  exported from zfs module and not directly coded in libzfs. Anyway it
  worked.


  Now in a free time (well, no vacations right now...) I''m trying to
  put zlib into ZFS...

  Does anyone know any ready compression functions written in C which
  can be used in a kernel (no mallocs, etc.)? Perhaps something from
  BSD (so no GPL<->CDDL problems will show up)?

  Another thing - why there''s a 12.5% limit on using compression (so
  if compression method did compress below 12.5% zfs writes
  uncompresed data)? Why 12.5 and not different value? Why hard
  coded-in?
  
-- 
Best regards,
 Robert                          mailto:rmilkowski at task.gda.pl
                                 http://milek.blogspot.com

James C. McPherson

2006-Jan-04 05:53 UTC

head link

[zfs-discuss] Adding my own compression to zfs

Hi Robert,

Robert Milkowski wrote:>   During christmass I managed to add my own compression to zfs - it as
>   quite easy. Actually I registered new compression named milek with
>   functions milek_compress and milek_decompress - but I copied the
>   same algorithm used for lzjb - first I wanted to learn how to
>   register/compile new compression.
*cool* --- and "people" said that nobody outside of Sun would be
interested
in making use of Solaris kernel code. OpenSolaris Community +1, naysayers 0!
>   One thing I noticed - I had to also add new compression name to
>   libzfs - I think that available compression methods should be
>   exported from zfs module and not directly coded in libzfs. Anyway it
>   worked.
Sounds like a reasonable RFE to me.
>   Now in a free time (well, no vacations right now...) I''m trying
to
>   put zlib into ZFS...
umm, zlib as in /usr/lib/libz.* or something else? If it''s
/usr/lib/libz
then don''t we already have access to all that already?
>   Does anyone know any ready compression functions written in C which
>   can be used in a kernel (no mallocs, etc.)? Perhaps something from
>   BSD (so no GPL<->CDDL problems will show up)?
I think it''s time to dig out your copy of Knuth.....
>   Another thing - why there''s a 12.5% limit on using compression
(so
>   if compression method did compress below 12.5% zfs writes
>   uncompresed data)? Why 12.5 and not different value? Why hard
>   coded-in?
That''s a good question. I don''t know for sure so I''m
guessing -- no doubt
during the design and early scoping-of-work phases the 12.5% figure was
determined. It''s probably got something to do with the overhead of
storing
the actual data such that over 12.5% we get a > X% benefit, and below
that the cost of the compression is either too high or not sufficiently
different. I await some facts (to spoil my hypothesizing!) from team ZFS.


So have you blogged about your Christmas coding yet?


cheers,
James C. McPherson
--
Solaris Datapath Engineering
Data Management Group
Sun Microsystems

Eric Schrock

2006-Jan-04 06:53 UTC

head link

[zfs-discuss] Adding my own compression to zfs

On Wed, Jan 04, 2006 at 04:53:34PM +1100, James C. McPherson
wrote:> 
> >  One thing I noticed - I had to also add new compression name to
> >  libzfs - I think that available compression methods should be
> >  exported from zfs module and not directly coded in libzfs. Anyway it
> >  worked.
> 
> Sounds like a reasonable RFE to me.
Feel free to file it, but I wouldn''t expect it to be fixed any time
soon.  Adding a compression algorithm requires modifying a hardcoded
table, making changes to the on-disk format, so I don''t see why
modifying libzfs is a problem.

We would like to support a more pluggable architecture (and there is an
open RFE), but as you can see there is more work to do than just
exporting the table to libzfs.

- Eric

--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

Jeff Bonwick

2006-Jan-04 07:58 UTC

head link

[zfs-discuss] Adding my own compression to zfs

> >   Another thing - why there''s a 12.5% limit on using
compression (so
> >   if compression method did compress below 12.5% zfs writes
> >   uncompresed data)? Why 12.5 and not different value? Why hard
> >   coded-in?
> 
> That''s a good question. I don''t know for sure so
I''m guessing -- no doubt
> during the design and early scoping-of-work phases the 12.5% figure was
> determined. It''s probably got something to do with the overhead of
storing
> the actual data such that over 12.5% we get a > X% benefit, and below
> that the cost of the compression is either too high or not sufficiently
> different. I await some facts (to spoil my hypothesizing!) from team ZFS.
Right.  The threshold is somewhat arbitrary, and not terribly important
in practice.  Data tends to compress either quite well (2x is common)
or not at all (e.g. JPEG files, which are already compressed).

It would be trivial to make the threshold a tunable, but we''re
trying to avoid this sort of thing.  I don''t want there to be a
ZFS tuning guide, ever.  That would mean we failed.

Jeff

Jeff Bonwick

2006-Jan-04 08:24 UTC

head link

[zfs-discuss] Adding my own compression to zfs

>   During christmass I managed to add my own compression to zfs
Cool!
>   Now in a free time (well, no vacations right now...) I''m trying
to
>   put zlib into ZFS...
We actually have the decompress half of zlib in the kernel already
(to support CTF data).  We''ll be adding the compress half soon
so that ZFS can use it.
>   Does anyone know any ready compression functions written in C 
I''d be interested too.  There''s no shortage of compression
algorithms,
but not many are suitable for in-kernel, hot-code-path usage.

The advantages of lzjb are that it''s fast, reentrant, stateless,
and incredibly small (85 lines of code).  But it doesn''t squeeze
the bits as hard as zlib does.

Jeff

Luke Deller

2006-Jan-04 08:35 UTC

head link

[zfs-discuss] Adding my own compression to zfs

Jeff Bonwick wrote:> The advantages of lzjb are that it''s fast, reentrant, stateless,
> and incredibly small (85 lines of code).  But it doesn''t squeeze
> the bits as hard as zlib does.
Is the lzjb algorithm documented somewhere that''s not CDDL source code?
(GPL compatibility probs with CDDL source code)

Luke.

Jeff Bonwick

2006-Jan-04 08:57 UTC

head link

[zfs-discuss] Adding my own compression to zfs

> Is the lzjb algorithm documented somewhere that''s not CDDL source
code?
> (GPL compatibility probs with CDDL source code)
Sorry, no.

Jeff

Darren J Moffat

2006-Jan-04 10:35 UTC

head link

[zfs-discuss] Adding my own compression to zfs

On Wed, 2006-01-04 at 06:53, Eric Schrock wrote:
> We would like to support a more pluggable architecture (and there is an
> open RFE), but as you can see there is more work to do than just
> exporting the table to libzfs.
One thing that might help with this is if we added compression support
to the crypto framework.  Compression ops are basically the same type of
op as crypto anyway.  We have already been asked by Hifn if we would
consider doing this because they have a patented compression algorithm
that they implement in hardware, this would allow ZFS to use it (on
machines where that hardware exists).

-- 
Darren J Moffat

Jeff Bonwick

2006-Jan-04 11:14 UTC

head link

[zfs-discuss] Adding my own compression to zfs

> One thing that might help with this is if we added compression support
> to the crypto framework.  Compression ops are basically the same type of
> op as crypto anyway.  We have already been asked by Hifn if we would
> consider doing this because they have a patented compression algorithm
> that they implement in hardware, this would allow ZFS to use it (on
> machines where that hardware exists).
That''s interesting.  Bill recently restructured the zio pipeline
so that all CPU-intensive operations (compress, encrypt, checksum)
are consecutive pipeline stages, with the thought that we''d
eventually like to merge them so that we only make one pass
over the data.  (I''d also like to fold in RAID-Z parity
generation, but that''s a little trickier.)  We should talk.

Jeff

Casper.Dik at Sun.COM

2006-Jan-04 11:20 UTC

head link

[zfs-discuss] Adding my own compression to zfs

>> One thing that might help with this is if we added compression support
>> to the crypto framework.  Compression ops are basically the same type
of
>> op as crypto anyway.  We have already been asked by Hifn if we would
>> consider doing this because they have a patented compression algorithm
>> that they implement in hardware, this would allow ZFS to use it (on
>> machines where that hardware exists).
>
>That''s interesting.  Bill recently restructured the zio pipeline
>so that all CPU-intensive operations (compress, encrypt, checksum)
>are consecutive pipeline stages, with the thought that we''d
>eventually like to merge them so that we only make one pass
>over the data.  (I''d also like to fold in RAID-Z parity
>generation, but that''s a little trickier.)  We should talk.
I''d imagine that compression is different in two possibly important
details:
	the amount of data in and out is not equal
	(less data out when compressing, (much) more when decompressing)
	compression can presumably fail (data does not compress) without
	it being an error for the process.

Casper

Darren J Moffat

2006-Jan-04 12:15 UTC

head link

[zfs-discuss] Adding my own compression to zfs

On Wed, 2006-01-04 at 11:14, Jeff Bonwick wrote:> > One thing that might help with this is if we added compression support
> > to the crypto framework.  Compression ops are basically the same type
of
> > op as crypto anyway.  We have already been asked by Hifn if we would
> > consider doing this because they have a patented compression algorithm
> > that they implement in hardware, this would allow ZFS to use it (on
> > machines where that hardware exists).
> 
> That''s interesting.  Bill recently restructured the zio pipeline
> so that all CPU-intensive operations (compress, encrypt, checksum)
> are consecutive pipeline stages, with the thought that we''d
> eventually like to merge them so that we only make one pass
> over the data.  (I''d also like to fold in RAID-Z parity
> generation, but that''s a little trickier.)  We should talk.
and this is where the crypto framework should be able to help you out
since we already have the ability to do hash and encrypt at the same
time doing only a single pass over the data. 

Of course we would be compressing then encrypting and hashing the data
so it will always be "two" passes (compress the real data then encrypt
and hash in a single pass the compressed data).

-- 
Darren J Moffat

Bill Sommerfeld

2006-Jan-04 22:26 UTC

head link

[zfs-discuss] Adding my own compression to zfs

On Wed, 2006-01-04 at 06:20, Casper.Dik at Sun.COM
wrote:> I''d imagine that compression is different in two possibly
important
> details:
> 	the amount of data in and out is not equal
> 	(less data out when compressing, (much) more when decompressing)This is also true for some encryption modes.

					- Bill

Luke

2006-Jan-05 03:58 UTC

head link

[zfs-discuss] Re: Adding my own compression to zfs

> > Is the lzjb algorithm documented somewhere that''s not CDDL
source code?
> > (GPL compatibility probs with CDDL source code)
>
> Sorry, no.
Shame, I was hoping it might not be too hard for me to write a read-only linux
driver for zfs, as that would help lower the barrier for linux people to try
solaris.  I don''t think I can look at any solaris ZFS code if I want to
do that though.

Does anyone know what lzjb stands for?  I presume that "lz" refers to
Lempel & Ziv and their compression algorithms:

http://en.wikipedia.org/wiki/LZ77_and_LZ78_%28algorithms%29

...so I guess lzjb would be a derivative of Lempel & Ziv''s
algorithms?

I wonder who J. and B. were

Luke.
This message posted from opensolaris.org

Jeff Bonwick

2006-Jan-05 07:22 UTC

head link

[zfs-discuss] Re: Adding my own compression to zfs

> Shame, I was hoping it might not be too hard for me to write a read-only
> linux driver for zfs, as that would help lower the barrier for linux people
> to try solaris.  I don''t think I can look at any solaris ZFS code
if I
> want to do that though.
Right.  (But please join me in resisting the urge to turn this
thread into a license debate.  It''s all been said before.)
> ...so I guess lzjb would be a derivative of Lempel & Ziv''s
algorithms?
> 
> I wonder who J. and B. were
That would be me (first.last, following the convention set by LZRW).
I wrote it many years ago to compress crash dumps.  That meant that
it had to work in panic() context, which is *really* restricted --
no malloc, no threads, no nothing -- and it had to be fast, because
compression time is down time.  We put it in ZFS because we have it,
we know it''s safe, and it does a decent job in very little time.

Jeff

Robert Milkowski

2006-Jan-05 12:36 UTC

head link

[zfs-discuss] Adding my own compression to zfs

Hello James,

Wednesday, January 4, 2006, 6:53:34 AM, you wrote:

JCM> Hi Robert,
>>   Now in a free time (well, no vacations right now...) I''m
trying to
>>   put zlib into ZFS...
JCM> umm, zlib as in /usr/lib/libz.* or something else? If it''s
/usr/lib/libz
JCM> then don''t we already have access to all that already?

zlib which is libz - I downloaded sources from official page of zlib.

Where in kernel libz is used????

JCM> So have you blogged about your Christmas coding yet?

Not yet - wanted to do it later when it would be more useful - but
I''ll blog about it sooner and eventually update it later :)
Roght now it''s hard to find some free time...

-- 
Best regards,
 Robert                            mailto:rmilkowski at task.gda.pl

Robert Milkowski

2006-Jan-05 12:40 UTC

head link

[zfs-discuss] Adding my own compression to zfs

Hello Eric,

Wednesday, January 4, 2006, 7:53:43 AM, you wrote:

ES> On Wed, Jan 04, 2006 at 04:53:34PM +1100, James C. McPherson
wrote:>> 
>> >  One thing I noticed - I had to also add new compression name to
>> >  libzfs - I think that available compression methods should be
>> >  exported from zfs module and not directly coded in libzfs. Anyway
it
>> >  worked.
>> 
>> Sounds like a reasonable RFE to me.
ES> Feel free to file it, but I wouldn''t expect it to be fixed any
time
ES> soon.  Adding a compression algorithm requires modifying a hardcoded
ES> table, making changes to the on-disk format, so I don''t see why
ES> modifying libzfs is a problem.

ES> We would like to support a more pluggable architecture (and there is an
ES> open RFE), but as you can see there is more work to do than just
ES> exporting the table to libzfs.

I know it''s not critical - it''s just that I was suprised that
zfs
command didn''t worked after I added "another" compression to
zfs.
Changes to libzfs were trival of course.

Another thing - zfs gui doesn''t work either after I added compression
(some kind of exception saying it doesn''t know compression named
"milek"). - to be honest I haven''t tried gui without my
modification
either - but as it complains about this new compression name I take it
it would work without my modifications.

-- 
Best regards,
 Robert                            mailto:rmilkowski at task.gda.pl

Robert Milkowski

2006-Jan-05 12:49 UTC

head link

[zfs-discuss] Adding my own compression to zfs

Hello Jeff,

Wednesday, January 4, 2006, 8:58:19 AM, you wrote:
>> >   Another thing - why there''s a 12.5% limit on using
compression (so
>> >   if compression method did compress below 12.5% zfs writes
>> >   uncompresed data)? Why 12.5 and not different value? Why hard
>> >   coded-in?
>> 
>> That''s a good question. I don''t know for sure so
I''m guessing -- no doubt
>> during the design and early scoping-of-work phases the 12.5% figure was
>> determined. It''s probably got something to do with the
overhead of storing
>> the actual data such that over 12.5% we get a > X% benefit, and
below
>> that the cost of the compression is either too high or not sufficiently
>> different. I await some facts (to spoil my hypothesizing!) from team
ZFS.
JB> Right.  The threshold is somewhat arbitrary, and not terribly important
JB> in practice.  Data tends to compress either quite well (2x is common)
JB> or not at all (e.g. JPEG files, which are already compressed).

ok

JB> It would be trivial to make the threshold a tunable, but we''re
JB> trying to avoid this sort of thing.  I don''t want there to be a
JB> ZFS tuning guide, ever.  That would mean we failed.

In this case I belive it''s not a problem - it''s similar to
specyfying
different compression algorithms and parameters to them.

Sometimes this 12.5% can make a difference (when you''ve got lot of TBs
of data).


-- 
Best regards,
 Robert                            mailto:rmilkowski at task.gda.pl

Robert Milkowski

2006-Jan-05 14:04 UTC

head link

[zfs-discuss] Adding my own compression to zfs

Hello Eric,

ES> Feel free to file it, but I wouldn''t expect it to be fixed any
time
ES> soon.  Adding a compression algorithm requires modifying a hardcoded
ES> table, making changes to the on-disk format, so I don''t see why
ES> modifying libzfs is a problem.

One thing that just occured to me - does that mean that ZFS on-disk
format will change again due to new (coming?) compression in ZFS in
that way that new bit won''t be able to mount data from current ZFS?


-- 
Best regards,
 Robert                            mailto:rmilkowski at task.gda.pl
                                   http://milek.blogspot.com

Anton B. Rang

2006-Jan-05 16:24 UTC

head link

[zfs-discuss] Re: Adding my own compression to zfs

Hi Robert,

You asked about compression functions in C which were kernel-friendly.

At my previous employer I adapted LZO for this purpose and it was trivial.

  http://www.oberhumer.com/opensource/lzo/

Actually, there are large classes of compression algorithms which use bounded
memory, and any of those would probably be suitable for in-kernel use with minor
tweaks.

As always, if you use someone else''s code for other than personal use,
there may be licensing issues, and if you use someone else''s algorithm
there may be patent issues.  It keeps the lawyers in business.  :-)

Anton
This message posted from opensolaris.org

Eric Schrock

2006-Jan-05 18:32 UTC

head link

[zfs-discuss] Adding my own compression to zfs

On Thu, Jan 05, 2006 at 03:04:10PM +0100, Robert Milkowski
wrote:> Hello Eric,
> 
> ES> Feel free to file it, but I wouldn''t expect it to be fixed
any time
> ES> soon.  Adding a compression algorithm requires modifying a hardcoded
> ES> table, making changes to the on-disk format, so I don''t see
why
> ES> modifying libzfs is a problem.
> 
> One thing that just occured to me - does that mean that ZFS on-disk
> format will change again due to new (coming?) compression in ZFS in
> that way that new bit won''t be able to mount data from current
ZFS?
It''s not clear exatly how we''ll do this, but the change will
be
backwards-compatible, but not upwards compatible.  Old filesystems will
mount under the new version, but new versions that choose to use a new
compression algorithm will not be accessible under old filesystems.

- Eric

--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

Robert Milkowski

2006-Jan-06 07:54 UTC

head link

[zfs-discuss] Adding my own compression to zfs

Hello Eric,

Thursday, January 5, 2006, 7:32:28 PM, you wrote:

ES> On Thu, Jan 05, 2006 at 03:04:10PM +0100, Robert Milkowski
wrote:>> Hello Eric,
>> 
>> ES> Feel free to file it, but I wouldn''t expect it to be
fixed any time
>> ES> soon.  Adding a compression algorithm requires modifying a
hardcoded
>> ES> table, making changes to the on-disk format, so I don''t
see why
>> ES> modifying libzfs is a problem.
>> 
>> One thing that just occured to me - does that mean that ZFS on-disk
>> format will change again due to new (coming?) compression in ZFS in
>> that way that new bit won''t be able to mount data from current
ZFS?
ES> It''s not clear exatly how we''ll do this, but the change
will be
ES> backwards-compatible, but not upwards compatible.  Old filesystems will
ES> mount under the new version, but new versions that choose to use a new
ES> compression algorithm will not be accessible under old filesystems.

Uffff... this is great news :)
Thank you.

-- 
Best regards,
 Robert                            mailto:rmilkowski at task.gda.pl
                                   http://milek.blogspot.com

Alan Hargreaves

2006-Jan-08 00:50 UTC

head link

[zfs-discuss] Re: Adding my own compression to zfs

If I were to hazzard a wild ass guess, I''d say JB was Jeff Bonwick; 
which would explain why he could speak so authoritively on it :)

alan.

Luke wrote:>>>Is the lzjb algorithm documented somewhere that''s not CDDL
source code?
>>>(GPL compatibility probs with CDDL source code)
>>
>>Sorry, no.
> 
> 
> Shame, I was hoping it might not be too hard for me to write a read-only
linux driver for zfs, as that would help lower the barrier for linux people to
try solaris.  I don''t think I can look at any solaris ZFS code if I
want to do that though.
> 
> Does anyone know what lzjb stands for?  I presume that "lz"
refers to Lempel & Ziv and their compression algorithms:
> 
> http://en.wikipedia.org/wiki/LZ77_and_LZ78_%28algorithms%29
> 
> ...so I guess lzjb would be a derivative of Lempel & Ziv''s
algorithms?
> 
> I wonder who J. and B. were
> 
> Luke.
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

-- 
Alan Hargreaves - http://blogs.sun.com/tpenta
Kernel/VOSJEC/Performance Staff Engineer
Product Technical Support (APAC)
Sun Microsystems

Luke

2006-Jan-09 07:33 UTC

head link

[zfs-discuss] Re: Re: Adding my own compression to zfs

> Right. (But please join me in resisting the urge to turn this
> thread into a license debate. It''s all been said before.)
Yeah I''m over license debates, I''m already suffering as a
consequence
of one :P
Pragmatically speaking, I''m hopeful there''s a way to read ZFS
filesystems
in a linux module if I jump through the right hoops (eg document the lzjb
algorithm from opensolaris source and then find a friend to reimplement
the algorithm?).  I think it would be great for raising awareness of
opensolaris among linux users and making it a bit easier for people to
dual-boot to try out opensolaris.
> > I wonder who J. and B. were
> That would be me (first.last, following the convention set by LZRW).
Oh of course, nice work Jeff :-)
This message posted from opensolaris.org

roland

2007-Jan-27 12:48 UTC

head link

[zfs-discuss] Re: Adding my own compression to zfs

is it planned to add some other compression algorithm to zfs ?

lzjb is quite good and especially performing very well, but i`d like to have
better compression (bzip2?) - no matter how worse performance drops with this.

regards
roland
 
 
This message posted from opensolaris.org

Wade.Stuart at fallon.com

2007-Jan-28 21:36 UTC

head link

[zfs-discuss] Re: Adding my own compression to zfs

zfs-discuss-bounces at opensolaris.org wrote on 01/27/2007 06:48:17 AM:
> is it planned to add some other compression algorithm to zfs ?
>
> lzjb is quite good and especially performing very well, but i`d like
> to have better compression (bzip2?) - no matter how worse
> performance drops with this.
>
> regards
> roland
>
>From the code it looks like they have the flags set to allow multiple typesof compression.  One thing you may notice is that this is per block,  bzip2
or compression schemes that depend very heavily on large dictionary
keyspace for optimal compression should not perform very well here.

-Wade

Dick Davies

2007-Jan-29 16:35 UTC

head link

[zfs-discuss] Re: Adding my own compression to zfs

Have a look at:

  http://blogs.sun.com/ahl/entry/a_little_zfs_hack

On 27/01/07, roland <devzero at web.de> wrote:> is it planned to add some other compression algorithm to zfs ?
>
> lzjb is quite good and especially performing very well, but i`d like to
have better compression (bzip2?) - no matter how worse performance drops with
this.
>
> regards
> roland
>
>
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

-- 
Rasputin :: Jack of All Trades - Master of Nuns
http://number9.hellooperator.net/

Cindy.Swearingen at Sun.COM

2007-Jan-29 17:12 UTC

head link

[zfs-discuss] Re: Adding my own compression to zfs

See the following bug:

http://bugs.opensolaris.org/view_bug.do?bug_id=6280662

Cindy

roland wrote:> is it planned to add some other compression algorithm to zfs ?
> 
> lzjb is quite good and especially performing very well, but i`d like to
have better compression (bzip2?) - no matter how worse performance drops with
this.
> 
> regards
> roland
>  
>  
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

roland

2007-Jan-29 22:00 UTC

head link

[zfs-discuss] Re: Re: Adding my own compression to zfs

> Have a look at:
>
>  http://blogs.sun.com/ahl/entry/a_little_zfs_hack
thanks for the link, dick !

this sounds fantastic !

is the source for that (yet) available somewhere ?
>Adam Leventhal''s Weblog
>inside the sausage factory
btw - just wondering - is this some english phrase or some running gag ? i 
have seen it once ago on another blog and so i`m wondering....

greetings from the beer and sausage nation ;)

roland
 
 
This message posted from opensolaris.org

Matt Ingenthron

2007-Jan-29 22:15 UTC

head link

[zfs-discuss] Re: Re: Adding my own compression to zfs

roland wrote:>
>
>> Adam Leventhal''s Weblog
>> inside the sausage factory
>>     
>
> btw - just wondering - is this some english phrase or some running gag ? i 
> have seen it once ago on another blog and so i`m wondering....
>
> greetings from the beer and sausage nation ;)
>
>   It''s a response to a common English colloquialism which says
''nearly
everybody likes eating sausage, but many people would probably rather 
not see how it''s made''.

Adam is a Sausage maker in the Solaris world.  Open Solaris is the newly 
expanded, room for everyone, Solaris sausage factory.  His blog covers 
topics relating to what goes on in his sausage making duties.

- Matt

p.s.: The web says a German word for colloquialism is umgangssprachlich.


-- 
Matt Ingenthron - Web Infrastructure Solutions Architect
Sun Microsystems, Inc. - Global Systems Practice
http://blogs.sun.com/mingenthron/
email: matt.ingenthron at sun.com             Phone: 310-242-6439

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070129/3e3bf4d5/attachment.html>

Sudarsan

2007-Jan-29 22:38 UTC

head link

[zfs-discuss] Re: Re: Adding my own compression to zfs

The lzjb compression implementation (IMO) is the fastest one on SPARC Solaris 
systems. I''ve seen it beat lzo in speed while not necesarily in
compressibility. I''ve measured both implementations inside Solaris
SPARC kernels, and would love to hear from others about their experiences.  As
some one else alluded, multithreading the compression implementation will
certainly improve performancel.

Sri
 
 
This message posted from opensolaris.org

roland

2007-Jan-29 22:39 UTC

head link

[zfs-discuss] Re: Re: Adding my own compression to zfs

hey, thanks for your overwhelming private lesson for english colloquialism  :D

now back to the technical :)
> # zfs create pool/gzip
> # zfs set compression=gzip pool/gzip
> # cp -r /pool/lzjb/* /pool/gzip
> # zfs list
> NAME        USED  AVAIL  REFER  MOUNTPOINT
> pool/gzip  64.9M  33.2G  64.9M  /pool/gzip
> pool/lzjb   128M  33.2G   128M  /pool/lzjb
> 
> That''s with a 1.2G crash dump (pretty much the most compressible
file
> imaginable). Here are the compression ratios with a pile of ELF binaries 
> (/usr/bin and /usr/lib):
> # zfs get compressratio
> NAME       PROPERTY       VALUE      SOURCE
> pool/gzip  compressratio  3.27x      -
> pool/lzjb  compressratio  1.89x      -
this looks MUCH better than i would have ever expected for smaller files. 

any real-world data how good or bad compressratio goes with lots of very small
but good compressible files , for example some (evil for those solaris
evangelists) untarred linux-source tree ?

i''m rather excited how effective gzip will compress here.

for comparison:

sun1:/comptest #  bzcat /tmp/linux-2.6.19.2.tar.bz2 |tar xvf -
--snipp--

sun1:/comptest # du -s -k *
143895  linux-2.6.19.2
1       pax_global_header

sun1:/comptest # du -s -k --apparent-size *
224282  linux-2.6.19.2
1       pax_global_header

sun1:/comptest # zfs get compressratio comptest
NAME  PROPERTY       VALUE  SOURCE
comptest tank  compressratio  1.79x  -
 
 
This message posted from opensolaris.org

Bill Sommerfeld

2007-Jan-29 22:40 UTC

head link

[zfs-discuss] Re: Re: Adding my own compression to zfs

On Mon, 2007-01-29 at 14:15 -0800, Matt Ingenthron wrote:
> > > inside the sausage factory
> > >     
> > 
> > btw - just wondering - is this some english phrase or some running gag
? i
> > have seen it once ago on another blog and so i`m wondering....
> > 
> > greetings from the beer and sausage nation ;)
> > 
> >   
> It''s a response to a common English colloquialism which says
''nearly
> everybody likes eating sausage, but many people would probably rather
> not see how it''s made''.
I''ve actually seen the quote attributed to a German: Otto von Bismark,
rendered in English as:

"Laws are like sausages -- it is better not to see them being made."

or

"If you like laws and sausages, you should never watch either one being
made."

Of course, the same can, and has, been said about software...

						- Bill

Adam Leventhal

2007-Jan-30 04:54 UTC

head link

[zfs-discuss] Re: Re: Adding my own compression to zfs

On Mon, Jan 29, 2007 at 02:39:13PM -0800, roland wrote:> > # zfs get compressratio
> > NAME       PROPERTY       VALUE      SOURCE
> > pool/gzip  compressratio  3.27x      -
> > pool/lzjb  compressratio  1.89x      -
> 
> this looks MUCH better than i would have ever expected for smaller files. 
> 
> any real-world data how good or bad compressratio goes with lots of very
small but good compressible files , for example some (evil for those solaris
evangelists) untarred linux-source tree ?
> 
> i''m rather excited how effective gzip will compress here.
> 
> for comparison:
> 
> sun1:/comptest #  bzcat /tmp/linux-2.6.19.2.tar.bz2 |tar xvf -
> --snipp--
> 
> sun1:/comptest # du -s -k *
> 143895  linux-2.6.19.2
> 1       pax_global_header
> 
> sun1:/comptest # du -s -k --apparent-size *
> 224282  linux-2.6.19.2
> 1       pax_global_header
> 
> sun1:/comptest # zfs get compressratio comptest
> NAME  PROPERTY       VALUE  SOURCE
> comptest tank  compressratio  1.79x  -
Don''t start sending me your favorite files to compress (it really
should
work about the same as gzip), but here''s the result for the above (I
found
a tar file that''s about 235M uncompressed):

# du -ks linux-2.6.19.2/
80087   linux-2.6.19.2
# zfs get compressratio pool/gzip
NAME       PROPERTY       VALUE      SOURCE
pool/gzip  compressratio  3.40x      -

Doing a gzip with the default compression level (6 -- the same setting
I''m
using in ZFS) yields a file that''s about 52M. The small files are
hurting
a bit here, but it''s still pretty good -- and considerably better than
LZJB.

Adam

-- 
Adam Leventhal, Solaris Kernel Development       http://blogs.sun.com/ahl

roland

2007-Jun-16 11:45 UTC

head link

[zfs-discuss] Re: Re: Adding my own compression to zfs

lzo in-kernel implementation for solaris/sparc ?
your answer makes me believe, it exists.

could you give a comment ?

roland
 
 
This message posted from opensolaris.org

roland

2007-Oct-07 14:45 UTC

head link

[zfs-discuss] Adding my own compression to zfs

any news on additional compression-schemes for zfs ?

this is interesting research-topic, imho :)

so, some more real-world tests with zfs-fuse + lzo patch :

-LZO------------------------------------------------
zfs set compression=lzo mypool

time cp /vmware/vserver1/vserver1.vmdk /mypool

real    7m8.540s
user    0m0.708s
sys     0m24.839s

zfs get compressratio mypool
NAME    PROPERTY       VALUE   SOURCE
mypool  compressratio  1.74x   -

1.7G    vserver1.vmdk  compressed
3.0G    vserver1.vmdk  uncompressed

-LZJB------------------------------------------------
zfs set compression=lzjb mypool

time cp /vmware/vserver1/vserver1.vmdk /mypool

real    7m16.392s
user    0m0.709s
sys     0m25.107s

zfs get compressratio mypool
NAME    PROPERTY       VALUE   SOURCE
mypool  compressratio  1.47x   -

2.0G    vserver1.vmdk compressed 
3.0G    vserver1.vmdk uncompressed

-GZIP------------------------------------------------
zfs set compression=gzip mypool

time cp /vmware/vserver1/vserver1.vmdk /mypool/

real    12m54.183s
user    0m0.653s
sys     0m24.933s

zfs get compressratio
NAME    PROPERTY       VALUE   SOURCE
mypool  compressratio  2.02x   -

1.5G    vserver1.vmdk    compressed
3.0G    vserver1.vmdk    uncompressed


btw - lzo-patch for zfs-fuse (does apply to latest zfs-fuse sources) is at
http://groups.google.com/group/zfs-fuse/attach/a489f630aa4aa189/zfs-lzo.diff.bz2?part=4
 
 
This message posted from opensolaris.org

Domingos Soares

2007-Oct-07 18:56 UTC

head link

[zfs-discuss] Adding my own compression to zfs

Hi,

No news. I received some very good suggestions, but unfortunately I
didn''t get as much discussion as I had hoped it would. I''m
sending the
project proposal again. I think that there are a lot of interesting
things to research and develop regarding the subject and I hope this
time we discuss a bit more about it. I would like to point out the
Adam Leventhal''s suggestion of an adaptive compression scheme: I think
it would be a challenging and interesting direction to take. Besides,
there are some new results about BWT that I''m sure would be of
interest in this context.

Kind Regards,

Domingos.

Follows the text of my original proposal:

-----------------------------------------------------------------------------------------------------------------

Bellow, follows a proposal for a new opensolaris project. Of course,
this is open to change since I just wrote down some ideas I had months
ago, while researching the topic as a graduate student in Computer
Science, and since I''m not an opensolaris/ZFS expert at all. I would
really appreciate any suggestion or comments.

PROJECT PROPOSAL: ZFS Compression Algorithms.

The main purpose of this project is the development of new
compression schemes for the ZFS file system. We plan to start with
the development of a fast implementation of a Burrows Wheeler
Transform based algorithm (BWT). BWT is an outstanding tool
and the currently known lossless compression algorithms
based on it outperform the compression ratio of algorithms derived from the well
known Ziv-Lempel algorithm, while being a little more time and space
expensive. Therefore, there is space for improvement: recent results
show that the running time and space needs of such algorithms can be
significantly reduced and the same results suggests that BWT is
likely to become the new standard in compression
algorithms[1]. Suffixes Sorting (i.e. the problem of sorting suffixes of a
given string) is the main bottleneck of BWT and really significant
progress has been made in this area since the first algorithms of
Manbers and Myers[2] and Larsson and Sadakane[3], notably the new
linear time algorithms of Karkkainen and Sanders[4]; Kim, Sim and
Park[5] and Ko e aluru[6] and also the promising O(nlogn) algorithm of
Karkkainen and Burkhardt[7].

As a conjecture, we believe that some intrinsic properties of ZFS and
file systems in general (e.g. sparseness and data entropy in blocks)
could be exploited in order to produce brand new and really efficient
compression algorithms, as well as the adaptation of existing ones to
the task. The study might be extended to the analysis of data in
specific applications (e.g. web servers, mail servers and others) in
order to develop compression schemes for specific environments and/or
modify the existing Ziv-Lempel based scheme to deal better with such
environments.

[1] "The Burrows-Wheeler Transform: Theory and Practice". Manzini,
Giovanni. Proc. 24th Int. Symposium on Mathematical Foundations of
Computer Science

[2] "Suffix Arrays: A New Method for
On-Line String Searches". Manber, Udi and Myers, Eugene W.. SIAM
Journal on Computing, Vol. 22 Issue 5. 1990

[3] "Faster suffix sorting". Larsson, N Jasper and Sadakane,
Kunihiko. TECHREPORT, Department of Computer Science, Lund University,
1999

[4] "Simple Linear Work Suffix Array Construction". Karkkainen, Juha
and Sanders,Peter. Proc. 13th International Conference on Automata,
Languages and Programming, 2003

[5]"Linear-time construction of suffix arrays" D.K. Kim, J.S. Sim,
H. Park, K. Park, CPM, LNCS, Vol. 2676, 2003

[6]"Space ecient linear time construction of sux arrays",P. Ko and
S. Aluru, CPM 2003.

[7]"Fast Lightweight Suffix Array Construction and
Checking". Burkhardt, Stefan and K?rkk?inen, Juha. 14th Annual
Symposium, CPM 2003,

Domingos Soares Neto
University of Sao Paulo
Institute of Mathematics and Statistics

and

IBM Software Group.

__________________________________________________________________________

On 10/7/07, roland <devzero at web.de> wrote:> any news on additional compression-schemes for zfs ?
>
> this is interesting research-topic, imho :)
>
> so, some more real-world tests with zfs-fuse + lzo patch :
>
> -LZO------------------------------------------------
> zfs set compression=lzo mypool
>
> time cp /vmware/vserver1/vserver1.vmdk /mypool
>
> real    7m8.540s
> user    0m0.708s
> sys     0m24.839s
>
> zfs get compressratio mypool
> NAME    PROPERTY       VALUE   SOURCE
> mypool  compressratio  1.74x   -
>
> 1.7G    vserver1.vmdk  compressed
> 3.0G    vserver1.vmdk  uncompressed
>
> -LZJB------------------------------------------------
> zfs set compression=lzjb mypool
>
> time cp /vmware/vserver1/vserver1.vmdk /mypool
>
> real    7m16.392s
> user    0m0.709s
> sys     0m25.107s
>
> zfs get compressratio mypool
> NAME    PROPERTY       VALUE   SOURCE
> mypool  compressratio  1.47x   -
>
> 2.0G    vserver1.vmdk compressed
> 3.0G    vserver1.vmdk uncompressed
>
> -GZIP------------------------------------------------
> zfs set compression=gzip mypool
>
> time cp /vmware/vserver1/vserver1.vmdk /mypool/
>
> real    12m54.183s
> user    0m0.653s
> sys     0m24.933s
>
> zfs get compressratio
> NAME    PROPERTY       VALUE   SOURCE
> mypool  compressratio  2.02x   -
>
> 1.5G    vserver1.vmdk    compressed
> 3.0G    vserver1.vmdk    uncompressed
>
>
> btw - lzo-patch for zfs-fuse (does apply to latest zfs-fuse sources) is at
http://groups.google.com/group/zfs-fuse/attach/a489f630aa4aa189/zfs-lzo.diff.bz2?part=4
>
>
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

Mario Goebbels

2007-Oct-08 10:28 UTC

head link

[zfs-discuss] Adding my own compression to zfs

> Besides,
> there are some new results about BWT that I''m sure would be of
> interest in this context.
I thought bzip2/BWT is a compression scheme that has a heavy footprint
and is generally brain damaging to implement?

-mg

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 648 bytes
Desc: OpenPGP digital signature
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20071008/e096dd45/attachment.bin>

Domingos Soares

2007-Oct-08 16:20 UTC

head link

[zfs-discuss] Adding my own compression to zfs

Hi Mario,

This is common knowledge but not completely true. The bottleneck of
BWT is the suffix sorting step and there have been many recent
advances that significantly reduced the time and space needs of the
algorithm. Of course, it will probably never be so fast as a
lightweight Ziv-Lempel implementation as is lzo, but I believe (as do
many others) that it can be made to run as fast as a medium weight LZ
implementation, while compressing a lot more. Also, note that we are
dealing with an specific application and not a general purpose
library/compression utility as bzip2.

 How fast exactly BWT can be made to run under ZFS? I don''t know and
don''t have a clue. It will require a lot if investigation to find out.

Now, regarding the hardness if implementing BWT, you are right in the
sense that fast suffix sorting algorithms are not at all trivial to
implement.

Kind regards,

Domingos.

On 10/8/07, Mario Goebbels <me at tomservo.cc>
wrote:> > Besides,
> > there are some new results about BWT that I''m sure would be
of
> > interest in this context.
>
> I thought bzip2/BWT is a compression scheme that has a heavy footprint
> and is generally brain damaging to implement?
>
> -mg
>
>
>

roland

2007-Oct-08 20:39 UTC

head link

[zfs-discuss] Adding my own compression to zfs

besides re-inventing the wheel somebody at sun should wake up and go ask mr.
oberhumer and pay him $$$ to get lzo into ZFS.

this is taken from http://www.oberhumer.com/opensource/lzo/lzodoc.php :

Copyright
 ---------
 LZO is Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004,
 2005 Markus Franz Xaver Johannes Oberhumer

 LZO is distributed under the terms of the GNU General Public License (GPL).
 See the file COPYING.

 Special licenses for commercial and other applications which
 are not willing to accept the GNU General Public License
 are available by contacting the author.


so, lzo with opensolaris doesn`t sound like a no-go for me.

if Sun doesn`t jump in to pay for that - let`s create some LZO-into-ZFS-fund.

i`m here with the first $100 bucks. :)
 
 
This message posted from opensolaris.org

Toby Thain

2007-Oct-08 20:52 UTC

head link

[zfs-discuss] Adding my own compression to zfs

On 8-Oct-07, at 5:39 PM, roland wrote:
> besides re-inventing the wheel somebody at sun should wake up and  
> go ask mr. oberhumer and pay him $$$ to get lzo into ZFS.
>
> this is taken from http://www.oberhumer.com/opensource/lzo/ 
> lzodoc.php :
>
> Copyright
>  ---------
>  LZO is Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2002,  
> 2003, 2004,
>  2005 Markus Franz Xaver Johannes Oberhumer
>
>  LZO is distributed under the terms of the GNU General Public  
> License (GPL).
>  See the file COPYING.
>
>  Special licenses for commercial and other applications which
>  are not willing to accept the GNU General Public License
>  are available by contacting the author.
>
>
> so, lzo with opensolaris doesn`t sound like a no-go for me.
>
> if Sun doesn`t jump in to pay for that - let`s create some LZO-into- 
> ZFS-fund.
>
> i`m here with the first $100 bucks. :)

I''m in too. :-)
LZO is a great product (my company relies on OpenVPN).

--Toby
>
>
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

roland

2007-Oct-09 20:43 UTC

head link

[zfs-discuss] Adding my own compression to zfs

for those who are interested in lzo with zfs, i have made a special version of
the patch taken from the zfs-fuse mailinglist:

http://82.141.46.148/tmp/zfs-fuse-lzo.tgz

this file contains the patch in unified diff format and also a broken out
version (i.e. split into single files).

maybe this makes integrating into an onnv-tree easier and also is better for
review.

i took some quick look and compared to onnv sources and it looks that it`s not
too hard to be integrated - most lines are new files and oonv files seem to be
changed just little.

unfortunately i have no solaris build environment around for now, so i cannot
give it a try and i also have no clue if this will compile at all. maybe the
code needs much rework to be able to run in kernelspace, maybe not - but some
solaris kernelhacker will better know....
 
 
This message posted from opensolaris.org

Eric Schrock

2007-Oct-13 19:17 UTC

head link

[zfs-discuss] Adding my own compression to zfs

I haven''t heard from any other core contributors, but this sounds like
a
worthy project to me.  Someone from the ZFS team should follow through
to create the project on os.org[1]

Its sounds like like Domingos and Roland might constitute the initial
"project team".

- Eric

[1]
http://www.opensolaris.org/os/community/ogb/policies/project-instantiation.txt

On Sun, Oct 07, 2007 at 03:56:04PM -0300, Domingos Soares
wrote:> Hi,
> 
> No news. I received some very good suggestions, but unfortunately I
> didn''t get as much discussion as I had hoped it would.
I''m sending the
> project proposal again. I think that there are a lot of interesting
> things to research and develop regarding the subject and I hope this
> time we discuss a bit more about it. I would like to point out the
> Adam Leventhal''s suggestion of an adaptive compression scheme: I
think
> it would be a challenging and interesting direction to take. Besides,
> there are some new results about BWT that I''m sure would be of
> interest in this context.
> 
> 
> Kind Regards,
> 
> Domingos.
> 
> Follows the text of my original proposal:
> 
>
-----------------------------------------------------------------------------------------------------------------
> 
> Bellow, follows a proposal for a new opensolaris project. Of course,
> this is open to change since I just wrote down some ideas I had months
> ago, while researching the topic as a graduate student in Computer
> Science, and since I''m not an opensolaris/ZFS expert at all. I
would
> really appreciate any suggestion or comments.
> 
> PROJECT PROPOSAL: ZFS Compression Algorithms.
> 
> The main purpose of this project is the development of new
> compression schemes for the ZFS file system. We plan to start with
> the development of a fast implementation of a Burrows Wheeler
> Transform based algorithm (BWT). BWT is an outstanding tool
> and the currently known lossless compression algorithms
> based on it outperform the compression ratio of algorithms derived from the
well
> known Ziv-Lempel algorithm, while being a little more time and space
> expensive. Therefore, there is space for improvement: recent results
> show that the running time and space needs of such algorithms can be
> significantly reduced and the same results suggests that BWT is
> likely to become the new standard in compression
> algorithms[1]. Suffixes Sorting (i.e. the problem of sorting suffixes of a
> given string) is the main bottleneck of BWT and really significant
> progress has been made in this area since the first algorithms of
> Manbers and Myers[2] and Larsson and Sadakane[3], notably the new
> linear time algorithms of Karkkainen and Sanders[4]; Kim, Sim and
> Park[5] and Ko e aluru[6] and also the promising O(nlogn) algorithm of
> Karkkainen and Burkhardt[7].
> 
> As a conjecture, we believe that some intrinsic properties of ZFS and
> file systems in general (e.g. sparseness and data entropy in blocks)
> could be exploited in order to produce brand new and really efficient
> compression algorithms, as well as the adaptation of existing ones to
> the task. The study might be extended to the analysis of data in
> specific applications (e.g. web servers, mail servers and others) in
> order to develop compression schemes for specific environments and/or
> modify the existing Ziv-Lempel based scheme to deal better with such
> environments.
> 
> [1] "The Burrows-Wheeler Transform: Theory and Practice".
Manzini,
> Giovanni. Proc. 24th Int. Symposium on Mathematical Foundations of
> Computer Science
> 
> [2] "Suffix Arrays: A New Method for
> On-Line String Searches". Manber, Udi and Myers, Eugene W.. SIAM
> Journal on Computing, Vol. 22 Issue 5. 1990
> 
> [3] "Faster suffix sorting". Larsson, N Jasper and Sadakane,
> Kunihiko. TECHREPORT, Department of Computer Science, Lund University,
> 1999
> 
> [4] "Simple Linear Work Suffix Array Construction". Karkkainen,
Juha
> and Sanders,Peter. Proc. 13th International Conference on Automata,
> Languages and Programming, 2003
> 
> [5]"Linear-time construction of suffix arrays" D.K. Kim, J.S.
Sim,
> H. Park, K. Park, CPM, LNCS, Vol. 2676, 2003
> 
> [6]"Space ecient linear time construction of sux arrays",P. Ko
and
> S. Aluru, CPM 2003.
> 
> [7]"Fast Lightweight Suffix Array Construction and
> Checking". Burkhardt, Stefan and K?rkk?inen, Juha. 14th Annual
> Symposium, CPM 2003,
> 
> 
> Domingos Soares Neto
> University of Sao Paulo
> Institute of Mathematics and Statistics
> 
> and
> 
> IBM Software Group.
> 
> __________________________________________________________________________
> 
> 
> On 10/7/07, roland <devzero at web.de> wrote:
> > any news on additional compression-schemes for zfs ?
> >
> > this is interesting research-topic, imho :)
> >
> > so, some more real-world tests with zfs-fuse + lzo patch :
> >
> > -LZO------------------------------------------------
> > zfs set compression=lzo mypool
> >
> > time cp /vmware/vserver1/vserver1.vmdk /mypool
> >
> > real    7m8.540s
> > user    0m0.708s
> > sys     0m24.839s
> >
> > zfs get compressratio mypool
> > NAME    PROPERTY       VALUE   SOURCE
> > mypool  compressratio  1.74x   -
> >
> > 1.7G    vserver1.vmdk  compressed
> > 3.0G    vserver1.vmdk  uncompressed
> >
> > -LZJB------------------------------------------------
> > zfs set compression=lzjb mypool
> >
> > time cp /vmware/vserver1/vserver1.vmdk /mypool
> >
> > real    7m16.392s
> > user    0m0.709s
> > sys     0m25.107s
> >
> > zfs get compressratio mypool
> > NAME    PROPERTY       VALUE   SOURCE
> > mypool  compressratio  1.47x   -
> >
> > 2.0G    vserver1.vmdk compressed
> > 3.0G    vserver1.vmdk uncompressed
> >
> > -GZIP------------------------------------------------
> > zfs set compression=gzip mypool
> >
> > time cp /vmware/vserver1/vserver1.vmdk /mypool/
> >
> > real    12m54.183s
> > user    0m0.653s
> > sys     0m24.933s
> >
> > zfs get compressratio
> > NAME    PROPERTY       VALUE   SOURCE
> > mypool  compressratio  2.02x   -
> >
> > 1.5G    vserver1.vmdk    compressed
> > 3.0G    vserver1.vmdk    uncompressed
> >
> >
> > btw - lzo-patch for zfs-fuse (does apply to latest zfs-fuse sources)
is at
http://groups.google.com/group/zfs-fuse/attach/a489f630aa4aa189/zfs-lzo.diff.bz2?part=4
> >
> >
> > This message posted from opensolaris.org
> > _______________________________________________
> > zfs-discuss mailing list
> > zfs-discuss at opensolaris.org
> > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> >
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

me at tomservo.cc

2007-Oct-14 10:28 UTC

head link

[zfs-discuss] Adding my own compression to zfs

> I haven''t heard from any other core contributors, but this sounds
like a
> worthy project to me.  Someone from the ZFS team should follow through
> to create the project on os.org[1]
>
> Its sounds like like Domingos and Roland might constitute the initial
> "project team".
In my opinion, the project should also include an effort in getting LZO
into ZFS. As an advanced fast but efficient variant.

For that matter, if it were up to me, there should be an effort in
modularizing the ZFS compression algorithms into loadable kernel modules,
also allowing easy addition of algorithms. I suppose the same should apply
to other components where possible, e.g. the spacemap allocator discussed
on this list. But I''m a mere C# coder, so I can''t really help
with that.

-mg

Eric Schrock

2007-Oct-15 15:29 UTC

head link

[zfs-discuss] Adding my own compression to zfs

Yes, I think that was the original intent of the project proposal.  It
could probably be reworded to decrease emphasis on a single algorithm,
but I read it as a generic exploration of alternative algorithms.

Pluggable algorithms is tricky, because compression is encoded as a
single 8-bit quantity in the block pointer.  This doesn''t make it
impossible, just difficult.  One could imagine, for example, reserving
the top bit to indicate that the remainder of the value is an index into
some auxiliary table that can identify compression schemes in some
extended manner.  This avoids the centralized repository, but introduces
a number of interesting failure modes, such as being unable to open a
pool because it uses an unsupported compression scheme.  All very
doable, but it''s a lot of work for (IMO) little gain, not to mention
increased difficulty maintaining compatibility across disparate
versions (what is the set of compression algorithms needed to be 100%
compatible?).

- Eric

On Sun, Oct 14, 2007 at 03:28:24AM -0700, me at tomservo.cc
wrote:> > I haven''t heard from any other core contributors, but this
sounds like a
> > worthy project to me.  Someone from the ZFS team should follow through
> > to create the project on os.org[1]
> >
> > Its sounds like like Domingos and Roland might constitute the initial
> > "project team".
> 
> In my opinion, the project should also include an effort in getting LZO
> into ZFS. As an advanced fast but efficient variant.
> 
> For that matter, if it were up to me, there should be an effort in
> modularizing the ZFS compression algorithms into loadable kernel modules,
> also allowing easy addition of algorithms. I suppose the same should apply
> to other components where possible, e.g. the spacemap allocator discussed
> on this list. But I''m a mere C# coder, so I can''t really
help with that.
> 
> -mg
--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

roland

2007-Oct-17 19:45 UTC

head link

[zfs-discuss] Adding my own compression to zfs

being at $300 now - a friend of mine just adding another $100
 
 
This message posted from opensolaris.org

Domingos Soares

2007-Oct-22 14:39 UTC

head link

[zfs-discuss] Adding my own compression to zfs

> Yes, I think that was the original intent of the project proposal.  It
> could probably be reworded to decrease emphasis on a single algorithm,
> but I read it as a generic exploration of alternative algorithms.
Yes, you read right. The original intent was to investigate
alternative algorithms in a general way instead of focused on a single
one.
> All very
> doable, but it''s a lot of work for (IMO) little gain, not to
mention
> increased difficulty maintaining compatibility across disparate
> versions (what is the set of compression algorithms needed to be 100%
> compatible?).
Do you have any strong reason to believe that there would be little
gain or is it just a conjecture? I think I could agree with you, but
in my case it''s just a conjecture. IMHO I have no idea about how much
gain would be possible and I think that this is the main point we need
to discover. I''m performing some tests to try to find out if different
compression algorithms are REALLY better suited to different entropy
scenarios. A lot of papers claim that but no one, as fair as I know,
gave arguments to support it. As soon as I have got some results I
will post them here.

If the claim is true, I''m sure loadable compression modules would be a
very good improvement to ZFS. It it''s not true, I think that the best
to do is just port LZO to ZFS and let aside any other compression
algorithm.
> > > I haven''t heard from any other core contributors, but
this sounds like a
> > > worthy project to me.  Someone from the ZFS team should follow
through
> > > to create the project on os.org[1]
Thank you very much for the support.


Domingos.


> On Sun, Oct 14, 2007 at 03:28:24AM -0700, me at tomservo.cc wrote:
> > > I haven''t heard from any other core contributors, but
this sounds like a
> > > worthy project to me.  Someone from the ZFS team should follow
through
> > > to create the project on os.org[1]
> > >
> > > Its sounds like like Domingos and Roland might constitute the
initial
> > > "project team".
> >
> > In my opinion, the project should also include an effort in getting
LZO
> > into ZFS. As an advanced fast but efficient variant.
> >
> > For that matter, if it were up to me, there should be an effort in
> > modularizing the ZFS compression algorithms into loadable kernel
modules,
> > also allowing easy addition of algorithms. I suppose the same should
apply
> > to other components where possible, e.g. the spacemap allocator
discussed
> > on this list. But I''m a mere C# coder, so I can''t
really help with that.
> >
> > -mg
>
> --
> Eric Schrock, Solaris Kernel Development      
http://blogs.sun.com/eschrock
>

roland

2007-Nov-30 22:50 UTC

head link

[zfs-discuss] Adding my own compression to zfs

*bump*

just wanted to keep this into discussion. i think it could be important to zfs
if it could compress faster with a better compressratio.
 
 
This message posted from opensolaris.org

Rob Clark

2008-Jul-20 11:04 UTC

head link

[zfs-discuss] Adding my own compression to zfs

> Robert Milkowski wrote:
> During christmass I managed to add my own compression to zfs - it as quite
easy.
Great to see innovation but unless your personal compression method is somehow
better (very fast with excellent
compression) then would it not be a better idea to use an existing (leading
edge) compression method ?

7-Zip''s (http://www.7-zip.org/) ''newest'' methods are
LZMA and PPMD (http://www.7-zip.org/7z.html).

There is a "proprietary license" for LZMA that _might_ interest Sun
but PPMD is "no explicit license" see this link:

Using PPMD for compression
http://www.codeproject.com/KB/recipes/ppmd.aspx

Rob

This message posted from opensolaris.org

Rob Clark

2008-Jul-20 11:11 UTC

head link

[zfs-discuss] Adding my own compression to zfs

> Robert Milkowski wrote:
> During christmass I managed to add my own compression to zfs - it as quite
easy.
Great to see innovation but unless your personal compression method is somehow
better (very fast with excellent
compression) then would it not be a better idea to use an existing (leading
edge) compression method ?

7-Zip''s (http://www.7-zip.org/) ''newest'' methods are
LZMA and PPMD (http://www.7-zip.org/7z.html).

There is a "proprietary license" for LZMA that _might_ interest Sun
but PPMD is "no explicit license" see this link:

Using PPMD for compression
http://www.codeproject.com/KB/recipes/ppmd.aspx

Rob

This message posted from opensolaris.org

Robert Milkowski

2008-Jul-30 22:49 UTC

head link

[zfs-discuss] Adding my own compression to zfs

Hello Rob,

Sunday, July 20, 2008, 12:11:56 PM, you wrote:
>> Robert Milkowski wrote:
>> During christmass I managed to add my own compression to zfs - it as
quite easy.
RC> Great to see innovation but unless your personal compression
RC> method is somehow better (very fast with excellent 
RC> compression) then would it not be a better idea to use an
RC> existing (leading edge) compression method ?


Well, it was just an exercise on my side to get a better understanding
about ZFS internals - it definitely wasn''t about writing any new
compression algorithm.

-- 
Best regards,
 Robert                            mailto:milek at task.gda.pl
                                       http://milek.blogspot.com

2008-Oct-06 02:16 UTC

head link

[zfs-discuss] Adding my own compression to zfs

> It would be trivial to make the threshold a tunable,
> but we''re
> trying to avoid this sort of thing.  I don''t want
> there to be a
> ZFS tuning guide, ever.  That would mean we failed.
> 
> Jeff
harumph... http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide
:-)

Well now that that battle is lost (and unrelated, lzma is in the kernel), how
about a renewed interest in zfs compression :)
--
This message posted from opensolaris.org

zfs discuss - Jan 2006 - Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Re: Adding my own compression to zfs

[zfs-discuss] Re: Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Re: Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Re: Adding my own compression to zfs

[zfs-discuss] Re: Re: Adding my own compression to zfs

[zfs-discuss] Re: Adding my own compression to zfs

[zfs-discuss] Re: Adding my own compression to zfs

[zfs-discuss] Re: Adding my own compression to zfs

[zfs-discuss] Re: Adding my own compression to zfs

[zfs-discuss] Re: Re: Adding my own compression to zfs

[zfs-discuss] Re: Re: Adding my own compression to zfs

[zfs-discuss] Re: Re: Adding my own compression to zfs

[zfs-discuss] Re: Re: Adding my own compression to zfs

[zfs-discuss] Re: Re: Adding my own compression to zfs

[zfs-discuss] Re: Re: Adding my own compression to zfs

[zfs-discuss] Re: Re: Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs

[zfs-discuss] Adding my own compression to zfs