Hello zfs-discuss, http://leaf.dragonflybsd.org/mailarchive/kernel/2007-10/msg00006.html http://leaf.dragonflybsd.org/mailarchive/kernel/2007-10/msg00008.html -- Best regards, Robert Milkowski mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
and what about compression? :D This message posted from opensolaris.org
you mean c9n ? ;) does anyone actually *use* compression ? i''d like to see a poll on how many people are using (or would use) compression on production systems that are larger than your little department catch-all dumping ground server. i mean, unless you had some NDMP interface directly to ZFS, daily tape backups for any large system will likely be an excersize in futility unless the systems are largely just archive servers, at which point it''s probably smarter to perform backups less often, coinciding with the workflow of migrating archive data to it. otherwise wouldn''t the system just plain get pounded? -=dave ----- Original Message ----- From: "roland" <devzero at web.de> To: <zfs-discuss at opensolaris.org> Sent: Tuesday, October 16, 2007 12:44 PM Subject: Re: [zfs-discuss] HAMMER> and what about compression? > > :D > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
We use compression on almost all of our zpools. We see very little if any I/O slowdown because of this, and you get free disk space. In fact, I believe read I/O gets a boost from this, since decompression is cheap compared to normal disk I/O. Jon Dave Johnson wrote:> you mean c9n ? ;) > > does anyone actually *use* compression ? i''d like to see a poll on how many > people are using (or would use) compression on production systems that are > larger than your little department catch-all dumping ground server. i mean, > unless you had some NDMP interface directly to ZFS, daily tape backups for > any large system will likely be an excersize in futility unless the systems > are largely just archive servers, at which point it''s probably smarter to > perform backups less often, coinciding with the workflow of migrating > archive data to it. otherwise wouldn''t the system just plain get pounded? > > -=dave > > ----- Original Message ----- > From: "roland" <devzero at web.de> > To: <zfs-discuss at opensolaris.org> > Sent: Tuesday, October 16, 2007 12:44 PM > Subject: Re: [zfs-discuss] HAMMER > > > >> and what about compression? >> >> :D >> >> >> This message posted from opensolaris.org >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> >> > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-- - _____/ _____/ / - Jonathan Loran - - - / / / IT Manager - - _____ / _____ / / Space Sciences Laboratory, UC Berkeley - / / / (510) 643-5146 jloran at ssl.berkeley.edu - ______/ ______/ ______/ AST:7731^29u18e3 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20071016/4c02c1fe/attachment.html>
On Oct 16, 2007, at 4:36 PM, Jonathan Loran wrote:> > We use compression on almost all of our zpools. We see very little > if any I/O slowdown because of this, and you get free disk space. > In fact, I believe read I/O gets a boost from this, since > decompression is cheap compared to normal disk I/O.Same here. For our workload (many writes, relatively few reads), we saw at least a 5x increase in performance (says a developer offhandedly when I ask him) when we enabled compression. I was expecting a boost, but I recall being surprised by how much quicker it was. I have not enabled it everywhere, just in specific places where disk I/O is being contended for, and CPU is in abundance. -- bda cyberpunk is dead. long live cyberpunk. http://bda.mirrorshades.net/
Hello Dave, Tuesday, October 16, 2007, 9:17:30 PM, you wrote: DJ> you mean c9n ? ;) DJ> does anyone actually *use* compression ? i''d like to see a poll on how many DJ> people are using (or would use) compression on production systems that are DJ> larger than your little department catch-all dumping ground server. i mean, DJ> unless you had some NDMP interface directly to ZFS, daily tape backups for DJ> any large system will likely be an excersize in futility unless the systems DJ> are largely just archive servers, at which point it''s probably smarter to DJ> perform backups less often, coinciding with the workflow of migrating DJ> archive data to it. otherwise wouldn''t the system just plain get pounded? LDAP servers with several dozen millions accounts? Why? First you get about 2:1 compression ratio with lzjb, and you also get better performance. -- Best regards, Robert Milkowski mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
From: "Robert Milkowski" <rmilkowski at task.gda.pl>> LDAP servers with several dozen millions accounts? > Why? First you get about 2:1 compression ratio with lzjb, and you also > get better performance.a busy ldap server certainly seems a good fit for compression but when i said "large" i meant, as in bytes and numbers of files :) seriously, is anyone out there using zfs for large "storage" servers? you know, the same usage that 90% of the storage sold in the world is used for ? (yes, i pulled that figure out of my *ss ;) are my concerns invalid with the current implementation of zfs with compression ? is the compression so lightweight that it can be decompressed as fast as the disks can stream uncompressed backup data to tape while the server is still servicing clients ? the days of "nightly" backups seem long gone in the space I''ve been working in the last several years... backups run almost ''round the clock it seems on our biggest systems (15-30Tb and 150-300mil files , which may be small by the standard of others of you out there.) what really got my eyes rolling about c9n and prompted my question was all this talk about gzip compression and other even heavierweight compression algor''s. lzjb is relatively lightweight but i could still see it being a bottleneck in a ''weekly full backups'' scenario unless you had a very new system with kilowatts of cpu to spare. gzip ? pulease. bzip and lzma someone has *got* to be joking ? i see these as ideal candiates for AVS scenarios where the aplication never requires full dumps to tape, but on a typical storage server ? the compression would be ideal but would also make it impossible to backup in any reasonable "window". back to my postulation, if it is correct, what about some NDMP interface to ZFS ? it seems a more than natural candidate. in this scenario, compression would be a boon since the blocks would already be in a compressed state. I''d imagine this fitting into the ''zfs send'' codebase somewhere. thoughts (on either c9n and/or ''zfs send ndmp'') ? -=dave ----- Original Message ----- From: "Robert Milkowski" <rmilkowski at task.gda.pl> To: "Dave Johnson" <dj4904 at hotmail.com> Cc: "roland" <devzero at web.de>; <zfs-discuss at opensolaris.org> Sent: Wednesday, October 17, 2007 2:35 AM Subject: Re[2]: [zfs-discuss] HAMMER> Hello Dave, > > Tuesday, October 16, 2007, 9:17:30 PM, you wrote: > > DJ> you mean c9n ? ;) > > DJ> does anyone actually *use* compression ? i''d like to see a poll on > how many > DJ> people are using (or would use) compression on production systems that > are > DJ> larger than your little department catch-all dumping ground server. i > mean, > DJ> unless you had some NDMP interface directly to ZFS, daily tape backups > for > DJ> any large system will likely be an excersize in futility unless the > systems > DJ> are largely just archive servers, at which point it''s probably smarter > to > DJ> perform backups less often, coinciding with the workflow of migrating > DJ> archive data to it. otherwise wouldn''t the system just plain get > pounded? > > LDAP servers with several dozen millions accounts? > Why? First you get about 2:1 compression ratio with lzjb, and you also > get better performance. > > > -- > Best regards, > Robert Milkowski mailto:rmilkowski at task.gda.pl > http://milek.blogspot.com > >
Dave Johnson wrote:> From: "Robert Milkowski" <rmilkowski at task.gda.pl> > >> LDAP servers with several dozen millions accounts? >> Why? First you get about 2:1 compression ratio with lzjb, and you also >> get better performance. >> > > a busy ldap server certainly seems a good fit for compression but when i > said "large" i meant, as in bytes and numbers of files :) > > seriously, is anyone out there using zfs for large "storage" servers? you > know, the same usage that 90% of the storage sold in the world is used for ? > (yes, i pulled that figure out of my *ss ;)We''re using ZFS compression on Netbackup Disk Cache Media Servers. I have 3 media servers with 42TB usable each, with compression enabled. I had to wait for Sol10 U4 to run compression because these are T2000''s and there was a problem that zfs was using only 1 compression thread per pool which made it too slow. But after U4, I have no problem handling bursts of nearly 2Gbit/s of backup streams in over the network while still spooling to a pair of 30MByte/s tape drives on each server. -Andy
We are using zfs compression across 5 zpools, about 45TB of data on iSCSI storage. I/O is very fast, with small fractional CPU usage (seat of the pants metrics here, sorry). We have one other large 10TB volume for nearline Networker backups, and that one isn''t compressed. We already compress these data on the backup client, and there wasn''t any more compression to be had on the zpool, so it isn''t worth it there. There''s no doubt that heavier weight compression would be a problem as you say. One thing that would be ultra cool on the backup pool would be to have post write compression. After backups are done, the backup server sits more or less idle. It would be cool to do a compress on scrub operation that cold do some real high level compression. Then we could zfssend | ssh-remote | zfsreceive to an off site location with far less less network bandwidth, not to mention the remote storage could be really small. Datadomain (www.datadomain.com) does block level checksumming to save files as link lists of common blocks. They get very high compression ratios (in our tests about 6/1, but with more frequent full backups, more like 20/1). Then off site transfers go that much faster. Jon Dave Johnson wrote:> From: "Robert Milkowski" <rmilkowski at task.gda.pl> > >> LDAP servers with several dozen millions accounts? >> Why? First you get about 2:1 compression ratio with lzjb, and you also >> get better performance. >> > > a busy ldap server certainly seems a good fit for compression but when i > said "large" i meant, as in bytes and numbers of files :) > > seriously, is anyone out there using zfs for large "storage" servers? you > know, the same usage that 90% of the storage sold in the world is used for ? > (yes, i pulled that figure out of my *ss ;) > > are my concerns invalid with the current implementation of zfs with > compression ? is the compression so lightweight that it can be decompressed > as fast as the disks can stream uncompressed backup data to tape while the > server is still servicing clients ? the days of "nightly" backups seem long > gone in the space I''ve been working in the last several years... backups run > almost ''round the clock it seems on our biggest systems (15-30Tb and > 150-300mil files , which may be small by the standard of others of you out > there.) > > what really got my eyes rolling about c9n and prompted my question was all > this talk about gzip compression and other even heavierweight compression > algor''s. lzjb is relatively lightweight but i could still see it being a > bottleneck in a ''weekly full backups'' scenario unless you had a very new > system with kilowatts of cpu to spare. gzip ? pulease. bzip and lzma > someone has *got* to be joking ? i see these as ideal candiates for AVS > scenarios where the aplication never requires full dumps to tape, but on a > typical storage server ? the compression would be ideal but would also make > it impossible to backup in any reasonable "window". > > back to my postulation, if it is correct, what about some NDMP interface to > ZFS ? it seems a more than natural candidate. in this scenario, > compression would be a boon since the blocks would already be in a > compressed state. I''d imagine this fitting into the ''zfs send'' codebase > somewhere. > > thoughts (on either c9n and/or ''zfs send ndmp'') ? > > -=dave > > ----- Original Message ----- > From: "Robert Milkowski" <rmilkowski at task.gda.pl> > To: "Dave Johnson" <dj4904 at hotmail.com> > Cc: "roland" <devzero at web.de>; <zfs-discuss at opensolaris.org> > Sent: Wednesday, October 17, 2007 2:35 AM > Subject: Re[2]: [zfs-discuss] HAMMER > > > >> Hello Dave, >> >> Tuesday, October 16, 2007, 9:17:30 PM, you wrote: >> >> DJ> you mean c9n ? ;) >> >> DJ> does anyone actually *use* compression ? i''d like to see a poll on >> how many >> DJ> people are using (or would use) compression on production systems that >> are >> DJ> larger than your little department catch-all dumping ground server. i >> mean, >> DJ> unless you had some NDMP interface directly to ZFS, daily tape backups >> for >> DJ> any large system will likely be an excersize in futility unless the >> systems >> DJ> are largely just archive servers, at which point it''s probably smarter >> to >> DJ> perform backups less often, coinciding with the workflow of migrating >> DJ> archive data to it. otherwise wouldn''t the system just plain get >> pounded? >> >> LDAP servers with several dozen millions accounts? >> Why? First you get about 2:1 compression ratio with lzjb, and you also >> get better performance. >> >> >> -- >> Best regards, >> Robert Milkowski mailto:rmilkowski at task.gda.pl >> http://milek.blogspot.com >> >> >> > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-- - _____/ _____/ / - Jonathan Loran - - - / / / IT Manager - - _____ / _____ / / Space Sciences Laboratory, UC Berkeley - / / / (510) 643-5146 jloran at ssl.berkeley.edu - ______/ ______/ ______/ AST:7731^29u18e3 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20071017/29c0cd4e/attachment.html>
Jonathan Loran wrote:> > We are using zfs compression across 5 zpools, about 45TB of data on > iSCSI storage. I/O is very fast, with small fractional CPU usage (seat > of the pants metrics here, sorry). We have one other large 10TB volume > for nearline Networker backups, and that one isn''t compressed. We > already compress these data on the backup client, and there wasn''t any > more compression to be had on the zpool, so it isn''t worth it there.cool.> There''s no doubt that heavier weight compression would be a problem as > you say. One thing that would be ultra cool on the backup pool would be > to have post write compression. After backups are done, the backup > server sits more or less idle. It would be cool to do a compress on > scrub operation that cold do some real high level compression. Then we > could zfssend | ssh-remote | zfsreceive to an off site location with far > less less network bandwidth, not to mention the remote storage could be > really small. Datadomain (www.datadomain.com > <http://www.datadomain.com>) does block level checksumming to save files > as link lists of common blocks. They get very high compression ratios > (in our tests about 6/1, but with more frequent full backups, more like > 20/1). Then off site transfers go that much faster.Do not assume that a compressed file system will send compressed. IIRC, it does not. But since UNIX is a land of pipe dreams, you can always compress anyway :-) zfs send ... | compress | ssh ... | uncompress | zfs receive ... -- richard
Richard Elling wrote:> Jonathan Loran wrote:<snip>...> Do not assume that a compressed file system will send compressed. > IIRC, it > does not.Let''s say, if it were possible to detect the remote compression support, couldn''t we send it compressed? With higher compression rates, wouldn''t that be smart? The Internet is not the land of infinite bandwidth that we often think it is.> > But since UNIX is a land of pipe dreams, you can always compress > anyway :-) > zfs send ... | compress | ssh ... | uncompress | zfs receive ... >Gosh, how obvious is that, eh? Thanks Richard.> -- richardJon -- - _____/ _____/ / - Jonathan Loran - - - / / / IT Manager - - _____ / _____ / / Space Sciences Laboratory, UC Berkeley - / / / (510) 643-5146 jloran at ssl.berkeley.edu - ______/ ______/ ______/ AST:7731^29u18e3
Jonathan Loran wrote:> Richard Elling wrote: > >> Jonathan Loran wrote: >> > <snip>... > > >> Do not assume that a compressed file system will send compressed. >> IIRC, it >> does not. >> > Let''s say, if it were possible to detect the remote compression support, > couldn''t we send it compressed? With higher compression rates, wouldn''t > that be smart? The Internet is not the land of infinite bandwidth that > we often think it is. > > >> But since UNIX is a land of pipe dreams, you can always compress >> anyway :-) >> zfs send ... | compress | ssh ... | uncompress | zfs receive ... >> >> > Gosh, how obvious is that, eh? Thanks Richard. > > >> -- richard >> > > Jon > >even better is the ability of ssh to compress the stream for you via ssh -C :)
Richard Elling wrote:> Do not assume that a compressed file system will send compressed. IIRC, it > does not. > > But since UNIX is a land of pipe dreams, you can always compress anyway :-) > zfs send ... | compress | ssh ... | uncompress | zfs receive ...zfs send | ssh -C | zfs recv -- Darren J Moffat
On 10/18/07, Darren J Moffat <darrenm at opensolaris.org> wrote:> zfs send | ssh -C | zfs recvI was going to suggest this, but I think (I could be wrong...) that ssh would then use zlib for compression and that ssh is still a single-threaded process. This has two effects: 1) gzip compression instead of compress - may or may not be right for the application 2) encryption + compression happens in same thread. While this may be fine for systems that can do both at wire or file system speed, it is not ideal if transfer rates are already constrained by CPU speed. The Niagara 2 CPU likely changes the importance of 2 a bit. -- Mike Gerdts http://mgerdts.blogspot.com/
Mike Gerdts wrote:> On 10/18/07, Darren J Moffat <darrenm at opensolaris.org> wrote: >> zfs send | ssh -C | zfs recv > > I was going to suggest this, but I think (I could be wrong...) that > ssh would then use zlib for compression and that ssh is still a > single-threaded process. This has two effects: > > 1) gzip compression instead of compress - may or may not be right for > the application > 2) encryption + compression happens in same thread. While this may be > fine for systems that can do both at wire or file system speed, it is > not ideal if transfer rates are already constrained by CPU speed. > > The Niagara 2 CPU likely changes the importance of 2 a bit.Unfortunately it doesn''t yet because ssh can''t yet use the N2 crypto - because it uses OpenSSL''s libcrypto without using the ENGINE API. -- Darren J Moffat
On 10/18/07, Darren J Moffat <darrenm at opensolaris.org> wrote:> Unfortunately it doesn''t yet because ssh can''t yet use the N2 crypto - > because it uses OpenSSL''s libcrypto without using the ENGINE API.Marketing needs to get in line with the technology. The word I received was that any application that linked against the included version of OpenSSL automatically gets to take advantage of the N2 crypto engine, so long as it is using one of the algorithms supported by N2 engine. -- Mike Gerdts http://mgerdts.blogspot.com/
Mike Gerdts wrote:> On 10/18/07, Darren J Moffat <darrenm at opensolaris.org> wrote: >> Unfortunately it doesn''t yet because ssh can''t yet use the N2 crypto - >> because it uses OpenSSL''s libcrypto without using the ENGINE API. > > Marketing needs to get in line with the technology. The word I > received was that any application that linked against the included > version of OpenSSL automatically gets to take advantage of the N2 > crypto engine, so long as it is using one of the algorithms supported > by N2 engine.Which marketing documentation (not person) says that ? It isn''t actually false but it has a caveat that the application must be using the OpenSSL ENGINE API, which Apache mod_ssl does and it must use the EVP_ interfaces in OpenSSL''s libcrypto (not the lower level direct software algorithm ones). Remember marketing info his very high level, the devil as aways is in the code. -- Darren J Moffat
On 10/18/07, Darren J Moffat <darrenm at opensolaris.org> wrote:> Which marketing documentation (not person) says that ?It was a person giving a technology brief in the past 6 weeks or so. It kinda went like "so long as they link against the bundled openssl and not a private copy of openssl they will automatically take advantage of the offload engine."> It isn''t actually false but it has a caveat that the application must be > using the OpenSSL ENGINE API, which Apache mod_ssl does and it must use > the EVP_ interfaces in OpenSSL''s libcrypto (not the lower level direct > software algorithm ones). > > Remember marketing info his very high level, the devil as aways is in > the code.Yeah, I know. It''s often times difficult to find the right code when you know what you are looking for. When you don''t know that you should be fact-checking, the code rarely finds its way in front of you. -- Mike Gerdts http://mgerdts.blogspot.com/
On 10/16/07, Dave Johnson <dj4904 at hotmail.com> wrote:> > does anyone actually *use* compression ? i''d like to see a poll on how many > people are using (or would use) compression on production systems that are > larger than your little department catch-all dumping ground server.We don''t use compression on our thumpers - they''re mostly for image storage where the original (eg. jpeg) is already compressed. What will be interesting is to look at the effect of compression on the attribute files (largely text and xml) as we start to deploy zfs there as well.> i mean, > unless you had some NDMP interface directly to ZFS, daily tape backups for > any large system will likely be an excersize in futility unless the systems > are largely just archive servers, at which point it''s probably smarter to > perform backups less often, coinciding with the workflow of migrating > archive data to it. otherwise wouldn''t the system just plain get pounded?I''m not worried about the compression effect. Where I see problems is backing up million/tens of millions of files in a single dataset. Backing up each file is essentially a random read (and this isn''t helped by raidz which gives you a single disks worth of random read I/O per vdev). I would love to see better ways of backing up huge numbers of files. -- -Peter Tribble http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
Peter Tribble wrote:> I''m not worried about the compression effect. Where I see problems is > backing up million/tens of millions of files in a single > dataset. Backing up > each file is essentially a random read (and this isn''t helped by raidz > which gives you a single disks worth of random read I/O per vdev). I > would love to see better ways of backing up huge numbers of files.It''s worth correcting this point... the RAIDZ behavior you mention only occurs if the read size is not aligned to the dataset''s block size. The checksum verifier must read the entire stripe to validate the data, but it does that in parallel across the stripe''s vdevs. The whole block is then available for delivery to the application. Although, backing up millions/tens of millions of files in a single backup dataset is a bad idea anyway. The metadata searches will kill you, no matter what backend filesystem is supporting it. "zfs send" is the faster way of backing up huge numbers of files. But you pay the price in restore time. (But that''s the normal tradeoff) --Joe
again i say (eventually) some "zfs sendndmp" type of mechanism seems the right way to go here *shrug* -=dave> Date: Mon, 5 Nov 2007 05:54:15 -0800> From: jmoore at ugs.com> To: zfs-discuss at opensolaris.org> Subject: Re: [zfs-discuss] HAMMER> > Peter Tribble wrote: > > I''m not worried about the compression effect. Where I see problems is> > backing up million/tens of millions of files in a single > > dataset. Backing up> > each file is essentially a random read (and this isn''t helped by raidz> > which gives you a single disks worth of random read I/O per vdev). I> > would love to see better ways of backing up huge numbers of files.> > It''s worth correcting this point... the RAIDZ behavior you mention only> occurs if the read size is not aligned to the dataset''s block size. The> checksum verifier must read the entire stripe to validate the data, but> it does that in parallel across the stripe''s vdevs. The whole block is> then available for delivery to the application.> > Although, backing up millions/tens of millions of files in a single> backup dataset is a bad idea anyway. The metadata searches will kill> you, no matter what backend filesystem is supporting it.> > "zfs send" is the faster way of backing up huge numbers of files. But> you pay the price in restore time. (But that''s the normal tradeoff)> > --Joe> _______________________________________________> zfs-discuss mailing list> zfs-discuss at opensolaris.org> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20071106/46c2e11c/attachment.html>