I am confused by the numerical value of compressratio. I copied a compressed ZFS filesystem that is 38.5G in size (zfs list USED and REFER value) and reports a compressratio value of "2.52x" to an uncompressed ZFS filesystem and it expanded to 198G. So why is the compressratio 2.52 rather than 198/38.5 = 5.14? As an artificial test, I created a filesystem with compression enabled and ran "mkfile 1g" and the reported compressratio for that filesystem is 1.00x even though this 1GB file only uses only 1kB. Note, this was done with ZFS version 4 on S10U4. I would appreciate any help in understanding what compressratio means. Thanks. -- Stuart Anderson anderson at ligo.caltech.edu http://www.ligo.caltech.edu/~anderson
Stuart Anderson wrote:> As an artificial test, I created a filesystem with compression enabled > and ran "mkfile 1g" and the reported compressratio for that filesystem > is 1.00x even though this 1GB file only uses only 1kB. >ZFS seems to treat files filled with zeroes as sparse files, regardless of whether or not compression is enabled. Try "dd if=/dev/urandom of=1g.dat bs=1024 count=1048576" to create a file that won''t exhibit this behavior. Creating this file is a lot slower than writing zeroes (mostly due to the speed of the urandom device), but ZFS won''t treat it like a sparse file, and it won''t compress very well either. -Luke
On Mon, Apr 14, 2008 at 09:59:48AM -0400, Luke Scharf wrote:> Stuart Anderson wrote: > >As an artificial test, I created a filesystem with compression enabled > >and ran "mkfile 1g" and the reported compressratio for that filesystem > >is 1.00x even though this 1GB file only uses only 1kB. > > > > ZFS seems to treat files filled with zeroes as sparse files, regardless > of whether or not compression is enabled. Try "dd if=/dev/urandom > of=1g.dat bs=1024 count=1048576" to create a file that won''t exhibit > this behavior. Creating this file is a lot slower than writing zeroes > (mostly due to the speed of the urandom device), but ZFS won''t treat it > like a sparse file, and it won''t compress very well either.However, I am still trying to reconcile the compression ratio as reported by compressratio vs the ratio of file sizes to disk blocks used (whether or not ZFS is creating sparse files). Regarding sparse files, I recently found that the builtin heuristic for auto detecting and creating sparse files in the GNU cp program "works" on ZFS filesystems. In particular, if you use GNU cp to copy a file from ZFS and it has a string of null characters in it (whether or not it is stored as a sparse file) the output file (regardless of the destination filesystem type) will be a sparse file. I have not seen this behavior for copying such files from other source filesystems. Thanks. -- Stuart Anderson anderson at ligo.caltech.edu http://www.ligo.caltech.edu/~anderson
Stuart Anderson wrote:> On Mon, Apr 14, 2008 at 09:59:48AM -0400, Luke Scharf wrote: > >> Stuart Anderson wrote: >> >>> As an artificial test, I created a filesystem with compression enabled >>> and ran "mkfile 1g" and the reported compressratio for that filesystem >>> is 1.00x even though this 1GB file only uses only 1kB. >>> >>> >> ZFS seems to treat files filled with zeroes as sparse files, regardless >> of whether or not compression is enabled. Try "dd if=/dev/urandom >> of=1g.dat bs=1024 count=1048576" to create a file that won''t exhibit >> this behavior. Creating this file is a lot slower than writing zeroes >> (mostly due to the speed of the urandom device), but ZFS won''t treat it >> like a sparse file, and it won''t compress very well either. >> > > However, I am still trying to reconcile the compression ratio as > reported by compressratio vs the ratio of file sizes to disk blocks > used (whether or not ZFS is creating sparse files). >Can you describe the data you''re storing a bit? Any big disk images? -Luke
On Mon, Apr 14, 2008 at 05:22:03PM -0400, Luke Scharf wrote:> Stuart Anderson wrote: > >On Mon, Apr 14, 2008 at 09:59:48AM -0400, Luke Scharf wrote: > > > >>Stuart Anderson wrote: > >> > >>>As an artificial test, I created a filesystem with compression enabled > >>>and ran "mkfile 1g" and the reported compressratio for that filesystem > >>>is 1.00x even though this 1GB file only uses only 1kB. > >>> > >>> > >>ZFS seems to treat files filled with zeroes as sparse files, regardless > >>of whether or not compression is enabled. Try "dd if=/dev/urandom > >>of=1g.dat bs=1024 count=1048576" to create a file that won''t exhibit > >>this behavior. Creating this file is a lot slower than writing zeroes > >>(mostly due to the speed of the urandom device), but ZFS won''t treat it > >>like a sparse file, and it won''t compress very well either. > >> > > > >However, I am still trying to reconcile the compression ratio as > >reported by compressratio vs the ratio of file sizes to disk blocks > >used (whether or not ZFS is creating sparse files). > > > > Can you describe the data you''re storing a bit? Any big disk images? >Understanding the "mkfile" case would be a start, but the initial filesystem that started my confusion is one that has a number of ~50GByte mysql database files as well as a number of application code files. Here is another simple test to avoid any confusion/bugs related to NULL character sequeneces being compressed to nothing versus being treated as sparse files. In particular, a 2GByte file full of the output of /bin/yes:>zfs create export-cit/compress >cd /export/compress >/bin/df -k .Filesystem kbytes used avail capacity Mounted on export-cit/compress 1704858624 55 1261199742 1% /export/compress>zfs get compression export-cit/compressNAME PROPERTY VALUE SOURCE export-cit/compress compression on inherited from export-cit>/bin/yes | head -1073741824 > yes.dat >/bin/ls -ls yes.dat185017 -rw-r--r-- 1 root root 2147483648 Apr 14 15:31 yes.dat>/bin/df -k .Filesystem kbytes used avail capacity Mounted on export-cit/compress 1704858624 92563 1261107232 1% /export/compress>zfs get compressratio export-cit/compressNAME PROPERTY VALUE SOURCE export-cit/compress compressratio 28.39x - So compressratio reports 28.39, but the ratio of file size to used disk for the only regular file on this filesystem, i.e., excluding the initial 55kB allocated for the "empty" filesystem is: 2147483648 / (185017 * 512) = 22.67 Calculated another way from "zfs list" for the entire filesystem:>zfs list /export/compressNAME USED AVAIL REFER MOUNTPOINT export-cit/compress 90.4M 1.17T 90.4M /export/compress is 2GB/90.4M = 2048 / 90.4 = 22.65 That still leaves me puzzled what the precise definition of compressratio is? Thanks. --- Stuart Anderson anderson at ligo.caltech.edu http://www.ligo.caltech.edu/~anderson
This may be my ignorance, but I thought all modern unix filesystems created sparse files in this way? -----Original Message----- From: Stuart Anderson <anderson at ligo.caltech.edu> Date: Mon, 14 Apr 2008 15:45:03 To:Luke Scharf <luke.scharf at clusterbee.net> Cc:zfs-discuss at opensolaris.org Subject: Re: [zfs-discuss] Confused by compressratio On Mon, Apr 14, 2008 at 05:22:03PM -0400, Luke Scharf wrote:> Stuart Anderson wrote: > >On Mon, Apr 14, 2008 at 09:59:48AM -0400, Luke Scharf wrote: > > > >>Stuart Anderson wrote: > >> > >>>As an artificial test, I created a filesystem with compression enabled > >>>and ran "mkfile 1g" and the reported compressratio for that filesystem > >>>is 1.00x even though this 1GB file only uses only 1kB. > >>> > >>> > >>ZFS seems to treat files filled with zeroes as sparse files, regardless > >>of whether or not compression is enabled. Try "dd if=/dev/urandom > >>of=1g.dat bs=1024 count=1048576" to create a file that won''t exhibit > >>this behavior. Creating this file is a lot slower than writing zeroes > >>(mostly due to the speed of the urandom device), but ZFS won''t treat it > >>like a sparse file, and it won''t compress very well either. > >> > > > >However, I am still trying to reconcile the compression ratio as > >reported by compressratio vs the ratio of file sizes to disk blocks > >used (whether or not ZFS is creating sparse files). > > > > Can you describe the data you''re storing a bit? Any big disk images? >Understanding the "mkfile" case would be a start, but the initial filesystem that started my confusion is one that has a number of ~50GByte mysql database files as well as a number of application code files. Here is another simple test to avoid any confusion/bugs related to NULL character sequeneces being compressed to nothing versus being treated as sparse files. In particular, a 2GByte file full of the output of /bin/yes:>zfs create export-cit/compress >cd /export/compress >/bin/df -k .Filesystem kbytes used avail capacity Mounted on export-cit/compress 1704858624 55 1261199742 1% /export/compress>zfs get compression export-cit/compressNAME PROPERTY VALUE SOURCE export-cit/compress compression on inherited from export-cit>/bin/yes | head -1073741824 > yes.dat >/bin/ls -ls yes.dat185017 -rw-r--r-- 1 root root 2147483648 Apr 14 15:31 yes.dat>/bin/df -k .Filesystem kbytes used avail capacity Mounted on export-cit/compress 1704858624 92563 1261107232 1% /export/compress>zfs get compressratio export-cit/compressNAME PROPERTY VALUE SOURCE export-cit/compress compressratio 28.39x - So compressratio reports 28.39, but the ratio of file size to used disk for the only regular file on this filesystem, i.e., excluding the initial 55kB allocated for the "empty" filesystem is: 2147483648 / (185017 * 512) = 22.67 Calculated another way from "zfs list" for the entire filesystem:>zfs list /export/compressNAME USED AVAIL REFER MOUNTPOINT export-cit/compress 90.4M 1.17T 90.4M /export/compress is 2GB/90.4M = 2048 / 90.4 = 22.65 That still leaves me puzzled what the precise definition of compressratio is? Thanks. --- Stuart Anderson anderson at ligo.caltech.edu http://www.ligo.caltech.edu/~anderson _______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
You can fill up an ext3 filesystem with the following command: dd if=/dev/zero of=delme.dat You can''t really fill up a ZFS filesystme that way. I guess you could, but I''ve never had the patience -- when several GB worth of zeroes takes 1kb worth of data, then it would take a very long time. AFAIK, ext3 supports sparse files just like it should -- but it doesn''t dynamically figure out what to write based on the contents of the file. -Luke Jeremy F. wrote:> This may be my ignorance, but I thought all modern unix filesystems created sparse files in this way? > > > -----Original Message----- > From: Stuart Anderson <anderson at ligo.caltech.edu> > > Date: Mon, 14 Apr 2008 15:45:03 > To:Luke Scharf <luke.scharf at clusterbee.net> > Cc:zfs-discuss at opensolaris.org > Subject: Re: [zfs-discuss] Confused by compressratio > > > On Mon, Apr 14, 2008 at 05:22:03PM -0400, Luke Scharf wrote: > >> Stuart Anderson wrote: >> >>> On Mon, Apr 14, 2008 at 09:59:48AM -0400, Luke Scharf wrote: >>> >>> >>>> Stuart Anderson wrote: >>>> >>>> >>>>> As an artificial test, I created a filesystem with compression enabled >>>>> and ran "mkfile 1g" and the reported compressratio for that filesystem >>>>> is 1.00x even though this 1GB file only uses only 1kB. >>>>> >>>>> >>>>> >>>> ZFS seems to treat files filled with zeroes as sparse files, regardless >>>> of whether or not compression is enabled. Try "dd if=/dev/urandom >>>> of=1g.dat bs=1024 count=1048576" to create a file that won''t exhibit >>>> this behavior. Creating this file is a lot slower than writing zeroes >>>> (mostly due to the speed of the urandom device), but ZFS won''t treat it >>>> like a sparse file, and it won''t compress very well either. >>>> >>>> >>> However, I am still trying to reconcile the compression ratio as >>> reported by compressratio vs the ratio of file sizes to disk blocks >>> used (whether or not ZFS is creating sparse files). >>> >>> >> Can you describe the data you''re storing a bit? Any big disk images? >> >> > > Understanding the "mkfile" case would be a start, but the initial filesystem > that started my confusion is one that has a number of ~50GByte mysql database > files as well as a number of application code files. > > Here is another simple test to avoid any confusion/bugs related to NULL > character sequeneces being compressed to nothing versus being treated > as sparse files. In particular, a 2GByte file full of the output of > /bin/yes: > > >> zfs create export-cit/compress >> cd /export/compress >> /bin/df -k . >> > Filesystem kbytes used avail capacity Mounted on > export-cit/compress 1704858624 55 1261199742 1% /export/compress > >> zfs get compression export-cit/compress >> > NAME PROPERTY VALUE SOURCE > export-cit/compress compression on inherited from export-cit > >> /bin/yes | head -1073741824 > yes.dat >> /bin/ls -ls yes.dat >> > 185017 -rw-r--r-- 1 root root 2147483648 Apr 14 15:31 yes.dat > >> /bin/df -k . >> > Filesystem kbytes used avail capacity Mounted on > export-cit/compress 1704858624 92563 1261107232 1% /export/compress > >> zfs get compressratio export-cit/compress >> > NAME PROPERTY VALUE SOURCE > export-cit/compress compressratio 28.39x - > > So compressratio reports 28.39, but the ratio of file size to used disk for > the only regular file on this filesystem, i.e., excluding the initial 55kB > allocated for the "empty" filesystem is: > > 2147483648 / (185017 * 512) = 22.67 > > > Calculated another way from "zfs list" for the entire filesystem: > > >> zfs list /export/compress >> > NAME USED AVAIL REFER MOUNTPOINT > export-cit/compress 90.4M 1.17T 90.4M /export/compress > > is 2GB/90.4M = 2048 / 90.4 = 22.65 > > > That still leaves me puzzled what the precise definition of compressratio is? > > > Thanks. > > --- > Stuart Anderson anderson at ligo.caltech.edu > http://www.ligo.caltech.edu/~anderson > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
On Tue, 15 Apr 2008, Luke Scharf wrote:> > AFAIK, ext3 supports sparse files just like it should -- but it doesn''t > dynamically figure out what to write based on the contents of the file.Since zfs inspects all data anyway in order to compute the block checksum, it can easily know if a block is all zeros. For ext3, inspecting all blocks for zeros would be viewed as unnecessary overhead. Bob =====================================Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
>>> zfs list /export/compress >>> >>> >> NAME USED AVAIL REFER MOUNTPOINT >> export-cit/compress 90.4M 1.17T 90.4M /export/compress >> >> is 2GB/90.4M = 2048 / 90.4 = 22.65 >> >> >> That still leaves me puzzled what the precise definition of compressratio is? >>My guess is that the compressratio doesn''t include any of those runs of null characaters that weren''t actually written to the disk. What I''m thinking is that if you have a disk-image (of a new computer) in there, the 4GB worth of actual data is counted against the compressratio, but the 36GB worth of empty (zeroed) space isn''t counted. But I don''t have hard numbers, or a good way to prove it. Not without reading all of the OP''s data, anyway... :-) -Luke P.S. This "don''t bother writing zeroes" behavior is wonderful when working with Xen disk images. I''m a fan!
On Tue, Apr 15, 2008 at 01:37:43PM -0400, Luke Scharf wrote:> > >>>zfs list /export/compress > >>> > >>> > >>NAME USED AVAIL REFER MOUNTPOINT > >>export-cit/compress 90.4M 1.17T 90.4M /export/compress > >> > >>is 2GB/90.4M = 2048 / 90.4 = 22.65 > >> > >> > >>That still leaves me puzzled what the precise definition of compressratio > >>is? > >> > > My guess is that the compressratio doesn''t include any of those runs of > null characaters that weren''t actually written to the disk.This test was done with a file created with via "/bin/yes | head", i.e., it does not have any null characters specifically for this possibility. -- Stuart Anderson anderson at ligo.caltech.edu http://www.ligo.caltech.edu/~anderson
UTSL. compressratio is the ratio of uncompressed bytes to compressed bytes. http://cvs.opensolaris.org/source/search?q=ZFS_PROP_COMPRESSRATIO&defs=&refs=&path=zfs&hist=&project=%2Fonnv IMHO, you will (almost) never get the same number looking at bytes as you get from counting blocks. -- richard Stuart Anderson wrote:> On Mon, Apr 14, 2008 at 05:22:03PM -0400, Luke Scharf wrote: > >> Stuart Anderson wrote: >> >>> On Mon, Apr 14, 2008 at 09:59:48AM -0400, Luke Scharf wrote: >>> >>> >>>> Stuart Anderson wrote: >>>> >>>> >>>>> As an artificial test, I created a filesystem with compression enabled >>>>> and ran "mkfile 1g" and the reported compressratio for that filesystem >>>>> is 1.00x even though this 1GB file only uses only 1kB. >>>>> >>>>> >>>>> >>>> ZFS seems to treat files filled with zeroes as sparse files, regardless >>>> of whether or not compression is enabled. Try "dd if=/dev/urandom >>>> of=1g.dat bs=1024 count=1048576" to create a file that won''t exhibit >>>> this behavior. Creating this file is a lot slower than writing zeroes >>>> (mostly due to the speed of the urandom device), but ZFS won''t treat it >>>> like a sparse file, and it won''t compress very well either. >>>> >>>> >>> However, I am still trying to reconcile the compression ratio as >>> reported by compressratio vs the ratio of file sizes to disk blocks >>> used (whether or not ZFS is creating sparse files). >>> >>> >> Can you describe the data you''re storing a bit? Any big disk images? >> >> > > Understanding the "mkfile" case would be a start, but the initial filesystem > that started my confusion is one that has a number of ~50GByte mysql database > files as well as a number of application code files. > > Here is another simple test to avoid any confusion/bugs related to NULL > character sequeneces being compressed to nothing versus being treated > as sparse files. In particular, a 2GByte file full of the output of > /bin/yes: > > >> zfs create export-cit/compress >> cd /export/compress >> /bin/df -k . >> > Filesystem kbytes used avail capacity Mounted on > export-cit/compress 1704858624 55 1261199742 1% /export/compress > >> zfs get compression export-cit/compress >> > NAME PROPERTY VALUE SOURCE > export-cit/compress compression on inherited from export-cit > >> /bin/yes | head -1073741824 > yes.dat >> /bin/ls -ls yes.dat >> > 185017 -rw-r--r-- 1 root root 2147483648 Apr 14 15:31 yes.dat > >> /bin/df -k . >> > Filesystem kbytes used avail capacity Mounted on > export-cit/compress 1704858624 92563 1261107232 1% /export/compress > >> zfs get compressratio export-cit/compress >> > NAME PROPERTY VALUE SOURCE > export-cit/compress compressratio 28.39x - > > So compressratio reports 28.39, but the ratio of file size to used disk for > the only regular file on this filesystem, i.e., excluding the initial 55kB > allocated for the "empty" filesystem is: > > 2147483648 / (185017 * 512) = 22.67 > > > Calculated another way from "zfs list" for the entire filesystem: > > >> zfs list /export/compress >> > NAME USED AVAIL REFER MOUNTPOINT > export-cit/compress 90.4M 1.17T 90.4M /export/compress > > is 2GB/90.4M = 2048 / 90.4 = 22.65 > > > That still leaves me puzzled what the precise definition of compressratio is? > > > Thanks. > > --- > Stuart Anderson anderson at ligo.caltech.edu > http://www.ligo.caltech.edu/~anderson > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
On Tue, Apr 15, 2008 at 03:51:17PM -0700, Richard Elling wrote:> UTSL. compressratio is the ratio of uncompressed bytes to compressed bytes. > http://cvs.opensolaris.org/source/search?q=ZFS_PROP_COMPRESSRATIO&defs=&refs=&path=zfs&hist=&project=%2Fonnv > > IMHO, you will (almost) never get the same number looking at bytes as you > get from counting blocks.If I can''t use /bin/ls to get an accurate measure of the number of compressed blocks used (-s) and the original number of uncompressed bytes (-l). What is a more accurate way to measure these? As a gedankan experiment, what command(s) can I run to examine a compressed ZFS filesystem and determine how much space it will require to replicate to an uncompressed ZFS filesystem? I can add up the file sizes, e.g., /bin/ls -lR | grep ^- | nawk ''{SUM+=$5}END{print SUM}'' but I would have thought there was a more efficient way using the already aggregated filesystem metadata via "/bin/df" or "zfs list" and the compressratio. Thanks. -- Stuart Anderson anderson at ligo.caltech.edu http://www.ligo.caltech.edu/~anderson
Hello Luke, Tuesday, April 15, 2008, 4:50:17 PM, you wrote: LS> You can fill up an ext3 filesystem with the following command: LS> dd if=/dev/zero of=delme.dat LS> You can''t really fill up a ZFS filesystme that way. I guess you could, LS> but I''ve never had the patience -- when several GB worth of zeroes takes LS> 1kb worth of data, then it would take a very long time. Unless something changed recently without compression ZFS will actually write zero block. -- Best regards, Robert Milkowski mailto:milek at task.gda.pl http://milek.blogspot.com
Stuart Anderson wrote:> On Tue, Apr 15, 2008 at 03:51:17PM -0700, Richard Elling wrote: > >> UTSL. compressratio is the ratio of uncompressed bytes to compressed bytes. >> http://cvs.opensolaris.org/source/search?q=ZFS_PROP_COMPRESSRATIO&defs=&refs=&path=zfs&hist=&project=%2Fonnv >> >> IMHO, you will (almost) never get the same number looking at bytes as you >> get from counting blocks. >> > > If I can''t use /bin/ls to get an accurate measure of the number of compressed > blocks used (-s) and the original number of uncompressed bytes (-l). What is > a more accurate way to measure these? >ls -s should give the proper number of blocks used. ls -l should give the proper file length. Do not assume that compressed data in a block consumes the whole block.> As a gedankan experiment, what command(s) can I run to examine a compressed > ZFS filesystem and determine how much space it will require to replicate > to an uncompressed ZFS filesystem? I can add up the file sizes, e.g., > /bin/ls -lR | grep ^- | nawk ''{SUM+=$5}END{print SUM}'' > but I would have thought there was a more efficient way using the already > aggregated filesystem metadata via "/bin/df" or "zfs list" and the > compressratio. >IMHO, this is a by-product of the dynamic nature of ZFS. Personally, I''d estimate using du rather than ls. -- richard
On Wed, Apr 16, 2008 at 10:09:00AM -0700, Richard Elling wrote:> Stuart Anderson wrote: > >On Tue, Apr 15, 2008 at 03:51:17PM -0700, Richard Elling wrote: > > > >>UTSL. compressratio is the ratio of uncompressed bytes to compressed > >>bytes. > >>http://cvs.opensolaris.org/source/search?q=ZFS_PROP_COMPRESSRATIO&defs=&refs=&path=zfs&hist=&project=%2Fonnv > >> > >>IMHO, you will (almost) never get the same number looking at bytes as you > >>get from counting blocks. > >> > > > >If I can''t use /bin/ls to get an accurate measure of the number of > >compressed > >blocks used (-s) and the original number of uncompressed bytes (-l). What > >is > >a more accurate way to measure these? > > > > ls -s should give the proper number of blocks used. > ls -l should give the proper file length. > Do not assume that compressed data in a block consumes the whole block.Not even on a pristine ZFS filesystem where just one file has been created?> > >As a gedankan experiment, what command(s) can I run to examine a compressed > >ZFS filesystem and determine how much space it will require to replicate > >to an uncompressed ZFS filesystem? I can add up the file sizes, e.g., > >/bin/ls -lR | grep ^- | nawk ''{SUM+=$5}END{print SUM}'' > >but I would have thought there was a more efficient way using the already > >aggregated filesystem metadata via "/bin/df" or "zfs list" and the > >compressratio. > > > > IMHO, this is a by-product of the dynamic nature of ZFS.Are you saying it can''t be done except by adding up all the individual file sizes?> Personally, I''d estimate using du rather than ls.They report the exact same number as far as I can tell. With the caveat that Solaris ls -s returns the number of 512-byte blocks, whereas GNU ls -s returns the number of 1024byte blocks by default. Thanks. -- Stuart Anderson anderson at ligo.caltech.edu http://www.ligo.caltech.edu/~anderson
Stuart Anderson wrote:> On Wed, Apr 16, 2008 at 10:09:00AM -0700, Richard Elling wrote: > >> Stuart Anderson wrote: >> >>> On Tue, Apr 15, 2008 at 03:51:17PM -0700, Richard Elling wrote: >>> >>> >>>> UTSL. compressratio is the ratio of uncompressed bytes to compressed >>>> bytes. >>>> http://cvs.opensolaris.org/source/search?q=ZFS_PROP_COMPRESSRATIO&defs=&refs=&path=zfs&hist=&project=%2Fonnv >>>> >>>> IMHO, you will (almost) never get the same number looking at bytes as you >>>> get from counting blocks. >>>> >>>> >>> If I can''t use /bin/ls to get an accurate measure of the number of >>> compressed >>> blocks used (-s) and the original number of uncompressed bytes (-l). What >>> is >>> a more accurate way to measure these? >>> >>> >> ls -s should give the proper number of blocks used. >> ls -l should give the proper file length. >> Do not assume that compressed data in a block consumes the whole block. >> > > Not even on a pristine ZFS filesystem where just one file has been created? >In theory, yes. Blocks are compressed, not files.> >>> As a gedankan experiment, what command(s) can I run to examine a compressed >>> ZFS filesystem and determine how much space it will require to replicate >>> to an uncompressed ZFS filesystem? I can add up the file sizes, e.g., >>> /bin/ls -lR | grep ^- | nawk ''{SUM+=$5}END{print SUM}'' >>> but I would have thought there was a more efficient way using the already >>> aggregated filesystem metadata via "/bin/df" or "zfs list" and the >>> compressratio. >>> >>> >> IMHO, this is a by-product of the dynamic nature of ZFS. >> > > Are you saying it can''t be done except by adding up all the individual > file sizes? >I''m saying that adding up all of the individual files sizes, rounded up to the smallest block size for the target file system, plus some estimate of metadata space requirements, will be the most pessimistic estimate. Metadata is also compressed and copied, by default.>> Personally, I''d estimate using du rather than ls. >> > > They report the exact same number as far as I can tell. With the caveat > that Solaris ls -s returns the number of 512-byte blocks, whereas > GNU ls -s returns the number of 1024byte blocks by default. > >That is file-system dependent. Some file systems have larger blocks and ls -s shows the size in blocks. ZFS uses dynamic block sizes, but you knew that already... :-) -- richard
Stuart Anderson <anderson at ligo.caltech.edu> wrote:> They report the exact same number as far as I can tell. With the caveat > that Solaris ls -s returns the number of 512-byte blocks, whereas > GNU ls -s returns the number of 1024byte blocks by default.IIRC, this may be controlled by environment variables J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
On Wed, Apr 16, 2008 at 02:07:53PM -0700, Richard Elling wrote:> > >>Personally, I''d estimate using du rather than ls. > >> > > > >They report the exact same number as far as I can tell. With the caveat > >that Solaris ls -s returns the number of 512-byte blocks, whereas > >GNU ls -s returns the number of 1024byte blocks by default. > > > > > That is file-system dependent. Some file systems have larger blocks > and ls -s shows the size in blocks. ZFS uses dynamic block sizes, but > you knew that already... :-) > -- richard >OK, we are now clearly exposing my ignorance, so hopefully I can learn something new about ZFS. What is the distinction/relationship between recordsize (which as I understand is a fixed quantity for each ZFS dataset) and dynamic block sizes? Are blocks what are allocated for metadata, and records what are allocated for data, i.e., the contents of files? What does it mean that blocks are compressed for a ZFS dataset with "compression=off"? Is this equivalent to saying that ZFS metadata is always compressed? Is there any ZFS documentation that shows by example exactly how to interpret the the various numbers from ls, du, df, and zfs used/refernced/ available/compressratio in the context of compression={on,off}, possibly also refering to both sparse and non-sparse files? Thanks. -- Stuart Anderson anderson at ligo.caltech.edu http://www.ligo.caltech.edu/~anderson
Stuart Anderson wrote:> On Wed, Apr 16, 2008 at 02:07:53PM -0700, Richard Elling wrote: > >>>> Personally, I''d estimate using du rather than ls. >>>> >>>> >>> They report the exact same number as far as I can tell. With the caveat >>> that Solaris ls -s returns the number of 512-byte blocks, whereas >>> GNU ls -s returns the number of 1024byte blocks by default. >>> >>> >>> >> That is file-system dependent. Some file systems have larger blocks >> and ls -s shows the size in blocks. ZFS uses dynamic block sizes, but >> you knew that already... :-) >> -- richard >> >> > > OK, we are now clearly exposing my ignorance, so hopefully I can learn > something new about ZFS. > > What is the distinction/relationship between recordsize (which as > I understand is a fixed quantity for each ZFS dataset) and dynamic > block sizes? Are blocks what are allocated for metadata, and records > what are allocated for data, i.e., the contents of files? > > What does it mean that blocks are compressed for a ZFS dataset with > "compression=off"? Is this equivalent to saying that ZFS metadata is > always compressed? > > Is there any ZFS documentation that shows by example exactly how to > interpret the the various numbers from ls, du, df, and zfs used/refernced/ > available/compressratio in the context of compression={on,off}, possibly > also refering to both sparse and non-sparse files? >ls, du, and df have fairly rigorous definitions in their respective man pages and specifications. But only df has an estimate of remaining available space. More detailed descriptions of the space usage and properties which impact it is in the ZFS Admin Guide. http://www.opensolaris.org/os/community/zfs/docs/zfsadmin.pdf The elephant in the room is the question of how much physical space you have free? That is not easy to predict, especially with compression. You might sleep better if you can convince yourself that you''ll never really know what is down the road until you get there :-) -- richard