thr3ads.net - zfs discuss - [zfs-discuss] Update/append of compressed files [Apr 2007]

If this information is useful, please help other people find it:
Share via:

Leonardo Francalanci

2007-Apr-17 10:37 UTC

[zfs-discuss] Update/append of compressed files

Hi,

regarding ZFS compression method: what happens when a compressed file is 
udpated/appended? Is it ALL un-compressed first, updated/appended and 
then re-compressed? Or only the affected blocks are uncompressed and 
then recompressed?

And, what happens exactly when a portion of a compressed file is to be 
read: the whole file or only a portion is read and uncompressed?


Thank you in advance


Leonardo

Darren J Moffat

2007-Apr-17 10:49 UTC

head link

[zfs-discuss] Update/append of compressed files

Leonardo Francalanci wrote:> Hi,
> 
> regarding ZFS compression method: what happens when a compressed file is 
> udpated/appended? Is it ALL un-compressed first, updated/appended and 
> then re-compressed? Or only the affected blocks are uncompressed and 
> then recompressed?
ZFS does NOT compress files.  It compresses the individual blocks that 
are written to comprise the file.

What this means is if you have a 10G file and you need to change one 
small part in the middle you seek(2) to that point, read(2) the data 
make your change to your in memory copy and write(2) it back out again. 
  ZFS will update (and recompress) only those blocks that actual need to 
get changed.

-- 
Darren J Moffat

Jerome Haynes-Smith

2007-Apr-17 14:37 UTC

head link

[zfs-discuss] Update/append of compressed files

Hi folks

On Tuesday 17 April 2007 11:49, Darren J Moffat wrote:> Leonardo Francalanci wrote:
> > Hi,
> >
> > regarding ZFS compression method: what happens when a compressed file
is
> > udpated/appended? Is it ALL un-compressed first, updated/appended and
> > then re-compressed? Or only the affected blocks are uncompressed and
> > then recompressed?
>
> ZFS does NOT compress files.  It compresses the individual blocks that
> are written to comprise the file.
>
> What this means is if you have a 10G file and you need to change one
> small part in the middle you seek(2) to that point, read(2) the data
> make your change to your in memory copy and write(2) it back out again.
>   ZFS will update (and recompress) only those blocks that actual need to
> get changed.
How can this work?  With compressed data, its hard to predict its final size 
before compression.  How is the seek(2) accurate? How can the seek know where 
the uncompressed block (say) 345 is after the data is compressed? It will be 
physically on a different actual block...??. or I probably dont understand 
how compression with ZFS works.... ;-)

Or - does it compress an 8k block into a smaller 6 or 4K block, but with the 
same block number, and so the seek to a block will still work, but less disk 
space is taken up ?

Regards
Jerome

-- 
Jerome Haynes-Smith
Sun TSC Storage Backline

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE: This email message is for the sole use of the intended
recipient(s) and may contain
confidential and privileged information. Any unauthorized review, use,
disclosure or distribution is
prohibited. If you are not the intended recipient, please contact the
sender by reply email and
destroy all copies of the original message.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Darren J Moffat

2007-Apr-17 14:48 UTC

head link

[zfs-discuss] Update/append of compressed files

Jerome Haynes-Smith wrote:>>> regarding ZFS compression method: what happens when a compressed
file is
>>> udpated/appended? Is it ALL un-compressed first, updated/appended
and
>>> then re-compressed? Or only the affected blocks are uncompressed
and
>>> then recompressed?
>> ZFS does NOT compress files.  It compresses the individual blocks that
>> are written to comprise the file.
>>
>> What this means is if you have a 10G file and you need to change one
>> small part in the middle you seek(2) to that point, read(2) the data
>> make your change to your in memory copy and write(2) it back out again.
>>   ZFS will update (and recompress) only those blocks that actual need
to
>> get changed.
> 
> How can this work?  With compressed data, its hard to predict its final
size
> before compression. 
Because you are NOT compressing the file only compressing the blocks as 
they get written to disk.
> How is the seek(2) accurate? How can the seek know where 
> the uncompressed block (say) 345 is after the data is compressed?
seek(2) doesn''t know anything about compression that is all hidden
below
in the implementation of ZFS.

seek(2) works on file offsets not block numbers on the raw storage device.
> It will be 
> physically on a different actual block...??. or I probably dont understand 
> how compression with ZFS works.... ;-)
> 
> Or - does it compress an 8k block into a smaller 6 or 4K block, but with
the
> same block number, and so the seek to a block will still work, but less
disk
> space is taken up ?
A bit more like that yes, except that seek(2) doesn''t (and
shouldn''t)
know anything about block numbers only file offsets.


-- 
Darren J Moffat

Dan Mick

2007-Apr-17 19:44 UTC

head link

[zfs-discuss] Update/append of compressed files

>> How can this work?  With compressed data, its hard to predict its 
>> final size before compression. 
> 
> Because you are NOT compressing the file only compressing the blocks as 
> they get written to disk.
I guess this implies that the compression only can save integral numbers of 
blocks.

Robert Milkowski

2007-Apr-17 20:38 UTC

head link

[zfs-discuss] Update/append of compressed files

Hello Dan,

Tuesday, April 17, 2007, 9:44:45 PM, you wrote:
>>> How can this work?  With compressed data, its hard to predict its 
>>> final size before compression. 
>> 
>> Because you are NOT compressing the file only compressing the blocks as
>> they get written to disk.
DM> I guess this implies that the compression only can save integral numbers
of
DM> blocks.

Can you clarify please?
I don''t understand above....

-- 
Best regards,
 Robert                            mailto:rmilkowski at task.gda.pl
                                       http://milek.blogspot.com

Dan Mick

2007-Apr-17 20:59 UTC

head link

[zfs-discuss] Update/append of compressed files

Robert Milkowski wrote:> Hello Dan,
> 
> Tuesday, April 17, 2007, 9:44:45 PM, you wrote:
> 
>>>> How can this work?  With compressed data, its hard to predict
its
>>>> final size before compression. 
>>> Because you are NOT compressing the file only compressing the
blocks as
>>> they get written to disk.
> 
> DM> I guess this implies that the compression only can save integral
numbers of
> DM> blocks.
> 
> Can you clarify please?
> I don''t understand above....
If compression is done block-wise, then if I compress a 512-byte block to 2 
bytes, I still need a 512-byte block to store it.

Similarly, if I compress 1000 blocks to 999.001 blocks, I still need 1000 
blocks to store them.

This is not a significant problem, I''m sure, but it''s worth
remembering.
Many tiny files probably don''t benefit from compression at all, rather
than
"only a little".

Robert Milkowski

2007-Apr-17 21:38 UTC

head link

[zfs-discuss] Update/append of compressed files

Hello Dan,

Tuesday, April 17, 2007, 10:59:53 PM, you wrote:

DM> Robert Milkowski wrote:>> Hello Dan,
>> 
>> Tuesday, April 17, 2007, 9:44:45 PM, you wrote:
>> 
>>>>> How can this work?  With compressed data, its hard to
predict its
>>>>> final size before compression. 
>>>> Because you are NOT compressing the file only compressing the
blocks as
>>>> they get written to disk.
>> 
>> DM> I guess this implies that the compression only can save integral
numbers of
>> DM> blocks.
>> 
>> Can you clarify please?
>> I don''t understand above....
DM> If compression is done block-wise, then if I compress a 512-byte block to
2
DM> bytes, I still need a 512-byte block to store it.

DM> Similarly, if I compress 1000 blocks to 999.001 blocks, I still need 1000
DM> blocks to store them.

DM> This is not a significant problem, I''m sure, but it''s
worth remembering.
DM> Many tiny files probably don''t benefit from compression at all,
rather than
DM> "only a little".

Yep, that''s true. As smallest block in zfs is 512...

But there''s one exception - if you''re creating small files
(and also
large one) fill with 0s then you will gain storage even if each file
is less than 512B as no data block is allocated then :)


-- 
Best regards,
 Robert                            mailto:rmilkowski at task.gda.pl
                                       http://milek.blogspot.com

Anton B. Rang

2007-Apr-18 03:56 UTC

head link

[zfs-discuss] Re: Update/append of compressed files

Remember that ZFS is a copy-on-write file system.

ZFS, much like UFS, uses indirect blocks to point to file contents. However,
unlike UFS (which supports only 8K and 1K blocks, and 1K blocks only at the end
of a file), the underlying stored data blocks can be of different sizes.

An uncompressed file might look conceptually like this:
  File offset 0 => Disk block 100000, 256 blocks
  File offset 128K => Disk block 100256, 256 blocks
  File offset 256K => Disk block 100512, 256 blocks

If compression were enabled and the "middle" block of that file were
rewritten, it might look like this:
  File offset 0 => Disk block 100000, 256 blocks
  File offset 128K => Disk block 200000, 64 blocks
  File offset 256K => Disk block 100512, 256 blocks

Seeking to an offset in the file is the same in both cases: Examine the index to
the file (direct & indirect blocks) until you get to the right file offset,
then retrieve the address of the stored block on disk. Then you can read the
data; in the compressed case, after reading the data, you uncompress it into the
user''s buffer.

Writing to the file is easy because you allocate new space each time, so it
doesn''t matter if the compressed size grows or shrinks from the
original block.
 
 
This message posted from opensolaris.org

Roch - PAE

2007-Apr-18 07:40 UTC

head link

[zfs-discuss] Update/append of compressed files

Dan Mick writes:
 > Robert Milkowski wrote:
 > > Hello Dan,
 > > 
 > > Tuesday, April 17, 2007, 9:44:45 PM, you wrote:
 > > 
 > >>>> How can this work?  With compressed data, its hard to
predict its
 > >>>> final size before compression. 
 > >>> Because you are NOT compressing the file only compressing the
blocks as
 > >>> they get written to disk.
 > > 
 > > DM> I guess this implies that the compression only can save
integral numbers of
 > > DM> blocks.
 > > 
 > > Can you clarify please?
 > > I don''t understand above....
 > 
 > If compression is done block-wise, then if I compress a 512-byte block to
2
 > bytes, I still need a 512-byte block to store it.
 > 
 > Similarly, if I compress 1000 blocks to 999.001 blocks, I still need 1000 
 > blocks to store them.
 > 
 > This is not a significant problem, I''m sure, but it''s
worth remembering.
 > Many tiny files probably don''t benefit from compression at all,
rather than
 > "only a little".
 > 

I guess this is true for sub sectors  (512 Bytes) filesize. But for
anything else, we could seen gains. 

-r

 > 
 > _______________________________________________
 > zfs-discuss mailing list
 > zfs-discuss at opensolaris.org
 > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

zfs discuss - Apr 2007 - Update/append of compressed files

[zfs-discuss] Update/append of compressed files

[zfs-discuss] Update/append of compressed files

[zfs-discuss] Update/append of compressed files

[zfs-discuss] Update/append of compressed files

[zfs-discuss] Update/append of compressed files

[zfs-discuss] Update/append of compressed files

[zfs-discuss] Update/append of compressed files

[zfs-discuss] Update/append of compressed files

[zfs-discuss] Re: Update/append of compressed files

[zfs-discuss] Update/append of compressed files