Hello list, After reading a lot of docu, i am confused. What exactly is the advantage of sparse-files against "normal" files with fixed length? First i thought this is something like an auto-increasing file. But if i take a 2GB partition and add two sparse-files with 1GB each, i can''t add an additional one, the disk is full? So what about this mystic advantage? Is it only the faster creation of that file with dd, because it is not completely filled? That''s all? Please enlighten me ;) Kind regards, Florian ********************************************************************************************** IMPORTANT: The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error, please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies thereof. *** eSafe scanned this email for viruses, vandals, and malicious content. *** ********************************************************************************************** _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Rustedt, Florian wrote:> What exactly is the advantage of sparse-files against "normal" files > with fixed length? > >There are both advantages and disadvantages.> First i thought this is something like an auto-increasing file. But if i > take a 2GB partition and add two sparse-files with 1GB each, i can''t add > an additional one, the disk is full? > >No, that''s not it.> So what about this mystic advantage? Is it only the faster creation of > that file with dd, because it is not completely filled? > That''s all? >If you create yourself a nice big sparse file like this dd bs=1M seek=10240 count=0 if=/dev/zero of=huge And then look at what you''ve got with "ls -lh" you''ll see you have a 10G file that was created almost instantly. On the other hand, "ls -sh" will show that the file is actually occupying no space at all (well, almost no space). You can make this file bigger like this: dd bs=1M seek=20480 count=0 if=/dev/zero of=huge and this will make it 20GB and still not occupying much space. I suspect you already know this, but if you didn''t, you do now :-) The advantage of this 20GB file is precisely that it occupies next to no space on the disk that holds it. I can start writing data into it (that is, use it a a guest''s disk) and the blocks needed will be allocated as they are used. In fact, I could have a 200GB guest disk image even though the disk I have at the moment is only 120GB and I''m using quite a lot of it -- it would only be a problem if the guest actually wanted to use all that space. There are some problems with sparse files: the compress beautfully (gzip reports 99.9%) but it takes a while to read the empty space and when you uncompress the file you discover that it now actually occupies disk space: there''s no good way to distinguish between an unallocated block and a block full of zeroes. This also means that you need to be careful how you back these files up: you need something a little cleverer than gzip. Another problem with sparse files, especially when using them as domU disks is that blocks that are contiguous in the file are not contiguous on the disk. That means if, in the guest, if you just "dd if=/dev/xvda of=/dev/null" then domU will be seeking back and forth all over the place to return the blocks in the order that they''re being asked for. You don''t need xen for this -- when I downloaded the DVD image of Fedora 10 using transmission (a bittorrent client) a checksum on the resulting file only managed to read it at about 4MB/s. On the other hand, when I copied the file the checksum on the copy ran at closer to 100MB/s -- bittorrent clients like transmission really ought to pre-allocate the disk space to that you get something contiguous and also not embarrassingly run out of space half way through. In a nutshell, though: pros: over-committed disk space cons: performance jch _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Thank you! That was exactly what i needed ;) I must have done something wrong in my tests, because i couldn''t "overbook" my partition like this examples did and that''s exactly what i hoped that is possible with sparse files! Now it works. Does the choosen blocksize has impact on the formatting, so do i need to take smaller blocksizes if i want to use the space with the filesystem i am choosing more efficiently? Or is it just for calculation, so formatting a "dd bs=512K seek=2048" results exactly in the same filesystemlayout after formatting as a "dd bs=1M seek=1024" would do? So in both cases, i can use a "mkfs.ext2 -b 512 huge" and the resulting file mounts in both cases equally with a 256 Byte blocksize? Kind regards, Florian> -----Ursprüngliche Nachricht----- > Von: John Haxby [mailto:john.haxby@oracle.com] > Gesendet: Dienstag, 16. Dezember 2008 15:38 > An: Rustedt, Florian > Cc: xen-users@lists.xensource.com > Betreff: Re: [Xen-users] Understanding sparse-files > > Rustedt, Florian wrote: > > What exactly is the advantage of sparse-files against > "normal" files > > with fixed length? > > > > > There are both advantages and disadvantages. > > > First i thought this is something like an auto-increasing > file. But if > > i take a 2GB partition and add two sparse-files with 1GB > each, i can''t > > add an additional one, the disk is full? > > > > > No, that''s not it. > > > So what about this mystic advantage? Is it only the faster > creation of > > that file with dd, because it is not completely filled? > > That''s all? > > > > If you create yourself a nice big sparse file like this > > dd bs=1M seek=10240 count=0 if=/dev/zero of=huge > > And then look at what you''ve got with "ls -lh" you''ll see you > have a 10G file that was created almost instantly. On the > other hand, "ls -sh" > will show that the file is actually occupying no space at all > (well, almost no space). You can make this file bigger like this: > > dd bs=1M seek=20480 count=0 if=/dev/zero of=huge > > and this will make it 20GB and still not occupying much space. > > I suspect you already know this, but if you didn''t, you do now :-) > > The advantage of this 20GB file is precisely that it occupies > next to no space on the disk that holds it. I can start > writing data into it (that is, use it a a guest''s disk) and > the blocks needed will be allocated as they are used. In > fact, I could have a 200GB guest disk image even though the > disk I have at the moment is only 120GB and I''m using quite a > lot of it -- it would only be a problem if the guest actually > wanted to use all that space. > > There are some problems with sparse files: the compress > beautfully (gzip reports 99.9%) but it takes a while to read > the empty space and when you uncompress the file you discover > that it now actually occupies disk > space: there''s no good way to distinguish between an > unallocated block > and a block full of zeroes. This also means that you need to be > careful how you back these files up: you need something a > little cleverer than gzip. > > Another problem with sparse files, especially when using them > as domU disks is that blocks that are contiguous in the file > are not contiguous > on the disk. That means if, in the guest, if you just "dd > if=/dev/xvda > of=/dev/null" then domU will be seeking back and forth all over the > place to return the blocks in the order that they''re being > asked for. > You don''t need xen for this -- when I downloaded the DVD > image of Fedora 10 using transmission (a bittorrent client) a > checksum on the resulting file only managed to read it at > about 4MB/s. On the other hand, when I copied the file the > checksum on the copy ran at closer to 100MB/s -- bittorrent > clients like transmission really ought to pre-allocate the > disk space to that you get something contiguous and also not > embarrassingly run out of space half way through. > > In a nutshell, though: > > pros: over-committed disk space > > cons: performance > > jch > > >********************************************************************************************** IMPORTANT: The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error, please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies thereof. *** eSafe scanned this email for viruses, vandals, and malicious content. *** ********************************************************************************************** _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
The over-committed disk space can also be a double-edged sword. While it allows you to commit disk space you don''t actually have, if you''re creating a lot of domUs with a lot of sparse disk files, you can easily lose track of how much data you actually do have on the filesystem vs. how much disk space you''ve allocated. So, if you have a 10 GB filesystem, and you create 4 x 5GB sparse files, it doesn''t take much use of each of those sparse files to run out that 10GB without realizing what you''ve done. It is a pro, but you have to make sure that you''re somehow keeping track of how much disk space you''re actually using. If you pre-allocate the disk files, you''ll know fairly easily from the df command when your filesystem is close to capacity or completely out of space. -Nick -----Original Message----- From: John Haxby <john.haxby@oracle.com> To: Rustedt, Florian <Florian.Rustedt@smartnet.de> Cc: xen-users@lists.xensource.com Subject: Re: [Xen-users] Understanding sparse-files Date: Tue, 16 Dec 2008 14:37:33 +0000 Rustedt, Florian wrote:> What exactly is the advantage of sparse-files against "normal" files > with fixed length? > >There are both advantages and disadvantages.> First i thought this is something like an auto-increasing file. But if i > take a 2GB partition and add two sparse-files with 1GB each, i can''t add > an additional one, the disk is full? > >No, that''s not it.> So what about this mystic advantage? Is it only the faster creation of > that file with dd, because it is not completely filled? > That''s all? >If you create yourself a nice big sparse file like this dd bs=1M seek=10240 count=0 if=/dev/zero of=huge And then look at what you''ve got with "ls -lh" you''ll see you have a 10G file that was created almost instantly. On the other hand, "ls -sh" will show that the file is actually occupying no space at all (well, almost no space). You can make this file bigger like this: dd bs=1M seek=20480 count=0 if=/dev/zero of=huge and this will make it 20GB and still not occupying much space. I suspect you already know this, but if you didn''t, you do now :-) The advantage of this 20GB file is precisely that it occupies next to no space on the disk that holds it. I can start writing data into it (that is, use it a a guest''s disk) and the blocks needed will be allocated as they are used. In fact, I could have a 200GB guest disk image even though the disk I have at the moment is only 120GB and I''m using quite a lot of it -- it would only be a problem if the guest actually wanted to use all that space. There are some problems with sparse files: the compress beautfully (gzip reports 99.9%) but it takes a while to read the empty space and when you uncompress the file you discover that it now actually occupies disk space: there''s no good way to distinguish between an unallocated block and a block full of zeroes. This also means that you need to be careful how you back these files up: you need something a little cleverer than gzip. Another problem with sparse files, especially when using them as domU disks is that blocks that are contiguous in the file are not contiguous on the disk. That means if, in the guest, if you just "dd if=/dev/xvda of=/dev/null" then domU will be seeking back and forth all over the place to return the blocks in the order that they''re being asked for. You don''t need xen for this -- when I downloaded the DVD image of Fedora 10 using transmission (a bittorrent client) a checksum on the resulting file only managed to read it at about 4MB/s. On the other hand, when I copied the file the checksum on the copy ran at closer to 100MB/s -- bittorrent clients like transmission really ought to pre-allocate the disk space to that you get something contiguous and also not embarrassingly run out of space half way through. In a nutshell, though: pros: over-committed disk space cons: performance jch This e-mail may contain confidential and privileged material for the sole use of the intended recipient. If this email is not intended for you, or you are not responsible for the delivery of this message to the intended recipient, please note that this message may contain SEAKR Engineering (SEAKR) Privileged/Proprietary Information. In such a case, you are strictly prohibited from downloading, photocopying, distributing or otherwise using this message, its contents or attachments in any way. If you have received this message in error, please notify us immediately by replying to this e-mail and delete the message from your mailbox. Information contained in this message that does not relate to the business of SEAKR is neither endorsed by nor attributable to SEAKR. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Rustedt, Florian wrote:> Thank you! > > That was exactly what i needed ;) > >My pleasure.> Does the choosen blocksize has impact on the formatting, so do i need to take smaller blocksizes if i want to use the space with the filesystem i am choosing more efficiently? > Or is it just for calculation, so formatting a "dd bs=512K seek=2048" results exactly in the same filesystemlayout after formatting as a "dd bs=1M seek=1024" would do? > So in both cases, i can use a "mkfs.ext2 -b 512 huge" and the resulting file mounts in both cases equally with a 256 Byte blocksize? >Files don''t have block sizes :-) dd just multiples the block size and seek to find the starting offset. In fact, as the count is zero (write nothing at all to the file), dd bs=1M seek=20480 count=0 turns into this: open("/dev/zero", O_RDONLY) = 3 dup2(3, 0) = 0 close(3) = 0 lseek(0, 0, SEEK_CUR) = 0 open("huge", O_RDWR|O_CREAT, 0666) = 3 dup2(3, 1) = 1 close(3) = 0 ftruncate(1, 21474836480) = 0 The important call is the last one. jch _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Oh, sh.t ;) Well, no big clue to see that, if the real size is zero! I hope, i''ll think a little deeper next time i am asking, thanks again for your time spent to answer. Kind regards, Florian> -----Ursprüngliche Nachricht----- > Von: John Haxby [mailto:john.haxby@oracle.com] > Gesendet: Dienstag, 16. Dezember 2008 16:14 > An: Rustedt, Florian > Cc: xen-users@lists.xensource.com > Betreff: Re: AW: [Xen-users] Understanding sparse-files > > Rustedt, Florian wrote: > > Thank you! > > > > That was exactly what i needed ;) > > > > > > My pleasure. > > > Does the choosen blocksize has impact on the formatting, so > do i need to take smaller blocksizes if i want to use the > space with the filesystem i am choosing more efficiently? > > Or is it just for calculation, so formatting a "dd bs=512K > seek=2048" results exactly in the same filesystemlayout after > formatting as a "dd bs=1M seek=1024" would do? > > So in both cases, i can use a "mkfs.ext2 -b 512 huge" and > the resulting file mounts in both cases equally with a 256 > Byte blocksize? > > > > Files don''t have block sizes :-) dd just multiples the > block size and > seek to find the starting offset. In fact, as the count is > zero (write nothing at all to the file), dd bs=1M seek=20480 > count=0 turns into this: > > open("/dev/zero", O_RDONLY) = 3 > dup2(3, 0) = 0 > close(3) = 0 > lseek(0, 0, SEEK_CUR) = 0 > open("huge", O_RDWR|O_CREAT, 0666) = 3 > dup2(3, 1) = 1 > close(3) = 0 > ftruncate(1, 21474836480) = 0 > > The important call is the last one. > > jch > >********************************************************************************************** IMPORTANT: The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error, please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies thereof. *** eSafe scanned this email for viruses, vandals, and malicious content. *** ********************************************************************************************** _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> There are some problems with sparse files: the compress beautfully (gzip > reports 99.9%) but it takes a while to read the empty space and when you > uncompress the file you discover that it now actually occupies disk space: > there''s no good way to distinguish between an unallocated block and a block > full of zeroes. This also means that you need to be careful how you back > these files up: you need something a little cleverer than gzip.You can use tar with the S (sparse) option here. Also I think that you should be able to use cp with a --sparse option as well. To distinguish, using the du command helps. Cheers, Todd -- Todd Deshane http://todddeshane.net http://runningxen.com _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users