Hi. While doing a scan of disk usage, I noticed the following oddity. I have a directory of files (named file.dat for this example) that all appear as ~1.5GB when using ''ls -l'', but that (correctly) appear as ~250KB files when using ''ls -s'' or du commands: edmudama$ ls -l file.dat -rwxrwx---+ 1 remlab staff 1447317088 Jun 5 2010 file.dat edmudama$ /usr/bin/ls -l file.dat -rwxrwx---+ 1 remlab staff 1447317088 Jun 5 2010 file.dat edmudama$ ls -ls file.dat 521 -rwxrwx---+ 1 remlab staff 1447317088 Jun 5 2010 file.dat edmudama$ du -sh file.dat 260K file.dat edmudama$ /usr/bin/du -s file.dat 521 file.dat edmudama$ /usr/bin/ls -s file.dat 521 file.dat I am running oi_148, though the files were created likely back when we were using an older opensolaris (2008.11) on this same machine. The results with both gnu ls and solaris ls are identical. Dedup is not enabled on any pool, nor has it ever been enabled. Filesystem is zfs version 4, Pool is ZFS pool version 28. A scrub of the pool is consistent and shows no errors, and the sizing reported in ''zpool list'' would appear to match the du block counts from what I can tell. edmudama$ zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT rpool 29.8G 22.0G 7.79G 73% 1.00x ONLINE - tank 1.81T 879G 977G 47% 1.00x ONLINE - Is something broken? Any idea why I am seeing the wrong sizes in ls? --eric -- Eric D. Mudama edmudama at bounceswoosh.org
On Mon, 2 May 2011, Eric D. Mudama wrote:> > Hi. While doing a scan of disk usage, I noticed the following oddity. > I have a directory of files (named file.dat for this example) that all > appear as ~1.5GB when using ''ls -l'', but that (correctly) appear as ~250KB > files when using ''ls -s'' or du commands:These are probably just sparse files. Nothing to be alarmed about. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
> > Hi. While doing a scan of disk usage, I noticed the following > > oddity. > > I have a directory of files (named file.dat for this example) that > > all > > appear as ~1.5GB when using ''ls -l'', but that (correctly) appear as > > ~250KB > > files when using ''ls -s'' or du commands: > > These are probably just sparse files. Nothing to be alarmed about.or very tightly compressed files (like those from a dd if=/dev/zero). du and ls -ls will show how much is stored - ls -l will show the theoretical file size. Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk.
On Mon, May 2 at 14:01, Bob Friesenhahn wrote:>On Mon, 2 May 2011, Eric D. Mudama wrote: > >> >>Hi. While doing a scan of disk usage, I noticed the following oddity. >>I have a directory of files (named file.dat for this example) that all >>appear as ~1.5GB when using ''ls -l'', but that (correctly) appear as ~250KB >>files when using ''ls -s'' or du commands: > >These are probably just sparse files. Nothing to be alarmed about.They were created via CIFS. I thought sparse files were an iSCSI concept, no? --eric -- Eric D. Mudama edmudama at bounceswoosh.org
On 05/ 2/11 08:41 PM, Eric D. Mudama wrote:> On Mon, May 2 at 14:01, Bob Friesenhahn wrote: >> On Mon, 2 May 2011, Eric D. Mudama wrote: >> >>> >>> Hi. While doing a scan of disk usage, I noticed the following oddity. >>> I have a directory of files (named file.dat for this example) that all >>> appear as ~1.5GB when using ''ls -l'', but that (correctly) appear as >>> ~250KB >>> files when using ''ls -s'' or du commands: >> >> These are probably just sparse files. Nothing to be alarmed about. > > They were created via CIFS. I thought sparse files were an iSCSI > concept, no?iSCSI is a block level protocol. Sparse files are a filesystem level concept that is understood my many filesystems including CIFS and ZFS and many others. -- Darren J Moffat
Casper.Dik at oracle.com
2011-May-02 19:53 UTC
[zfs-discuss] ls reports incorrect file size
>On Mon, May 2 at 14:01, Bob Friesenhahn wrote: >>On Mon, 2 May 2011, Eric D. Mudama wrote: >> >>> >>>Hi. While doing a scan of disk usage, I noticed the following oddity. >>>I have a directory of files (named file.dat for this example) that all >>>appear as ~1.5GB when using ''ls -l'', but that (correctly) appear as ~250KB >>>files when using ''ls -s'' or du commands: >> >>These are probably just sparse files. Nothing to be alarmed about. > >They were created via CIFS. I thought sparse files were an iSCSI concept, no?"sparse files" are a concept of the underlying filesystem. E.g., if you lseek() after the end of the file and you write, your filesystem may not need to allocate empty blocks. Most Unix filesystems allow sparse files; FAT/FAT32 filesystems do not. Casper
Also, sparseness need not be apparent to applications. Until recent improvements to lseek(2) to expose hole/non-hole offsets, the only way to know about sparseness was to notice that a file''s reported size is more than the file''s reported filesystem blocks times the block size. Sparse files in Unix go back at least to the early 80s. If a filesystem protocol, such as CIFS (I''ve no idea if it supports sparse files), were to not support sparse files, all that would mean is that the server must report a number of blocks that matches a file''s size (assuming the protocol in question even supports any notion of reporting a file''s size in blocks). There''s really two ways in which a filesystem protocol could support sparse files: a) by reporting file size in bytes and blocks, b) by reporting lists of file offsets demarcating holes from non-holes. (b) is a very new idea; Lustre may be the only filesystem that I know that supports this (see the Linux FIEMAP APIs)., though work is in progress to add this to NFSv4. Nico --
On Mon, 2 May 2011, Eric D. Mudama wrote:>> These are probably just sparse files. Nothing to be alarmed about. > > They were created via CIFS. I thought sparse files were an iSCSI concept, > no?Sparse files are a traditional Unix filesystem feature. Many/most database files are sparse. All that is needed to create a sparse portion of a file is to seek beyond the current end of the file and write something. The ftruncate() function may be used to create all/part of a file which is sparse. A smart file writer (or the filesystem) could easily convert long blocks of zero to a sparse allocation. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On 05/02/11 14:02, Nico Williams wrote:> Also, sparseness need not be apparent to applications. Until recent > improvements to lseek(2) to expose hole/non-hole offsets, the only way > to know about sparseness was to notice that a file''s reported size is > more than the file''s reported filesystem blocks times the block size. > Sparse files in Unix go back at least to the early 80s. > > If a filesystem protocol, such as CIFS (I''ve no idea if it supports > sparse files), were to not support sparse files, all that would mean > is that the server must report a number of blocks that matches a > file''s size (assuming the protocol in question even supports any > notion of reporting a file''s size in blocks). > > There''s really two ways in which a filesystem protocol could support > sparse files: a) by reporting file size in bytes and blocks, b) by > reporting lists of file offsets demarcating holes from non-holes. (b) > is a very new idea; Lustre may be the only filesystem that I know that > supports this (see the Linux FIEMAP APIs)., though work is in progress > to add this to NFSv4. >I enhanced the lseek interface a while back now to return information about sparse files, by adding 2 new interfaces: SEEK_HOLE and SEEK_DATA. See man -s2 lseek Neil.
On Mon, May 2 at 20:50, Darren J Moffat wrote:>On 05/ 2/11 08:41 PM, Eric D. Mudama wrote: >>On Mon, May 2 at 14:01, Bob Friesenhahn wrote: >>>On Mon, 2 May 2011, Eric D. Mudama wrote: >>> >>>> >>>>Hi. While doing a scan of disk usage, I noticed the following oddity. >>>>I have a directory of files (named file.dat for this example) that all >>>>appear as ~1.5GB when using ''ls -l'', but that (correctly) appear as >>>>~250KB >>>>files when using ''ls -s'' or du commands: >>> >>>These are probably just sparse files. Nothing to be alarmed about. >> >>They were created via CIFS. I thought sparse files were an iSCSI >>concept, no? > >iSCSI is a block level protocol. Sparse files are a filesystem level >concept that is understood my many filesystems including CIFS and ZFS >and many others.Yea, kept googling and it makes sense. I guess I am simply surprised that the application would have done the seek+write combination, since on NTFS (which doesn''t support sparse) these would have been real 1.5GB files, and there would be hundreds or thousands of them in normal usage. thx! -- Eric D. Mudama edmudama at bounceswoosh.org
On Mon, May 2, 2011 at 3:56 PM, Eric D. Mudama <edmudama at bounceswoosh.org> wrote:> Yea, kept googling and it makes sense. ?I guess I am simply surprised > that the application would have done the seek+write combination, since > on NTFS (which doesn''t support sparse) these would have been real > 1.5GB files, and there would be hundreds or thousands of them in > normal usage.It could have been smbd compressing long runs of zeros.
On Mon, 2 May 2011, Eric D. Mudama wrote:> Yea, kept googling and it makes sense. I guess I am simply surprised > that the application would have done the seek+write combination, since > on NTFS (which doesn''t support sparse) these would have been real > 1.5GB files, and there would be hundreds or thousands of them in > normal usage.This is a reason why a Solaris server may be much faster than a Windows/NTFS server when there is a seek followed by a write. In my own application, I see that a long seek past the end of the file followed by a write is quite slow on NTFS but is quite fast on Unix filesystems which support holes. Apple''s HFS Plus is another popular filesystem which does not support holes and is therefore quite slow at creating large files comprised of zeros. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Then again, Windows apps may be doing seek+write to pre-allocate storage.
On Mon, May 2, 2011 at 1:56 PM, Eric D. Mudama <edmudama at bounceswoosh.org> wrote:> that the application would have done the seek+write combination, since > on NTFS (which doesn''t support sparse) these would have been real > 1.5GB files, and there would be hundreds or thousands of them in > normal usage.NTFS supports sparse files. http://www.flexhex.com/docs/articles/sparse-files.phtml -B -- Brandon High : bhigh at freaks.com
On Mon, May 2 at 15:30, Brandon High wrote:>On Mon, May 2, 2011 at 1:56 PM, Eric D. Mudama ><edmudama at bounceswoosh.org> wrote: >> that the application would have done the seek+write combination, since >> on NTFS (which doesn''t support sparse) these would have been real >> 1.5GB files, and there would be hundreds or thousands of them in >> normal usage. > >NTFS supports sparse files. >http://www.flexhex.com/docs/articles/sparse-files.phtml >ok corrected, thx. my google-fu had indicated otherwise, thanks -- Eric D. Mudama edmudama at bounceswoosh.org