Filipe Brandenburger
2009-Mar-11 21:29 UTC
[CentOS] Disk usage for small files in ext3 in CentOS 5
Hello, I noticed something unusual today. If I "du" a small file (couple of bytes) in CentOS 5, it tells me the file is using 8kb, while I was expecting 4kb which is the block size I'm using. I tried this on several CentOS 5 machines, both x86_64 and i386: $ echo test >test.txt $ ls -l test.txt -rw-rw-r-- 1 filbranden filbranden 5 Mar 11 17:24 test.txt $ du -h test.txt 8.0K test.txt If I do the same on a CentOS 4 machine: $ echo test >test.txt $ ls -l test.txt -rw-rw-r-- 1 filbranden filbranden 5 Mar 11 17:25 test.txt $ du -h test.txt 4.0K test.txt On all machines I tested, both CentOS 4 and CentOS 5: # tune2fs -l /dev/xxxxx ... Block size: 4096 Fragment size: 4096 I could not find any differences that would explain the behaviour. Have you seen this before? Can you reproduce it on your systems? Do you know how to get the CentOS 4 behaviour? More on the point: I'm migrating some data from CentOS 4 to CentOS 5, it's around 70GB of millions of small files. I would like it to still take 70GB, not 140GB. For now, I'm working around this issue by using "-T small" to mke2fs, I'm not sure if it's going to have the effect I want, and I'm not sure about any other impact (performance?) it might have on my filesystem. Thanks, Filipe
Nicolas Thierry-Mieg
2009-Mar-11 21:51 UTC
[CentOS] Disk usage for small files in ext3 in CentOS 5
Filipe Brandenburger wrote:> Hello, > > I noticed something unusual today. > > If I "du" a small file (couple of bytes) in CentOS 5, it tells me the > file is using 8kb, while I was expecting 4kb which is the block size > I'm using. > > I tried this on several CentOS 5 machines, both x86_64 and i386: > > $ echo test >test.txt > $ ls -l test.txt > -rw-rw-r-- 1 filbranden filbranden 5 Mar 11 17:24 test.txt > $ du -h test.txt > 8.0K test.txt<snip>> I could not find any differences that would explain the behaviour. > Have you seen this before? Can you reproduce it on your systems? Do > you know how to get the CentOS 4 behaviour?strange. I don't reproduce on an x86_64 centos 5 machine: [nthierry at localhost ~]$ echo test >test.txt [nthierry at localhost ~]$ ls -l test.txt -rw-rw-r-- 1 nthierry nthierry 5 Mar 11 22:44 test.txt [nthierry at localhost ~]$ du -h test.txt 4.0K test.txt I'm pretty sure I did nothing special when making the fs. HTH
Robert Nichols
2009-Mar-11 22:08 UTC
[CentOS] Disk usage for small files in ext3 in CentOS 5
Filipe Brandenburger wrote:> Hello, > > I noticed something unusual today. > > If I "du" a small file (couple of bytes) in CentOS 5, it tells me the > file is using 8kb, while I was expecting 4kb which is the block size > I'm using. > > I tried this on several CentOS 5 machines, both x86_64 and i386: > > $ echo test >test.txt > $ ls -l test.txt > -rw-rw-r-- 1 filbranden filbranden 5 Mar 11 17:24 test.txt > $ du -h test.txt > 8.0K test.txtOdd. I'm not seeing this on CentOS 5.2: $ echo test >test.txt $ ls -ls test.txt 4 -rw-rw-r-- 1 rnichols rnichols 5 Mar 11 16:57 test.txt $ du -h test.txt 4.0K test.txt $ stat test.txt File: `test.txt' Size: 5 Blocks: 8 IO Block: 4096 regular file Device: 341h/833d Inode: 4325491 Links: 1 Access: (0664/-rw-rw-r--) Uid: ( 500/rnichols) Gid: ( 500/rnichols) Access: 2009-03-11 16:57:18.000000000 -0500 Modify: 2009-03-11 16:57:18.000000000 -0500 Change: 2009-03-11 16:57:18.000000000 -0500 $ df . Filesystem 1K-blocks Used Available Use% Mounted on /dev/hdb1 487397840 320075816 142902824 70% /xstore $ su - -c "tune2fs -l /dev/hdb1" | egrep 'features|size' Password: Filesystem features: has_journal resize_inode dir_index filetype needs_recovery sparse_super large_file Block size: 4096 Fragment size: 4096 Inode size: 128 Everything exactly as expected. -- Bob Nichols "NOSPAM" is really part of my email address. Do NOT delete it.
William L. Maltby
2009-Mar-11 22:58 UTC
[CentOS] Disk usage for small files in ext3 in CentOS 5
On Wed, 2009-03-11 at 17:29 -0400, Filipe Brandenburger wrote:> Hello, > > I noticed something unusual today. > > If I "du" a small file (couple of bytes) in CentOS 5, it tells me the > file is using 8kb, while I was expecting 4kb which is the block size > I'm using. > > I tried this on several CentOS 5 machines, both x86_64 and i386: > > $ echo test >test.txt > $ ls -l test.txt > -rw-rw-r-- 1 filbranden filbranden 5 Mar 11 17:24 test.txt > $ du -h test.txt > 8.0K test.txt > > If I do the same on a CentOS 4 machine: > > $ echo test >test.txt > $ ls -l test.txt > -rw-rw-r-- 1 filbranden filbranden 5 Mar 11 17:25 test.txt > $ du -h test.txt > 4.0K test.txt > > On all machines I tested, both CentOS 4 and CentOS 5: > > # tune2fs -l /dev/xxxxx > ... > Block size: 4096 > Fragment size: 4096 > > I could not find any differences that would explain the behaviour. > Have you seen this before? Can you reproduce it on your systems? Do > you know how to get the CentOS 4 behaviour? > > More on the point: I'm migrating some data from CentOS 4 to CentOS 5, > it's around 70GB of millions of small files. I would like it to still > take 70GB, not 140GB. For now, I'm working around this issue by using > "-T small" to mke2fs, I'm not sure if it's going to have the effect I > want, and I'm not sure about any other impact (performance?) it might > have on my filesystem.I'm a gambler, so I'll bet on this. Very large disks? If so, it may be that some of the tunables specify two blocks per "fragment" or the bytes-per-inode specifies more than 4K. I've been able, in the past, to affect things like this by tuning the number of i-nodes up/down when making the file system. Generally though, I'm reducing the number as there is a lot of space that can be gained since normally there will be 1 per block, IIRC. Since my desktop FS doesn't experience that much growth, and lots of the files are large, this is safe. YMMV. The output of the tune2fs command might give some hints. Also, using mke2fs with the "-n" parameter will tell you what it would do if you were to (re) make the file system.> <snip sig stuff>HTH -- Bill
Filipe Brandenburger
2009-Mar-11 23:36 UTC
[CentOS] Disk usage for small files in ext3 in CentOS 5
Hi, On Wed, Mar 11, 2009 at 17:29, Filipe Brandenburger <filbranden at gmail.com> wrote:> If I "du" a small file (couple of bytes) in CentOS 5, it tells me the > file is using 8kb, while I was expecting 4kb which is the block size > I'm using.Found it! It's not related to CentOS 4 or 5 (I found a C4 machine in which small files took 8kb of diskspace and a C5 machine in which small files took 4kb). It's related to SELinux being enabled or not. Casually most of my C4 machines had SELinux disabled and most of my C5 have it enabled. Now I dug out some machines with the opposite config and I checked it out. I believe if SELinux is enabled, it will use extended attributes to store the file's SELinux context (you can see it with "ls -Z", for some reason you cannot see it with "getfattr -d", I was expecting that to be possible). I guess when the file has extended attributes it will use an additional block to store them. That basically doubles the storage requirements if you have millions of tiny files... ACLs would probably have the same effect (I did not test it though). I wonder if there is a way to override this, for instance by mounting a filesystem and disabling extended attributes, specifying the SELinux context for all the files in the mount options or something. I know that is possible for NFS, but not for local filesystems... I'll dig in, I'll let you know if I find anything. Thanks! Filipe
Kai Schaetzl
2009-Mar-12 08:31 UTC
[CentOS] Disk usage for small files in ext3 in CentOS 5
Filipe Brandenburger wrote on Wed, 11 Mar 2009 17:29:06 -0400:> $ du -h test.txt > 8.0K test.txtand just "du test.txt"? e.g. without "translation"? Kai -- Kai Sch?tzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com