Thorn Roby
2012-Dec-04 19:44 UTC
Mounted compress-force=zlib, compresses when files copied in, but not when written directly by application
I am trying to use a btrfs filesystem (Oracle Linux 6.3 X86_64, UEK kernel) with data RAID0 and metadata RAID1, mounted as follows: /dev/sda3 on /data type btrfs (rw,noatime,compress-force=zlib,noacl) dmesg shows btrfs: force zlib compression btrfs: disk space caching is enabled My application is MongoDB, which creates a series of 2GB datafiles, possibly using sparse allocation (at least I know it does on XFS), which reports the following for each datafile: Tue Dec 4 12:05:26.538 [FileAllocator] allocating new datafile /data/db/lfs/lfs.19, filling with zeroes... Tue Dec 4 12:05:26.544 [FileAllocator] done allocating datafile /data/db/lfs/lfs.19, size: 2047MB, took 0.006 secs If I create a new filesystem and mount it with compress-force=zlib, then rsync a number of these 2GB files containing MongoDB data from another system, I get roughly 4:1 compression. I also notices that CPU usage remains at 100% and the rsync speed is roughly 70% of network speed. If I do the same thing using LZO, I see low CPU usage, transmission speed is 100% of network speed and compression is about 2:1. These results are what I expect. However, if I run the mongodb process directly on the machine, using compress- force=zlib (haven''t tried LZO yet) with the same mount options, and load data into the database from another system via database inserts, there is no evidence of compression. This is consistently verified by the output of df, btrfs fi df and an internal report from the database which shows row count and average row size (560 bytes). I also see low CPU usage (however this is not conclusive since the file write rate from the database process is roughly 10 times slower than a direct write using rsync). [root@eng-mongodb-t1 lfs]# btrfs fi df /data Data, RAID0: total=37.95GB, used=33.84GB Data: total=8.00MB, used=0.00 System, RAID1: total=8.00MB, used=4.00KB System: total=4.00MB, used=0.00 Metadata, RAID1: total=1.00GB, used=45.35MB Metadata: total=8.00MB, used=0.00 [root@eng-mongodb-t1 lfs]# df -h /dev/sda3 40G 34G 4.2G 90% /data I tried chattr -c on the database directory but it doesn''t appear to be supported in Oracle Linux (nor was it necessary in the case where I rsyncd the files and saw the expected compression). Filesystem type is: 9123683e File size of lfs.18 is 2146435072 (524032 blocks, blocksize 4096) ext logical physical expected length flags 0 0 9182208 2 1 2 9706240 9182209 1 2 3 9182211 9706240 14348 3 14351 9706241 9196558 1 4 14352 9196560 9706241 28126 5 42478 9224686 481554 eof lfs.18: 5 extents found Is there something about the (possible) initial sparse allocation followed by zero-filling that might be disabling compression as the datafile is later overwritten by the database process, despite the compress-force flag? filefrag -v shows this, but I''m still not sure how to interpret it: Filesystem type is: 9123683e File size of lfs.18 is 2146435072 (524032 blocks, blocksize 4096) ext logical physical expected length flags 0 0 9182208 2 1 2 9706240 9182209 1 2 3 9182211 9706240 14348 3 14351 9706241 9196558 1 4 14352 9196560 9706241 28126 5 42478 9224686 481554 eof lfs.18: 5 extents found -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2012-Dec-05 02:01 UTC
Re: Mounted compress-force=zlib, compresses when files copied in, but not when written directly by application
On Tue, Dec 04, 2012 at 07:44:16PM +0000, Thorn Roby wrote:> I am trying to use a btrfs filesystem (Oracle Linux 6.3 X86_64, UEK kernel) > with data RAID0 and metadata RAID1, mounted as follows: > > /dev/sda3 on /data type btrfs (rw,noatime,compress-force=zlib,noacl) > > dmesg shows > > btrfs: force zlib compression > btrfs: disk space caching is enabled > > My application is MongoDB, which creates a series of 2GB datafiles, possibly > using sparse allocation (at least I know it does on XFS), which reports the > following for each datafile: > > Tue Dec 4 12:05:26.538 [FileAllocator] allocating new datafile > /data/db/lfs/lfs.19, filling with zeroes... > Tue Dec 4 12:05:26.544 [FileAllocator] done allocating datafile > /data/db/lfs/lfs.19, size: 2047MB, took 0.006 secs > > If I create a new filesystem and mount it with compress-force=zlib, then rsync a > number of these 2GB files containing MongoDB data from another system, I get > roughly 4:1 compression. I also notices that CPU usage remains at 100% and the > rsync speed is roughly 70% of network speed. If I do the same thing using LZO, I > see low CPU usage, transmission speed is 100% of network speed and compression > is about 2:1. These results are what I expect. > > However, if I run the mongodb process directly on the machine, using compress- > force=zlib (haven''t tried LZO yet) with the same mount options, and load data > into the database from another system via database inserts, there is no evidence > of compression. This is consistently verified by the output of df, btrfs fi df > and an internal report from the database which shows row count and average row > size (560 bytes). I also see low CPU usage (however this is not conclusive since > the file write rate from the database process is roughly 10 times slower than a > direct write using rsync). > > [root@eng-mongodb-t1 lfs]# btrfs fi df /data > Data, RAID0: total=37.95GB, used=33.84GB > Data: total=8.00MB, used=0.00 > System, RAID1: total=8.00MB, used=4.00KB > System: total=4.00MB, used=0.00 > Metadata, RAID1: total=1.00GB, used=45.35MB > Metadata: total=8.00MB, used=0.00 > > [root@eng-mongodb-t1 lfs]# df -h > /dev/sda3 40G 34G 4.2G 90% /data > > I tried chattr -c on the database directory but it doesn''t appear to be > supported in Oracle Linux (nor was it necessary in the case where I rsyncd the > files and saw the expected compression). > Filesystem type is: 9123683e > File size of lfs.18 is 2146435072 (524032 blocks, blocksize 4096) > ext logical physical expected length flags > 0 0 9182208 2 > 1 2 9706240 9182209 1 > 2 3 9182211 9706240 14348 > 3 14351 9706241 9196558 1 > 4 14352 9196560 9706241 28126 > 5 42478 9224686 481554 eof > lfs.18: 5 extents found > > > Is there something about the (possible) initial sparse allocation followed by > zero-filling that might be disabling compression as the datafile is later > overwritten by the database process, despite the compress-force flag?Well, it shouldn''t be that since we have compress-force, as the name shows, with compress-force it''ll always try to compress the data via zlib/lzo. So the previous rsync one is just ''write numbers of 2G files'', while the current application one is ''write numbers of 2G files'' plus ''overwritten by others later'', is it right? And is the above ''btrfs fi df'' output captured after ''overwritten'', or not? thanks, -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Thorn Roby
2012-Dec-05 17:21 UTC
RE: Mounted compress-force=zlib, compresses when files copied in, but not when written directly by application
Thanks for your response. Yes, the rsync copies a number of the 2GB files written by the same database software on another system on XFS, and compression is successful. These files consist mostly of plain text log output and are highly compressible (4:1 via zlib). When the database is running locally on btrfs, the I/O pattern is that the database allocates a new 2GB datafile (perhaps using a sparse allocation call, I know it does on XFS) and then says it fills the file with zeroes (but I''m not certain this actually happens). At that point database inserts are appended initially to a transaction log (a separate non-sparse 2GB file), and shortly thereafter there is a batch copy operation (I''m not sure of the size) of a set of new rows which are copied from the transaction log and appended to the main datafiles (which are sparse). Up until now I have included both the transaction log and the datafiles within the btrfs filesystem - next I''m going to try putting the transaction log on a separate filesystem . The btrfs fi df output was taken after the database load process was stopped, so it reflects a static set of the 2GB datafiles (plus the transaction log and a few other database system files). -----Original Message----- From: Liu Bo [mailto:bo.li.liu@oracle.com] Sent: Tuesday, December 04, 2012 7:02 PM To: Thorn Roby Cc: linux-btrfs@vger.kernel.org Subject: Re: Mounted compress-force=zlib, compresses when files copied in, but not when written directly by application Well, it shouldn''t be that since we have compress-force, as the name shows, with compress-force it''ll always try to compress the data via zlib/lzo. So the previous rsync one is just ''write numbers of 2G files'', while the current application one is ''write numbers of 2G files'' plus ''overwritten by others later'', is it right? And is the above ''btrfs fi df'' output captured after ''overwritten'', or not? thanks, -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Thorn Roby
2012-Dec-05 17:42 UTC
RE: Mounted compress-force=zlib, compresses when files copied in, but not when written directly by application
My previous reply was incorrect in one point - the data is never copied from the transaction log into the sparse datafiles, instead the application writes the same data independently to both locations. Also, I failed to mention that the files are memmapped, and it''s possible that the write operations attempt to use DIRECT_IO, which I believe is disabled by btrfs with compression - is it possible that an attempt to use DIRECT_IO or memmapped files would prevent compression? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2012-Dec-06 06:59 UTC
Re: Mounted compress-force=zlib, compresses when files copied in, but not when written directly by application
On Wed, Dec 05, 2012 at 05:42:51PM +0000, Thorn Roby wrote:> My previous reply was incorrect in one point - the data is never copied from the transaction log into the sparse datafiles, instead the application writes the same data independently to both locations. > Also, I failed to mention that the files are memmapped, and it''s possible that the write operations attempt to use DIRECT_IO, which I believe is disabled by btrfs with compression - is it possible that an attempt to use DIRECT_IO or memmapped files would prevent compression? >Actually, writting with DIRECT_IO will fall back to buffer write for safety, and mmap files just dirty pages, should be same with buffer write, so it might be other reasons. thanks, liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Thorn Roby
2012-Dec-06 20:34 UTC
RE: Mounted compress-force=zlib, compresses when files copied in, but not when written directly by application
It appears to be an issue with the initial sparse file allocation. When I manually create 2GB datafiles using dd from /dev/zero, instead of allowing MongoDB to allocate them as needed, compression seems to be working correctly. -----Original Message----- From: Liu Bo [mailto:bo.li.liu@oracle.com] Sent: Wednesday, December 05, 2012 11:59 PM To: Thorn Roby Cc: linux-btrfs@vger.kernel.org Subject: Re: Mounted compress-force=zlib, compresses when files copied in, but not when written directly by application On Wed, Dec 05, 2012 at 05:42:51PM +0000, Thorn Roby wrote:> My previous reply was incorrect in one point - the data is never copied from the transaction log into the sparse datafiles, instead the application writes the same data independently to both locations. > Also, I failed to mention that the files are memmapped, and it''s possible that the write operations attempt to use DIRECT_IO, which I believe is disabled by btrfs with compression - is it possible that an attempt to use DIRECT_IO or memmapped files would prevent compression? >Actually, writting with DIRECT_IO will fall back to buffer write for safety, and mmap files just dirty pages, should be same with buffer write, so it might be other reasons. thanks, liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html