If you create a filesystem, build an ON workspace in it, and then delete the workspace, disk usage as reported by df & zfs list doesn''t shrink back down all the way to what it was when the filesystem was first created. I first noticed this when cleaning up a filesystem which had contained a couple nightly build workspaces on a system running nv_35; "du" reported ~2 blocks usage, but "df"/"zfs list" reported on the order of 600MB usage (!). Since there have been lots of post-35 fixes I tried reproducing this on a system running yesterday''s ON nightly. The amount of hidden "leftover" usage seems to be greatly improved but it''s still substantially more than I''d expect. On this system, a freshly created filesystem shows 9K of usage: # df /export/a/build/sommerfeld Filesystem 1024-blocks Used Available Capacity Mounted on a/build/sommerfeld 105338880 9 96721092 1% /export/a/build/sommerfeld after a build + workspace delete, we get back to an empty filesystem again: # du . 0 ./.zfs/snapshot 0 ./.zfs 2 . but it''s still using 1.4MB: # df -k . Filesystem 1024-blocks Used Available Capacity Mounted on a/build/sommerfeld 105338880 1413 96719691 1% /export/a/build/sommerfeld>From looking at "zdb -bbb -L a" output before and after a "zfsdestroy/zfs create" on the filesystem, I see a number of entries which are different. The most significant change is the "ZFS directory" line which goes from: 1.17K 1.71M 617K 617K 526 2.84 0.27 ZFS directory to 80 150K 52.5K 52.5K 672 2.85 0.02 ZFS directory Here''s everything which changed in the zdb output: Before destroy: Blocks LSIZE PSIZE ASIZE avg comp %Total Type 5 80.0K 12.5K 12.5K 2.50K 6.40 0.01 L1 deferred free 9 84.0K 21.5K 21.5K 2.39K 3.91 0.01 L0 deferred free 14 164K 34.0K 34.0K 2.43K 4.82 0.01 deferred free ... 386 6.03M 787K 787K 2.04K 7.85 0.34 L1 SPA space map 9.03K 36.1M 30.8M 30.8M 3.41K 1.17 13.67 L0 SPA space map 9.41K 42.2M 31.6M 31.6M 3.36K 1.34 14.01 SPA space map 1 28.0K 28.0K 28.0K 28.0K 1.00 0.01 ZIL intent log ... 7 112K 23.5K 23.5K 3.36K 4.77 0.01 L2 DMU dnode 38 608K 85.0K 85.0K 2.24K 7.15 0.04 L1 DMU dnode 795 12.4M 1.09M 1.09M 1.40K 11.42 0.48 L0 DMU dnode 862 13.5M 1.26M 1.26M 1.50K 10.70 0.56 DMU dnode ... 1.17K 1.71M 617K 617K 526 2.84 0.27 ZFS directory ... 6 73.0K 15.0K 15.0K 2.50K 4.87 0.01 ZFS delete queue ... 8 128K 25.0K 25.0K 3.12K 5.12 0.01 L2 Total 487 7.61M 1005K 1005K 2.06K 7.76 0.44 L1 Total 14.3K 268M 224M 224M 15.7K 1.20 99.52 L0 Total 14.8K 276M 225M 225M 15.2K 1.23 100.00 Total After destroy/create: Blocks LSIZE PSIZE ASIZE avg comp %Total Type 4 64K 9.00K 9.00K 2.25K 7.11 0.00 L1 deferred free 8 80.0K 20.0K 20.0K 2.50K 4.00 0.01 L0 deferred free 12 144K 29.0K 29.0K 2.42K 4.97 0.01 deferred free ... 386 6.03M 787K 787K 2.04K 7.85 0.34 L1 SPA space map 9.03K 36.1M 30.8M 30.8M 3.41K 1.17 13.76 L0 SPA space map 9.41K 42.2M 31.6M 31.6M 3.36K 1.34 14.10 SPA space map - - - - - - - ZIL intent log ... 7 112K 22.0K 22.0K 3.14K 5.09 0.01 L2 DMU dnode 8 128K 26.0K 26.0K 3.25K 4.92 0.01 L1 DMU dnode 80 1.25M 346K 346K 4.32K 3.70 0.15 L0 DMU dnode 117 1.83M 461K 461K 3.94K 4.06 0.20 DMU dnode ... 80 150K 52.5K 52.5K 672 2.85 0.02 ZFS directory ... 6 3.00K 3.00K 3.00K 512 1.00 0.00 ZFS delete queue ... 8 128K 23.5K 23.5K 2.94K 5.45 0.01 L2 Total 456 7.12M 942K 942K 2.07K 7.75 0.41 L1 Total 12.5K 255M 223M 223M 17.8K 1.14 99.55 L0 Total 13.0K 263M 224M 224M 17.3K 1.17 100.00 Total
On Wed, Mar 15, 2006 at 10:51:33AM -0500, Bill Sommerfeld wrote:> I first noticed this when cleaning up a filesystem which had contained a > couple nightly build workspaces on a system running nv_35; "du" reported > ~2 blocks usage, but "df"/"zfs list" reported on the order of 600MB > usage (!).Yep, you were seeing 6391873 metadata compression should be turned back on> Since there have been lots of post-35 fixes I tried reproducing this on > a system running yesterday''s ON nightly. The amount of hidden > "leftover" usage seems to be greatly improved but it''s still > substantially more than I''d expect.Hmm, I wasn''t able to reproduce this. Did you create the fs under the new bits as well? On my system, after removing everything in a fs that contained a workspace, I see that it is using 9KB. Could you run ''zdb -vvv pool/emptyfs''? This will tell us what is using space in that fs. I would believe that the space used in the *pool* does not go completely back to where it began after creating and deleting a workspace, due to the space maps. For example on your system the space maps are using 32MB:> 9.41K 42.2M 31.6M 31.6M 3.36K 1.34 14.01 SPA space map--matt
On Wed, 2006-03-15 at 13:47, Matthew Ahrens wrote:> Hmm, I wasn''t able to reproduce this. Did you create the fs under the > new bits as well?I believe so, but I''m trying again with a freshly created fs just to be sure. I don''t 100% trust my memory that I nuked & recreated the fs *after* the bfu+reboot to the onnv-gate:2006-03-14 build.> On my system, after removing everything in a fs that > contained a workspace, I see that it is using 9KB. > > Could you run ''zdb -vvv pool/emptyfs''? This will tell us what is using > space in that fs.I''ll let you know once I reproduce it again -- or not. (this isn''t merely creating a workspace but also building it, so it will be a while..) - Bill
On Mar 15, 2006, at 11:46 AM, Bill Sommerfeld wrote:> On Wed, 2006-03-15 at 13:47, Matthew Ahrens wrote: >> Hmm, I wasn''t able to reproduce this. Did you create the fs under >> the >> new bits as well? > > I believe so, but I''m trying again with a freshly created fs just > to be > sure. I don''t 100% trust my memory that I nuked & recreated the fs > *after* the bfu+reboot to the onnv-gate:2006-03-14 build. > >> On my system, after removing everything in a fs that >> contained a workspace, I see that it is using 9KB. >> >> Could you run ''zdb -vvv pool/emptyfs''? This will tell us what is >> using >> space in that fs. > > I''ll let you know once I reproduce it again -- or not. > (this isn''t merely creating a workspace but also building it, so it > will > be a while..) > > - Bill >You might also try comparing the 1)completely clean, 2)after bringover/build, 3)after rm -rf, and 4)after another bringover/build. You''ve 1 and 3 aren''t quite equal, but I''d be more curious about comparing 2 and 4. ckl
On Wed, 2006-03-15 at 13:47, Matthew Ahrens wrote:> Could you run ''zdb -vvv pool/emptyfs''? This will tell us what is using > space in that fs.That command produced about 500kb of output, so I won''t include the whole thing here; if you want to see the whole thing, let me know. The usage dropped from 1463KB to 11KB when I unmounted and remounted the filesystem (something I didn''t try previously). - Bill # zdb -vvv a/build/sommerfeld Dataset a/build/sommerfeld [ZPL], ID 469, cr_txg 2277942, last_txg 2285133, 1.43 , 1166 objects, rootbp [L0 DMU objset] vdev=1 offset=6045a2a00 size=400L/200P/20 A fletcher4 lzjb BE contiguous birth=2285133 fill=1166 cksum=cdd3d3e64:518d561ae 2:108ac5a5796c5:24a18ca4b6521a ZIL header: claim_txg 0, seq 0 first block: [L0 ZIL intent log] vdev=2 offset=1b43c5000 size=7000L/7000 /7000A zilog uncompressed BE contiguous birth=2283341 fill=0 cksum=5ff72d775e025 f3:a96fbf4821809c06:1d5:1b95d Block seqno 112989, won''t claim Object lvl iblk dblk lsize psize type 0 7 16K 16K 152M 862K DMU dnode Object lvl iblk dblk lsize psize type 1 1 16K 512 512 512 ZFS master node microzap: 512 bytes, 3 entries VERSION = 1 ROOT = 3 DELETE_QUEUE = 2 Object lvl iblk dblk lsize psize type 2 1 16K 73.0K 73.0K 13.0K ZFS delete queue microzap: 74752 bytes, 1162 entries 14294 = 82580 29fa = 10746 e208 = 57864 1425a = 82522 5d5a = 23898 98c5 = 39109 870c = 34572 ... 3ea9 = 16041 a4b8 = 42168 ed69 = 60777 8d74 = 36212 146f2 = 83698 15872 = 88178 13c6d = 81005 8765 = 34661 13fa7 = 81831 Object lvl iblk dblk lsize psize type 3 1 16K 512 512 512 ZFS directory 264 bonus ZFS znode path / atime Wed Mar 15 20:13:28 2006 mtime Wed Mar 15 20:13:28 2006 ctime Wed Mar 15 20:13:28 2006 crtime Wed Mar 15 10:13:43 2006 gen 2277942 mode 40755 size 2 parent 3 links 2 xattr 0 rdev 0x0000000000000000 microzap: 512 bytes, 0 entries Object lvl iblk dblk lsize psize type 6 1 16K 512 512 512 ZFS directory 264 bonus ZFS znode path ?{object#6> atime Wed Mar 15 20:13:28 2006 mtime Wed Mar 15 20:13:28 2006 ctime Wed Mar 15 20:13:28 2006 crtime Wed Mar 15 14:37:10 2006 gen 2281104 mode 40775 size 2 parent 5 links 0 xattr 0 rdev 0x0000000000000000 microzap: 512 bytes, 0 entries Object lvl iblk dblk lsize psize type 11 1 16K 512 512 512 ZFS directory 264 bonus ZFS znode path ?{object#11> atime Wed Mar 15 20:13:24 2006 mtime Wed Mar 15 20:13:24 2006 ctime Wed Mar 15 20:13:24 2006 crtime Wed Mar 15 14:37:12 2006 gen 2281104 mode 40775 size 2 parent 3 links 0 xattr 0 rdev 0x0000000000000000 microzap: 512 bytes, 0 entries ...
> I would believe that the space used in the *pool* does not go completely > back to where it began after creating and deleting a workspace, due to > the space maps. For example on your system the space maps are using > 32MB: > > > 9.41K 42.2M 31.6M 31.6M 3.36K 1.34 14.01 SPA space mapRight. FYI, the space maps (the things that keep track of which blocks are in use) are stored as segment lists. On average, this consumes much less space than a bitmap, and the AVL algorithms scale better. Space maps have an interesting property: they consume essentially no disk space when a pool is either 0% full or 100% full. (At 0%, the allocteed space is represented by an empty segment list; at 100%, the allocated space is represented by a single segment.) All else being equal, the space maps will be maximally fragmented (and thus using the most disk space) when the pool is around 50% full. Sort of like income tax: http://en.wikipedia.org/wiki/Laffer_curve This is nice because it means that as your pool gets closer and closer to full, the metadata overhead for space maps goes to zero. So if you see non-trivial amounts of disk space in your space maps, don''t worry: we''ll give it back when you need it. Jeff
On Wed, Mar 15, 2006 at 08:26:48PM -0500, Bill Sommerfeld wrote:> On Wed, 2006-03-15 at 13:47, Matthew Ahrens wrote: > > Could you run ''zdb -vvv pool/emptyfs''? This will tell us what is using > > space in that fs. > > That command produced about 500kb of output, so I won''t include the > whole thing here; if you want to see the whole thing, let me know. > > The usage dropped from 1463KB to 11KB when I unmounted and remounted the > filesystem (something I didn''t try previously). > > Object lvl iblk dblk lsize psize type > 2 1 16K 73.0K 73.0K 13.0K ZFS delete queue > microzap: 74752 bytes, 1162 entries > 14294 = 82580 > 29fa = 10746 > e208 = 57864 > 1425a = 82522 > 5d5a = 23898 > 98c5 = 39109 > 870c = 34572 > ...Hmm, sounds like there''s a problem with processing the delete queue. Were there any messages printed on the console while or after you removed the files? (I''m assuming that after you removed the files you waited until the space used stopped decreasing before running zdb.) --matt
On Wed, 2006-03-15 at 20:33, Matthew Ahrens wrote:> Hmm, sounds like there''s a problem with processing the delete queue. > Were there any messages printed on the console while or after you > removed the files?No. Nothing has come out of the console since early this afternoon.> (I''m assuming that after you removed the files you > waited until the space used stopped decreasing before running zdb.)Yes. The space used, as reported by df, dropped rapidly after "workspace delete" finished and then "stuck" at 1463, and remained unchanged across the run of "zdb -vvv". - Bill