I''ve been struggling to fully understand why disk space seems to vanish. I''ve dug through bits of code and reviewed all the mails on the subject that I can find, but I still don''t have a proper understanding of whats going on. I did a test with a local zpool on snv_97... zfs list, zpool list, and zdb all seem to disagree on how much space is available. In this case its only a discrepancy of about 20G or so, but I''ve got Thumpers that have a discrepancy of over 6TB! Can someone give a really detailed explanation about whats going on? block traversal size 670225837056 != alloc 720394438144 (leaked 50168601088) bp count: 15182232 bp logical: 672332631040 avg: 44284 bp physical: 669020836352 avg: 44066 compression: 1.00 bp allocated: 670225837056 avg: 44145 compression: 1.00 SPA allocated: 720394438144 used: 96.40% Blocks LSIZE PSIZE ASIZE avg comp %Total Type 12 120K 26.5K 79.5K 6.62K 4.53 0.00 deferred free 1 512 512 1.50K 1.50K 1.00 0.00 object directory 3 1.50K 1.50K 4.50K 1.50K 1.00 0.00 object array 1 16K 1.50K 4.50K 4.50K 10.67 0.00 packed nvlist - - - - - - - packed nvlist size 72 8.45M 889K 2.60M 37.0K 9.74 0.00 bplist - - - - - - - bplist header - - - - - - - SPA space map header 974 4.48M 2.65M 7.94M 8.34K 1.70 0.00 SPA space map - - - - - - - ZIL intent log 96.7K 1.51G 389M 777M 8.04K 3.98 0.12 DMU dnode 17 17.0K 8.50K 17.5K 1.03K 2.00 0.00 DMU objset - - - - - - - DSL directory 13 6.50K 6.50K 19.5K 1.50K 1.00 0.00 DSL directory child map 12 6.00K 6.00K 18.0K 1.50K 1.00 0.00 DSL dataset snap map 14 38.0K 10.0K 30.0K 2.14K 3.80 0.00 DSL props - - - - - - - DSL dataset - - - - - - - ZFS znode 2 1K 1K 2K 1K 1.00 0.00 ZFS V0 ACL 5.81M 558G 557G 557G 95.8K 1.00 89.27 ZFS plain file 382K 301M 200M 401M 1.05K 1.50 0.06 ZFS directory 9 4.50K 4.50K 9.00K 1K 1.00 0.00 ZFS master node 12 482K 20.0K 40.0K 3.33K 24.10 0.00 ZFS delete queue 8.20M 66.1G 65.4G 65.8G 8.03K 1.01 10.54 zvol object 1 512 512 1K 1K 1.00 0.00 zvol prop - - - - - - - other uint8[] - - - - - - - other uint64[] - - - - - - - other ZAP - - - - - - - persistent error log 1 128K 10.5K 31.5K 31.5K 12.19 0.00 SPA history - - - - - - - SPA history offsets - - - - - - - Pool properties - - - - - - - DSL permissions - - - - - - - ZFS ACL - - - - - - - ZFS SYSACL - - - - - - - FUID table - - - - - - - FUID table size 5 3.00K 2.50K 7.50K 1.50K 1.20 0.00 DSL dataset next clones - - - - - - - scrub work queue 14.5M 626G 623G 624G 43.1K 1.00 100.00 Total real 21m16.862s user 0m36.984s sys 0m5.757s ======================================================Looking at the data: root at quadra ~$ zfs list backup && zpool list backup NAME USED AVAIL REFER MOUNTPOINT backup 685G 237K 27K /backup NAME SIZE USED AVAIL CAP HEALTH ALTROOT backup 696G 671G 25.1G 96% ONLINE - So zdb says 626GB is used, zfs list says 685GB is used, and zpool list says 671GB is used. The pool was filled to 100% capacity via dd, this is confirmed, I can''t write data, but yet zpool list says its only 96%. benr. -- This message posted from opensolaris.org
No takers? :) benr. -- This message posted from opensolaris.org
Hello there... I did see that already, talk with some guys without answer too... ;-) Actually, this week i did not see discrepancy between tools, but the pool information was wrong (space used). Exporting/importing, scrub, and etc, did not solve. I know that zfs is "async" in the status report ;-), but only after a reboot the status was OK again. ps.: b89 Leal. -- This message posted from opensolaris.org
> No takers? :) > > benr.I''m quite curious about finding out about this too, to be honest :) And its not just ZFS on Solaris because I''ve filled up and imported pools into ZFS Fuse 0.5.0 (which is based on the latest ZFS code) in Linux, and on FreeBSD too. -- This message posted from opensolaris.org
I guess difficult questions go unanswered :( -- This message posted from opensolaris.org
On Sun, 2 Nov 2008, Turanga Leela wrote:> I guess difficult questions go unanswered :(Mailing lists are not an efficient substitute for reading the documentation, blogs, wikis, and existing mailing list postings. Your question is a FAQ. Google is your friend. Bob =====================================Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Are you running this on a live pool? If so, zdb can''t get a reliable block count -- and zdb -L [live pool] emits a warning to that effect. Jeff On Thu, Oct 16, 2008 at 03:36:25AM -0700, Ben Rockwood wrote:> I''ve been struggling to fully understand why disk space seems to vanish. I''ve dug through bits of code and reviewed all the mails on the subject that I can find, but I still don''t have a proper understanding of whats going on. > > I did a test with a local zpool on snv_97... zfs list, zpool list, and zdb all seem to disagree on how much space is available. In this case its only a discrepancy of about 20G or so, but I''ve got Thumpers that have a discrepancy of over 6TB! > > Can someone give a really detailed explanation about whats going on? > > block traversal size 670225837056 != alloc 720394438144 (leaked 50168601088) > > bp count: 15182232 > bp logical: 672332631040 avg: 44284 > bp physical: 669020836352 avg: 44066 compression: 1.00 > bp allocated: 670225837056 avg: 44145 compression: 1.00 > SPA allocated: 720394438144 used: 96.40% > > Blocks LSIZE PSIZE ASIZE avg comp %Total Type > 12 120K 26.5K 79.5K 6.62K 4.53 0.00 deferred free > 1 512 512 1.50K 1.50K 1.00 0.00 object directory > 3 1.50K 1.50K 4.50K 1.50K 1.00 0.00 object array > 1 16K 1.50K 4.50K 4.50K 10.67 0.00 packed nvlist > - - - - - - - packed nvlist size > 72 8.45M 889K 2.60M 37.0K 9.74 0.00 bplist > - - - - - - - bplist header > - - - - - - - SPA space map header > 974 4.48M 2.65M 7.94M 8.34K 1.70 0.00 SPA space map > - - - - - - - ZIL intent log > 96.7K 1.51G 389M 777M 8.04K 3.98 0.12 DMU dnode > 17 17.0K 8.50K 17.5K 1.03K 2.00 0.00 DMU objset > - - - - - - - DSL directory > 13 6.50K 6.50K 19.5K 1.50K 1.00 0.00 DSL directory child map > 12 6.00K 6.00K 18.0K 1.50K 1.00 0.00 DSL dataset snap map > 14 38.0K 10.0K 30.0K 2.14K 3.80 0.00 DSL props > - - - - - - - DSL dataset > - - - - - - - ZFS znode > 2 1K 1K 2K 1K 1.00 0.00 ZFS V0 ACL > 5.81M 558G 557G 557G 95.8K 1.00 89.27 ZFS plain file > 382K 301M 200M 401M 1.05K 1.50 0.06 ZFS directory > 9 4.50K 4.50K 9.00K 1K 1.00 0.00 ZFS master node > 12 482K 20.0K 40.0K 3.33K 24.10 0.00 ZFS delete queue > 8.20M 66.1G 65.4G 65.8G 8.03K 1.01 10.54 zvol object > 1 512 512 1K 1K 1.00 0.00 zvol prop > - - - - - - - other uint8[] > - - - - - - - other uint64[] > - - - - - - - other ZAP > - - - - - - - persistent error log > 1 128K 10.5K 31.5K 31.5K 12.19 0.00 SPA history > - - - - - - - SPA history offsets > - - - - - - - Pool properties > - - - - - - - DSL permissions > - - - - - - - ZFS ACL > - - - - - - - ZFS SYSACL > - - - - - - - FUID table > - - - - - - - FUID table size > 5 3.00K 2.50K 7.50K 1.50K 1.20 0.00 DSL dataset next clones > - - - - - - - scrub work queue > 14.5M 626G 623G 624G 43.1K 1.00 100.00 Total > > > real 21m16.862s > user 0m36.984s > sys 0m5.757s > > ======================================================> Looking at the data: > root at quadra ~$ zfs list backup && zpool list backup > NAME USED AVAIL REFER MOUNTPOINT > backup 685G 237K 27K /backup > NAME SIZE USED AVAIL CAP HEALTH ALTROOT > backup 696G 671G 25.1G 96% ONLINE - > > So zdb says 626GB is used, zfs list says 685GB is used, and zpool list says 671GB is used. The pool was filled to 100% capacity via dd, this is confirmed, I can''t write data, but yet zpool list says its only 96%. > > benr. > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Ben Rockwood wrote:> I''ve been struggling to fully understand why disk space seems to vanish. I''ve dug through bits of code and reviewed all the mails on the subject that I can find, but I still don''t have a proper understanding of whats going on. > > I did a test with a local zpool on snv_97... zfs list, zpool list, and zdb all seem to disagree on how much space is available. In this case its only a discrepancy of about 20G or so, but I''ve got Thumpers that have a discrepancy of over 6TB! > > Can someone give a really detailed explanation about whats going on? > > block traversal size 670225837056 != alloc 720394438144 (leaked 50168601088) > > bp count: 15182232 > bp logical: 672332631040 avg: 44284 > bp physical: 669020836352 avg: 44066 compression: 1.00 > bp allocated: 670225837056 avg: 44145 compression: 1.00 > SPA allocated: 720394438144 used: 96.40% > > Blocks LSIZE PSIZE ASIZE avg comp %Total Type > 12 120K 26.5K 79.5K 6.62K 4.53 0.00 deferred free > 1 512 512 1.50K 1.50K 1.00 0.00 object directory > 3 1.50K 1.50K 4.50K 1.50K 1.00 0.00 object array > 1 16K 1.50K 4.50K 4.50K 10.67 0.00 packed nvlist > - - - - - - - packed nvlist size > 72 8.45M 889K 2.60M 37.0K 9.74 0.00 bplist > - - - - - - - bplist header > - - - - - - - SPA space map header > 974 4.48M 2.65M 7.94M 8.34K 1.70 0.00 SPA space map > - - - - - - - ZIL intent log > 96.7K 1.51G 389M 777M 8.04K 3.98 0.12 DMU dnode > 17 17.0K 8.50K 17.5K 1.03K 2.00 0.00 DMU objset > - - - - - - - DSL directory > 13 6.50K 6.50K 19.5K 1.50K 1.00 0.00 DSL directory child map > 12 6.00K 6.00K 18.0K 1.50K 1.00 0.00 DSL dataset snap map > 14 38.0K 10.0K 30.0K 2.14K 3.80 0.00 DSL props > - - - - - - - DSL dataset > - - - - - - - ZFS znode > 2 1K 1K 2K 1K 1.00 0.00 ZFS V0 ACL > 5.81M 558G 557G 557G 95.8K 1.00 89.27 ZFS plain file > 382K 301M 200M 401M 1.05K 1.50 0.06 ZFS directory > 9 4.50K 4.50K 9.00K 1K 1.00 0.00 ZFS master node > 12 482K 20.0K 40.0K 3.33K 24.10 0.00 ZFS delete queue > 8.20M 66.1G 65.4G 65.8G 8.03K 1.01 10.54 zvol object > 1 512 512 1K 1K 1.00 0.00 zvol prop > - - - - - - - other uint8[] > - - - - - - - other uint64[] > - - - - - - - other ZAP > - - - - - - - persistent error log > 1 128K 10.5K 31.5K 31.5K 12.19 0.00 SPA history > - - - - - - - SPA history offsets > - - - - - - - Pool properties > - - - - - - - DSL permissions > - - - - - - - ZFS ACL > - - - - - - - ZFS SYSACL > - - - - - - - FUID table > - - - - - - - FUID table size > 5 3.00K 2.50K 7.50K 1.50K 1.20 0.00 DSL dataset next clones > - - - - - - - scrub work queue > 14.5M 626G 623G 624G 43.1K 1.00 100.00 Total > > > real 21m16.862s > user 0m36.984s > sys 0m5.757s > > ======================================================> Looking at the data: > root at quadra ~$ zfs list backup && zpool list backup > NAME USED AVAIL REFER MOUNTPOINT > backup 685G 237K 27K /backup > NAME SIZE USED AVAIL CAP HEALTH ALTROOT > backup 696G 671G 25.1G 96% ONLINE - > > So zdb says 626GB is used, zfs list says 685GB is used, and zpool list says 671GB is used. The pool was filled to 100% capacity via dd, this is confirmed, I can''t write data, but yet zpool list says its only 96%.Unconsumed reservations would cause the space used according to "zfs list" to be more than according to "zpool list". Also I assume you are not using RAID-Z. As Jeff mentioned, zdb is not reliable on pools that are changing. A percentage of the total space is reserved for pool overhead and is not allocatable, but shows up as available in "zpool list". --matt
> A percentage of the total space is reserved for pool > overhead and is not > allocatable, but shows up as available in "zpool > list". >Something to change/show in the future? -- Leal [http://www.posix.brte.com.br/blog] -- This message posted from opensolaris.org