Hi, OS : Solaris 10 11/06 zpool list doesn''t reflect pool usage stats instantly. Why? # ls -l total 209769330 -rw------T 1 root root 107374182400 Apr 30 14:28 deleteme # zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT wo 136G 100G 36.0G 73% ONLINE - # rm deleteme # zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT wo 136G 100G 36.0G 73% ONLINE - ---> why ..... time passes # zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT wo 136G 1.06M 136G 0% ONLINE - thanks This message posted from opensolaris.org
> zpool list doesn''t reflect pool usage stats instantly. Why? > > # ls -l > total 209769330 > -rw------T 1 root root 107374182400 Apr 30 14:28 deleteme > > # zpool list > NAME SIZE USED AVAIL CAP HEALTH ALTROOT > wo 136G 100G 36.0G 73% ONLINE - > > # rm deleteme > # zpool list > NAME SIZE USED AVAIL CAP HEALTH ALTROOT > wo 136G 100G 36.0G 73% ONLINE - ---> why > > ..... time passesSame thing happened with logging filesystems at first. The file is only marked for deletion. The recovery of the space takes longer. So the zpool list information is probably "current", but perhaps not what you expect. Logging UFS now reports the result of the rm even if it hasn''t been completed yet by searching the delete queue. I would suppose that if this is a similar issue, it might be addressed as well. Of course actually doing this in ZFS is more complex because of snapshots. -- Darren Dunham ddunham at taos.com Senior Technical Consultant TAOS http://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. >
> zpool list doesn''t reflect pool usage stats instantly. Why?This is no different to how UFS behaves. If you rm a file, this uses the system call unlink(2) to do the work which is asynchronous. In other words, unlink(2) almost immediately returns a successful return code to rm (which can then exit, and return the user to a shell prompt), while leaving a kernel thread running to actually finish off freeing up the used space. Normally you don''t see this because it happens very quickly, but once in a while you blow a 100GB file away which may well have a significant amount of metadata associated with it that needs clearing down. I guess if you wanted to force this to be synchronous you could do something like this: rm /tank/myfs/bigfile && lockfs /tank/myfs Which would not return until the whole filesystem was flushed back to disk. I don''t think you can force a flush at a finer granularity than that. Anyone? regards, --justin
> > zpool list doesn''t reflect pool usage stats instantly. Why? > > This is no different to how UFS behaves.It is different now (although I spent about 5 minutes looking for an old bug that would point to *when* the UFS change went in, I wasn''t able to find one).> If you rm a file, this uses the system call unlink(2) to do the work > which is asynchronous. In other words, unlink(2) almost immediately > returns a successful return code to rm (which can then exit, and > return the user to a shell prompt), while leaving a kernel thread > running to actually finish off freeing up the used space. Normally you > don''t see this because it happens very quickly, but once in a while > you blow a 100GB file away which may well have a significant amount of > metadata associated with it that needs clearing down.The UFS stat doesn''t require synchronization, but prediction based on the delete queue. Such a solution may be impractical for ZFS. But it does suggest that it may be a problem worth solving. Actually, I care less about any synchronicity with any ''zpool'' output if the traditional interfaces for the specific filesystem involved can reflect the intention. -- Darren Dunham ddunham at taos.com Senior Technical Consultant TAOS http://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. >
On Mon, 2007-04-30 at 16:53 -0700, Darren Dunham wrote:> > > zpool list doesn''t reflect pool usage stats instantly. Why? > > > > This is no different to how UFS behaves. > > It is different now (although I spent about 5 minutes looking for an old > bug that would point to *when* the UFS change went in, I wasn''t able to > find one).The motivation for the change was standards compliance -- though there was at least one instance reported on comp.unix.solaris where deferred free caused an actual application problem -- the application was coded so that when free space reported by statvfs(2) crossed a low water mark, the application iterated through a log directory deleting files until available space went above a high water mark. Anyhow, the primary bug related to this was: 5012326 delay between unlink/close and space becoming available may be arbitrarily long which integrated into a late build of s10. IIRC there were some followup fixes to this which further improved the behavior. PSARC reviewed a proposal, 2004/372 to add a tunable to select between "fast" and "correct" behavior. The proposal was withdrawn when the current approach was found to be sufficient for UFS. - Bill