Hi All, out of curiosity: Can anyone come up with a good idea about why my snv_111 laptop computer should run more than 1000 zil_clean threads? ffffff0009a9dc60 fffffffffbc2c030 0 tq:zil_clean ffffff0009aa3c60 fffffffffbc2c030 0 tq:zil_clean ffffff0009aa9c60 fffffffffbc2c030 0 tq:zil_clean ffffff0009aafc60 fffffffffbc2c030 0 tq:zil_clean ffffff0009ab5c60 fffffffffbc2c030 0 tq:zil_clean ffffff0009abbc60 fffffffffbc2c030 0 tq:zil_clean ffffff0009ac1c60 fffffffffbc2c030 0 tq:zil_clean > ::threadlist!grep zil_clean| wc -l 1037 Thanks, Nils P.S.: Please don''t spend too much time on this, for me, this question is really academic - but I''d be grateful for any good answers.
Nils, A zil_clean() is started for each dataset after every txg. this includes snapshots (which is perhaps a bit inefficient). Still, zil_clean() is fairly lightweight if there''s nothing to do (grab a non contended lock; find nothing on a list; drop the lock & exit). Neil. On 09/21/09 08:08, Nils Goroll wrote:> Hi All, > > out of curiosity: Can anyone come up with a good idea about why my > snv_111 laptop computer should run more than 1000 zil_clean threads? > > ffffff0009a9dc60 fffffffffbc2c030 0 tq:zil_clean > ffffff0009aa3c60 fffffffffbc2c030 0 tq:zil_clean > ffffff0009aa9c60 fffffffffbc2c030 0 tq:zil_clean > ffffff0009aafc60 fffffffffbc2c030 0 tq:zil_clean > ffffff0009ab5c60 fffffffffbc2c030 0 tq:zil_clean > ffffff0009abbc60 fffffffffbc2c030 0 tq:zil_clean > ffffff0009ac1c60 fffffffffbc2c030 0 tq:zil_clean > > ::threadlist!grep zil_clean| wc -l > 1037 > > Thanks, Nils > > P.S.: Please don''t spend too much time on this, for me, this question is > really academic - but I''d be grateful for any good answers. > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Thinking more about this I''m confused about what you are seeing. The function dsl_pool_zil_clean() will serialise separate calls to zil_clean() within a pool. I don''t expect you have >1037 pools on your laptop! So I don''t know what''s going on. What is the typical call stack for those zil_clean() threads? Neil. On 09/21/09 08:53, Neil Perrin wrote:> Nils, > > A zil_clean() is started for each dataset after every txg. > this includes snapshots (which is perhaps a bit inefficient). > Still, zil_clean() is fairly lightweight if there''s nothing > to do (grab a non contended lock; find nothing on a list; > drop the lock & exit). > > Neil. > > On 09/21/09 08:08, Nils Goroll wrote: >> Hi All, >> >> out of curiosity: Can anyone come up with a good idea about why my >> snv_111 laptop computer should run more than 1000 zil_clean threads? >> >> ffffff0009a9dc60 fffffffffbc2c030 0 tq:zil_clean >> ffffff0009aa3c60 fffffffffbc2c030 0 tq:zil_clean >> ffffff0009aa9c60 fffffffffbc2c030 0 tq:zil_clean >> ffffff0009aafc60 fffffffffbc2c030 0 tq:zil_clean >> ffffff0009ab5c60 fffffffffbc2c030 0 tq:zil_clean >> ffffff0009abbc60 fffffffffbc2c030 0 tq:zil_clean >> ffffff0009ac1c60 fffffffffbc2c030 0 tq:zil_clean >> > ::threadlist!grep zil_clean| wc -l >> 1037 >> >> Thanks, Nils >> >> P.S.: Please don''t spend too much time on this, for me, this question >> is really academic - but I''d be grateful for any good answers. >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
Hi Neil and all, thank you very much for looking into this:> So I don''t know what''s going on. What is the typical call stack for those > zil_clean() threads?I''d say they are all blocking on their respective CVs: ffffff0009066c60 fffffffffbc2c030 0 0 60 ffffff01d25e1180 PC: _resume_from_idle+0xf1 TASKQ: zil_clean stack pointer for thread ffffff0009066c60: ffffff0009066b60 [ ffffff0009066b60 _resume_from_idle+0xf1() ] swtch+0x147() cv_wait+0x61() taskq_thread+0x10b() thread_start+8() I should add that I have quite a lot of datasets: root at haggis:~# zfs list -r -t filesystem | wc -l 49 root at haggis:~# zfs list -r -t volume | wc -l 14 root at haggis:~# zfs list -r -t snapshot | wc -l 6018 Nils
> I should add that I have quite a lot of datasets:and maybe I should also add that I''m still running an old zpool version in order to keep the ability to boot snv_98: aggis:~$ zpool upgrade This system is currently running ZFS pool version 14. The following pools are out of date, and can be upgraded. After being upgraded, these pools will no longer be accessible by older software versions. VER POOL --- ------------ 13 rpool
I wonder if a taskq pool does not suffer from a similar effect observed for the nfsd pool : 6467988 Minimize the working set of nfsd threads Created threads round robin our of taskq loop, doing little work but wake up at least once per 5 minute and so are never reaped. -r Nils Goroll writes: > Hi Neil and all, > > thank you very much for looking into this: > > > So I don''t know what''s going on. What is the typical call stack for those > > zil_clean() threads? > > I''d say they are all blocking on their respective CVs: > > ffffff0009066c60 fffffffffbc2c030 0 0 60 ffffff01d25e1180 > PC: _resume_from_idle+0xf1 TASKQ: zil_clean > stack pointer for thread ffffff0009066c60: ffffff0009066b60 > [ ffffff0009066b60 _resume_from_idle+0xf1() ] > swtch+0x147() > cv_wait+0x61() > taskq_thread+0x10b() > thread_start+8() > > I should add that I have quite a lot of datasets: > > root at haggis:~# zfs list -r -t filesystem | wc -l > 49 > root at haggis:~# zfs list -r -t volume | wc -l > 14 > root at haggis:~# zfs list -r -t snapshot | wc -l > 6018 > > Nils > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss