Robert Milkowski wrote:> Hi,
>
> I''m replicating some zfs file systems between some servers.
> Recently from time to time a server which is doing zfs recv hangs all
operations while writing to zfs pool. zfs recv is "stuck" so is an
attempt to zfs create xxx or sync.
> Creating a new file system or files in other pool on the same server works
fine.
>
> All vdevs and pools are healthy and there are no other issues.
> It looks like some kind of deadlock. IIRC there was some kind of a 3-way
deadlock but it was fixed in b111b (or a new code which introduced it was backed
out) so it is probably something else.
>
> I can provide a forced crashdump if someone is interested to help... :)
>
>
Sounds like 6826836, introduced in 111, fixed in 113.
-tim
> x4500 with Open Solaris snv_111b
>
> SolarisCAT(live/11X)> ps -ef | egrep "zfs|sync"
> root 24820 24819 0 13:40:45 ? 0:00 bash -c
/usr/local/bin/mbuffer -s 65536 -m 1024M | pfexec /usr/sbin/zfs recv
> root 25356 25326 0 13:50:56 pts/8 0:00 zfs create archive-2/tmp
> root 24822 24820 0 13:40:45 ? 0:13 /usr/sbin/zfs recv -dF
-v archive-2/repl
> root 25576 25567 0 13:54:04 pts/9 0:00 sync
> SolarisCAT(live/11X)> tlist -l proc 24822
> ==== user (LWP_SYS) thread: 0xffffff04e6604080 PID: 24822 ===> cmd:
/usr/sbin/zfs recv -dF -v archive-2/repl
> fmri: svc:/network/shell:default
> t_wchan: 0xffffff04de46055e sobj: condition var (from
zfs:txg_wait_open+0x7a)
> t_procp: 0xffffff05a26f16e8
> p_as: 0xffffff04e51c7548 size: 5472256 RSS: 2707456
> hat: 0xffffff04dbe9c6e8
> cpuset:
> zone: global
> t_stk: 0xffffff001f484f10 sp: 0xffffff001f484620 t_stkbase:
0xffffff001f480000
> t_pri: 59(TS) pctcpu: 0.000000
> t_lwp: 0xffffff04e5f216c0 lwp_regs: 0xffffff001f484f10
> mstate: LMS_SLEEP ms_prev: LMS_SYSTEM
> ms_state_start: 11 minutes 9.816633850 seconds earlier
> ms_start: 12 minutes 9.081459673 seconds earlier
> psrset: 0 last CPU: 2
> idle: 79631 ticks (13 minutes 16.31 seconds)
> start: Thu Jun 11 13:40:45 2009
> age: 855 seconds (14 minutes 15 seconds)
> syscall: #54 ioctl(, 0x0) ()
> tstate: TS_SLEEP - awaiting an event
> tflg: T_DFLTSTK - stack is default size
> tpflg: TP_TWAIT - wait to be freed by lwp_wait
> TP_MSACCT - collect micro-state accounting information
> tsched: TS_LOAD - thread is in memory
> TS_DONT_SWAP - thread/LWP should not be swapped
> pflag: SMSACCT - process is keeping micro-state accounting
> SMSFORK - child inherits micro-state accounting
>
> pc: unix:_resume_from_idle+0xf1 resume_return: addq $0x8,%rsp
>
> unix:_resume_from_idle+0xf1 resume_return()
> unix:swtch+0x147()
> genunix:cv_wait+0x61()
> zfs:txg_wait_open+0x7a()
> zfs:dmu_tx_wait+0xb3()
> zfs:dmu_tx_assign+0x4b()
> zfs:dmu_free_long_range_impl+0x12a()
> zfs:dmu_free_long_range+0x5b()
> zfs:dmu_object_reclaim+0x112()
> zfs:restore_object+0xff()
> zfs:dmu_recv_stream+0x48d()
> zfs:zfs_ioc_recv+0x2c0()
> zfs:zfsdev_ioctl+0x10b()
> genunix:cdev_ioctl+0x45()
> specfs:spec_ioctl+0x83()
> genunix:fop_ioctl+0x7b()
> genunix:ioctl+0x18e()
> unix:_syscall32_save+0xbf()
> -- switch to user thread''s user stack --
>
>
> 1 thread for that process found.
>
> SolarisCAT(live/11X)> tlist -l proc 25356
> ==== user (LWP_SYS) thread: 0xffffff05a36bae00 PID: 25356 ===> cmd: zfs
create archive-2/tmp
> fmri: svc:/network/ssh:default
> t_wchan: 0xffffff04de46055a sobj: condition var (from
zfs:txg_wait_synced+0x7f)
> t_procp: 0xffffff051e5996c8
> p_as: 0xffffff051a234e38 size: 7806976 RSS: 2531328
> hat: 0xffffff052f664448
> cpuset:
> zone: global
> t_stk: 0xffffff001eccbf10 sp: 0xffffff001eccb990 t_stkbase:
0xffffff001ecc7000
> t_pri: 59(TS) pctcpu: 0.000000
> t_lwp: 0xffffff05d9201220 lwp_regs: 0xffffff001eccbf10
> mstate: LMS_SLEEP ms_prev: LMS_SYSTEM
> ms_state_start: 2 minutes 2.018084726 seconds earlier
> ms_start: 2 minutes 2.064748896 seconds earlier
> psrset: 0 last CPU: 2
> idle: 24851 ticks (4 minutes 8.51 seconds)
> start: Thu Jun 11 13:50:56 2009
> age: 248 seconds (4 minutes 8 seconds)
> syscall: #54 ioctl(, 0x0) ()
> tstate: TS_SLEEP - awaiting an event
> tflg: T_DFLTSTK - stack is default size
> tpflg: TP_TWAIT - wait to be freed by lwp_wait
> TP_MSACCT - collect micro-state accounting information
> tsched: TS_LOAD - thread is in memory
> TS_DONT_SWAP - thread/LWP should not be swapped
> pflag: SMSACCT - process is keeping micro-state accounting
> SMSFORK - child inherits micro-state accounting
>
> pc: unix:_resume_from_idle+0xf1 resume_return: addq $0x8,%rsp
>
> unix:_resume_from_idle+0xf1 resume_return()
> unix:swtch+0x147()
> genunix:cv_wait+0x61()
> zfs:txg_wait_synced+0x7f()
> zfs:dsl_sync_task_group_wait+0xee()
> zfs:dsl_sync_task_do+0x65()
> zfs:dmu_objset_create+0x142()
> zfs:zfs_ioc_create+0x1e7()
> zfs:zfsdev_ioctl+0x10b()
> genunix:cdev_ioctl+0x45()
> specfs:spec_ioctl+0x83()
> genunix:fop_ioctl+0x7b()
> genunix:ioctl+0x18e()
> unix:_syscall32_save+0xbf()
> -- switch to user thread''s user stack --
>
>
> 1 thread for that process found.
>
> SolarisCAT(live/11X)> tlist -l proc 25576
> ==== user (LWP_SYS) thread: 0xffffff05a3246ac0 PID: 25576 ===> cmd:
sync
> fmri: svc:/network/ssh:default
> t_wchan: 0xffffff04de46055a sobj: condition var (from
zfs:txg_wait_synced+0x7f)
> t_procp: 0xffffff05a2657a90
> p_as: 0xffffff051e9f1c60 size: 4165632 RSS: 1081344
> hat: 0xffffff051e8977a8
> cpuset:
> zone: global
> t_stk: 0xffffff001f96bf10 sp: 0xffffff001f96bd50 t_stkbase:
0xffffff001f967000
> t_pri: 59(TS) pctcpu: 0.000000
> t_lwp: 0xffffff05d778a280 lwp_regs: 0xffffff001f96bf10
> mstate: LMS_SLEEP ms_prev: LMS_SYSTEM
> ms_state_start: 56.644445127 seconds later
> ms_start: 56.641730136 seconds later
> psrset: 0 last CPU: 2
> idle: 6985 ticks (1 minutes 9.85 seconds)
> start: Thu Jun 11 13:54:03 2009
> age: 70 seconds (1 minutes 10 seconds)
> syscall: #36 sync(, 0x0) ()
> tstate: TS_SLEEP - awaiting an event
> tflg: T_DFLTSTK - stack is default size
> tpflg: TP_TWAIT - wait to be freed by lwp_wait
> TP_MSACCT - collect micro-state accounting information
> tsched: TS_LOAD - thread is in memory
> TS_DONT_SWAP - thread/LWP should not be swapped
> pflag: SMSACCT - process is keeping micro-state accounting
> SMSFORK - child inherits micro-state accounting
>
> pc: unix:_resume_from_idle+0xf1 resume_return: addq $0x8,%rsp
>
> unix:_resume_from_idle+0xf1 resume_return()
> unix:swtch+0x147()
> genunix:cv_wait+0x61()
> zfs:txg_wait_synced+0x7f()
> zfs:spa_sync_allpools+0x76()
> zfs:zfs_sync+0xce()
> genunix:vfs_sync+0x9c()
> genunix:syssync+0xb()
> unix:_syscall32_save+0xbf()
> -- switch to user thread''s user stack --
>
>
> 1 thread for that process found.
>
> SolarisCAT(live/11X)>