G''Day Pawel,
Pawel Jakub Dawidek wrote:>
> Hi.
>
> I discovered the following deadlock which can occur on adding cache
> device. When we call this command:
>
> # zpool add <pool> cache <disk>
>
> It hangs here:
>
> mutex_enter(&l2arc_dev_mtx)
> l2arc_add_vdev()
> spa_load_l2cache()
> spa_vdev_add()
> zfs_ioc_vdev_add()
> zfsdev_ioctl()
> ioctl()
> syscall()
>
> It cannot acquire the l2arc_dev_mtx mutex, because it is already held by
> the l2arc_feed_thread thread. The l2arc_feed_thread cannot release it
> because it hangs here:
>
> cv_wait(&scl->scl_cv, &scl->scl_lock)
> spa_config_enter()
> zio_create()
> zio_write_phys()
> l2arc_feed_thread()
>
> It will wait here forever, because the previous process hangs, and
> spa_config_exit() is called at the end of spa_vdev_add() (via
> spa_vdev_exit()) and spa_config_exit() calls cv_broadcast() for this
> condvar.
Sorry about this deadlock, and thanks - your analysis is spot on, I''ve
added
it to the bug description:
http://bugs.opensolaris.org/view_bug.do?bug_id=6701480
(it might take a few moments before the new description field appears on
the opensolaris.org website).
I''ve been working on the fix.
cheers,
Brendan
--
Brendan
[CA, USA]