In 11 and 12 the taskqueue code has been rewritten in this area but
under 10 this bug still occurs.
On our appliances this bug stops the system from mounting the ZFS
root, so it is quite severe.
Basically while the thread is sleeping during the ZFS mount of root
(in the while loop), another thread can free the 'task' item it is
checking in that while loop and it can be reused or filled with
'deadcode' etc., with the waiting code unaware of the change.. The fix
is to refetch the item at the end of the queue each time around the loop.
I don't really want to do the bigger change of MFCing the change in
11, as it is more extensive, though if someone else does, that's ok by
me. (If it's ABI compatible)
Any comments or suggestions?
here's the fix in diff form:
[robot at porridge /usr/src]$ p4 diff -du ...
--- //depot/pbranches/jelischer/FreeBSD-PZ/10.3/sys/kern/subr_taskqueue.c
2016-09-27 09:14:59.000000000 -0700
+++ /usr/src/sys/kern/subr_taskqueue.c 2016-09-27 09:14:59.000000000 -0700
@@ -441,9 +441,10 @@
TQ_LOCK(queue);
task = STAILQ_LAST(&queue->tq_queue, task, ta_link);
- if (task != NULL)
- while (task->ta_pending != 0)
- TQ_SLEEP(queue, task, &queue->tq_mutex, PWAIT,
"-", 0);
+ while (task != NULL && task->ta_pending != 0) {
+ TQ_SLEEP(queue, task, &queue->tq_mutex, PWAIT,
"-", 0);
+ task = STAILQ_LAST(&queue->tq_queue, task, ta_link);
+ }
taskqueue_drain_running(queue);
KASSERT(STAILQ_EMPTY(&queue->tq_queue),
("taskqueue queue is not empty after draining"));
Julian Elischer
2016-Oct-05 23:59 UTC
fix for use-after-free problem in 10.x [review please].
Please review..
https://reviews.freebsd.org/D8160
Direct fix for stable/10 as bug is not present in 11+ in this form.
Julian
On 4/10/2016 8:06 PM, Julian Elischer wrote:> In 11 and 12 the taskqueue code has been rewritten in this area but
> under 10 this bug still occurs.
>
> On our appliances this bug stops the system from mounting the ZFS
> root, so it is quite severe.
> Basically while the thread is sleeping during the ZFS mount of root
> (in the while loop), another thread can free the 'task' item it is
> checking in that while loop and it can be reused or filled with
> 'deadcode' etc., with the waiting code unaware of the change.. The
> fix is to refetch the item at the end of the queue each time around
> the loop.
> I don't really want to do the bigger change of MFCing the change in
> 11, as it is more extensive, though if someone else does, that's ok
> by me. (If it's ABI compatible)
>
> Any comments or suggestions?
>
> here's the fix in diff form:
A slightly better fix is at
https://reviews.freebsd.org/D8160
>
>
> [robot at porridge /usr/src]$ p4 diff -du ...
> ---
> //depot/pbranches/jelischer/FreeBSD-PZ/10.3/sys/kern/subr_taskqueue.c
> 2016-09-27 09:14:59.000000000 -0700
> +++ /usr/src/sys/kern/subr_taskqueue.c 2016-09-27
> 09:14:59.000000000 -0700
> @@ -441,9 +441,10 @@
>
> TQ_LOCK(queue);
> task = STAILQ_LAST(&queue->tq_queue, task, ta_link);
> - if (task != NULL)
> - while (task->ta_pending != 0)
> - TQ_SLEEP(queue, task, &queue->tq_mutex,
> PWAIT, "-", 0);
> + while (task != NULL && task->ta_pending != 0) {
> + TQ_SLEEP(queue, task, &queue->tq_mutex, PWAIT,
"-", 0);
> + task = STAILQ_LAST(&queue->tq_queue, task, ta_link);
> + }
> taskqueue_drain_running(queue);
> KASSERT(STAILQ_EMPTY(&queue->tq_queue),
> ("taskqueue queue is not empty after draining"));
>
On 10/5/16, Julian Elischer <julian at freebsd.org> wrote:> In 11 and 12 the taskqueue code has been rewritten in this area but > under 10 this bug still occurs. > > On our appliances this bug stops the system from mounting the ZFS > root, so it is quite severe. > Basically while the thread is sleeping during the ZFS mount of root > (in the while loop), another thread can free the 'task' item it is > checking in that while loop and it can be reused or filled with > 'deadcode' etc., with the waiting code unaware of the change.. The fix > is to refetch the item at the end of the queue each time around the loop. > I don't really want to do the bigger change of MFCing the change in > 11, as it is more extensive, though if someone else does, that's ok by > me. (If it's ABI compatible) > > Any comments or suggestions?Yes, please commit them. This patch fixes the ZFS + GELI + INVARIANTS problem for us. There is the FreeBSD PR about the issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209580> > here's the fix in diff form: > > > [robot at porridge /usr/src]$ p4 diff -du ... > --- //depot/pbranches/jelischer/FreeBSD-PZ/10.3/sys/kern/subr_taskqueue.c > 2016-09-27 09:14:59.000000000 -0700 > +++ /usr/src/sys/kern/subr_taskqueue.c 2016-09-27 09:14:59.000000000 -0700 > @@ -441,9 +441,10 @@ > > TQ_LOCK(queue); > task = STAILQ_LAST(&queue->tq_queue, task, ta_link); > - if (task != NULL) > - while (task->ta_pending != 0) > - TQ_SLEEP(queue, task, &queue->tq_mutex, PWAIT, "-", > 0); > + while (task != NULL && task->ta_pending != 0) { > + TQ_SLEEP(queue, task, &queue->tq_mutex, PWAIT, "-", 0); > + task = STAILQ_LAST(&queue->tq_queue, task, ta_link); > + } > taskqueue_drain_running(queue); > KASSERT(STAILQ_EMPTY(&queue->tq_queue), > ("taskqueue queue is not empty after draining")); > > _______________________________________________ > freebsd-hackers at freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe at freebsd.org" >