Cornelia Huck
2016-Aug-17 09:06 UTC
[PATCH 06/15] genhd: Add return code to device_add_disk
On Wed, 17 Aug 2016 16:48:23 +0800 Fam Zheng <famz at redhat.com> wrote:> On Wed, 08/17 10:49, Cornelia Huck wrote: > > On Wed, 17 Aug 2016 15:15:06 +0800 > > Fam Zheng <famz at redhat.com> wrote: > > > > > @@ -613,10 +614,8 @@ void device_add_disk(struct device *parent, struct gendisk *disk) > > > disk->flags |= GENHD_FL_UP; > > > > > > retval = blk_alloc_devt(&disk->part0, &devt); > > > - if (retval) { > > > - WARN_ON(1); > > > - return; > > > - } > > > + if (retval) > > > + goto fail; > > > disk_to_dev(disk)->devt = devt; > > > > > > /* ->major and ->first_minor aren't supposed to be > > > @@ -625,16 +624,26 @@ void device_add_disk(struct device *parent, struct gendisk *disk) > > > disk->major = MAJOR(devt); > > > disk->first_minor = MINOR(devt); > > > > > > - disk_alloc_events(disk); > > > + retval = disk_alloc_events(disk); > > > + if (retval) > > > + goto fail; > > > > > > /* Register BDI before referencing it from bdev */ > > > bdi = &disk->queue->backing_dev_info; > > > - bdi_register_owner(bdi, disk_to_dev(disk)); > > > + retval = bdi_register_owner(bdi, disk_to_dev(disk)); > > > + if (retval) > > > + goto fail; > > > > > > - blk_register_region(disk_devt(disk), disk->minors, NULL, > > > - exact_match, exact_lock, disk); > > > - register_disk(parent, disk); > > > - blk_register_queue(disk); > > > + retval = blk_register_region(disk_devt(disk), disk->minors, NULL, > > > + exact_match, exact_lock, disk); > > > + if (retval) > > > + goto fail; > > > + retval = register_disk(parent, disk); > > > + if (retval) > > > + goto fail; > > > + retval = blk_register_queue(disk); > > > + if (retval) > > > + goto fail; > > > > > > /* > > > * Take an extra ref on queue which will be put on disk_release() > > > @@ -644,10 +653,20 @@ void device_add_disk(struct device *parent, struct gendisk *disk) > > > > > > retval = sysfs_create_link(&disk_to_dev(disk)->kobj, &bdi->dev->kobj, > > > "bdi"); > > > + if (retval) > > > + goto fail; > > > + > > > + retval = disk_add_events(disk); > > > + if (retval) > > > + goto fail; > > > + > > > + retval = blk_integrity_add(disk); > > > + if (retval) > > > + goto fail; > > > + return 0; > > > +fail: > > > WARN_ON(retval); > > > - > > > - disk_add_events(disk); > > > - blk_integrity_add(disk); > > > + return retval; > > > } > > > > Noticed this when trying to figure out whether the error handling in > > virtio_blk was correct: > > > > Shouldn't you try to cleanup/rewind so that any structures are in a > > sane state after failure? The caller doesn't know where device_add_disk > > failed, and calling del_gendisk unconditionally like virtio_blk does is > > probably not the right thing to do (at the very least, I don't think > > unregistering a device that has not been registered is likely to work). > > > > Yes, I think all callers need to be reviewed before device_add_disk do the > clean up on error. For this patchset I wanted to keep the change small.But do the callers even have a chance to do this correctly right now? They will either clean up too much, or too little. ('Too little' is probably the more common case, given that you just added error propagation...) Can you make del_gendisk handle devices partially setup via device_add_disk in all cases? Then you could mandate pairing device_add_disk with del_gendisk in all cases, error or not, and you should have a better chance on avoiding introducing new errors.
On Wed, 08/17 11:06, Cornelia Huck wrote:> On Wed, 17 Aug 2016 16:48:23 +0800 > Fam Zheng <famz at redhat.com> wrote: > > > On Wed, 08/17 10:49, Cornelia Huck wrote: > > > On Wed, 17 Aug 2016 15:15:06 +0800 > > > Fam Zheng <famz at redhat.com> wrote: > > > > > > > @@ -613,10 +614,8 @@ void device_add_disk(struct device *parent, struct gendisk *disk) > > > > disk->flags |= GENHD_FL_UP; > > > > > > > > retval = blk_alloc_devt(&disk->part0, &devt); > > > > - if (retval) { > > > > - WARN_ON(1); > > > > - return; > > > > - } > > > > + if (retval) > > > > + goto fail; > > > > disk_to_dev(disk)->devt = devt; > > > > > > > > /* ->major and ->first_minor aren't supposed to be > > > > @@ -625,16 +624,26 @@ void device_add_disk(struct device *parent, struct gendisk *disk) > > > > disk->major = MAJOR(devt); > > > > disk->first_minor = MINOR(devt); > > > > > > > > - disk_alloc_events(disk); > > > > + retval = disk_alloc_events(disk); > > > > + if (retval) > > > > + goto fail; > > > > > > > > /* Register BDI before referencing it from bdev */ > > > > bdi = &disk->queue->backing_dev_info; > > > > - bdi_register_owner(bdi, disk_to_dev(disk)); > > > > + retval = bdi_register_owner(bdi, disk_to_dev(disk)); > > > > + if (retval) > > > > + goto fail; > > > > > > > > - blk_register_region(disk_devt(disk), disk->minors, NULL, > > > > - exact_match, exact_lock, disk); > > > > - register_disk(parent, disk); > > > > - blk_register_queue(disk); > > > > + retval = blk_register_region(disk_devt(disk), disk->minors, NULL, > > > > + exact_match, exact_lock, disk); > > > > + if (retval) > > > > + goto fail; > > > > + retval = register_disk(parent, disk); > > > > + if (retval) > > > > + goto fail; > > > > + retval = blk_register_queue(disk); > > > > + if (retval) > > > > + goto fail; > > > > > > > > /* > > > > * Take an extra ref on queue which will be put on disk_release() > > > > @@ -644,10 +653,20 @@ void device_add_disk(struct device *parent, struct gendisk *disk) > > > > > > > > retval = sysfs_create_link(&disk_to_dev(disk)->kobj, &bdi->dev->kobj, > > > > "bdi"); > > > > + if (retval) > > > > + goto fail; > > > > + > > > > + retval = disk_add_events(disk); > > > > + if (retval) > > > > + goto fail; > > > > + > > > > + retval = blk_integrity_add(disk); > > > > + if (retval) > > > > + goto fail; > > > > + return 0; > > > > +fail: > > > > WARN_ON(retval); > > > > - > > > > - disk_add_events(disk); > > > > - blk_integrity_add(disk); > > > > + return retval; > > > > } > > > > > > Noticed this when trying to figure out whether the error handling in > > > virtio_blk was correct: > > > > > > Shouldn't you try to cleanup/rewind so that any structures are in a > > > sane state after failure? The caller doesn't know where device_add_disk > > > failed, and calling del_gendisk unconditionally like virtio_blk does is > > > probably not the right thing to do (at the very least, I don't think > > > unregistering a device that has not been registered is likely to work). > > > > > > > Yes, I think all callers need to be reviewed before device_add_disk do the > > clean up on error. For this patchset I wanted to keep the change small. > > But do the callers even have a chance to do this correctly right now? > They will either clean up too much, or too little. ('Too little' is > probably the more common case, given that you just added error > propagation...)Right, which is pre-exising.> > Can you make del_gendisk handle devices partially setup via > device_add_disk in all cases? Then you could mandate pairing > device_add_disk with del_gendisk in all cases, error or not, and you > should have a better chance on avoiding introducing new errors. >Of course, the plan is to write patches on top. I'm not cleaning up anything here because I'm concerned callers may double free (and I didn't look hard into that). Fam
On 08/17/2016 05:14 AM, Fam Zheng wrote: ...> Of course, the plan is to write patches on top. I'm not cleaning up anything > here because I'm concerned callers may double free (and I didn't look hard into > that).Aside from Huck's concerns, the changes looked OK from aoe's perspective. -- Ed
Reasonably Related Threads
- [PATCH 06/15] genhd: Add return code to device_add_disk
- [PATCH 06/15] genhd: Add return code to device_add_disk
- [PATCH 06/15] genhd: Add return code to device_add_disk
- [PATCH 06/15] genhd: Add return code to device_add_disk
- [PATCH 06/15] genhd: Add return code to device_add_disk