Steven Rostedt
2006-Sep-29 04:00 UTC
[Xen-devel] [PATCH] Turn blktap tapfds into a link list
The current implementation of blktap has the tapfds descriptors as a static array. Which means that what isn''t used is taking up memory. It also means that if we ever want to dynamically increase the number of descriptors that we allow, we can''t do so. So this patch converts the tapfds from an array into a link list. I use the list_add_rcu and list_for_each_entry_rcu so that I don''t need to protect the list while walking or adding items to it. I never delete a device, so once it is added to the list, it stays. But it wouldn''t be hard with the rcu list operations to add a deletion in the future. This patch also fixed a few bugs that were lying in there. Some where the tapfds[idx] was offset outside the checking of the idx size (although they would probably be very hard to hit). Anyway, I booted this up and it runs. I started a guest with a block device on the blktap, twice, and there was no problems. But... I have to admit, I wrote this when I was tired, so it may have introduced a few bugs as well. I went over it twice, but I could have missed something twice. Please review! But there shouldn''t be anything major wrong with it. -- Steve Signed-off-by: Steven Rostedt <srostedt@redhat.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Warfield
2006-Sep-29 17:13 UTC
[Xen-devel] Re: [PATCH] Turn blktap tapfds into a link list
Hi Steven, thanks for getting to this so quickly! The patch generally looks good -- a couple of quick thoughts: - Linear searches of the tapfds list are a little grim where they appear in the data path (blktap_ioctl, blktap_kick_user, fast_flush_area, do_block_io_op, dispatch_rw_block_io). If we are happy with a limit of 254 concurrent devices for the immediate term, I wonder if a lookup array indexed by minor and allocated on use might be better? - I enjoyed seeing the domid_translate array go away, I think we can kill this translation all together though by moving the domid/busid lookup out of blktapctrl and into xenbus, and filling it in directly when a new vbd is connected. - With dynamic allocation, MAX_TAP_DEV seems a little unnecessary. Shouldn''t we just allocate until we run out of minors now? This is a great improvement. I know of at least one person that is regularly running blktap with 60-80 vbds -- I''d like to get them to try out the patch as an additional check. Also, because of the changes in allocation and locking I''m inclined to wait until immediately after the 3.0.3 barrier with this one. Sound okay? a. On 9/28/06, Steven Rostedt <srostedt@redhat.com> wrote:> The current implementation of blktap has the tapfds descriptors as a > static array. Which means that what isn''t used is taking up memory. It > also means that if we ever want to dynamically increase the number of > descriptors that we allow, we can''t do so. > > So this patch converts the tapfds from an array into a link list. > > I use the list_add_rcu and list_for_each_entry_rcu so that I don''t need > to protect the list while walking or adding items to it. I never delete > a device, so once it is added to the list, it stays. But it wouldn''t be > hard with the rcu list operations to add a deletion in the future. > > This patch also fixed a few bugs that were lying in there. Some where > the tapfds[idx] was offset outside the checking of the idx size > (although they would probably be very hard to hit). > > Anyway, I booted this up and it runs. I started a guest with a block > device on the blktap, twice, and there was no problems. > > But... I have to admit, I wrote this when I was tired, so it may have > introduced a few bugs as well. I went over it twice, but I could have > missed something twice. Please review! But there shouldn''t be anything > major wrong with it. > > -- Steve > > Signed-off-by: Steven Rostedt <srostedt@redhat.com> > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Rostedt
2006-Sep-29 18:32 UTC
[Xen-devel] Re: [PATCH] Turn blktap tapfds into a link list
Andrew Warfield wrote:> Hi Steven, > > thanks for getting to this so quickly! The patch generally looks goodThanks, it was some what fresh in my head, so I decided to dump it out.> -- a > couple of quick thoughts: > > - Linear searches of the tapfds list are a little grim where they > appear in > the data path (blktap_ioctl, blktap_kick_user, fast_flush_area,I didn''t like this either. Perhaps I could switch it back to an array of pointers. And I could even have the array be able to resize, with the use of rcu locks.> do_block_io_op, dispatch_rw_block_io). If we are happy with a limit of > 254 concurrent devices for the immediate term, I wonder if a lookup > array > indexed by minor and allocated on use might be better?Yeah, I think I do agree with you on this. I really don''t like that linear search. Maybe I did it because I was tired and it seemed cool. ;)> > - I enjoyed seeing the domid_translate array go away, I think we can kill > this translation all together though by moving the domid/busid lookup > out of blktapctrl and into xenbus, and filling it in directly when a > new vbd is connected.This is a separate issue, and would need to be looked at later. (I''m not to sure on the interworkings of that code).> > - With dynamic allocation, MAX_TAP_DEV seems a little unnecessary. > Shouldn''t > we just allocate until we run out of minors now?Sure! I just was keeping it in sync with what was there. The old code didn''t allocate more than MAX_TAP_DEV so I wasn''t about to change it.> > This is a great improvement. I know of at least one person that is > regularly > running blktap with 60-80 vbds -- I''d like to get them to try out the > patch as > an additional check. Also, because of the changes in allocation and > locking > I''m inclined to wait until immediately after the 3.0.3 barrier with this > one. > Sound okay?Sounds fine with me. Thanks, -- Steve _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Warfield
2006-Sep-29 18:41 UTC
[Xen-devel] Re: [PATCH] Turn blktap tapfds into a link list
> > - Linear searches of the tapfds list are a little grim where they > > appear in > > the data path (blktap_ioctl, blktap_kick_user, fast_flush_area, > > I didn''t like this either. Perhaps I could switch it back to an array > of pointers. And I could even have the array be able to resize, with > the use of rcu locks. > > > do_block_io_op, dispatch_rw_block_io). If we are happy with a limit of > > 254 concurrent devices for the immediate term, I wonder if a lookup > > array > > indexed by minor and allocated on use might be better? > > Yeah, I think I do agree with you on this. I really don''t like that > linear search. Maybe I did it because I was tired and it seemed cool. ;)Fair enough. :) Resizable lookup array of pointers sounds great. Then again 254 pointers may not be the worst overhead. ;)> > - I enjoyed seeing the domid_translate array go away, I think we can kill > > this translation all together though by moving the domid/busid lookup > > out of blktapctrl and into xenbus, and filling it in directly when a > > new vbd is connected. > > This is a separate issue, and would need to be looked at later. (I''m not > to sure on the interworkings of that code).Absolutely, that comment was partially a note-to-self. ;)> > - With dynamic allocation, MAX_TAP_DEV seems a little unnecessary. > > Shouldn''t > > we just allocate until we run out of minors now? > > Sure! I just was keeping it in sync with what was there. The old code > didn''t allocate more than MAX_TAP_DEV so I wasn''t about to change it.Change away -- the current maximum is pretty arbitrary.> > I''m inclined to wait until immediately after the 3.0.3 barrier with this > > one. > > Sound okay? > > Sounds fine with me. Thanks,excellent -- thanks again to both you and Stephen for all the patches this week. The code is much improved now and it''s great to have all the feedback. a. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel