On Fri, Nov 02, 2018 at 09:14:51AM -0700, Linus Torvalds wrote:> On Fri, Nov 2, 2018 at 6:04 AM Michael S. Tsirkin <mst at redhat.com> wrote: > > > > I've tried making access_ok mask the parameter it gets. > > PLEASE don't do this.Okay.> Just use "copy_to/from_user()".Just for completeness I'd like to point out for vhost the copies are done from the kernel thread. So yes we can switch to copy_to/from_user but for e.g. 32-bit userspace running on top of a 64 bit kernel it is IIUC not sufficient - we must *also* do access_ok checks on control path when addresses are passed to the kernel and when current points to the correct task struct.> We have had lots of bugs because code bitrots.Yes, I wish we did not need these access_ok checks and could just rely on copy_to/from_user.> And no, the access_ok() checks aren't expensive, not even in a loop. > They *used* to be somewhat expensive compared to the access, but that > simply isn't true any more. The real expense in copy_to_user and > friends are in the user access bit setting (STAC and CLAC on x86), > which easily an order of magnitude more expensive than access_ok(). > > So just get rid of the double-underscore version. It's basically > always a mis-optimization due to entirely historical reasons. I can > pretty much guarantee that it's not visible in profiles. > > LinusOK. So maybe we should focus on switching to user_access_begin/end + unsafe_get_user/unsafe_put_user in a loop which does seem to be measureable. That moves the barrier out of the loop, which seems to be consistent with what you would expect. -- MST
On Fri, Nov 2, 2018 at 9:59 AM Michael S. Tsirkin <mst at redhat.com> wrote:> > Just for completeness I'd like to point out for vhost the copies are > done from the kernel thread. So yes we can switch to copy_to/from_user > but for e.g. 32-bit userspace running on top of a 64 bit kernel it is > IIUC not sufficient - we must *also* do access_ok checks on control path > when addresses are passed to the kernel and when current points to the > correct task struct.Don't you take over the VM with "use_mm()" when you do the copies? So yes, it's a kernel thread, but it has a user VM, and though that should have the user limits. No? Linus
On Fri, Nov 2, 2018 at 10:10 AM Linus Torvalds <torvalds at linux-foundation.org> wrote:> > Don't you take over the VM with "use_mm()" when you do the copies? So > yes, it's a kernel thread, but it has a user VM, and though that > should have the user limits.Oooh. *Just* as I sent this, I realized that "use_mm()" doesn't update the thread addr_limit. That actually looks like a bug to me - although one that you've apparently been aware of and worked around. Wouldn't it be nicer to just make "use_mm()" do set_fs(USER_DS); instead? And undo it on unuse_mm()? And, in fact, maybe we should default kernel threads to have a zero address limit, so that they can't do any user accesses at all without doing this? Adding Al to the cc, because I think he's been looking at set_fs() in general. Linus
On Fri, Nov 02, 2018 at 10:10:45AM -0700, Linus Torvalds wrote:> On Fri, Nov 2, 2018 at 9:59 AM Michael S. Tsirkin <mst at redhat.com> wrote: > > > > Just for completeness I'd like to point out for vhost the copies are > > done from the kernel thread. So yes we can switch to copy_to/from_user > > but for e.g. 32-bit userspace running on top of a 64 bit kernel it is > > IIUC not sufficient - we must *also* do access_ok checks on control path > > when addresses are passed to the kernel and when current points to the > > correct task struct. > > Don't you take over the VM with "use_mm()" when you do the copies?Yes we do.> So > yes, it's a kernel thread, but it has a user VM, and though that > should have the user limits. > > No? > > LinusHere's what I meant: we have #define access_ok(type, addr, size) \ ({ \ WARN_ON_IN_IRQ(); \ likely(!__range_not_ok(addr, size, user_addr_max())); \ }) and #define user_addr_max() (current->thread.addr_limit.seg) it seems that it depends on current not on the active mm. get_user and friends are similar: ENTRY(__get_user_1) mov PER_CPU_VAR(current_task), %_ASM_DX cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX jae bad_get_user sbb %_ASM_DX, %_ASM_DX /* array_index_mask_nospec() */ and %_ASM_DX, %_ASM_AX ASM_STAC 1: movzbl (%_ASM_AX),%edx xor %eax,%eax ASM_CLAC ret ENDPROC(__get_user_1) EXPORT_SYMBOL(__get_user_1) -- MST