thr3ads.net - llvm dev - [LLVMdev] RFC: implicit null checks in llvm [Apr 2015]

If this information is useful, please help other people find it:
Share via:

Reid Kleckner

2015-Apr-24 15:26 UTC

[LLVMdev] RFC: implicit null checks in llvm

On Thu, Apr 23, 2015 at 5:17 PM, Andrew Trick <atrick at apple.com> wrote:
> The scheduler itself doesn’t move anything around labels. But any pass can
> perform code motion or load/store optimization. Also, any pass can insert
> an instruction, like a copy, between the label and the load.
>
> I’m not really sure how EH_LABEL ends up translating into exception
> tables, but my guess is that it’s encoding a range that may include any
> arbitrary instructions as long as the call is within the range. So as long
> as calls aren’t reordered with labels and appears to have side effects it
> would work.
>
> So, you could add a different kind of stack map entry that encodes ranges
> instead of exact addresses, then survey all passes to ensure they don’t
> optimize loads across labels. I would have more confidence doing this with
> a pseudo instruction though.
>
I guess your concern is that you need an exact address to do patching.

If this feature is limited to simply allow handling exceptions raised from
loads and stores, then I'm not worried about copies, adds, or other code
being moved into the label range. We don't support catching traps from
those kinds of instructions. I'm only worried about memory accesses being
moved into the range. If we can address that, I think the EH_LABEL approach
works, and it generalizes to other trapping instructions like division or
FP exceptions.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150424/e122c912/attachment.html>

Andrew Trick

2015-Apr-24 21:24 UTC

head link

[LLVMdev] RFC: implicit null checks in llvm

> On Apr 24, 2015, at 8:26 AM, Reid Kleckner <rnk at google.com> wrote:
> 
> On Thu, Apr 23, 2015 at 5:17 PM, Andrew Trick <atrick at apple.com
<mailto:atrick at apple.com>> wrote:
> The scheduler itself doesn’t move anything around labels. But any pass can
perform code motion or load/store optimization. Also, any pass can insert an
instruction, like a copy, between the label and the load.
> 
> I’m not really sure how EH_LABEL ends up translating into exception tables,
but my guess is that it’s encoding a range that may include any arbitrary
instructions as long as the call is within the range. So as long as calls aren’t
reordered with labels and appears to have side effects it would work.
> 
> So, you could add a different kind of stack map entry that encodes ranges
instead of exact addresses, then survey all passes to ensure they don’t optimize
loads across labels. I would have more confidence doing this with a pseudo
instruction though.
> 
> I guess your concern is that you need an exact address to do patching.
> 
> If this feature is limited to simply allow handling exceptions raised from
loads and stores, then I'm not worried about copies, adds, or other code
being moved into the label range. We don't support catching traps from those
kinds of instructions. I'm only worried about memory accesses being moved
into the range. If we can address that, I think the EH_LABEL approach works, and
it generalizes to other trapping instructions like division or FP exceptions.
I think this is riskier than EH_LABELing calls because targets do aggressive
load/store optimization. It does sound plausible though. I imagine teaching
those passes to treat labels equivalent to unknown stores.

We don’t need to support patching at the load. Patch points will be needed to
“heal” bad implicit null checks, but that is probably better done by patching
call sites into the optimized code. Eventually, someone may want to be able to
patch their implicit null checks, and they’ll just need to use a patchpoint to
do that instead.

Obviously, if this approach works, it’s better to define a separate
llvm.load_with_trap intrinsic.

FWIW: Now that I think about it, the EH_LABEL approach makes sense and is more
elegant (although it’s risky), but implementing the pseudo instruction would
also be fairly straightforward. You would just mimic the patchpoint logic for
lowering and MC streaming.

Andy


-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150424/32b0ea61/attachment.html>

Sanjoy Das

2015-Apr-24 23:14 UTC

head link

[LLVMdev] RFC: implicit null checks in llvm

I don't think we can expose the memory operations directly from a
semantic, theoretical point of view.  Whether practically we can do
this or not is a different question.

Does LLVM do optimizations like these at the machine instruction
level?


   if (condition)
     T = *X  // normal load, condition guards against null

   EH_LABEL // clobbers all
   U = *X  // implicit null check, branches out on fault
   EH_LABEL // clobbers all
   ...

=>

  since the second "load" from X always happens, X must be
  dereferenceable


   T = *X  // miscompile here

   EH_LABEL // clobbers all
   U = *X  // implicit null check, branches out on fault
   EH_LABEL // clobbers all
   ...

The fundamental problem, of course, is that we're hiding the real
control flow which is

 if (!is_dereferenceable(X))  branch_out;
 U = *X
> We don’t need to support patching at the load. Patch points will be needed
> to “heal” bad implicit null checks, but that is probably better done by
> patching call sites into the optimized code. Eventually, someone may want
to
> be able to patch their implicit null checks, and they’ll just need to use a
> patchpoint to do that instead.
Agreed.

-- Sanjoy

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Apr 2015 - [LLVMdev] RFC: implicit null checks in llvm

[LLVMdev] RFC: implicit null checks in llvm

[LLVMdev] RFC: implicit null checks in llvm

[LLVMdev] RFC: implicit null checks in llvm

Possibly Parallel Threads