thr3ads.net - llvm dev - [LLVMdev] RFC: implicit null checks in llvm [Apr 2015]

If this information is useful, please help other people find it:
Share via:

Sanjoy Das

2015-Apr-24 23:14 UTC

[LLVMdev] RFC: implicit null checks in llvm

I don't think we can expose the memory operations directly from a
semantic, theoretical point of view.  Whether practically we can do
this or not is a different question.

Does LLVM do optimizations like these at the machine instruction
level?


   if (condition)
     T = *X  // normal load, condition guards against null

   EH_LABEL // clobbers all
   U = *X  // implicit null check, branches out on fault
   EH_LABEL // clobbers all
   ...

=>

  since the second "load" from X always happens, X must be
  dereferenceable


   T = *X  // miscompile here

   EH_LABEL // clobbers all
   U = *X  // implicit null check, branches out on fault
   EH_LABEL // clobbers all
   ...

The fundamental problem, of course, is that we're hiding the real
control flow which is

 if (!is_dereferenceable(X))  branch_out;
 U = *X
> We don’t need to support patching at the load. Patch points will be needed
> to “heal” bad implicit null checks, but that is probably better done by
> patching call sites into the optimized code. Eventually, someone may want
to
> be able to patch their implicit null checks, and they’ll just need to use a
> patchpoint to do that instead.
Agreed.

-- Sanjoy

Andrew Trick

2015-Apr-30 01:52 UTC

head link

[LLVMdev] RFC: implicit null checks in llvm

> On Apr 24, 2015, at 4:14 PM, Sanjoy Das <sanjoy at
playingwithpointers.com> wrote:
> 
> I don't think we can expose the memory operations directly from a
> semantic, theoretical point of view.  Whether practically we can do
> this or not is a different question.
> 
> Does LLVM do optimizations like these at the machine instruction
> level?
> 
> 
>   if (condition)
>     T = *X  // normal load, condition guards against null
> 
>   EH_LABEL // clobbers all
>   U = *X  // implicit null check, branches out on fault
>   EH_LABEL // clobbers all
>   ...
> 
> =>
> 
>  since the second "load" from X always happens, X must be
>  dereferenceable
> 
> 
>   T = *X  // miscompile here
> 
>   EH_LABEL // clobbers all
>   U = *X  // implicit null check, branches out on fault
>   EH_LABEL // clobbers all
>   ...
> 
> The fundamental problem, of course, is that we're hiding the real
> control flow which is
> 
> if (!is_dereferenceable(X))  branch_out;
> U = *X
That’s a good description of the problem.

Lowering to real loads will *probably* just work because your are being saved by
EH_LABEL instructions which are conservatively modeled as having unknown side
effects. The feature that saves you will also defeat optimization of those
loads. I don't see any advantage of this in terms of optimizing codegen. It
is just a workaround to avoid defining pseudo instructions.

The optimal implementation would be to leave the explicit null check in place.
Late in the pipeline, just before post-ra scheduling, a pass would combine
and+cmp+br+load when it is profitable using target hooks like
getLdStBaseRegImmOfsWidth(). Note that we still have alias information in the
form of machine mem operands.

You could take a step in that direction without doing much backend work by
lowering to pseudo-loads during ISEL instead of using EH_LABEL. Then the various
load/store optimizations could be taught to explicitly optimize normal loads and
stores over the pseudo loads but not among them.

Andy

Sanjoy Das

2015-Jun-02 21:42 UTC

head link

[LLVMdev] RFC: implicit null checks in llvm

I decided to go with Andy's suggestion of lowering explicit null
checks into implicit null checks late, after register allocation.  The
tip of the change is at http://reviews.llvm.org/D10201.

-- Sanjoy

On Wed, Apr 29, 2015 at 6:52 PM, Andrew Trick <atrick at apple.com>
wrote:>
>> On Apr 24, 2015, at 4:14 PM, Sanjoy Das <sanjoy at
playingwithpointers.com> wrote:
>>
>> I don't think we can expose the memory operations directly from a
>> semantic, theoretical point of view.  Whether practically we can do
>> this or not is a different question.
>>
>> Does LLVM do optimizations like these at the machine instruction
>> level?
>>
>>
>>   if (condition)
>>     T = *X  // normal load, condition guards against null
>>
>>   EH_LABEL // clobbers all
>>   U = *X  // implicit null check, branches out on fault
>>   EH_LABEL // clobbers all
>>   ...
>>
>> =>
>>
>>  since the second "load" from X always happens, X must be
>>  dereferenceable
>>
>>
>>   T = *X  // miscompile here
>>
>>   EH_LABEL // clobbers all
>>   U = *X  // implicit null check, branches out on fault
>>   EH_LABEL // clobbers all
>>   ...
>>
>> The fundamental problem, of course, is that we're hiding the real
>> control flow which is
>>
>> if (!is_dereferenceable(X))  branch_out;
>> U = *X
>
> That’s a good description of the problem.
>
> Lowering to real loads will *probably* just work because your are being
saved by EH_LABEL instructions which are conservatively modeled as having
unknown side effects. The feature that saves you will also defeat optimization
of those loads. I don't see any advantage of this in terms of optimizing
codegen. It is just a workaround to avoid defining pseudo instructions.
>
> The optimal implementation would be to leave the explicit null check in
place. Late in the pipeline, just before post-ra scheduling, a pass would
combine and+cmp+br+load when it is profitable using target hooks like
getLdStBaseRegImmOfsWidth(). Note that we still have alias information in the
form of machine mem operands.
>
> You could take a step in that direction without doing much backend work by
lowering to pseudo-loads during ISEL instead of using EH_LABEL. Then the various
load/store optimizations could be taught to explicitly optimize normal loads and
stores over the pseudo loads but not among them.
>
> Andy

llvm dev - Apr 2015 - [LLVMdev] RFC: implicit null checks in llvm

[LLVMdev] RFC: implicit null checks in llvm

[LLVMdev] RFC: implicit null checks in llvm

[LLVMdev] RFC: implicit null checks in llvm