thr3ads.net - llvm dev - [LLVMdev] alloc

If this information is useful, please help other people find it:
Share via:

Nuno Lopes

2012-May-30 15:36 UTC

[LLVMdev] alloc_size metadata

>> >> >> I think this is a good point, here's a
suggestion:
>> >> >>
>> >> >> Have the metadata name two functions, both assumed to
have the
>> >> >> same signature as the tagged function, one which
returns the
>> >> >> offset of the start of the allocated region and one
which
>> >> >> returns the length of the allocated region.
Alternatively,
>> >> >> these functions could take the same signature and
additionally
>> >> >> the returned pointer of the tagged function, and then
one
>> >> >> function can return the start of the region and the
other the
>> >> >> length.
>> >> >
>> >> > Ok, so this seems to be the most general proposal, which
can
>> >> > obviously handle all cases.
>> >>
>> >> I agree.  Variation: have one function return the offset of
the
>> >> start of the memory, and the other the offset of the end of
the
>> >> memory (or the end plus 1), i.e. a range.  This seems more
uniform
>> >> to me, but I don't have a strong preference.
>> >>
>> >> > Something like this would work:
>> >> >
>> >> > define i8* @foo() {
>> >> >     %0 = tail call i32 @get_realloc_size(i8* null, i32
42)
>> >> >     %call = tail call i8* @my_recalloc(i8* null, i32 42)
>> >> > nounwind, !alloc_size !{i32 %0}
>> >> >     ret i8* %call
>> >> > }
>> >> >
>> >> > Basically I just added a function call as the metadata
(it's not
>> >> > currently possible to add the function itself to the
metadata;
>> >> > the function call is required instead).
>> >> > As long as the function is marked as readnone, I think it
>> >> > shouldn't interfere with the optimizers, and we can
have a later
>> >> > pass to drop the metadata and remove the calls.  I still
don't
>> >> > like having the explicit calls there, though.  Any
suggestions
>> >> > to remove the functions calls from there?
>> >>
>> >> How about this:
>> >>
>> >> define i32 @lo(i32) {
>> >>    ret i32 0
>> >> }
>> >>
>> >> define i32 @hi(i32 %n) {
>> >>    ret i32 %n
>> >> }
>> >>
>> >> declare i8* @wonder_allocator(i32)
>> >>
>> >> define i8* @foo(i32 %n) {
>> >>    %r = call i8* @wonder_allocator(i32 %n), !alloc !0
>> >>    ret i8* %r
>> >> }
>> >>
>> >> !0 = metadata !{ i32 (i32)* @lo, i32 (i32)* @hi }
>> >
>> > This is the format that I had in mind.
>> >
>> >>
>> >>
>> >> The main problem I see is that if you declare @lo and @hi to
have
>> >> internal linkage then the optimizers will zap them.  Maybe
there's
>> >> a neat solution to that.
>> >
>> > I would consider the optimizer doing this a feature, not a
problem.
>> > That having been said, we need to make sure that the optimzer does
>> > not zap them before the analysis/instrumentation passes get to
run.
>>
>> This is actually non-trivial to accomplish.
>> Metadata doesn't count as a user, so internal functions with no
>> other usage will get removed.
>
> I thought that it is possible to have passes run before the optimizer
> performs such deletions. Is this not practical? Another option is to
> change the current implementation delete such functions in two phases:
> in the first phase we leave functions with metadata references. In the
> second phase (which runs near the end of the pipeline) we delete
> functions regardless of metadata references.
Right now, if you list the users of a Value, the references coming  
from metadata won't appear. Metadata is not an user and doesn't count  
towards the number of uses of a value.  That's why using anything  
about constant expressions risks disappearing.
Leaving non-used functions to be removed after all optimizations could  
be done. But then you would probably want to, for example, patch the  
pass manager so that it didn't run a function pass over dead  
functions, and so on.

>> On the other hand, we could use
>> @llvm.used to make sure the functions had (at least) one user, but
>> that's probably equivalent to not using internal linkage.
>> And I still want to make sure that these functions disappear in the
>> final binary..
>>
>> Another thing that bothers me is the implementation on the
>> objectsize intrinsic. This intrinsic returns the *constant* size of
>> the pointed object given as argument (if the object has a constant
>> size). However, with this function scheme, the implementation would
>> be a bit heavy, since it would need to inline the @lo and @hi
>> functions, simplify the resulting expression, and then check if the
>> result is a ConstantInt. And remember that in general these functions
>> can be arbitrary complex.
>
> I agree; we'd need to use SCEV or some other heavyweight mechanism to
> do the analysis. In some sense, however, that would be the price of
> generality. On the other hand, I see no reason why we could not write a
> utility function that could accomplish all of that, so we'd only need
> to work out the details once.
SCEV is not the answer here. You just want to know if the result of a  
function is constant given a set of parameters. Inlining +  
simplifications should do it.  But doing an inlining trial is expensive.

> I think that this kind of issue will come up again in the future. Any
> time someone asks, "how can a frontend pass <some complicated
> constraint or information> to the backend", this kind of
functionality
> will be needed.
Yes. Maybe we should have a separate mini-expression language for the  
metadata? I dunno if it's worth the effort..

Nuno

Duncan Sands

2012-May-30 15:55 UTC

head link

[LLVMdev] alloc_size metadata

Hi Nuno,
>>> This is actually non-trivial to accomplish.
>>> Metadata doesn't count as a user, so internal functions with no
>>> other usage will get removed.
>>
>> I thought that it is possible to have passes run before the optimizer
>> performs such deletions. Is this not practical? Another option is to
>> change the current implementation delete such functions in two phases:
>> in the first phase we leave functions with metadata references. In the
>> second phase (which runs near the end of the pipeline) we delete
>> functions regardless of metadata references.
>
> Right now, if you list the users of a Value, the references coming
> from metadata won't appear. Metadata is not an user and doesn't
count
> towards the number of uses of a value.  That's why using anything
> about constant expressions risks disappearing.
> Leaving non-used functions to be removed after all optimizations could
> be done. But then you would probably want to, for example, patch the
> pass manager so that it didn't run a function pass over dead
> functions, and so on.
the functions could be declared to have linkonce_odr linkage.  That way
they will be zapped after the inliner runs, but shouldn't be removed
before.
>>> Another thing that bothers me is the implementation on the
>>> objectsize intrinsic. This intrinsic returns the *constant* size of
>>> the pointed object given as argument (if the object has a constant
>>> size). However, with this function scheme, the implementation would
>>> be a bit heavy, since it would need to inline the @lo and @hi
>>> functions, simplify the resulting expression, and then check if the
>>> result is a ConstantInt. And remember that in general these
functions
>>> can be arbitrary complex.
>>
>> I agree; we'd need to use SCEV or some other heavyweight mechanism
to
>> do the analysis. In some sense, however, that would be the price of
>> generality. On the other hand, I see no reason why we could not write a
>> utility function that could accomplish all of that, so we'd only
need
>> to work out the details once.
>
> SCEV is not the answer here. You just want to know if the result of a
> function is constant given a set of parameters. Inlining +
> simplifications should do it.  But doing an inlining trial is expensive.
The hi/lo functions could be declared always_inline.  Thus they will always
be inlined, either by the always-inliner pass or the usual one.  You would
need to insert the instrumentation code or whatever that uses hi/lo before
any inliner runs, and run optimizations such as turning objectsize into a
constant after the inliner runs.

Ciao, Duncan.

Nuno Lopes

2012-Jun-01 06:37 UTC

head link

[LLVMdev] alloc_size metadata

Hi,

Sorry for the delay; comments below.
>>>> This is actually non-trivial to accomplish.
>>>> Metadata doesn't count as a user, so internal functions
with no
>>>> other usage will get removed.
>>>
>>> I thought that it is possible to have passes run before the
optimizer
>>> performs such deletions. Is this not practical? Another option is
to
>>> change the current implementation delete such functions in two
phases:
>>> in the first phase we leave functions with metadata references. In
the
>>> second phase (which runs near the end of the pipeline) we delete
>>> functions regardless of metadata references.
>>
>> Right now, if you list the users of a Value, the references coming
>> from metadata won't appear. Metadata is not an user and doesn't
count
>> towards the number of uses of a value.  That's why using anything
>> about constant expressions risks disappearing.
>> Leaving non-used functions to be removed after all optimizations could
>> be done. But then you would probably want to, for example, patch the
>> pass manager so that it didn't run a function pass over dead
>> functions, and so on.
>
> the functions could be declared to have linkonce_odr linkage.  That way
> they will be zapped after the inliner runs, but shouldn't be removed
> before.
I'm certainly not convinced. You cannot force all analysis to be run before 
inlining. You're basically saying that all passes that do analysis on buffer
size must run quite early. The inliner is run pretty early!
At least in the case of the buffer overflow pass, I want it to run late, 
after most cleanups were done. Asan does exactly the same.

>>>> Another thing that bothers me is the implementation on the
>>>> objectsize intrinsic. This intrinsic returns the *constant*
size of
>>>> the pointed object given as argument (if the object has a
constant
>>>> size). However, with this function scheme, the implementation
would
>>>> be a bit heavy, since it would need to inline the @lo and @hi
>>>> functions, simplify the resulting expression, and then check if
the
>>>> result is a ConstantInt. And remember that in general these
functions
>>>> can be arbitrary complex.
>>>
>>> I agree; we'd need to use SCEV or some other heavyweight
mechanism to
>>> do the analysis. In some sense, however, that would be the price of
>>> generality. On the other hand, I see no reason why we could not
write a
>>> utility function that could accomplish all of that, so we'd
only need
>>> to work out the details once.
>>
>> SCEV is not the answer here. You just want to know if the result of a
>> function is constant given a set of parameters. Inlining +
>> simplifications should do it.  But doing an inlining trial is
expensive.
>
> The hi/lo functions could be declared always_inline.  Thus they will 
> always
> be inlined, either by the always-inliner pass or the usual one.  You would
> need to insert the instrumentation code or whatever that uses hi/lo before
> any inliner runs, and run optimizations such as turning objectsize into a
> constant after the inliner runs.
The semantics of the objectsize intrinsic is that it returns a constant 
value if it can figure out the objectsize, and return 0/-1 otherwise. So you 
cannot simply inline the functions and hope for the best.  You need to run 
an inline trial:  inline; try to fold the resulting expression into a 
constant; remove inlined code if it didn't fold to a constant.
You may say this is the price of generality. I don't know how slow would it 
be, though.


Today, after playing around with these things, I found another problem: 
inlining functions with this alloc metadata.  Assuming that we attach the 
metadata to call sites in the front-end, if the function later gets inlined, 
then the metadata is lost.  We can, however, allow the metadata to be 
attached to arbitrary instructions, so that the inliner can be taught to 
attach it to the returned expression.

Nuno

Maybe Matching Threads

Search for more seemingly similar threads

llvm dev - May 2012 - [LLVMdev] alloc_size metadata

[LLVMdev] alloc_size metadata

[LLVMdev] alloc_size metadata

[LLVMdev] alloc_size metadata

Maybe Matching Threads