thr3ads.net - llvm dev - [LLVMdev] IC profiling infrastructure [Jun 2015]

If this information is useful, please help other people find it:
Share via:

Justin Bogner

2015-Jun-16 04:16 UTC

[LLVMdev] IC profiling infrastructure

Xinliang David Li <davidxl at google.com> writes:> On Mon, Jun 15, 2015 at 5:35 PM, Justin Bogner <mail at
justinbogner.com> wrote:
>> "Betul Buyukkurt" <betulb at codeaurora.org> writes:
>>>> - We don't need to store the value profiling kind in the
data at all.
>>>>   The frontend knows which invocations of the intrinsic are for
each kind
>>>>   implicitly, much like it knows the difference between a
counter for an
>>>>   "if" and a "for" apart implicitly. Just
store one set of profiling data
>>>>   and let the frontend sort it out.
>>>
>>> I think so too. However, value_kind should stay around until the
kinds of
>>> expressions whose values are computed are included in the hash
>>> computation. Otherwise, if a value-profiled expression is to be
removed
>>> from source, then all the rest of the values would get attached to
the MD
>>> fields of wrong types. Currently hash checking verifies if the
counter
>>> assigned regions in the source has been changed across profile
consumption
>>> runs.
>>
>> This isn't a good reason to keep value_kind - the function hash not
>> reflecting these kinds of changes is a problem regardless of whether or
>> not we have multiple types. We should just add hash computations for
>> these indirect call counters. As long as we only add this when indirect
>> profiling is turned on it won't invalidate old hashes so it's
trivial to
>> stay backwards compatible.
>
> I think there are other better reasons why keeping value kind is
> desirable. Different kinds of value profile data needs different
> 'preprocessing' during reading. For instance, indirect call target
> needs to translate target address into function name. For length data,
> we may want to compute average length; for data address, we may want
> to compute the min-alignment for each site. It will be hard to do so
> with out value kind info.
But since the consumer is the frontend, and it knows which counts are
which, it knows the kind, no? I don't understand how storing the kind
info helps or hurts - it's just redundant.
>>
>> The other thing we should do is store which profiling options are
>> enabled, in both formats. When we specify profile-instr-use it'll
be
>> less error prone if we can detect whether or not this includes indirect
>> call profiling without checking other options, and it's probably a
good
>> idea for "llvm-profdata merge" to disallow merging profiles
that are
>> gathering different sets of data. A bitfield seems suitable for this.
>
> For value profiling, allowing profile-gen and profile-use passes using
> different options is a useful feature. Consider the following
> scenarios:
> 1) collect the same profile data once with all kinds of value profile
>    data collected. The exact same profile data can be used in performance
>    experiments with different kinds of value profiling enabled/disabled
This use case is pretty easy to accomodate for with the model where the
profile stores its type. We could autodetect the type to decide how to
interpret the profile, and if more specific profile flags are set simply
ignore some of the data. The thing that the file format storing the type
gives us is that it's harder to misinterpret the data by supplying the
mismatching flags between reading and writing.
> 2) work around profile-use/value transformation bug selectively for
>    some file without the need to change anything in instrumentation pass
I'm not sure I understand what you mean here.
> Besides, with the latest patch, the value_kind is not recorded in each
> profile value thus the overhead is minimized.
The overhead is low, sure, but the code complexity of dealing with the
multiple kinds in this way is quite real. Since I'm not convinced we're
getting real benefit from storing the kind, I don't believe this
trade-off is worth it.
> thanks,
>
> David

Betul Buyukkurt

2015-Jun-16 04:31 UTC

head link

[LLVMdev] IC profiling infrastructure

> Xinliang David Li <davidxl at google.com> writes:
>> On Mon, Jun 15, 2015 at 5:35 PM, Justin Bogner <mail at
justinbogner.com>
>> wrote:
>>> "Betul Buyukkurt" <betulb at codeaurora.org>
writes:
>>>>> - We don't need to store the value profiling kind in
the data at all.
>>>>>   The frontend knows which invocations of the intrinsic are
for each
>>>>> kind
>>>>>   implicitly, much like it knows the difference between a
counter for
>>>>> an
>>>>>   "if" and a "for" apart implicitly.
Just store one set of profiling
>>>>> data
>>>>>   and let the frontend sort it out.
>>>>
>>>> I think so too. However, value_kind should stay around until
the kinds
>>>> of
>>>> expressions whose values are computed are included in the hash
>>>> computation. Otherwise, if a value-profiled expression is to be
>>>> removed
>>>> from source, then all the rest of the values would get attached
to the
>>>> MD
>>>> fields of wrong types. Currently hash checking verifies if the
counter
>>>> assigned regions in the source has been changed across profile
>>>> consumption
>>>> runs.
>>>
>>> This isn't a good reason to keep value_kind - the function hash
not
>>> reflecting these kinds of changes is a problem regardless of
whether or
>>> not we have multiple types. We should just add hash computations
for
>>> these indirect call counters. As long as we only add this when
indirect
>>> profiling is turned on it won't invalidate old hashes so
it's trivial
>>> to
>>> stay backwards compatible.
>>
>> I think there are other better reasons why keeping value kind is
>> desirable. Different kinds of value profile data needs different
>> 'preprocessing' during reading. For instance, indirect call
target
>> needs to translate target address into function name. For length data,
>> we may want to compute average length; for data address, we may want
>> to compute the min-alignment for each site. It will be hard to do so
>> with out value kind info.
>
> But since the consumer is the frontend, and it knows which counts are
> which, it knows the kind, no? I don't understand how storing the kind
> info helps or hurts - it's just redundant.
>
>>>
>>> The other thing we should do is store which profiling options are
>>> enabled, in both formats. When we specify profile-instr-use
it'll be
>>> less error prone if we can detect whether or not this includes
indirect
>>> call profiling without checking other options, and it's
probably a good
>>> idea for "llvm-profdata merge" to disallow merging
profiles that are
>>> gathering different sets of data. A bitfield seems suitable for
this.
>>
>> For value profiling, allowing profile-gen and profile-use passes using
>> different options is a useful feature. Consider the following
>> scenarios:
>> 1) collect the same profile data once with all kinds of value profile
>>    data collected. The exact same profile data can be used in
>> performance
>>    experiments with different kinds of value profiling enabled/disabled
>
> This use case is pretty easy to accomodate for with the model where the
> profile stores its type. We could autodetect the type to decide how to
> interpret the profile, and if more specific profile flags are set simply
> ignore some of the data. The thing that the file format storing the type
> gives us is that it's harder to misinterpret the data by supplying the
> mismatching flags between reading and writing.
How would you plan to merge profile files where one profile is collected
w/ one value_kind turned on with another file where another value_kind is
turned on. This might be a valid use case. The value_kinds at this time
are not stored per raw value in the raw profile files. The newer is design
is extremely compact. Having the value_kind is extremely useful when doing
the merges and when processing (i.e. derivation of function names from
target addresses) data differently per value_kind. You may read the design
discussions in the past threads.
>> 2) work around profile-use/value transformation bug selectively for
>>    some file without the need to change anything in instrumentation
pass
>
> I'm not sure I understand what you mean here.
>
>> Besides, with the latest patch, the value_kind is not recorded in each
>> profile value thus the overhead is minimized.
>
> The overhead is low, sure, but the code complexity of dealing with the
> multiple kinds in this way is quite real. Since I'm not convinced
we're
> getting real benefit from storing the kind, I don't believe this
> trade-off is worth it.
>
>> thanks,
>>
>> David
>

Xinliang David Li

2015-Jun-16 05:47 UTC

head link

[LLVMdev] IC profiling infrastructure

> But since the consumer is the frontend, and it knows which counts are
> which, it knows the kind, no? I don't understand how storing the kind
> info helps or hurts - it's just redundant.
Yes, the frontend consumer knows, but the raw reader does not. It is
natural to do those processing when processing the raw profile data
(for instance when function pointer to name mapping still exists).
>
>>>
>>> The other thing we should do is store which profiling options are
>>> enabled, in both formats. When we specify profile-instr-use
it'll be
>>> less error prone if we can detect whether or not this includes
indirect
>>> call profiling without checking other options, and it's
probably a good
>>> idea for "llvm-profdata merge" to disallow merging
profiles that are
>>> gathering different sets of data. A bitfield seems suitable for
this.
>>
>> For value profiling, allowing profile-gen and profile-use passes using
>> different options is a useful feature. Consider the following
>> scenarios:
>> 1) collect the same profile data once with all kinds of value profile
>>    data collected. The exact same profile data can be used in
performance
>>    experiments with different kinds of value profiling enabled/disabled
>
> This use case is pretty easy to accomodate for with the model where the
> profile stores its type. We could autodetect the type to decide how to
> interpret the profile, and if more specific profile flags are set simply
> ignore some of the data. The thing that the file format storing the type
> gives us is that it's harder to misinterpret the data by supplying the
> mismatching flags between reading and writing.
Yes, this is achievable with some extra effort -- for instance by
collecting the profile sites of all kinds regardless of the flags.
After profiling matching, simply drop selected sites according to the
flags.

This adds complexity of the code. With value kinds encoded in profile
data, the handling is much simpler -- each value profiler only needs
to deal with its own kind.
>
>> 2) work around profile-use/value transformation bug selectively for
>>    some file without the need to change anything in instrumentation
pass
>
> I'm not sure I understand what you mean here.
Similar to the above, but for correctness.
>
>> Besides, with the latest patch, the value_kind is not recorded in each
>> profile value thus the overhead is minimized.
>
> The overhead is low, sure, but the code complexity of dealing with the
> multiple kinds in this way is quite real.
I actually believe having value-kind can help reduce code complexity
instead of the other way around. What is the extra complexity?

thanks,

David

>Since I'm not convinced we're
> getting real benefit from storing the kind, I don't believe this
> trade-off is worth it.
>
>> thanks,
>>
>> David

Xinliang David Li

2015-Jun-22 21:48 UTC

head link

[LLVMdev] IC profiling infrastructure

Justin, do you have more concerns on keeping value_kind?

If there is still disagreement, can we agree to move on with it for
now ?  After the initial version of the patches checked in, we can do
more serious testings with large apps and revisit this if there are
problems discovered.

thanks,

David



On Mon, Jun 15, 2015 at 10:47 PM, Xinliang David Li <davidxl at
google.com> wrote:>> But since the consumer is the frontend, and it knows which counts are
>> which, it knows the kind, no? I don't understand how storing the
kind
>> info helps or hurts - it's just redundant.
>
> Yes, the frontend consumer knows, but the raw reader does not. It is
> natural to do those processing when processing the raw profile data
> (for instance when function pointer to name mapping still exists).
>
>>
>>>>
>>>> The other thing we should do is store which profiling options
are
>>>> enabled, in both formats. When we specify profile-instr-use
it'll be
>>>> less error prone if we can detect whether or not this includes
indirect
>>>> call profiling without checking other options, and it's
probably a good
>>>> idea for "llvm-profdata merge" to disallow merging
profiles that are
>>>> gathering different sets of data. A bitfield seems suitable for
this.
>>>
>>> For value profiling, allowing profile-gen and profile-use passes
using
>>> different options is a useful feature. Consider the following
>>> scenarios:
>>> 1) collect the same profile data once with all kinds of value
profile
>>>    data collected. The exact same profile data can be used in
performance
>>>    experiments with different kinds of value profiling
enabled/disabled
>>
>> This use case is pretty easy to accomodate for with the model where the
>> profile stores its type. We could autodetect the type to decide how to
>> interpret the profile, and if more specific profile flags are set
simply
>> ignore some of the data. The thing that the file format storing the
type
>> gives us is that it's harder to misinterpret the data by supplying
the
>> mismatching flags between reading and writing.
>
> Yes, this is achievable with some extra effort -- for instance by
> collecting the profile sites of all kinds regardless of the flags.
> After profiling matching, simply drop selected sites according to the
> flags.
>
> This adds complexity of the code. With value kinds encoded in profile
> data, the handling is much simpler -- each value profiler only needs
> to deal with its own kind.
>
>>
>>> 2) work around profile-use/value transformation bug selectively for
>>>    some file without the need to change anything in instrumentation
pass
>>
>> I'm not sure I understand what you mean here.
>
> Similar to the above, but for correctness.
>
>>
>>> Besides, with the latest patch, the value_kind is not recorded in
each
>>> profile value thus the overhead is minimized.
>>
>> The overhead is low, sure, but the code complexity of dealing with the
>> multiple kinds in this way is quite real.
>
> I actually believe having value-kind can help reduce code complexity
> instead of the other way around. What is the extra complexity?
>
> thanks,
>
> David
>
>
>>Since I'm not convinced we're
>> getting real benefit from storing the kind, I don't believe this
>> trade-off is worth it.
>>
>>> thanks,
>>>
>>> David

Maybe Matching Threads

Search for more reasonably related threads

llvm dev - Jun 2015 - [LLVMdev] IC profiling infrastructure

[LLVMdev] IC profiling infrastructure

[LLVMdev] IC profiling infrastructure

[LLVMdev] IC profiling infrastructure

[LLVMdev] IC profiling infrastructure

Maybe Matching Threads