thr3ads.net - llvm dev - [llvm-dev] RFC: On non 8-bit bytes and the target for it [Nov 2019]

If this information is useful, please help other people find it:
Share via:

David Blaikie via llvm-dev

2019-Nov-05 00:07 UTC

[llvm-dev] RFC: On non 8-bit bytes and the target for it

On Mon, Nov 4, 2019 at 4:00 PM James Molloy <james at jamesmolloy.co.uk>
wrote:
> Hi,
>
> To throw my hat into the ring, we also maintain a downstream target that
> has a subset of this problem - it has non-8-bit-addressable memory. This
> means that %uglygeps don't work, pointers can't be casted to i8*
and expect
> to be byte-addressed (conversions to memcpy/memset that use i8* aren't
> welcome either), and GEP lowering code is slightly different.
>
> We maintain this currently as a pile of known technical debt. We have a
> CGP pass to decompose GEPs and custom-expand them taking our word size into
> account, but everything before that is just waiting for InstCombine to
> break something. AliasAnalysis also makes offsetting assumptions that are
> totally bogus because our pointers are word-addressed and therefore so are
> pointer offsets.
>
> We'd be really keen to help here. We're keen to upstream anything
we
> possibly can (and have been, over the past few months). We've have
several
> discussions about how best to approach upstream with this, and the sticking
> point has always been lack of a testing target. It's always felt to me
that
> the idea of addressable granule should be a fairly reasonable DataLayout
> addition; We can test DataLayout changes purely via opt without requiring a
> target that uses them. Lowering to instructions was always the testing
> sticking point.
>
> We'd be keen to help out what the community decides to do here. I
> personally feel it's reasonable that:
>   - LangRef/DataLayout is updated with semantically coherent changes.
>   - The midend optimizer is updated by someone who cares about those
> changes and tests are added that use the new DataLayout.
>   - Developers that don't care about those changes maintain a
best-effort
> approach, which is exactly what we do right now; there are features that
> are tested but are still esoteric enough that I might reasonably break them
> without realising (OperandBundles come to mind), so I don't think
there's
> any change in mindset here.
>   - Developers that care perform downstream testing and provide review
> feedback / revert if really bad / fixes. Again, this is how LLVM works
> right now - I'd guess that >80% of our real-world test coverage
comes from
> downstream users deploying ToT LLVM rather than the upstream LIT tests /
> builders.
>
This last bit is the bit I'd worry about if the bug were only visible in an
out-of-tree target. If it's visible with a DataLayout change that can be
tested upstream, then I'd be OK with an upstream patch being reverted given
a test case with custom DataLayout showing the failure.

But if the failure is for, say, non-8-bit-bytes that are only actually
exercised downstream, I'm less OK with patches being reverted upstream no
matter the severity of the breakage to such targets downstream.

- Dave


>
> Cheers,
>
> James
>
> On Mon, 4 Nov 2019 at 12:16, David Blaikie via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>>
>>
>> On Sat, Nov 2, 2019 at 12:45 AM Jorg Brown via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> On Fri, Nov 1, 2019 at 8:40 AM Adrian Prantl via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> > On Nov 1, 2019, at 3:41 AM, Dmitriy Borisenkov via
llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>> > It seems that there are two possible solutions on how to
move forward
>>>> with non 8 bits byte:
>>>> >
>>>> > 1. Commit changes without tests. Chris Lattner, Mikael
Holmen, Jeroen
>>>> Dobbelaere, Jesper Antonsson support this idea.
>>>> > James Y Knight says that at least magic numbers should be
removed "at
>>>> least where it arguably helps code clarity". This might be
not exactly the
>>>> scope of the changes discussed, but it's probably worth do
discuss code
>>>> clarity having concrete patches.
>>>> > GCC (according to James Y Knight) has the same practice
meaning non-8
>>>> bits byte is supported but there are no tests in upstream and
we have
>>>> downstream contributors who will fix the bugs if they appear in
the LLVM
>>>> core.
>>>> > David Chisnall raised a question about what to count as a
byte (which
>>>> defines the scope of the changes) and we suggest to use all 5
criteria he
>>>> granted:
>>>> > > - The smallest unit that can be loaded / stored at a
time.
>>>> > > - The smallest unit that can be addressed with a raw
pointer in a
>>>> specific address space.
>>>> > > - The largest unit whose encoding is opaque to
anything above the
>>>> ISA.
>>>> > > - The type used to represent `char` in C.
>>>> > > - The type that has a size that all other types are a
multiple of.
>>>> > But if DSPs are less restrictive about byte, some of the
criteria
>>>> could be removed.
>>>> >
>>>> > 2. Use an iconic target. PDP10 was suggested as a
candidate. This
>>>> opinion found support from Tim Northover, Joerg Sonenberger,
Mehdi AMINI,
>>>> Philip Reames. It's not clear though does this opinion
oppose upstreaming
>>>> non-8-bits byte without tests or just a dummy and TVM targets
options.
>>>> >
>>>> > So if there is no strong opposition to the solution 1 from
the people
>>>> supporting an iconic target option, we could probably move to
the patches.
>>>>
>>>> I'm in camp (2). Any changes that are not tested are an
invitation to
>>>> upstream developers to "simplify" the code, not
knowing that those changes
>>>> are important. Anyone who commits untested changes to LLVM will
inevitably
>>>> face an uphill battle against benevolent NFC refactorings that
break these
>>>> changes because the expectation of how the code is supposed to
behave is
>>>> not codified in a test. In the short term option (1) sounds
more appealing
>>>> because they can start right away, but I'm going to predict
that it will be
>>>> more expensive for the downstream maintainers of non 8-bit
targets in the
>>>> long term.
>>>>
>>>
>>> I've worked on multiple codebases where an option existed in
order to
>>> satisfy an extremely small userbase, with little or no testing,
>>>
>>
>> In those situations, were the core developers responsible for those
>> features/users? Yeah, if I needed to support a certain observable
feature
>> of clang continuing to work, I'd want tests (I'm pretty serious
about
>> testing, FWIW).
>>
>>
>>> and as such, I'm adamantly opposed to repeating it.
>>>
>>
>>>
>> In addition to what Adrian said, where the weird option exists but is
>>> constantly being incidentally broken, I've seen the opposite
problem:
>>> people become afraid to refactor a section of code because it might
break
>>> the weird option.  You might say "if there aren't any
tests, people should
>>> feel free to refactor the code; their only responsibility is to
make sure
>>> the tests will still work", but honestly, I've seen the
opposite: without
>>> tests, it's just presumed that touching certain parts of code
is likely to
>>> break things, and so after a time, an aura of untouchability starts
to
>>> surround regions of the code.  And then, the more time goes by, the
more
>>> that code becomes unfamiliar to everyone (because no one is
actively
>>> maintaining it).  In the long run, the cost of an unmaintained
option may
>>> be far more than the cost of a maintained one.
>>>
>>
>>
>> I'm not actually opposed to this situation - LLVM as a project is
pretty
>> happy about making big structural changes to the codebase & holding
the
>> test coverage and downstream users accountable for ensuring quality. We
>> rarely avoid changes due to risk of breakage & as a community push
back a
>> fair bit on reviewers suggesting we should - if someone can't
demonstrate
>> the breakage in an upstream test (or has a pretty good track record of
true
>> positives that may take some time to investigate - and thus it might be
>> better in the short term to revert while waiting for that evidence to
be
>> provided) the changes tend to go in and stay in.
>>
>> Yeah, I think some parts of the code may become complicated enough to
>> warrant separate testing - but most of the code that might move to
>> constants for byte width, iterate over bits to that byte width, etc,
will
>> be tested on one value & might have bugs on other values that will
be found
>> (or not) downstream - best effort and all. But in cases where the code
to
>> handle novel byte widths becomes more complicated - some abstraction
and
>> unit testing would seem quite appropriate.
>>
>>
>>
>>> In short: please don't commit changes without tests.  Even if
the test
>>> is nothing but making sure this works:
>>>
>>> int main(int argc, char *argv[]) {
>>>   return argv[argc - 1][0];
>>> }
>>>
>>> That at least would give some freedom from the guilt of breaking
>>> something important.
>>>
>>
>> It's hard to make sure that works in a meaningful sense in this
context -
>> without a non-8-bit-byte target in upstream LLVM, which is the point of
>> contention/discussion. It's unclear if there's a suitable
target/community
>> to provide/maintain such a target upstream. I don't think
there's a
>> "cheap"/stub/trivial target we could create that would
provide what you're
>> suggesting without bitrotting quickly and being removed (more quickly
than,
>> I think, the sort of patches to support but not provide, non-8-bit-byte
>> targets).
>>
>> Though it's hard to guess without seeing the sort of patches
that'd be
>> needed.
>>
>> - Dave
>>
>>
>>>
>>> -- Jorg
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191104/0d86ed01/attachment-0001.html>

James Molloy via llvm-dev

2019-Nov-05 00:09 UTC

head link

[llvm-dev] RFC: On non 8-bit bytes and the target for it

> But if the failure is for, say, non-8-bit-bytes that are only actuallyexercised downstream, I'm less OK with patches being reverted upstream no
matter the severity of the breakage to such targets downstream.

I think that's totally reasonable and is the same yardstick applied already
- reversion is only justifiable if a testcase can be provided. Anything
else would be undue burden on the original committer and would require
shotgun debugging.

On Mon, 4 Nov 2019 at 16:07, David Blaikie <dblaikie at gmail.com> wrote:
>
>
> On Mon, Nov 4, 2019 at 4:00 PM James Molloy <james at
jamesmolloy.co.uk>
> wrote:
>
>> Hi,
>>
>> To throw my hat into the ring, we also maintain a downstream target
that
>> has a subset of this problem - it has non-8-bit-addressable memory.
This
>> means that %uglygeps don't work, pointers can't be casted to
i8* and expect
>> to be byte-addressed (conversions to memcpy/memset that use i8*
aren't
>> welcome either), and GEP lowering code is slightly different.
>>
>> We maintain this currently as a pile of known technical debt. We have a
>> CGP pass to decompose GEPs and custom-expand them taking our word size
into
>> account, but everything before that is just waiting for InstCombine to
>> break something. AliasAnalysis also makes offsetting assumptions that
are
>> totally bogus because our pointers are word-addressed and therefore so
are
>> pointer offsets.
>>
>> We'd be really keen to help here. We're keen to upstream
anything we
>> possibly can (and have been, over the past few months). We've have
several
>> discussions about how best to approach upstream with this, and the
sticking
>> point has always been lack of a testing target. It's always felt to
me that
>> the idea of addressable granule should be a fairly reasonable
DataLayout
>> addition; We can test DataLayout changes purely via opt without
requiring a
>> target that uses them. Lowering to instructions was always the testing
>> sticking point.
>>
>> We'd be keen to help out what the community decides to do here. I
>> personally feel it's reasonable that:
>>   - LangRef/DataLayout is updated with semantically coherent changes.
>>   - The midend optimizer is updated by someone who cares about those
>> changes and tests are added that use the new DataLayout.
>>   - Developers that don't care about those changes maintain a
best-effort
>> approach, which is exactly what we do right now; there are features
that
>> are tested but are still esoteric enough that I might reasonably break
them
>> without realising (OperandBundles come to mind), so I don't think
there's
>> any change in mindset here.
>>   - Developers that care perform downstream testing and provide review
>> feedback / revert if really bad / fixes. Again, this is how LLVM works
>> right now - I'd guess that >80% of our real-world test coverage
comes from
>> downstream users deploying ToT LLVM rather than the upstream LIT tests
/
>> builders.
>>
>
> This last bit is the bit I'd worry about if the bug were only visible
in
> an out-of-tree target. If it's visible with a DataLayout change that
can be
> tested upstream, then I'd be OK with an upstream patch being reverted
given
> a test case with custom DataLayout showing the failure.
>
> But if the failure is for, say, non-8-bit-bytes that are only actually
> exercised downstream, I'm less OK with patches being reverted upstream
no
> matter the severity of the breakage to such targets downstream.
>
> - Dave
>
>
>
>>
>> Cheers,
>>
>> James
>>
>> On Mon, 4 Nov 2019 at 12:16, David Blaikie via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>>
>>>
>>> On Sat, Nov 2, 2019 at 12:45 AM Jorg Brown via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> On Fri, Nov 1, 2019 at 8:40 AM Adrian Prantl via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>>> > On Nov 1, 2019, at 3:41 AM, Dmitriy Borisenkov via
llvm-dev <
>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>> > It seems that there are two possible solutions on how
to move
>>>>> forward with non 8 bits byte:
>>>>> >
>>>>> > 1. Commit changes without tests. Chris Lattner, Mikael
Holmen,
>>>>> Jeroen Dobbelaere, Jesper Antonsson support this idea.
>>>>> > James Y Knight says that at least magic numbers should
be removed
>>>>> "at least where it arguably helps code clarity".
This might be not exactly
>>>>> the scope of the changes discussed, but it's probably
worth do discuss code
>>>>> clarity having concrete patches.
>>>>> > GCC (according to James Y Knight) has the same
practice meaning
>>>>> non-8 bits byte is supported but there are no tests in
upstream and we have
>>>>> downstream contributors who will fix the bugs if they
appear in the LLVM
>>>>> core.
>>>>> > David Chisnall raised a question about what to count
as a byte
>>>>> (which defines the scope of the changes) and we suggest to
use all 5
>>>>> criteria he granted:
>>>>> > > - The smallest unit that can be loaded / stored
at a time.
>>>>> > > - The smallest unit that can be addressed with a
raw pointer in a
>>>>> specific address space.
>>>>> > > - The largest unit whose encoding is opaque to
anything above the
>>>>> ISA.
>>>>> > > - The type used to represent `char` in C.
>>>>> > > - The type that has a size that all other types
are a multiple of.
>>>>> > But if DSPs are less restrictive about byte, some of
the criteria
>>>>> could be removed.
>>>>> >
>>>>> > 2. Use an iconic target. PDP10 was suggested as a
candidate. This
>>>>> opinion found support from Tim Northover, Joerg
Sonenberger, Mehdi AMINI,
>>>>> Philip Reames. It's not clear though does this opinion
oppose upstreaming
>>>>> non-8-bits byte without tests or just a dummy and TVM
targets options.
>>>>> >
>>>>> > So if there is no strong opposition to the solution 1
from the
>>>>> people supporting an iconic target option, we could
probably move to the
>>>>> patches.
>>>>>
>>>>> I'm in camp (2). Any changes that are not tested are an
invitation to
>>>>> upstream developers to "simplify" the code, not
knowing that those changes
>>>>> are important. Anyone who commits untested changes to LLVM
will inevitably
>>>>> face an uphill battle against benevolent NFC refactorings
that break these
>>>>> changes because the expectation of how the code is supposed
to behave is
>>>>> not codified in a test. In the short term option (1) sounds
more appealing
>>>>> because they can start right away, but I'm going to
predict that it will be
>>>>> more expensive for the downstream maintainers of non 8-bit
targets in the
>>>>> long term.
>>>>>
>>>>
>>>> I've worked on multiple codebases where an option existed
in order to
>>>> satisfy an extremely small userbase, with little or no testing,
>>>>
>>>
>>> In those situations, were the core developers responsible for those
>>> features/users? Yeah, if I needed to support a certain observable
feature
>>> of clang continuing to work, I'd want tests (I'm pretty
serious about
>>> testing, FWIW).
>>>
>>>
>>>> and as such, I'm adamantly opposed to repeating it.
>>>>
>>>
>>>>
>>> In addition to what Adrian said, where the weird option exists but
is
>>>> constantly being incidentally broken, I've seen the
opposite problem:
>>>> people become afraid to refactor a section of code because it
might break
>>>> the weird option.  You might say "if there aren't any
tests, people should
>>>> feel free to refactor the code; their only responsibility is to
make sure
>>>> the tests will still work", but honestly, I've seen
the opposite: without
>>>> tests, it's just presumed that touching certain parts of
code is likely to
>>>> break things, and so after a time, an aura of untouchability
starts to
>>>> surround regions of the code.  And then, the more time goes by,
the more
>>>> that code becomes unfamiliar to everyone (because no one is
actively
>>>> maintaining it).  In the long run, the cost of an unmaintained
option may
>>>> be far more than the cost of a maintained one.
>>>>
>>>
>>>
>>> I'm not actually opposed to this situation - LLVM as a project
is pretty
>>> happy about making big structural changes to the codebase &
holding the
>>> test coverage and downstream users accountable for ensuring
quality. We
>>> rarely avoid changes due to risk of breakage & as a community
push back a
>>> fair bit on reviewers suggesting we should - if someone can't
demonstrate
>>> the breakage in an upstream test (or has a pretty good track record
of true
>>> positives that may take some time to investigate - and thus it
might be
>>> better in the short term to revert while waiting for that evidence
to be
>>> provided) the changes tend to go in and stay in.
>>>
>>> Yeah, I think some parts of the code may become complicated enough
to
>>> warrant separate testing - but most of the code that might move to
>>> constants for byte width, iterate over bits to that byte width,
etc, will
>>> be tested on one value & might have bugs on other values that
will be found
>>> (or not) downstream - best effort and all. But in cases where the
code to
>>> handle novel byte widths becomes more complicated - some
abstraction and
>>> unit testing would seem quite appropriate.
>>>
>>>
>>>
>>>> In short: please don't commit changes without tests.  Even
if the test
>>>> is nothing but making sure this works:
>>>>
>>>> int main(int argc, char *argv[]) {
>>>>   return argv[argc - 1][0];
>>>> }
>>>>
>>>> That at least would give some freedom from the guilt of
breaking
>>>> something important.
>>>>
>>>
>>> It's hard to make sure that works in a meaningful sense in this
context
>>> - without a non-8-bit-byte target in upstream LLVM, which is the
point of
>>> contention/discussion. It's unclear if there's a suitable
target/community
>>> to provide/maintain such a target upstream. I don't think
there's a
>>> "cheap"/stub/trivial target we could create that would
provide what you're
>>> suggesting without bitrotting quickly and being removed (more
quickly than,
>>> I think, the sort of patches to support but not provide,
non-8-bit-byte
>>> targets).
>>>
>>> Though it's hard to guess without seeing the sort of patches
that'd be
>>> needed.
>>>
>>> - Dave
>>>
>>>
>>>>
>>>> -- Jorg
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191104/f34766a6/attachment.html>

Sean Silva via llvm-dev

2019-Nov-06 00:35 UTC

head link

[llvm-dev] RFC: On non 8-bit bytes and the target for it

On Mon, Nov 4, 2019 at 4:07 PM David Blaikie via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
>
> On Mon, Nov 4, 2019 at 4:00 PM James Molloy <james at
jamesmolloy.co.uk>
> wrote:
>
>> Hi,
>>
>> To throw my hat into the ring, we also maintain a downstream target
that
>> has a subset of this problem - it has non-8-bit-addressable memory.
This
>> means that %uglygeps don't work, pointers can't be casted to
i8* and expect
>> to be byte-addressed (conversions to memcpy/memset that use i8*
aren't
>> welcome either), and GEP lowering code is slightly different.
>>
>> We maintain this currently as a pile of known technical debt. We have a
>> CGP pass to decompose GEPs and custom-expand them taking our word size
into
>> account, but everything before that is just waiting for InstCombine to
>> break something. AliasAnalysis also makes offsetting assumptions that
are
>> totally bogus because our pointers are word-addressed and therefore so
are
>> pointer offsets.
>>
>> We'd be really keen to help here. We're keen to upstream
anything we
>> possibly can (and have been, over the past few months). We've have
several
>> discussions about how best to approach upstream with this, and the
sticking
>> point has always been lack of a testing target. It's always felt to
me that
>> the idea of addressable granule should be a fairly reasonable
DataLayout
>> addition; We can test DataLayout changes purely via opt without
requiring a
>> target that uses them. Lowering to instructions was always the testing
>> sticking point.
>>
>> We'd be keen to help out what the community decides to do here. I
>> personally feel it's reasonable that:
>>   - LangRef/DataLayout is updated with semantically coherent changes.
>>   - The midend optimizer is updated by someone who cares about those
>> changes and tests are added that use the new DataLayout.
>>   - Developers that don't care about those changes maintain a
best-effort
>> approach, which is exactly what we do right now; there are features
that
>> are tested but are still esoteric enough that I might reasonably break
them
>> without realising (OperandBundles come to mind), so I don't think
there's
>> any change in mindset here.
>>   - Developers that care perform downstream testing and provide review
>> feedback / revert if really bad / fixes. Again, this is how LLVM works
>> right now - I'd guess that >80% of our real-world test coverage
comes from
>> downstream users deploying ToT LLVM rather than the upstream LIT tests
/
>> builders.
>>
>
> This last bit is the bit I'd worry about if the bug were only visible
in
> an out-of-tree target. If it's visible with a DataLayout change that
can be
> tested upstream, then I'd be OK with an upstream patch being reverted
given
> a test case with custom DataLayout showing the failure.
>
> But if the failure is for, say, non-8-bit-bytes that are only actually
> exercised downstream, I'm less OK with patches being reverted upstream
no
> matter the severity of the breakage to such targets downstream.
>
Suppose that the failure is build breakage for the downstream target. This
could cause the entirety of the downstream LLVM/Clang build to break in a
way that isn’t easy to untangle or fix. This could prevent an entire
organization from integrating LLVM for an extended period of time. So it
isn’t necessarily “no matter the severity of the breakage for such
*targets* downstream” but could be the entire organization unless a
tactical downstream revert of the offending change can be made.

— Sean Silva

> - Dave
>
>
>
>>
>> Cheers,
>>
>> James
>>
>> On Mon, 4 Nov 2019 at 12:16, David Blaikie via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>>
>>>
>>> On Sat, Nov 2, 2019 at 12:45 AM Jorg Brown via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> On Fri, Nov 1, 2019 at 8:40 AM Adrian Prantl via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>>> > On Nov 1, 2019, at 3:41 AM, Dmitriy Borisenkov via
llvm-dev <
>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>> > It seems that there are two possible solutions on how
to move
>>>>> forward with non 8 bits byte:
>>>>> >
>>>>> > 1. Commit changes without tests. Chris Lattner, Mikael
Holmen,
>>>>> Jeroen Dobbelaere, Jesper Antonsson support this idea.
>>>>> > James Y Knight says that at least magic numbers should
be removed
>>>>> "at least where it arguably helps code clarity".
This might be not exactly
>>>>> the scope of the changes discussed, but it's probably
worth do discuss code
>>>>> clarity having concrete patches.
>>>>> > GCC (according to James Y Knight) has the same
practice meaning
>>>>> non-8 bits byte is supported but there are no tests in
upstream and we have
>>>>> downstream contributors who will fix the bugs if they
appear in the LLVM
>>>>> core.
>>>>> > David Chisnall raised a question about what to count
as a byte
>>>>> (which defines the scope of the changes) and we suggest to
use all 5
>>>>> criteria he granted:
>>>>> > > - The smallest unit that can be loaded / stored
at a time.
>>>>> > > - The smallest unit that can be addressed with a
raw pointer in a
>>>>> specific address space.
>>>>> > > - The largest unit whose encoding is opaque to
anything above the
>>>>> ISA.
>>>>> > > - The type used to represent `char` in C.
>>>>> > > - The type that has a size that all other types
are a multiple of.
>>>>> > But if DSPs are less restrictive about byte, some of
the criteria
>>>>> could be removed.
>>>>> >
>>>>> > 2. Use an iconic target. PDP10 was suggested as a
candidate. This
>>>>> opinion found support from Tim Northover, Joerg
Sonenberger, Mehdi AMINI,
>>>>> Philip Reames. It's not clear though does this opinion
oppose upstreaming
>>>>> non-8-bits byte without tests or just a dummy and TVM
targets options.
>>>>> >
>>>>> > So if there is no strong opposition to the solution 1
from the
>>>>> people supporting an iconic target option, we could
probably move to the
>>>>> patches.
>>>>>
>>>>> I'm in camp (2). Any changes that are not tested are an
invitation to
>>>>> upstream developers to "simplify" the code, not
knowing that those changes
>>>>> are important. Anyone who commits untested changes to LLVM
will inevitably
>>>>> face an uphill battle against benevolent NFC refactorings
that break these
>>>>> changes because the expectation of how the code is supposed
to behave is
>>>>> not codified in a test. In the short term option (1) sounds
more appealing
>>>>> because they can start right away, but I'm going to
predict that it will be
>>>>> more expensive for the downstream maintainers of non 8-bit
targets in the
>>>>> long term.
>>>>>
>>>>
>>>> I've worked on multiple codebases where an option existed
in order to
>>>> satisfy an extremely small userbase, with little or no testing,
>>>>
>>>
>>> In those situations, were the core developers responsible for those
>>> features/users? Yeah, if I needed to support a certain observable
feature
>>> of clang continuing to work, I'd want tests (I'm pretty
serious about
>>> testing, FWIW).
>>>
>>>
>>>> and as such, I'm adamantly opposed to repeating it.
>>>>
>>>
>>>>
>>> In addition to what Adrian said, where the weird option exists but
is
>>>> constantly being incidentally broken, I've seen the
opposite problem:
>>>> people become afraid to refactor a section of code because it
might break
>>>> the weird option.  You might say "if there aren't any
tests, people should
>>>> feel free to refactor the code; their only responsibility is to
make sure
>>>> the tests will still work", but honestly, I've seen
the opposite: without
>>>> tests, it's just presumed that touching certain parts of
code is likely to
>>>> break things, and so after a time, an aura of untouchability
starts to
>>>> surround regions of the code.  And then, the more time goes by,
the more
>>>> that code becomes unfamiliar to everyone (because no one is
actively
>>>> maintaining it).  In the long run, the cost of an unmaintained
option may
>>>> be far more than the cost of a maintained one.
>>>>
>>>
>>>
>>> I'm not actually opposed to this situation - LLVM as a project
is pretty
>>> happy about making big structural changes to the codebase &
holding the
>>> test coverage and downstream users accountable for ensuring
quality. We
>>> rarely avoid changes due to risk of breakage & as a community
push back a
>>> fair bit on reviewers suggesting we should - if someone can't
demonstrate
>>> the breakage in an upstream test (or has a pretty good track record
of true
>>> positives that may take some time to investigate - and thus it
might be
>>> better in the short term to revert while waiting for that evidence
to be
>>> provided) the changes tend to go in and stay in.
>>>
>>> Yeah, I think some parts of the code may become complicated enough
to
>>> warrant separate testing - but most of the code that might move to
>>> constants for byte width, iterate over bits to that byte width,
etc, will
>>> be tested on one value & might have bugs on other values that
will be found
>>> (or not) downstream - best effort and all. But in cases where the
code to
>>> handle novel byte widths becomes more complicated - some
abstraction and
>>> unit testing would seem quite appropriate.
>>>
>>>
>>>
>>>> In short: please don't commit changes without tests.  Even
if the test
>>>> is nothing but making sure this works:
>>>>
>>>> int main(int argc, char *argv[]) {
>>>>   return argv[argc - 1][0];
>>>> }
>>>>
>>>> That at least would give some freedom from the guilt of
breaking
>>>> something important.
>>>>
>>>
>>> It's hard to make sure that works in a meaningful sense in this
context
>>> - without a non-8-bit-byte target in upstream LLVM, which is the
point of
>>> contention/discussion. It's unclear if there's a suitable
target/community
>>> to provide/maintain such a target upstream. I don't think
there's a
>>> "cheap"/stub/trivial target we could create that would
provide what you're
>>> suggesting without bitrotting quickly and being removed (more
quickly than,
>>> I think, the sort of patches to support but not provide,
non-8-bit-byte
>>> targets).
>>>
>>> Though it's hard to guess without seeing the sort of patches
that'd be
>>> needed.
>>>
>>> - Dave
>>>
>>>
>>>>
>>>> -- Jorg
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191105/34c850f3/attachment.html>

David Blaikie via llvm-dev

2019-Nov-06 01:02 UTC

head link

[llvm-dev] RFC: On non 8-bit bytes and the target for it

On Tue, Nov 5, 2019 at 4:35 PM Sean Silva <chisophugis at gmail.com>
wrote:
>
>
> On Mon, Nov 4, 2019 at 4:07 PM David Blaikie via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>>
>>
>> On Mon, Nov 4, 2019 at 4:00 PM James Molloy <james at
jamesmolloy.co.uk>
>> wrote:
>>
>>> Hi,
>>>
>>> To throw my hat into the ring, we also maintain a downstream target
that
>>> has a subset of this problem - it has non-8-bit-addressable memory.
This
>>> means that %uglygeps don't work, pointers can't be casted
to i8* and expect
>>> to be byte-addressed (conversions to memcpy/memset that use i8*
aren't
>>> welcome either), and GEP lowering code is slightly different.
>>>
>>> We maintain this currently as a pile of known technical debt. We
have a
>>> CGP pass to decompose GEPs and custom-expand them taking our word
size into
>>> account, but everything before that is just waiting for InstCombine
to
>>> break something. AliasAnalysis also makes offsetting assumptions
that are
>>> totally bogus because our pointers are word-addressed and therefore
so are
>>> pointer offsets.
>>>
>>> We'd be really keen to help here. We're keen to upstream
anything we
>>> possibly can (and have been, over the past few months). We've
have several
>>> discussions about how best to approach upstream with this, and the
sticking
>>> point has always been lack of a testing target. It's always
felt to me that
>>> the idea of addressable granule should be a fairly reasonable
DataLayout
>>> addition; We can test DataLayout changes purely via opt without
requiring a
>>> target that uses them. Lowering to instructions was always the
testing
>>> sticking point.
>>>
>>> We'd be keen to help out what the community decides to do here.
I
>>> personally feel it's reasonable that:
>>>   - LangRef/DataLayout is updated with semantically coherent
changes.
>>>   - The midend optimizer is updated by someone who cares about
those
>>> changes and tests are added that use the new DataLayout.
>>>   - Developers that don't care about those changes maintain a
>>> best-effort approach, which is exactly what we do right now; there
are
>>> features that are tested but are still esoteric enough that I might
>>> reasonably break them without realising (OperandBundles come to
mind), so I
>>> don't think there's any change in mindset here.
>>>   - Developers that care perform downstream testing and provide
review
>>> feedback / revert if really bad / fixes. Again, this is how LLVM
works
>>> right now - I'd guess that >80% of our real-world test
coverage comes from
>>> downstream users deploying ToT LLVM rather than the upstream LIT
tests /
>>> builders.
>>>
>>
>> This last bit is the bit I'd worry about if the bug were only
visible in
>> an out-of-tree target. If it's visible with a DataLayout change
that can be
>> tested upstream, then I'd be OK with an upstream patch being
reverted given
>> a test case with custom DataLayout showing the failure.
>>
>> But if the failure is for, say, non-8-bit-bytes that are only actually
>> exercised downstream, I'm less OK with patches being reverted
upstream no
>> matter the severity of the breakage to such targets downstream.
>>
>
> Suppose that the failure is build breakage for the downstream target. This
> could cause the entirety of the downstream LLVM/Clang build to break in a
> way that isn’t easy to untangle or fix. This could prevent an entire
> organization from integrating LLVM for an extended period of time. So it
> isn’t necessarily “no matter the severity of the breakage for such
> *targets* downstream” but could be the entire organization unless a
> tactical downstream revert of the offending change can be made.
>
That's already the world they live in today having out of tree targets and
especially ones with non-8-bit-bytes. I don't think upstream can sign up
for any stronger support here in that situation.

>
> — Sean Silva
>
>
>> - Dave
>>
>>
>>
>>>
>>> Cheers,
>>>
>>> James
>>>
>>> On Mon, 4 Nov 2019 at 12:16, David Blaikie via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>>
>>>>
>>>> On Sat, Nov 2, 2019 at 12:45 AM Jorg Brown via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>>> On Fri, Nov 1, 2019 at 8:40 AM Adrian Prantl via llvm-dev
<
>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>
>>>>>> > On Nov 1, 2019, at 3:41 AM, Dmitriy Borisenkov via
llvm-dev <
>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>> > It seems that there are two possible solutions on
how to move
>>>>>> forward with non 8 bits byte:
>>>>>> >
>>>>>> > 1. Commit changes without tests. Chris Lattner,
Mikael Holmen,
>>>>>> Jeroen Dobbelaere, Jesper Antonsson support this idea.
>>>>>> > James Y Knight says that at least magic numbers
should be removed
>>>>>> "at least where it arguably helps code
clarity". This might be not exactly
>>>>>> the scope of the changes discussed, but it's
probably worth do discuss code
>>>>>> clarity having concrete patches.
>>>>>> > GCC (according to James Y Knight) has the same
practice meaning
>>>>>> non-8 bits byte is supported but there are no tests in
upstream and we have
>>>>>> downstream contributors who will fix the bugs if they
appear in the LLVM
>>>>>> core.
>>>>>> > David Chisnall raised a question about what to
count as a byte
>>>>>> (which defines the scope of the changes) and we suggest
to use all 5
>>>>>> criteria he granted:
>>>>>> > > - The smallest unit that can be loaded /
stored at a time.
>>>>>> > > - The smallest unit that can be addressed
with a raw pointer in a
>>>>>> specific address space.
>>>>>> > > - The largest unit whose encoding is opaque
to anything above the
>>>>>> ISA.
>>>>>> > > - The type used to represent `char` in C.
>>>>>> > > - The type that has a size that all other
types are a multiple of.
>>>>>> > But if DSPs are less restrictive about byte, some
of the criteria
>>>>>> could be removed.
>>>>>> >
>>>>>> > 2. Use an iconic target. PDP10 was suggested as a
candidate. This
>>>>>> opinion found support from Tim Northover, Joerg
Sonenberger, Mehdi AMINI,
>>>>>> Philip Reames. It's not clear though does this
opinion oppose upstreaming
>>>>>> non-8-bits byte without tests or just a dummy and TVM
targets options.
>>>>>> >
>>>>>> > So if there is no strong opposition to the
solution 1 from the
>>>>>> people supporting an iconic target option, we could
probably move to the
>>>>>> patches.
>>>>>>
>>>>>> I'm in camp (2). Any changes that are not tested
are an invitation to
>>>>>> upstream developers to "simplify" the code,
not knowing that those changes
>>>>>> are important. Anyone who commits untested changes to
LLVM will inevitably
>>>>>> face an uphill battle against benevolent NFC
refactorings that break these
>>>>>> changes because the expectation of how the code is
supposed to behave is
>>>>>> not codified in a test. In the short term option (1)
sounds more appealing
>>>>>> because they can start right away, but I'm going to
predict that it will be
>>>>>> more expensive for the downstream maintainers of non
8-bit targets in the
>>>>>> long term.
>>>>>>
>>>>>
>>>>> I've worked on multiple codebases where an option
existed in order to
>>>>> satisfy an extremely small userbase, with little or no
testing,
>>>>>
>>>>
>>>> In those situations, were the core developers responsible for
those
>>>> features/users? Yeah, if I needed to support a certain
observable feature
>>>> of clang continuing to work, I'd want tests (I'm pretty
serious about
>>>> testing, FWIW).
>>>>
>>>>
>>>>> and as such, I'm adamantly opposed to repeating it.
>>>>>
>>>>
>>>>>
>>>> In addition to what Adrian said, where the weird option exists
but is
>>>>> constantly being incidentally broken, I've seen the
opposite problem:
>>>>> people become afraid to refactor a section of code because
it might break
>>>>> the weird option.  You might say "if there aren't
any tests, people should
>>>>> feel free to refactor the code; their only responsibility
is to make sure
>>>>> the tests will still work", but honestly, I've
seen the opposite: without
>>>>> tests, it's just presumed that touching certain parts
of code is likely to
>>>>> break things, and so after a time, an aura of
untouchability starts to
>>>>> surround regions of the code.  And then, the more time goes
by, the more
>>>>> that code becomes unfamiliar to everyone (because no one is
actively
>>>>> maintaining it).  In the long run, the cost of an
unmaintained option may
>>>>> be far more than the cost of a maintained one.
>>>>>
>>>>
>>>>
>>>> I'm not actually opposed to this situation - LLVM as a
project is
>>>> pretty happy about making big structural changes to the
codebase & holding
>>>> the test coverage and downstream users accountable for ensuring
quality. We
>>>> rarely avoid changes due to risk of breakage & as a
community push back a
>>>> fair bit on reviewers suggesting we should - if someone
can't demonstrate
>>>> the breakage in an upstream test (or has a pretty good track
record of true
>>>> positives that may take some time to investigate - and thus it
might be
>>>> better in the short term to revert while waiting for that
evidence to be
>>>> provided) the changes tend to go in and stay in.
>>>>
>>>> Yeah, I think some parts of the code may become complicated
enough to
>>>> warrant separate testing - but most of the code that might move
to
>>>> constants for byte width, iterate over bits to that byte width,
etc, will
>>>> be tested on one value & might have bugs on other values
that will be found
>>>> (or not) downstream - best effort and all. But in cases where
the code to
>>>> handle novel byte widths becomes more complicated - some
abstraction and
>>>> unit testing would seem quite appropriate.
>>>>
>>>>
>>>>
>>>>> In short: please don't commit changes without tests. 
Even if the test
>>>>> is nothing but making sure this works:
>>>>>
>>>>> int main(int argc, char *argv[]) {
>>>>>   return argv[argc - 1][0];
>>>>> }
>>>>>
>>>>> That at least would give some freedom from the guilt of
breaking
>>>>> something important.
>>>>>
>>>>
>>>> It's hard to make sure that works in a meaningful sense in
this context
>>>> - without a non-8-bit-byte target in upstream LLVM, which is
the point of
>>>> contention/discussion. It's unclear if there's a
suitable target/community
>>>> to provide/maintain such a target upstream. I don't think
there's a
>>>> "cheap"/stub/trivial target we could create that
would provide what you're
>>>> suggesting without bitrotting quickly and being removed (more
quickly than,
>>>> I think, the sort of patches to support but not provide,
non-8-bit-byte
>>>> targets).
>>>>
>>>> Though it's hard to guess without seeing the sort of
patches that'd be
>>>> needed.
>>>>
>>>> - Dave
>>>>
>>>>
>>>>>
>>>>> -- Jorg
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191105/0079b78b/attachment.html>

llvm dev - Nov 2019 - RFC: On non 8-bit bytes and the target for it

[llvm-dev] RFC: On non 8-bit bytes and the target for it

[llvm-dev] RFC: On non 8-bit bytes and the target for it

[llvm-dev] RFC: On non 8-bit bytes and the target for it

[llvm-dev] RFC: On non 8-bit bytes and the target for it