thr3ads.net - llvm dev - [llvm-dev] RFC: On removing magic numbers assuming 8-bit bytes [May 2019]

If this information is useful, please help other people find it:
Share via:

Jesper Antonsson via llvm-dev

2019-May-09 12:29 UTC

[llvm-dev] RFC: On removing magic numbers assuming 8-bit bytes

On Wed, 2019-05-08 at 11:12 -0700, Philip Reames wrote:> On 5/8/19 1:25 AM, Jesper Antonsson wrote:
> > On Mon, 2019-05-06 at 15:56 -0700, Philip Reames via llvm-dev
> > wrote:
> > > On 5/6/19 2:43 AM, Tim Northover via llvm-dev wrote:
> > > > On Mon, 6 May 2019 at 10:13, James Courtier-Dutton via
llvm-dev
> > > > <llvm-dev at lists.llvm.org> wrote:
> > > > > Although the above is mentioning bytes, looking at the
"/
> > > > > 8"   and "& 0x7" makes it look like
the author meant octets
> > > > > and
> > > > > not bytes.
> > > > > Bytes can be any size of bits.
> > > > 
> > > > I don't think you'll have much luck trying to make
that stick
> > > > for a
> > > > general audience, or even a general compiler-writer
audience.
> > > > Byte
> > > > is
> > > > far too strongly associated with 8 bits these days.
> > > 
> > > +1 Please don't try; insisting on a distinction will confuse
many
> > > new
> > > contributors.
> > 
> > Yes, my interpretation is that the community is leaning toward
> > addressable unit as terminology.
> > 
> > > > > Octets are only ever 8 bits.
> > > > 
> > > > You might be able to convert all uses of byte to octet and
> > > > abandon
> > > > byte entirely, but at that point why bother? It feels like a
> > > > change
> > > > just for the sake of pedantry.
> > > > 
> > > > I like the "addressable unit" name, though
it's a bit long
> > > > (AddrUnit
> > > > seems OK). It at least signals to a reader that there might
be
> > > > something weird going on. Getting someone writing new code
to
> > > > think
> > > > in
> > > > those terms is a different matter, of course, but I
don't think
> > > > any
> > > > of
> > > > the changes under discussion really help there.
> > > > 
> > > > BTW, is there an open source backend (in a fork, I assume)
that
> > > > does
> > > > this? So that we can get some kind of idea of the real scope
of
> > > > the
> > > > changes needed.
> > > 
> > > Strongly agreed.
> > > 
> > > My personal take is this is an invasive enough change with enough
> > > likely
> > > ongoing maintenance fall out to require substantial justification
> > > before
> > > the work was undertaken upstream.
> > 
> > My hope and belief is that having good names instead of these
> > magical
> > numbers won't be a burden but rather an improvement long-term. 
> > 
> > >   A open source backend proposed for
> > > inclusion upstream would be one part of that.
> > 
> > That is not on the table right now. However, as the work required
> > to
> > make such an inclusion happen will be reduced by this cleanup, the
> > likelihood that it happens in the future should increase.
> 
> Given this, I'm not sure the community as a whole should take on the
> burden of supporting non b-byte addressable units.  I see this as a
> precondition.  To be clear, I don't care *which* backend there is,
> doesn't have to be yours, but the presence of at least one would seem
> necessary for testing if nothing else.
I agree that an in-tree target is needed for actual support. However,
we're merely suggesting a gradual cleanup of magic numbers in order to
make the code a bit more readable and make life easier for a number of
downstream targets. It will not result in support, but it would make
any effort to create support (or maintain support downstream)
significantly smaller. This would also make it a bit more likely that
LLVM is the compiler of choice for such targets, some of which might
want to upstream eventually.

The onus is on interested parties to maintain any gains, and Ericsson
is offering to do that in a no-drama way with the help of other
companies that have voiced their interest. We continuously merge and
test against top of tree and would act accordingly, if allowed.

As the discussion is subsiding, I'm unsure about how to conclude this
RFC. Several parties have said they support this effort, others have
pitched in with suggestions on terminology and such (which perhaps
indicates that they are not opposed in general). JF Bastien and you ask
for in-tree targets, although JF did indicate that it made sense to
first clean up.

On "byte" vs "addressable unit", we've been thinking a
bit and are
leaning toward staying with the prevalent "byte" terminology for as
long as upstream is 8-bit-only to avoid mixed terminology or larger
patches. However, we're flexible on this, and I've uploaded a twin
patch (in D61725) to my original showcase showing how "addressable
unit" could look. 
> > 
> > >   Active contribution from
> > > the sponsors in other areas would also be a key factor.
> > 
> > I'm not sure how to interpret that, but our team here at Ericsson
> > is
> > fairly large, we have been working with this out-of-tree backend
> > since
> > 2011 and as a group, we contribute to upstream e.g. by helping out
> > with
> > the fixedpoint upstreaming, by solving and filing TRs (we're
pretty
> > good at testing I'd say), improving debug information and more.
> 
> Ok.
> 
> p.s. If my wording came across as implying any disrespect, sorry!  I
> was
> making a general point, not thinking about how it might be read in
> context. 
No problem, thanks!
> 
> > 
> > > > Cheers.
> > > > 
> > > > Tim.
> > > > _______________________________________________
> > > > LLVM Developers mailing list
> > > > llvm-dev at lists.llvm.org
> > > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > > 
> > > _______________________________________________
> > > LLVM Developers mailing list
> > > llvm-dev at lists.llvm.org
> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

JF Bastien via llvm-dev

2019-May-09 17:29 UTC

head link

[llvm-dev] RFC: On removing magic numbers assuming 8-bit bytes

> On May 9, 2019, at 5:29 AM, Jesper Antonsson via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> On Wed, 2019-05-08 at 11:12 -0700, Philip Reames wrote:
>> On 5/8/19 1:25 AM, Jesper Antonsson wrote:
>>> On Mon, 2019-05-06 at 15:56 -0700, Philip Reames via llvm-dev
>>> wrote:
>>>> On 5/6/19 2:43 AM, Tim Northover via llvm-dev wrote:
>>>>> On Mon, 6 May 2019 at 10:13, James Courtier-Dutton via
llvm-dev
>>>>> <llvm-dev at lists.llvm.org> wrote:
>>>>>> Although the above is mentioning bytes, looking at the
"/
>>>>>> 8"   and "& 0x7" makes it look like
the author meant octets
>>>>>> and
>>>>>> not bytes.
>>>>>> Bytes can be any size of bits.
>>>>> 
>>>>> I don't think you'll have much luck trying to make
that stick
>>>>> for a
>>>>> general audience, or even a general compiler-writer
audience.
>>>>> Byte
>>>>> is
>>>>> far too strongly associated with 8 bits these days.
>>>> 
>>>> +1 Please don't try; insisting on a distinction will
confuse many
>>>> new
>>>> contributors.
>>> 
>>> Yes, my interpretation is that the community is leaning toward
>>> addressable unit as terminology.
>>> 
>>>>>> Octets are only ever 8 bits.
>>>>> 
>>>>> You might be able to convert all uses of byte to octet and
>>>>> abandon
>>>>> byte entirely, but at that point why bother? It feels like
a
>>>>> change
>>>>> just for the sake of pedantry.
>>>>> 
>>>>> I like the "addressable unit" name, though
it's a bit long
>>>>> (AddrUnit
>>>>> seems OK). It at least signals to a reader that there might
be
>>>>> something weird going on. Getting someone writing new code
to
>>>>> think
>>>>> in
>>>>> those terms is a different matter, of course, but I
don't think
>>>>> any
>>>>> of
>>>>> the changes under discussion really help there.
>>>>> 
>>>>> BTW, is there an open source backend (in a fork, I assume)
that
>>>>> does
>>>>> this? So that we can get some kind of idea of the real
scope of
>>>>> the
>>>>> changes needed.
>>>> 
>>>> Strongly agreed.
>>>> 
>>>> My personal take is this is an invasive enough change with
enough
>>>> likely
>>>> ongoing maintenance fall out to require substantial
justification
>>>> before
>>>> the work was undertaken upstream.
>>> 
>>> My hope and belief is that having good names instead of these
>>> magical
>>> numbers won't be a burden but rather an improvement long-term. 
>>> 
>>>>  A open source backend proposed for
>>>> inclusion upstream would be one part of that.
>>> 
>>> That is not on the table right now. However, as the work required
>>> to
>>> make such an inclusion happen will be reduced by this cleanup, the
>>> likelihood that it happens in the future should increase.
>> 
>> Given this, I'm not sure the community as a whole should take on
the
>> burden of supporting non b-byte addressable units.  I see this as a
>> precondition.  To be clear, I don't care *which* backend there is,
>> doesn't have to be yours, but the presence of at least one would
seem
>> necessary for testing if nothing else.
> 
> I agree that an in-tree target is needed for actual support. However,
> we're merely suggesting a gradual cleanup of magic numbers in order to
> make the code a bit more readable and make life easier for a number of
> downstream targets. It will not result in support, but it would make
> any effort to create support (or maintain support downstream)
> significantly smaller. This would also make it a bit more likely that
> LLVM is the compiler of choice for such targets, some of which might
> want to upstream eventually.
> 
> The onus is on interested parties to maintain any gains, and Ericsson
> is offering to do that in a no-drama way with the help of other
> companies that have voiced their interest. We continuously merge and
> test against top of tree and would act accordingly, if allowed.
> 
> As the discussion is subsiding, I'm unsure about how to conclude this
> RFC. Several parties have said they support this effort, others have
> pitched in with suggestions on terminology and such (which perhaps
> indicates that they are not opposed in general). JF Bastien and you ask
> for in-tree targets, although JF did indicate that it made sense to
> first clean up.
I don’t think you have consensus to move forward at this point in time. My
expectation, which I think represents LLVM’s historical approach, is that a path
to full support be planned out before this effort starts. Concretely, I expect a
real-world backend to be committed to LLVM as a necessary step. What I meant
upthread was: yes it makes sense to do cleanups before landing a backend, but
someone has to commit to upstreaming a backend before you start the cleanups.
When I say a backend I don’t mean a toy, I mean a real backend.

Right now we have no commitment on anybody landing a backend, and we don’t
really have a concrete idea of what you’re even proposing to change or how.
You’re focusing on “magic numbers” like everyone agrees 8 is the root of all
evil, and it’s really not. Let’s say someone promises to upstream a backend,
what concretely do you need to change, and in which projects, to get there? Are
you changing clang, and how? What about libc++? Linker? LLVM, and how? Is IR
going to change? If not, do you keep all the i8* around, and how do you work
around not having void* in IR?

The above is, I think, necessary but not sufficient to moving forward.

> On "byte" vs "addressable unit", we've been
thinking a bit and are
> leaning toward staying with the prevalent "byte" terminology for
as
> long as upstream is 8-bit-only to avoid mixed terminology or larger
> patches. However, we're flexible on this, and I've uploaded a twin
> patch (in D61725) to my original showcase showing how "addressable
> unit" could look. 
> 
>>> 
>>>>  Active contribution from
>>>> the sponsors in other areas would also be a key factor.
>>> 
>>> I'm not sure how to interpret that, but our team here at
Ericsson
>>> is
>>> fairly large, we have been working with this out-of-tree backend
>>> since
>>> 2011 and as a group, we contribute to upstream e.g. by helping out
>>> with
>>> the fixedpoint upstreaming, by solving and filing TRs (we're
pretty
>>> good at testing I'd say), improving debug information and more.
>> 
>> Ok.
>> 
>> p.s. If my wording came across as implying any disrespect, sorry!  I
>> was
>> making a general point, not thinking about how it might be read in
>> context. 
> 
> No problem, thanks!
> 
>> 
>>> 
>>>>> Cheers.
>>>>> 
>>>>> Tim.
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>> 
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190509/0b1134d5/attachment-0001.html>

Eric Christopher via llvm-dev

2019-May-09 19:46 UTC

head link

[llvm-dev] RFC: On removing magic numbers assuming 8-bit bytes

I agree that consensus seems to be missing. There's definitely some
assumptions, and more in particular, API and usage assumptions around
8 bit bytes in the backends. Also: How do you plan on keeping these
assumptions from creeping back in?

-eric

On Thu, May 9, 2019 at 10:30 AM JF Bastien via llvm-dev
<llvm-dev at lists.llvm.org> wrote:>
>
>
> On May 9, 2019, at 5:29 AM, Jesper Antonsson via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
>
> On Wed, 2019-05-08 at 11:12 -0700, Philip Reames wrote:
>
> On 5/8/19 1:25 AM, Jesper Antonsson wrote:
>
> On Mon, 2019-05-06 at 15:56 -0700, Philip Reames via llvm-dev
> wrote:
>
> On 5/6/19 2:43 AM, Tim Northover via llvm-dev wrote:
>
> On Mon, 6 May 2019 at 10:13, James Courtier-Dutton via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>
> Although the above is mentioning bytes, looking at the "/
> 8"   and "& 0x7" makes it look like the author meant
octets
> and
> not bytes.
> Bytes can be any size of bits.
>
>
> I don't think you'll have much luck trying to make that stick
> for a
> general audience, or even a general compiler-writer audience.
> Byte
> is
> far too strongly associated with 8 bits these days.
>
>
> +1 Please don't try; insisting on a distinction will confuse many
> new
> contributors.
>
>
> Yes, my interpretation is that the community is leaning toward
> addressable unit as terminology.
>
> Octets are only ever 8 bits.
>
>
> You might be able to convert all uses of byte to octet and
> abandon
> byte entirely, but at that point why bother? It feels like a
> change
> just for the sake of pedantry.
>
> I like the "addressable unit" name, though it's a bit long
> (AddrUnit
> seems OK). It at least signals to a reader that there might be
> something weird going on. Getting someone writing new code to
> think
> in
> those terms is a different matter, of course, but I don't think
> any
> of
> the changes under discussion really help there.
>
> BTW, is there an open source backend (in a fork, I assume) that
> does
> this? So that we can get some kind of idea of the real scope of
> the
> changes needed.
>
>
> Strongly agreed.
>
> My personal take is this is an invasive enough change with enough
> likely
> ongoing maintenance fall out to require substantial justification
> before
> the work was undertaken upstream.
>
>
> My hope and belief is that having good names instead of these
> magical
> numbers won't be a burden but rather an improvement long-term.
>
>  A open source backend proposed for
> inclusion upstream would be one part of that.
>
>
> That is not on the table right now. However, as the work required
> to
> make such an inclusion happen will be reduced by this cleanup, the
> likelihood that it happens in the future should increase.
>
>
> Given this, I'm not sure the community as a whole should take on the
> burden of supporting non b-byte addressable units.  I see this as a
> precondition.  To be clear, I don't care *which* backend there is,
> doesn't have to be yours, but the presence of at least one would seem
> necessary for testing if nothing else.
>
>
> I agree that an in-tree target is needed for actual support. However,
> we're merely suggesting a gradual cleanup of magic numbers in order to
> make the code a bit more readable and make life easier for a number of
> downstream targets. It will not result in support, but it would make
> any effort to create support (or maintain support downstream)
> significantly smaller. This would also make it a bit more likely that
> LLVM is the compiler of choice for such targets, some of which might
> want to upstream eventually.
>
> The onus is on interested parties to maintain any gains, and Ericsson
> is offering to do that in a no-drama way with the help of other
> companies that have voiced their interest. We continuously merge and
> test against top of tree and would act accordingly, if allowed.
>
> As the discussion is subsiding, I'm unsure about how to conclude this
> RFC. Several parties have said they support this effort, others have
> pitched in with suggestions on terminology and such (which perhaps
> indicates that they are not opposed in general). JF Bastien and you ask
> for in-tree targets, although JF did indicate that it made sense to
> first clean up.
>
>
> I don’t think you have consensus to move forward at this point in time. My
expectation, which I think represents LLVM’s historical approach, is that a path
to full support be planned out before this effort starts. Concretely, I expect a
real-world backend to be committed to LLVM as a necessary step. What I meant
upthread was: yes it makes sense to do cleanups before landing a backend, but
someone has to commit to upstreaming a backend before you start the cleanups.
When I say a backend I don’t mean a toy, I mean a real backend.
>
> Right now we have no commitment on anybody landing a backend, and we don’t
really have a concrete idea of what you’re even proposing to change or how.
You’re focusing on “magic numbers” like everyone agrees 8 is the root of all
evil, and it’s really not. Let’s say someone promises to upstream a backend,
what concretely do you need to change, and in which projects, to get there? Are
you changing clang, and how? What about libc++? Linker? LLVM, and how? Is IR
going to change? If not, do you keep all the i8* around, and how do you work
around not having void* in IR?
>
> The above is, I think, necessary but not sufficient to moving forward.
>
>
> On "byte" vs "addressable unit", we've been
thinking a bit and are
> leaning toward staying with the prevalent "byte" terminology for
as
> long as upstream is 8-bit-only to avoid mixed terminology or larger
> patches. However, we're flexible on this, and I've uploaded a twin
> patch (in D61725) to my original showcase showing how "addressable
> unit" could look.
>
>
>  Active contribution from
> the sponsors in other areas would also be a key factor.
>
>
> I'm not sure how to interpret that, but our team here at Ericsson
> is
> fairly large, we have been working with this out-of-tree backend
> since
> 2011 and as a group, we contribute to upstream e.g. by helping out
> with
> the fixedpoint upstreaming, by solving and filing TRs (we're pretty
> good at testing I'd say), improving debug information and more.
>
>
> Ok.
>
> p.s. If my wording came across as implying any disrespect, sorry!  I
> was
> making a general point, not thinking about how it might be read in
> context.
>
>
> No problem, thanks!
>
>
>
> Cheers.
>
> Tim.
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

James Y Knight via llvm-dev

2019-May-09 19:59 UTC

head link

[llvm-dev] RFC: On removing magic numbers assuming 8-bit bytes

*From: *JF Bastien via llvm-dev <llvm-dev at lists.llvm.org>
*Date: *Thu, May 9, 2019 at 1:30 PM
*To: *Jesper Antonsson
*Cc: *dag at cray.com, llvm-dev at lists.llvm.org

I don’t think you have consensus to move forward at this point in time.
My> expectation, which I think represents LLVM’s historical approach, is that a
> path to full support be planned out before this effort starts. Concretely,
> I expect a real-world backend to be committed to LLVM as a necessary step.
> What I meant upthread was: yes it makes sense to do cleanups before landing
> a backend, but *someone* has to commit to upstreaming a backend before
> you start the cleanups. When I say a backend I don’t mean a toy, I mean a
> real backend.
>
IMO, the argument of removing "magic constants", and replacing them
with a
semantically meaningful value does have some merit, even lacking any
backend that uses a number other than "8".

I guess I'd say that my feeling is that if llvm's codebase already had a
"bitsPerUnit()" accessor (as GCC calls this concept), I would probably
not
say that we should replace all the calls with literal '8's across the
code-base simply because no current target uses any value other than 8.
It's an abstraction which does make sense on its own, even if it's
currently always 8.

So, I'm somewhat agnostic here -- the change itself has little value
upstream, but if the state of the codebase afterwards is not degraded by
the change (and on the contrary, can be ever-so-marginally improved), that
seems like it could be basically okay.

Right now we have no commitment on anybody landing a backend, and we
don’t> really have a concrete idea of what you’re even proposing to change or how.
> You’re focusing on “magic numbers” like everyone agrees 8 is the root of
> all evil, and it’s really not. Let’s say someone promises to upstream a
> backend, what concretely do you need to change, and in which projects, to
> get there? Are you changing clang, and how? What about libc++? Linker?
> LLVM, and how? Is IR going to change? If not, do you keep all the i8*
> around, and how do you work around not having void* in IR?
>
This is a great list of questions that it would be good to have answers to,
though.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190509/e3e635c5/attachment.html>

Reasonably Related Threads

Search for more apparently analagous threads

llvm dev - May 2019 - RFC: On removing magic numbers assuming 8-bit bytes

[llvm-dev] RFC: On removing magic numbers assuming 8-bit bytes

[llvm-dev] RFC: On removing magic numbers assuming 8-bit bytes

[llvm-dev] RFC: On removing magic numbers assuming 8-bit bytes

[llvm-dev] RFC: On removing magic numbers assuming 8-bit bytes

Reasonably Related Threads