thr3ads.net - llvm dev - [llvm-dev] RFC: Harvard architectures and default address spaces [Jul 2017]

If this information is useful, please help other people find it:
Share via:

Björn Pettersson A via llvm-dev

2017-Jul-13 17:25 UTC

[llvm-dev] RFC: Harvard architectures and default address spaces

> -----Original Message-----
> From: Hal Finkel [mailto:hfinkel at anl.gov]
> Sent: den 13 juli 2017 16:01
> To: Björn Pettersson A <bjorn.a.pettersson at ericsson.com>; David
Chisnall
> <David.Chisnall at cl.cam.ac.uk>; Dylan McKay <me at
dylanmckay.io>
> Cc: llvm-dev at lists.llvm.org; Carl Peto <carl.peto at me.com>
> Subject: Re: [llvm-dev] RFC: Harvard architectures and default address
> spaces
> 
> On 07/13/2017 05:38 AM, Björn Pettersson A via llvm-dev wrote:
> > My experience of having the address space for functions (or function
> pointers) in the DataLayout i that when the .ll file is parsed we need to
parse
> the DataLayout before any function declarations. That is needed because we
> want to attribute the functions with correct address space (according to
> DataLayout) when inserting them in the symbol table. An alternative would
> be to update address space info for functions after having parsed the
> DataLayout.
> >
> > Is the DataLayout normally used when parsing the .ll file (or .bc)? Or
would
> this be the first case of doing that?
> >
> > Is it guaranteed that DataLayout is specified/parsed before function
> declaration, or that the DataLayout specification is context sensitive and
only
> is valid for the following declarations?
> 
> The DataLayout is a required part of the .ll/.bc file. In the .ll file
> (*), it's the part of the module that looks like this:
> 
>    target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
> 
> it is global to the entire module and always available.
My point was that the DataLayout isn't available inside the LLParser and the
BitcodeReader, up until the point when it has been parsed. So I would not say
"always available".

Both LLParser and BitcodeReader is for example using getAddressSpace(), both
directly and maybe also indirectly through different interfaces (perhaps not on
function pointers though). My concern is that maybe the incorrect address space
will be used while parsing, and then it might be hard to find all places to
fixup at a later stage. when having parsed the DataLayout and finding out what
the default address space really is.
Or if there is some undocumented(?) rule that the DataLayout always comes before
function declarations in the ll/bc file, then all functions can get the default
address space attribute directly (as indicated by the DataLayout) when being
parsed.

I think the whole idea from Dylan was to do this as a fixup after
LLParser/BitcodeReader. I.e not trying to lookup a functions address space
already when parsing the function declaration. So then the rule would be - do
not use Pointer<Function>::getAddressSpace(), or PointerType::get() etc.
during ll/bc parsing because it might give the wrong result. Maybe it is
possible to assert on that?

We could of course give functions some kind of undefined address space value
when parsing ll/bc and adding functions to the symbol table. That might help us
catch situations when someone tries to fetch the address space for a function
pointer before the
set-default-address-space-as-given-by-datalayout-on-all-functions pass has
executed.
> 
> (*) It is true that you can write tests without specifying one of these,
> but in such cases, you just get the builtin default. For all real cases
> you'll need to have a target-appropriate DataLayout string.
> 
> >
> > What if there are several address spaces for functions? Or is that a
silly
> thing that no one ever will use? Having the address space specified in the
> DataLayout would be insufficient, since we would need to attribute the
> functions separately, right?
> >
> > I do not say that having the info in the DataLayout is a totally bad
idea (since
> our out-of-tree target is using that trick), but I think it might impose
some
> problems as well. And perhaps it isn't the most general solution.
> 
> If different functions might be in different address spaces, you'll
need
> some other mechanism to set the address space (as a single default
won't
> suffice). You might use source-level function attributes, for example.
> 
>   -Hal
> 
> >
> > /Björn
> >
> >> -----Original Message-----
> >> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On
Behalf Of
> David
> >> Chisnall via llvm-dev
> >> Sent: den 12 juli 2017 17:26
> >> To: Dylan McKay <me at dylanmckay.io>
> >> Cc: llvm-dev <llvm-dev at lists.llvm.org>; Carl Peto
<carl.peto at me.com>
> >> Subject: Re: [llvm-dev] RFC: Harvard architectures and default
address
> >> spaces
> >>
> >> On 11 Jul 2017, at 23:18, Dylan McKay via llvm-dev <llvm-
> dev at lists.llvm.org>
> >> wrote:
> >>>> Add this information to DataLayout and to use that
information in
> >> relevant places.
> >>> This sounds like a much better/cleaner idea, thanks!
> >> I’d suggest taking a look at the alloca address space changes,
which were
> >> recently added based on a cleaned-up version of our code.  We have
a
> similar
> >> issue (function and data pointers have the same representation for
us,
> but
> >> casting requires different handling[1]) and have considered adding
> address
> >> spaces to functions.
> >>
> >> David
> >>
> >> [1] Probably not relevant for this discussion, but if anyone
cares: in our
> world
> >> we have 128-bit fat pointers contain base, bounds and permissions,
and
> that
> >> 64-bit pointers that are implicitly relative to one of two special
fat pointer
> >> registers, one for code and one for data.  We must therefore
handle 64-
> bit to
> >> 128-bit pointer casts differently depending on whether we’re
casting
> code or
> >> data pointers.  We currently do this with some fairly ugly hacks,
but being
> >> able to put all functions in a different AS would make this much
easier for
> us.
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org
> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory

Hal Finkel via llvm-dev

2017-Jul-13 17:33 UTC

head link

[llvm-dev] RFC: Harvard architectures and default address spaces

On 07/13/2017 12:25 PM, Björn Pettersson A wrote:>
>> -----Original Message-----
>> From: Hal Finkel [mailto:hfinkel at anl.gov]
>> Sent: den 13 juli 2017 16:01
>> To: Björn Pettersson A <bjorn.a.pettersson at ericsson.com>;
David Chisnall
>> <David.Chisnall at cl.cam.ac.uk>; Dylan McKay <me at
dylanmckay.io>
>> Cc: llvm-dev at lists.llvm.org; Carl Peto <carl.peto at me.com>
>> Subject: Re: [llvm-dev] RFC: Harvard architectures and default address
>> spaces
>>
>> On 07/13/2017 05:38 AM, Björn Pettersson A via llvm-dev wrote:
>>> My experience of having the address space for functions (or
function
>> pointers) in the DataLayout i that when the .ll file is parsed we need
to parse
>> the DataLayout before any function declarations. That is needed because
we
>> want to attribute the functions with correct address space (according
to
>> DataLayout) when inserting them in the symbol table. An alternative
would
>> be to update address space info for functions after having parsed the
>> DataLayout.
>>> Is the DataLayout normally used when parsing the .ll file (or .bc)?
Or would
>> this be the first case of doing that?
>>> Is it guaranteed that DataLayout is specified/parsed before
function
>> declaration, or that the DataLayout specification is context sensitive
and only
>> is valid for the following declarations?
>>
>> The DataLayout is a required part of the .ll/.bc file. In the .ll file
>> (*), it's the part of the module that looks like this:
>>
>>     target datalayout =
"e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>>
>> it is global to the entire module and always available.
> My point was that the DataLayout isn't available inside the LLParser
and the BitcodeReader, up until the point when it has been parsed. So I would
not say "always available".
>
> Both LLParser and BitcodeReader is for example using getAddressSpace(),
both directly and maybe also indirectly through different interfaces (perhaps
not on function pointers though). My concern is that maybe the incorrect address
space will be used while parsing, and then it might be hard to find all places
to fixup at a later stage. when having parsed the DataLayout and finding out
what the default address space really is.
> Or if there is some undocumented(?) rule that the DataLayout always comes
before function declarations in the ll/bc file, then all functions can get the
default address space attribute directly (as indicated by the DataLayout) when
being parsed.
>
> I think the whole idea from Dylan was to do this as a fixup after
LLParser/BitcodeReader. I.e not trying to lookup a functions address space
already when parsing the function declaration. So then the rule would be - do
not use Pointer<Function>::getAddressSpace(), or PointerType::get() etc.
during ll/bc parsing because it might give the wrong result. Maybe it is
possible to assert on that?
>
> We could of course give functions some kind of undefined address space
value when parsing ll/bc and adding functions to the symbol table. That might
help us catch situations when someone tries to fetch the address space for a
function pointer before the
set-default-address-space-as-given-by-datalayout-on-all-functions pass has
executed.
I understand your point. We need to extend the syntax to enabling 
tagging functions with addressspaces directly (i.e. as a function 
attribute). DataLayout can't, as you noted, and shouldn't, be used to 
adjust defaults during parsing. The point of the defaults in DataLayout 
is to provide some way to supply correct defaults should optimizations 
need them.

  -Hal
>
>> (*) It is true that you can write tests without specifying one of
these,
>> but in such cases, you just get the builtin default. For all real cases
>> you'll need to have a target-appropriate DataLayout string.
>>
>>> What if there are several address spaces for functions? Or is that
a silly
>> thing that no one ever will use? Having the address space specified in
the
>> DataLayout would be insufficient, since we would need to attribute the
>> functions separately, right?
>>> I do not say that having the info in the DataLayout is a totally
bad idea (since
>> our out-of-tree target is using that trick), but I think it might
impose some
>> problems as well. And perhaps it isn't the most general solution.
>>
>> If different functions might be in different address spaces, you'll
need
>> some other mechanism to set the address space (as a single default
won't
>> suffice). You might use source-level function attributes, for example.
>>
>>    -Hal
>>
>>> /Björn
>>>
>>>> -----Original Message-----
>>>> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On
Behalf Of
>> David
>>>> Chisnall via llvm-dev
>>>> Sent: den 12 juli 2017 17:26
>>>> To: Dylan McKay <me at dylanmckay.io>
>>>> Cc: llvm-dev <llvm-dev at lists.llvm.org>; Carl Peto
<carl.peto at me.com>
>>>> Subject: Re: [llvm-dev] RFC: Harvard architectures and default
address
>>>> spaces
>>>>
>>>> On 11 Jul 2017, at 23:18, Dylan McKay via llvm-dev <llvm-
>> dev at lists.llvm.org>
>>>> wrote:
>>>>>> Add this information to DataLayout and to use that
information in
>>>> relevant places.
>>>>> This sounds like a much better/cleaner idea, thanks!
>>>> I’d suggest taking a look at the alloca address space changes,
which were
>>>> recently added based on a cleaned-up version of our code.  We
have a
>> similar
>>>> issue (function and data pointers have the same representation
for us,
>> but
>>>> casting requires different handling[1]) and have considered
adding
>> address
>>>> spaces to functions.
>>>>
>>>> David
>>>>
>>>> [1] Probably not relevant for this discussion, but if anyone
cares: in our
>> world
>>>> we have 128-bit fat pointers contain base, bounds and
permissions, and
>> that
>>>> 64-bit pointers that are implicitly relative to one of two
special fat pointer
>>>> registers, one for code and one for data.  We must therefore
handle 64-
>> bit to
>>>> 128-bit pointer casts differently depending on whether we’re
casting
>> code or
>>>> data pointers.  We currently do this with some fairly ugly
hacks, but being
>>>> able to put all functions in a different AS would make this
much easier for
>> us.
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> --
>> Hal Finkel
>> Lead, Compiler Technology and Programming Languages
>> Leadership Computing Facility
>> Argonne National Laboratory
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Dylan McKay via llvm-dev

2017-Jul-14 05:43 UTC

head link

[llvm-dev] RFC: Harvard architectures and default address spaces

> My point was that the DataLayout isn't available inside the LLParser
andthe BitcodeReader, up until the point when it has been parsed. So I would
not say "always available".

I have noticed this.

Normally, the frontend compilation adds functions, and them emits some form
of code. There are no issues with normal .o or .s compilation, but if we
emit IR, we lose the address space information because it is not actually a
part of the textual IR form.

The standard LLParser process has no data layout available until the end of
the process (because we create an empty temporary module for parsing).

Because of both of these problems, the best way to implement this is modify
the textual IR form to support address space attributes on function
declarations.
> I think the whole idea from Dylan was to do this as a fixup afterLLParser/BitcodeReader

Originally, that was the only workable solution I could come up with, but I
think the solution Hal suggested is better.

As I understand it:

* New functions created by the frontend will default to the address space
specified in the DataLayout
* When these functions get lowered to LL, we emit the address space on the
function like 'addrspace 1' if it is not the default address space 0
* When parsing, all functions will either have no addrspace attribute and
default to '0', or take the addrspace specified

This will let frontends specify explicit address spaces manually (which
would support Mikael and others usecases), and also keep track of the
address space throughout the pipeline so that we don't make any incorrect
assumptions of the space.



On Fri, Jul 14, 2017 at 5:33 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>
> On 07/13/2017 12:25 PM, Björn Pettersson A wrote:
>
>>
>> -----Original Message-----
>>> From: Hal Finkel [mailto:hfinkel at anl.gov]
>>> Sent: den 13 juli 2017 16:01
>>> To: Björn Pettersson A <bjorn.a.pettersson at ericsson.com>;
David Chisnall
>>> <David.Chisnall at cl.cam.ac.uk>; Dylan McKay <me at
dylanmckay.io>
>>> Cc: llvm-dev at lists.llvm.org; Carl Peto <carl.peto at
me.com>
>>> Subject: Re: [llvm-dev] RFC: Harvard architectures and default
address
>>> spaces
>>>
>>> On 07/13/2017 05:38 AM, Björn Pettersson A via llvm-dev wrote:
>>>
>>>> My experience of having the address space for functions (or
function
>>>>
>>> pointers) in the DataLayout i that when the .ll file is parsed we
need
>>> to parse
>>> the DataLayout before any function declarations. That is needed
because
>>> we
>>> want to attribute the functions with correct address space
(according to
>>> DataLayout) when inserting them in the symbol table. An alternative
would
>>> be to update address space info for functions after having parsed
the
>>> DataLayout.
>>>
>>>> Is the DataLayout normally used when parsing the .ll file (or
.bc)? Or
>>>> would
>>>>
>>> this be the first case of doing that?
>>>
>>>> Is it guaranteed that DataLayout is specified/parsed before
function
>>>>
>>> declaration, or that the DataLayout specification is context
sensitive
>>> and only
>>> is valid for the following declarations?
>>>
>>> The DataLayout is a required part of the .ll/.bc file. In the .ll
file
>>> (*), it's the part of the module that looks like this:
>>>
>>>     target datalayout =
"e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>>>
>>> it is global to the entire module and always available.
>>>
>> My point was that the DataLayout isn't available inside the
LLParser and
>> the BitcodeReader, up until the point when it has been parsed. So I
would
>> not say "always available".
>>
>> Both LLParser and BitcodeReader is for example using getAddressSpace(),
>> both directly and maybe also indirectly through different interfaces
>> (perhaps not on function pointers though). My concern is that maybe the
>> incorrect address space will be used while parsing, and then it might
be
>> hard to find all places to fixup at a later stage. when having parsed
the
>> DataLayout and finding out what the default address space really is.
>> Or if there is some undocumented(?) rule that the DataLayout always
comes
>> before function declarations in the ll/bc file, then all functions can
get
>> the default address space attribute directly (as indicated by the
>> DataLayout) when being parsed.
>>
>> I think the whole idea from Dylan was to do this as a fixup after
>> LLParser/BitcodeReader. I.e not trying to lookup a functions address
space
>> already when parsing the function declaration. So then the rule would
be -
>> do not use Pointer<Function>::getAddressSpace(), or
PointerType::get()
>> etc. during ll/bc parsing because it might give the wrong result. Maybe
it
>> is possible to assert on that?
>>
>> We could of course give functions some kind of undefined address space
>> value when parsing ll/bc and adding functions to the symbol table. That
>> might help us catch situations when someone tries to fetch the address
>> space for a function pointer before the set-default-address-space-as-g
>> iven-by-datalayout-on-all-functions pass has executed.
>>
>
> I understand your point. We need to extend the syntax to enabling tagging
> functions with addressspaces directly (i.e. as a function attribute).
> DataLayout can't, as you noted, and shouldn't, be used to adjust
defaults
> during parsing. The point of the defaults in DataLayout is to provide some
> way to supply correct defaults should optimizations need them.
>
>  -Hal
>
>
>
>> (*) It is true that you can write tests without specifying one of
these,
>>> but in such cases, you just get the builtin default. For all real
cases
>>> you'll need to have a target-appropriate DataLayout string.
>>>
>>> What if there are several address spaces for functions? Or is that
a
>>>> silly
>>>>
>>> thing that no one ever will use? Having the address space specified
in
>>> the
>>> DataLayout would be insufficient, since we would need to attribute
the
>>> functions separately, right?
>>>
>>>> I do not say that having the info in the DataLayout is a
totally bad
>>>> idea (since
>>>>
>>> our out-of-tree target is using that trick), but I think it might
impose
>>> some
>>> problems as well. And perhaps it isn't the most general
solution.
>>>
>>> If different functions might be in different address spaces,
you'll need
>>> some other mechanism to set the address space (as a single default
won't
>>> suffice). You might use source-level function attributes, for
example.
>>>
>>>    -Hal
>>>
>>> /Björn
>>>>
>>>> -----Original Message-----
>>>>> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org]
On Behalf Of
>>>>>
>>>> David
>>>
>>>> Chisnall via llvm-dev
>>>>> Sent: den 12 juli 2017 17:26
>>>>> To: Dylan McKay <me at dylanmckay.io>
>>>>> Cc: llvm-dev <llvm-dev at lists.llvm.org>; Carl Peto
<carl.peto at me.com>
>>>>> Subject: Re: [llvm-dev] RFC: Harvard architectures and
default address
>>>>> spaces
>>>>>
>>>>> On 11 Jul 2017, at 23:18, Dylan McKay via llvm-dev
<llvm-
>>>>>
>>>> dev at lists.llvm.org>
>>>
>>>> wrote:
>>>>>
>>>>>> Add this information to DataLayout and to use that
information in
>>>>>>>
>>>>>> relevant places.
>>>>>
>>>>>> This sounds like a much better/cleaner idea, thanks!
>>>>>>
>>>>> I’d suggest taking a look at the alloca address space
changes, which
>>>>> were
>>>>> recently added based on a cleaned-up version of our code. 
We have a
>>>>>
>>>> similar
>>>
>>>> issue (function and data pointers have the same representation
for us,
>>>>>
>>>> but
>>>
>>>> casting requires different handling[1]) and have considered
adding
>>>>>
>>>> address
>>>
>>>> spaces to functions.
>>>>>
>>>>> David
>>>>>
>>>>> [1] Probably not relevant for this discussion, but if
anyone cares: in
>>>>> our
>>>>>
>>>> world
>>>
>>>> we have 128-bit fat pointers contain base, bounds and
permissions, and
>>>>>
>>>> that
>>>
>>>> 64-bit pointers that are implicitly relative to one of two
special fat
>>>>> pointer
>>>>> registers, one for code and one for data.  We must
therefore handle 64-
>>>>>
>>>> bit to
>>>
>>>> 128-bit pointer casts differently depending on whether we’re
casting
>>>>>
>>>> code or
>>>
>>>> data pointers.  We currently do this with some fairly ugly
hacks, but
>>>>> being
>>>>> able to put all functions in a different AS would make this
much
>>>>> easier for
>>>>>
>>>> us.
>>>
>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>> --
>>> Hal Finkel
>>> Lead, Compiler Technology and Programming Languages
>>> Leadership Computing Facility
>>> Argonne National Laboratory
>>>
>>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170714/3b407f6e/attachment.html>

llvm dev - Jul 2017 - RFC: Harvard architectures and default address spaces

[llvm-dev] RFC: Harvard architectures and default address spaces

[llvm-dev] RFC: Harvard architectures and default address spaces

[llvm-dev] RFC: Harvard architectures and default address spaces