Björn Pettersson A via llvm-dev
2017-Jul-13 10:38 UTC
[llvm-dev] RFC: Harvard architectures and default address spaces
My experience of having the address space for functions (or function pointers) in the DataLayout i that when the .ll file is parsed we need to parse the DataLayout before any function declarations. That is needed because we want to attribute the functions with correct address space (according to DataLayout) when inserting them in the symbol table. An alternative would be to update address space info for functions after having parsed the DataLayout. Is the DataLayout normally used when parsing the .ll file (or .bc)? Or would this be the first case of doing that? Is it guaranteed that DataLayout is specified/parsed before function declaration, or that the DataLayout specification is context sensitive and only is valid for the following declarations? What if there are several address spaces for functions? Or is that a silly thing that no one ever will use? Having the address space specified in the DataLayout would be insufficient, since we would need to attribute the functions separately, right? I do not say that having the info in the DataLayout is a totally bad idea (since our out-of-tree target is using that trick), but I think it might impose some problems as well. And perhaps it isn't the most general solution. /Björn> -----Original Message----- > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of David > Chisnall via llvm-dev > Sent: den 12 juli 2017 17:26 > To: Dylan McKay <me at dylanmckay.io> > Cc: llvm-dev <llvm-dev at lists.llvm.org>; Carl Peto <carl.peto at me.com> > Subject: Re: [llvm-dev] RFC: Harvard architectures and default address > spaces > > On 11 Jul 2017, at 23:18, Dylan McKay via llvm-dev <llvm-dev at lists.llvm.org> > wrote: > > > > > Add this information to DataLayout and to use that information in > relevant places. > > > > This sounds like a much better/cleaner idea, thanks! > > I’d suggest taking a look at the alloca address space changes, which were > recently added based on a cleaned-up version of our code. We have a similar > issue (function and data pointers have the same representation for us, but > casting requires different handling[1]) and have considered adding address > spaces to functions. > > David > > [1] Probably not relevant for this discussion, but if anyone cares: in our world > we have 128-bit fat pointers contain base, bounds and permissions, and that > 64-bit pointers that are implicitly relative to one of two special fat pointer > registers, one for code and one for data. We must therefore handle 64-bit to > 128-bit pointer casts differently depending on whether we’re casting code or > data pointers. We currently do this with some fairly ugly hacks, but being > able to put all functions in a different AS would make this much easier for us. > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Hal Finkel via llvm-dev
2017-Jul-13 14:00 UTC
[llvm-dev] RFC: Harvard architectures and default address spaces
On 07/13/2017 05:38 AM, Björn Pettersson A via llvm-dev wrote:> My experience of having the address space for functions (or function pointers) in the DataLayout i that when the .ll file is parsed we need to parse the DataLayout before any function declarations. That is needed because we want to attribute the functions with correct address space (according to DataLayout) when inserting them in the symbol table. An alternative would be to update address space info for functions after having parsed the DataLayout. > > Is the DataLayout normally used when parsing the .ll file (or .bc)? Or would this be the first case of doing that? > > Is it guaranteed that DataLayout is specified/parsed before function declaration, or that the DataLayout specification is context sensitive and only is valid for the following declarations?The DataLayout is a required part of the .ll/.bc file. In the .ll file (*), it's the part of the module that looks like this: target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" it is global to the entire module and always available. (*) It is true that you can write tests without specifying one of these, but in such cases, you just get the builtin default. For all real cases you'll need to have a target-appropriate DataLayout string.> > What if there are several address spaces for functions? Or is that a silly thing that no one ever will use? Having the address space specified in the DataLayout would be insufficient, since we would need to attribute the functions separately, right? > > I do not say that having the info in the DataLayout is a totally bad idea (since our out-of-tree target is using that trick), but I think it might impose some problems as well. And perhaps it isn't the most general solution.If different functions might be in different address spaces, you'll need some other mechanism to set the address space (as a single default won't suffice). You might use source-level function attributes, for example. -Hal> > /Björn > >> -----Original Message----- >> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of David >> Chisnall via llvm-dev >> Sent: den 12 juli 2017 17:26 >> To: Dylan McKay <me at dylanmckay.io> >> Cc: llvm-dev <llvm-dev at lists.llvm.org>; Carl Peto <carl.peto at me.com> >> Subject: Re: [llvm-dev] RFC: Harvard architectures and default address >> spaces >> >> On 11 Jul 2017, at 23:18, Dylan McKay via llvm-dev <llvm-dev at lists.llvm.org> >> wrote: >>>> Add this information to DataLayout and to use that information in >> relevant places. >>> This sounds like a much better/cleaner idea, thanks! >> I’d suggest taking a look at the alloca address space changes, which were >> recently added based on a cleaned-up version of our code. We have a similar >> issue (function and data pointers have the same representation for us, but >> casting requires different handling[1]) and have considered adding address >> spaces to functions. >> >> David >> >> [1] Probably not relevant for this discussion, but if anyone cares: in our world >> we have 128-bit fat pointers contain base, bounds and permissions, and that >> 64-bit pointers that are implicitly relative to one of two special fat pointer >> registers, one for code and one for data. We must therefore handle 64-bit to >> 128-bit pointer casts differently depending on whether we’re casting code or >> data pointers. We currently do this with some fairly ugly hacks, but being >> able to put all functions in a different AS would make this much easier for us. >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory
Björn Pettersson A via llvm-dev
2017-Jul-13 17:25 UTC
[llvm-dev] RFC: Harvard architectures and default address spaces
> -----Original Message----- > From: Hal Finkel [mailto:hfinkel at anl.gov] > Sent: den 13 juli 2017 16:01 > To: Björn Pettersson A <bjorn.a.pettersson at ericsson.com>; David Chisnall > <David.Chisnall at cl.cam.ac.uk>; Dylan McKay <me at dylanmckay.io> > Cc: llvm-dev at lists.llvm.org; Carl Peto <carl.peto at me.com> > Subject: Re: [llvm-dev] RFC: Harvard architectures and default address > spaces > > On 07/13/2017 05:38 AM, Björn Pettersson A via llvm-dev wrote: > > My experience of having the address space for functions (or function > pointers) in the DataLayout i that when the .ll file is parsed we need to parse > the DataLayout before any function declarations. That is needed because we > want to attribute the functions with correct address space (according to > DataLayout) when inserting them in the symbol table. An alternative would > be to update address space info for functions after having parsed the > DataLayout. > > > > Is the DataLayout normally used when parsing the .ll file (or .bc)? Or would > this be the first case of doing that? > > > > Is it guaranteed that DataLayout is specified/parsed before function > declaration, or that the DataLayout specification is context sensitive and only > is valid for the following declarations? > > The DataLayout is a required part of the .ll/.bc file. In the .ll file > (*), it's the part of the module that looks like this: > > target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" > > it is global to the entire module and always available.My point was that the DataLayout isn't available inside the LLParser and the BitcodeReader, up until the point when it has been parsed. So I would not say "always available". Both LLParser and BitcodeReader is for example using getAddressSpace(), both directly and maybe also indirectly through different interfaces (perhaps not on function pointers though). My concern is that maybe the incorrect address space will be used while parsing, and then it might be hard to find all places to fixup at a later stage. when having parsed the DataLayout and finding out what the default address space really is. Or if there is some undocumented(?) rule that the DataLayout always comes before function declarations in the ll/bc file, then all functions can get the default address space attribute directly (as indicated by the DataLayout) when being parsed. I think the whole idea from Dylan was to do this as a fixup after LLParser/BitcodeReader. I.e not trying to lookup a functions address space already when parsing the function declaration. So then the rule would be - do not use Pointer<Function>::getAddressSpace(), or PointerType::get() etc. during ll/bc parsing because it might give the wrong result. Maybe it is possible to assert on that? We could of course give functions some kind of undefined address space value when parsing ll/bc and adding functions to the symbol table. That might help us catch situations when someone tries to fetch the address space for a function pointer before the set-default-address-space-as-given-by-datalayout-on-all-functions pass has executed.> > (*) It is true that you can write tests without specifying one of these, > but in such cases, you just get the builtin default. For all real cases > you'll need to have a target-appropriate DataLayout string. > > > > > What if there are several address spaces for functions? Or is that a silly > thing that no one ever will use? Having the address space specified in the > DataLayout would be insufficient, since we would need to attribute the > functions separately, right? > > > > I do not say that having the info in the DataLayout is a totally bad idea (since > our out-of-tree target is using that trick), but I think it might impose some > problems as well. And perhaps it isn't the most general solution. > > If different functions might be in different address spaces, you'll need > some other mechanism to set the address space (as a single default won't > suffice). You might use source-level function attributes, for example. > > -Hal > > > > > /Björn > > > >> -----Original Message----- > >> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of > David > >> Chisnall via llvm-dev > >> Sent: den 12 juli 2017 17:26 > >> To: Dylan McKay <me at dylanmckay.io> > >> Cc: llvm-dev <llvm-dev at lists.llvm.org>; Carl Peto <carl.peto at me.com> > >> Subject: Re: [llvm-dev] RFC: Harvard architectures and default address > >> spaces > >> > >> On 11 Jul 2017, at 23:18, Dylan McKay via llvm-dev <llvm- > dev at lists.llvm.org> > >> wrote: > >>>> Add this information to DataLayout and to use that information in > >> relevant places. > >>> This sounds like a much better/cleaner idea, thanks! > >> I’d suggest taking a look at the alloca address space changes, which were > >> recently added based on a cleaned-up version of our code. We have a > similar > >> issue (function and data pointers have the same representation for us, > but > >> casting requires different handling[1]) and have considered adding > address > >> spaces to functions. > >> > >> David > >> > >> [1] Probably not relevant for this discussion, but if anyone cares: in our > world > >> we have 128-bit fat pointers contain base, bounds and permissions, and > that > >> 64-bit pointers that are implicitly relative to one of two special fat pointer > >> registers, one for code and one for data. We must therefore handle 64- > bit to > >> 128-bit pointer casts differently depending on whether we’re casting > code or > >> data pointers. We currently do this with some fairly ugly hacks, but being > >> able to put all functions in a different AS would make this much easier for > us. > >> _______________________________________________ > >> LLVM Developers mailing list > >> llvm-dev at lists.llvm.org > >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > -- > Hal Finkel > Lead, Compiler Technology and Programming Languages > Leadership Computing Facility > Argonne National Laboratory