On 16 Nov 2016, at 21:56, Daniel Berlin <dberlin at dberlin.org> wrote:> > You keep talking about platforms, but llvm ir itself is not platform dependent. > Can you give a reference in the language reference that says that this is not legal?Nothing in the LangRef (apart from the note about non-integral pointers, which was added recently) makes any claim about the representation of pointers. Pointers in LLVM IR have always been opaque and must explicitly be bitcast or inttoptr / ptrtoint cast to be used as if they were integers. We have had discussions on the list previously about tightening up the semantics of inttoptr and ptrtoint.> IE what loads do *on your platform* is completely irrelevant to whether the IR code is legal or not, only what it codegens to. > > LLVM's type semantics (and pointers may not have types, but the load operations produce values that do) are also not defined in terms of platform, but in terms of what datalayout says, etc.GVN is materialising loads that go beyond the bounds of an object. This is undefined behaviour in C and there is nothing in the LangRef that indicates that this should be valid. It is only potentially valid because, on platforms with a page-based MMU as the sole form of memory protection, if you only round up to a power of two then you will still be in the same page (and, likely, cache line) so you will get some unspecified data and can ignore it.> What you want seems to be non-integral pointer types. > > Which are experimental: > "LLVM IR optionally allows the frontend to denote pointers in certain address spaces as “non-integral” via the datalayout string. Non-integral pointer types represent pointers that have an unspecified bitwise representation; that is, the integral representation may be target dependent or unstable (not backed by a fixed integer). > inttoptr instructions converting integers to non-integral pointer types are ill-typed, and so are ptrtoint instructions converting values of non-integral pointer types to integers. Vector versions of said instructions are ill-typed as well." > > One of the reasons it's experimental is because nobody has made it work in all cases. > I think whoever wants this to work is going to have to drive fixing it and making it work sanely.Actually, that isn’t what I want, because we do define inttoptr and ptrtoint for our architecture. You can’t implement C without them (or some equivalent) working and we have a fully working C / Objective-C compiler (C++ in progress) using LLVM. ptrtoint is always valid for us, inttoptr may give null depending on the ABI and environment. I gave a talk in the LLVM track at FOSDEM a couple of years ago about the things that are needed to make LLVM work correctly for targets where integers are not pointers. We have done most of this work, but it is not helped by people propagating the ‘integers are pointers’ assumption (which the LangRef has always been *very* careful not to state) in passes. David
Very nice to see it! Piotr 2016-11-17 9:31 GMT+01:00 David Chisnall via llvm-dev < llvm-dev at lists.llvm.org>:> On 16 Nov 2016, at 21:56, Daniel Berlin <dberlin at dberlin.org> wrote: > > > > You keep talking about platforms, but llvm ir itself is not platform > dependent. > > Can you give a reference in the language reference that says that this > is not legal? > > Nothing in the LangRef (apart from the note about non-integral pointers, > which was added recently) makes any claim about the representation of > pointers. Pointers in LLVM IR have always been opaque and must explicitly > be bitcast or inttoptr / ptrtoint cast to be used as if they were integers. > > We have had discussions on the list previously about tightening up the > semantics of inttoptr and ptrtoint. > > > IE what loads do *on your platform* is completely irrelevant to whether > the IR code is legal or not, only what it codegens to. > > > > LLVM's type semantics (and pointers may not have types, but the load > operations produce values that do) are also not defined in terms of > platform, but in terms of what datalayout says, etc. > > GVN is materialising loads that go beyond the bounds of an object. This > is undefined behaviour in C and there is nothing in the LangRef that > indicates that this should be valid. It is only potentially valid because, > on platforms with a page-based MMU as the sole form of memory protection, > if you only round up to a power of two then you will still be in the same > page (and, likely, cache line) so you will get some unspecified data and > can ignore it. > > > What you want seems to be non-integral pointer types. > > > > Which are experimental: > > "LLVM IR optionally allows the frontend to denote pointers in certain > address spaces as “non-integral” via the datalayout string. Non-integral > pointer types represent pointers that have an unspecified bitwise > representation; that is, the integral representation may be target > dependent or unstable (not backed by a fixed integer). > > inttoptr instructions converting integers to non-integral pointer types > are ill-typed, and so are ptrtoint instructions converting values of > non-integral pointer types to integers. Vector versions of said > instructions are ill-typed as well." > > > > One of the reasons it's experimental is because nobody has made it work > in all cases. > > I think whoever wants this to work is going to have to drive fixing it > and making it work sanely. > > Actually, that isn’t what I want, because we do define inttoptr and > ptrtoint for our architecture. You can’t implement C without them (or some > equivalent) working and we have a fully working C / Objective-C compiler > (C++ in progress) using LLVM. ptrtoint is always valid for us, inttoptr > may give null depending on the ABI and environment. > > I gave a talk in the LLVM track at FOSDEM a couple of years ago about the > things that are needed to make LLVM work correctly for targets where > integers are not pointers. We have done most of this work, but it is not > helped by people propagating the ‘integers are pointers’ assumption (which > the LangRef has always been *very* careful not to state) in passes. > > David > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161117/f78e8dea/attachment.html>
On Thu, Nov 17, 2016 at 12:31 AM, David Chisnall <David.Chisnall at cl.cam.ac.uk> wrote:> On 16 Nov 2016, at 21:56, Daniel Berlin <dberlin at dberlin.org> wrote: >> >> You keep talking about platforms, but llvm ir itself is not platform dependent. >> Can you give a reference in the language reference that says that this is not legal? > > Nothing in the LangRef (apart from the note about non-integral pointers, which was added recently) makes any claim about the representation of pointers. Pointers in LLVM IR have always been opaque and must explicitly be bitcast or inttoptr / ptrtoint cast to be used as if they were integers. > > We have had discussions on the list previously about tightening up the semantics of inttoptr and ptrtoint. > >> IE what loads do *on your platform* is completely irrelevant to whether the IR code is legal or not, only what it codegens to. >> >> LLVM's type semantics (and pointers may not have types, but the load operations produce values that do) are also not defined in terms of platform, but in terms of what datalayout says, etc. > > GVN is materialising loads that go beyond the bounds of an object. This is undefined behaviour in C and there is nothing in the LangRef that indicates that this should be valid. It is only potentially valid because, on platforms with a page-based MMU as the sole form of memory protection, if you only round up to a power of two then you will still be in the same page (and, likely, cache line) so you will get some unspecified data and can ignore it. > >> What you want seems to be non-integral pointer types. >> >> Which are experimental: >> "LLVM IR optionally allows the frontend to denote pointers in certain address spaces as “non-integral” via the datalayout string. Non-integral pointer types represent pointers that have an unspecified bitwise representation; that is, the integral representation may be target dependent or unstable (not backed by a fixed integer). >> inttoptr instructions converting integers to non-integral pointer types are ill-typed, and so are ptrtoint instructions converting values of non-integral pointer types to integers. Vector versions of said instructions are ill-typed as well." >> >> One of the reasons it's experimental is because nobody has made it work in all cases. >> I think whoever wants this to work is going to have to drive fixing it and making it work sanely. > > Actually, that isn’t what I want, because we do define inttoptr and ptrtoint for our architecture. You can’t implement C without them (or some equivalent) working and we have a fully working C / Objective-C compiler (C++ in progress) using LLVM. ptrtoint is always valid for us, inttoptr may give null depending on the ABI and environment. > > I gave a talk in the LLVM track at FOSDEM a couple of years ago about the things that are needed to make LLVM work correctly for targets where integers are not pointers. We have done most of this work, but it is not helped by people propagating the ‘integers are pointers’ assumption (which the LangRef has always been *very* careful not to state) in passes. >Do you happen to have a link for the talk? We'll try to make sure this works in the new pass. -- Davide "There are no solved problems; there are only problems that are more or less solved" -- Henri Poincare
Davide, Slides and video available at https://archive.fosdem.org/2015/schedule/event/the_cheri_cpu/ Kind regards, Arnaud> -----Original Message----- > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of > Davide Italiano via llvm-dev > Sent: 18 November 2016 03:55 > To: David Chisnall > Cc: llvm-dev > Subject: Re: [llvm-dev] [RFC] NewGVN > > On Thu, Nov 17, 2016 at 12:31 AM, David Chisnall > <David.Chisnall at cl.cam.ac.uk> wrote: > > On 16 Nov 2016, at 21:56, Daniel Berlin <dberlin at dberlin.org> wrote: > >> > >> You keep talking about platforms, but llvm ir itself is not platform > dependent. > >> Can you give a reference in the language reference that says that this is > not legal? > > > > Nothing in the LangRef (apart from the note about non-integral pointers, > which was added recently) makes any claim about the representation of > pointers. Pointers in LLVM IR have always been opaque and must explicitly > be bitcast or inttoptr / ptrtoint cast to be used as if they were integers. > > > > We have had discussions on the list previously about tightening up the > semantics of inttoptr and ptrtoint. > > > >> IE what loads do *on your platform* is completely irrelevant to whether > the IR code is legal or not, only what it codegens to. > >> > >> LLVM's type semantics (and pointers may not have types, but the load > operations produce values that do) are also not defined in terms of platform, > but in terms of what datalayout says, etc. > > > > GVN is materialising loads that go beyond the bounds of an object. This is > undefined behaviour in C and there is nothing in the LangRef that indicates > that this should be valid. It is only potentially valid because, on platforms > with a page-based MMU as the sole form of memory protection, if you only > round up to a power of two then you will still be in the same page (and, > likely, cache line) so you will get some unspecified data and can ignore it. > > > >> What you want seems to be non-integral pointer types. > >> > >> Which are experimental: > >> "LLVM IR optionally allows the frontend to denote pointers in certain > address spaces as “non-integral” via the datalayout string. Non-integral > pointer types represent pointers that have an unspecified bitwise > representation; that is, the integral representation may be target dependent > or unstable (not backed by a fixed integer). > >> inttoptr instructions converting integers to non-integral pointer types are > ill-typed, and so are ptrtoint instructions converting values of non-integral > pointer types to integers. Vector versions of said instructions are ill-typed as > well." > >> > >> One of the reasons it's experimental is because nobody has made it work > in all cases. > >> I think whoever wants this to work is going to have to drive fixing it and > making it work sanely. > > > > Actually, that isn’t what I want, because we do define inttoptr and ptrtoint > for our architecture. You can’t implement C without them (or some > equivalent) working and we have a fully working C / Objective-C compiler > (C++ in progress) using LLVM. ptrtoint is always valid for us, inttoptr may give > null depending on the ABI and environment. > > > > I gave a talk in the LLVM track at FOSDEM a couple of years ago about the > things that are needed to make LLVM work correctly for targets where > integers are not pointers. We have done most of this work, but it is not > helped by people propagating the ‘integers are pointers’ assumption (which > the LangRef has always been *very* careful not to state) in passes. > > > > Do you happen to have a link for the talk? We'll try to make sure this works in > the new pass. > > > -- > Davide > > "There are no solved problems; there are only problems that are more or > less solved" -- Henri Poincare > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev