Villmow, Micah
2012-Aug-27 15:25 UTC
[LLVMdev] FW: RFC: Supporting different sized address space arithmetic
Most likely this code was added before getSExtOrTruncate was added, but not 100% sure. It seems to assume that no pointer can be more than 64bits in size.> -----Original Message----- > From: Hal Finkel [mailto:hfinkel at anl.gov] > Sent: Friday, August 24, 2012 4:27 PM > To: Villmow, Micah > Cc: LLVM Developers Mail > Subject: Re: [LLVMdev] FW: RFC: Supporting different sized address > space arithmetic > > Micah, > > There are a number of variable names in this patch that don't follow > the naming convention (which specifies that they should start with an > uppercase letter). > > > if (PtrBits < 64) > > - OffsVal = DAG.getNode(ISD::TRUNCATE, getCurDebugLoc(), > > - TLI.getPointerTy(), > > + OffsVal = DAG.getNode(ISD::TRUNCATE, getCurDebugLoc(), > > PtrTy, DAG.getConstant(Offs, MVT::i64)); > > This is not necessarily specific to your patch, but can you explain the > special role of 64-bits here? > > Thanks again, > Hal > > On Fri, 24 Aug 2012 23:07:54 +0000 > "Villmow, Micah" <Micah.Villmow at amd.com> wrote: > > > > > > > > -----Original Message----- > > > From: Villmow, Micah > > > Sent: Friday, August 24, 2012 2:56 PM > > > To: 'Eli Friedman' > > > Cc: LLVM Developers Mailing List > > > Subject: RE: [LLVMdev] RFC: Supporting different sized address > space > > > arithmetic > > > > > > Eli, > > > There is a patch that implements the beginning what I think is the > > > correct approach to support the backend overloading the size of the > > > pointer for an address space. > > > > > > Does this seem like the right way to do this? > > > > > > I'm guessing I need to handle a few more nodes(IntToPtr/PtrToInt) > > > and atomics, but this already works for what we want to do. > > > > > > Micah > > > > > > > -----Original Message----- > > > > From: Eli Friedman [mailto:eli.friedman at gmail.com] > > > > Sent: Friday, August 17, 2012 3:48 PM > > > > To: Villmow, Micah > > > > Cc: LLVM Developers Mailing List > > > > Subject: Re: [LLVMdev] RFC: Supporting different sized address > > > > space arithmetic > > > > > > > > On Fri, Aug 17, 2012 at 3:35 PM, Villmow, Micah > > > > <Micah.Villmow at amd.com> > > > > wrote: > > > > > > > > > > > > > > >> -----Original Message----- > > > > >> From: Eli Friedman [mailto:eli.friedman at gmail.com] > > > > >> Sent: Friday, August 17, 2012 3:16 PM > > > > >> To: Villmow, Micah > > > > >> Cc: LLVM Developers Mailing List > > > > >> Subject: Re: [LLVMdev] RFC: Supporting different sized address > > > > >> space arithmetic > > > > >> > > > > >> On Fri, Aug 17, 2012 at 2:53 PM, Villmow, Micah > > > > >> <Micah.Villmow at amd.com> > > > > >> wrote: > > > > >> > Currently LLVM only supports address calculations for a > > > > >> > single address space(the default). This is problematic when > > > > >> > the device pointer type is 32/64bits, but there are small > > > > >> > distinct memories that only have 16 bits of addressing(think > > > > >> > GPU's with small software controlled memory > > > > >> segments). > > > > >> > > > > > >> > > > > > >> > > > > > >> > I am proposing a modification to a few API's in SelectionDAG > > > > >> > that I believe will fix this problem. > > > > >> > > > > > >> > > > > > >> > > > > > >> > In TargetLowering > > > > >> > > > > > >> > * Add getPointerTy to take an argument, the address > > > space, > > > > >> this > > > > >> > defaults to 0. > > > > >> > > > > > >> > * Add a virtual API call that returns the default > > > > >> > address > > > > >> space for > > > > >> > the target, this defaults to returning 0. > > > > >> > > > > >> Under what circumstances would the default address space for a > > > > >> target not be 0? > > > > > [Villmow, Micah] In OpenCL, the default address space is the > > > > > private address space. This is closer to TLS in the X86 world, > > > > > whereas the llvm default address space of 0, is closer to the > > > > > global address space in OpenCL. For our GPU's when supporting > > > > > 64bit pointers, there is a huge drop(think ~60%) in LDS > > > > > bandwidth performance because of the extra computation 64bit > > > > > pointers requires. What would be ideal is that our default > > > > > would be treated as global(AS 1) and be 64bit pointers, > > > > the private(AS 0) would be 32bit pointers and constant(AS 2) and > > > > local(AS 3) address spaces can be considered as 16bit pointers. > > > > This would allow us to highly optimize our memory access and only > > > > pay for the 64bit pointer addressing costs where it is required > > > > by hardware. So we want the default address space to be the > > > > global address space for all computations the use the pointer > > > > type, but on non-OpenCL systems, this behavior might not be > > > > warranted. > > > > > > > > Okay... that mostly makes sense. (Although it sounds like > > > > long-term, you'd want to eliminate all uses of > > > > getDefaultAddressSpace from target- independent code.) > > > > > > > > >> > * Have the original getPointerTy implementation with > > > > >> > no > > > > >> arguments > > > > >> > query for the default address space. > > > > >> > > > > > >> > In SelectionDAG > > > > >> > > > > > >> > * Add a new API to getIntPtrConstant that takes an > > > address > > > > >> space as > > > > >> > the second argument > > > > >> > > > > >> Do we actually need this API? Normally, if you need a > > > > >> pointer-sized constant, it means you have a value of pointer > > > > >> type somewhere nearby; it's probably easier to just use the > > > > >> existing API which takes a constant and a type than to try to > > > > >> dig the address space out of a memory operation. > > > > > [Villmow, Micah] This is only there so that the current > > > > > implementation does not change and break anything. Instead of > > > > > changing all of the code to the new API, I leave the code alone > > > > > and only change the > > > > locations that need the new functionality. > > > > > > > > I understand that part; it just seems like nobody will actually > > > > end up using the two-argument form getIntPtrConstant(), in favor > > > > of just using getConstant(). > > > > > > > > >> > * Modify the implementation of the original > > > > getIntPtrConstant > > > > >> > function to call the new function with > > > > >> > getDefaultAddressSpace(). > > > > >> > > > > > >> > As far as I can tell, this should not affect any backends > > > > >> > behavior, but will allow the targets with disjoint address > > > > >> > spaces to directly address them in the most efficient > manner. > > > > >> > > > > > >> > > > > > >> > > > > > >> > So, what do you think? Is this the correct approach? Or does > > > > >> > it actually require more fundamental changes. > > > > >> > > > > >> This is the right direction; I don't think you need anything > > > > >> else, at least not in the SelectionDAG. > > > > >> > > > > >> Are you planning on introducing an in-tree target that > actually > > > > >> uses this functionality? I'm afraid it'll bitrot without good > > > testing. > > > > > [Villmow, Micah] We have the AMDIL branch that this will be > > > > > added > > > to. > > > > > > > > That's good. > > > > > > > > -Eli > > > > > > -- > Hal Finkel > Postdoctoral Appointee > Leadership Computing Facility > Argonne National Laboratory
Hal Finkel
2012-Aug-27 15:37 UTC
[LLVMdev] FW: RFC: Supporting different sized address space arithmetic
On Mon, 27 Aug 2012 15:25:50 +0000 "Villmow, Micah" <Micah.Villmow at amd.com> wrote:> Most likely this code was added before getSExtOrTruncate was added, > but not 100% sure. It seems to assume that no pointer can be more > than 64bits in size.Does LLVM generally support pointers of greater than 64 bits? -Hal> > > -----Original Message----- > > From: Hal Finkel [mailto:hfinkel at anl.gov] > > Sent: Friday, August 24, 2012 4:27 PM > > To: Villmow, Micah > > Cc: LLVM Developers Mail > > Subject: Re: [LLVMdev] FW: RFC: Supporting different sized address > > space arithmetic > > > > Micah, > > > > There are a number of variable names in this patch that don't follow > > the naming convention (which specifies that they should start with > > an uppercase letter). > > > > > if (PtrBits < 64) > > > - OffsVal = DAG.getNode(ISD::TRUNCATE, getCurDebugLoc(), > > > - TLI.getPointerTy(), > > > + OffsVal = DAG.getNode(ISD::TRUNCATE, getCurDebugLoc(), > > > PtrTy, DAG.getConstant(Offs, MVT::i64)); > > > > This is not necessarily specific to your patch, but can you explain > > the special role of 64-bits here? > > > > Thanks again, > > Hal > > > > On Fri, 24 Aug 2012 23:07:54 +0000 > > "Villmow, Micah" <Micah.Villmow at amd.com> wrote: > > > > > > > > > > > > -----Original Message----- > > > > From: Villmow, Micah > > > > Sent: Friday, August 24, 2012 2:56 PM > > > > To: 'Eli Friedman' > > > > Cc: LLVM Developers Mailing List > > > > Subject: RE: [LLVMdev] RFC: Supporting different sized address > > space > > > > arithmetic > > > > > > > > Eli, > > > > There is a patch that implements the beginning what I think is > > > > the correct approach to support the backend overloading the > > > > size of the pointer for an address space. > > > > > > > > Does this seem like the right way to do this? > > > > > > > > I'm guessing I need to handle a few more > > > > nodes(IntToPtr/PtrToInt) and atomics, but this already works > > > > for what we want to do. > > > > > > > > Micah > > > > > > > > > -----Original Message----- > > > > > From: Eli Friedman [mailto:eli.friedman at gmail.com] > > > > > Sent: Friday, August 17, 2012 3:48 PM > > > > > To: Villmow, Micah > > > > > Cc: LLVM Developers Mailing List > > > > > Subject: Re: [LLVMdev] RFC: Supporting different sized address > > > > > space arithmetic > > > > > > > > > > On Fri, Aug 17, 2012 at 3:35 PM, Villmow, Micah > > > > > <Micah.Villmow at amd.com> > > > > > wrote: > > > > > > > > > > > > > > > > > >> -----Original Message----- > > > > > >> From: Eli Friedman [mailto:eli.friedman at gmail.com] > > > > > >> Sent: Friday, August 17, 2012 3:16 PM > > > > > >> To: Villmow, Micah > > > > > >> Cc: LLVM Developers Mailing List > > > > > >> Subject: Re: [LLVMdev] RFC: Supporting different sized > > > > > >> address space arithmetic > > > > > >> > > > > > >> On Fri, Aug 17, 2012 at 2:53 PM, Villmow, Micah > > > > > >> <Micah.Villmow at amd.com> > > > > > >> wrote: > > > > > >> > Currently LLVM only supports address calculations for a > > > > > >> > single address space(the default). This is problematic > > > > > >> > when the device pointer type is 32/64bits, but there are > > > > > >> > small distinct memories that only have 16 bits of > > > > > >> > addressing(think GPU's with small software controlled > > > > > >> > memory > > > > > >> segments). > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > I am proposing a modification to a few API's in > > > > > >> > SelectionDAG that I believe will fix this problem. > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > In TargetLowering > > > > > >> > > > > > > >> > * Add getPointerTy to take an argument, the > > > > > >> > address > > > > space, > > > > > >> this > > > > > >> > defaults to 0. > > > > > >> > > > > > > >> > * Add a virtual API call that returns the default > > > > > >> > address > > > > > >> space for > > > > > >> > the target, this defaults to returning 0. > > > > > >> > > > > > >> Under what circumstances would the default address space > > > > > >> for a target not be 0? > > > > > > [Villmow, Micah] In OpenCL, the default address space is the > > > > > > private address space. This is closer to TLS in the X86 > > > > > > world, whereas the llvm default address space of 0, is > > > > > > closer to the global address space in OpenCL. For our > > > > > > GPU's when supporting 64bit pointers, there is a huge > > > > > > drop(think ~60%) in LDS bandwidth performance because of > > > > > > the extra computation 64bit pointers requires. What would > > > > > > be ideal is that our default would be treated as global(AS > > > > > > 1) and be 64bit pointers, > > > > > the private(AS 0) would be 32bit pointers and constant(AS 2) > > > > > and local(AS 3) address spaces can be considered as 16bit > > > > > pointers. This would allow us to highly optimize our memory > > > > > access and only pay for the 64bit pointer addressing costs > > > > > where it is required by hardware. So we want the default > > > > > address space to be the global address space for all > > > > > computations the use the pointer type, but on non-OpenCL > > > > > systems, this behavior might not be warranted. > > > > > > > > > > Okay... that mostly makes sense. (Although it sounds like > > > > > long-term, you'd want to eliminate all uses of > > > > > getDefaultAddressSpace from target- independent code.) > > > > > > > > > > >> > * Have the original getPointerTy implementation > > > > > >> > with no > > > > > >> arguments > > > > > >> > query for the default address space. > > > > > >> > > > > > > >> > In SelectionDAG > > > > > >> > > > > > > >> > * Add a new API to getIntPtrConstant that takes > > > > > >> > an > > > > address > > > > > >> space as > > > > > >> > the second argument > > > > > >> > > > > > >> Do we actually need this API? Normally, if you need a > > > > > >> pointer-sized constant, it means you have a value of > > > > > >> pointer type somewhere nearby; it's probably easier to > > > > > >> just use the existing API which takes a constant and a > > > > > >> type than to try to dig the address space out of a memory > > > > > >> operation. > > > > > > [Villmow, Micah] This is only there so that the current > > > > > > implementation does not change and break anything. Instead > > > > > > of changing all of the code to the new API, I leave the > > > > > > code alone and only change the > > > > > locations that need the new functionality. > > > > > > > > > > I understand that part; it just seems like nobody will > > > > > actually end up using the two-argument form > > > > > getIntPtrConstant(), in favor of just using getConstant(). > > > > > > > > > > >> > * Modify the implementation of the original > > > > > getIntPtrConstant > > > > > >> > function to call the new function with > > > > > >> > getDefaultAddressSpace(). > > > > > >> > > > > > > >> > As far as I can tell, this should not affect any backends > > > > > >> > behavior, but will allow the targets with disjoint > > > > > >> > address spaces to directly address them in the most > > > > > >> > efficient > > manner. > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > So, what do you think? Is this the correct approach? Or > > > > > >> > does it actually require more fundamental changes. > > > > > >> > > > > > >> This is the right direction; I don't think you need > > > > > >> anything else, at least not in the SelectionDAG. > > > > > >> > > > > > >> Are you planning on introducing an in-tree target that > > actually > > > > > >> uses this functionality? I'm afraid it'll bitrot without > > > > > >> good > > > > testing. > > > > > > [Villmow, Micah] We have the AMDIL branch that this will be > > > > > > added > > > > to. > > > > > > > > > > That's good. > > > > > > > > > > -Eli > > > > > > > > > > > -- > > Hal Finkel > > Postdoctoral Appointee > > Leadership Computing Facility > > Argonne National Laboratory > >-- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory
Villmow, Micah
2012-Aug-27 15:45 UTC
[LLVMdev] FW: RFC: Supporting different sized address space arithmetic
> -----Original Message----- > From: Hal Finkel [mailto:hfinkel at anl.gov] > Sent: Monday, August 27, 2012 8:38 AM > To: Villmow, Micah > Cc: LLVM Developers Mail > Subject: Re: [LLVMdev] FW: RFC: Supporting different sized address > space arithmetic > > On Mon, 27 Aug 2012 15:25:50 +0000 > "Villmow, Micah" <Micah.Villmow at amd.com> wrote: > > > Most likely this code was added before getSExtOrTruncate was added, > > but not 100% sure. It seems to assume that no pointer can be more > > than 64bits in size. > > Does LLVM generally support pointers of greater than 64 bits?[Villmow, Micah] I really don't see why it couldn't, but there might be issues that exist since I don't think any backends do.> > -Hal > > > > > > -----Original Message----- > > > From: Hal Finkel [mailto:hfinkel at anl.gov] > > > Sent: Friday, August 24, 2012 4:27 PM > > > To: Villmow, Micah > > > Cc: LLVM Developers Mail > > > Subject: Re: [LLVMdev] FW: RFC: Supporting different sized address > > > space arithmetic > > > > > > Micah, > > > > > > There are a number of variable names in this patch that don't > follow > > > the naming convention (which specifies that they should start with > > > an uppercase letter). > > > > > > > if (PtrBits < 64) > > > > - OffsVal = DAG.getNode(ISD::TRUNCATE, getCurDebugLoc(), > > > > - TLI.getPointerTy(), > > > > + OffsVal = DAG.getNode(ISD::TRUNCATE, getCurDebugLoc(), > > > > PtrTy, DAG.getConstant(Offs, MVT::i64)); > > > > > > This is not necessarily specific to your patch, but can you explain > > > the special role of 64-bits here? > > > > > > Thanks again, > > > Hal > > > > > > On Fri, 24 Aug 2012 23:07:54 +0000 > > > "Villmow, Micah" <Micah.Villmow at amd.com> wrote: > > > > > > > > > > > > > > > > -----Original Message----- > > > > > From: Villmow, Micah > > > > > Sent: Friday, August 24, 2012 2:56 PM > > > > > To: 'Eli Friedman' > > > > > Cc: LLVM Developers Mailing List > > > > > Subject: RE: [LLVMdev] RFC: Supporting different sized address > > > space > > > > > arithmetic > > > > > > > > > > Eli, > > > > > There is a patch that implements the beginning what I think is > > > > > the correct approach to support the backend overloading the > > > > > size of the pointer for an address space. > > > > > > > > > > Does this seem like the right way to do this? > > > > > > > > > > I'm guessing I need to handle a few more > > > > > nodes(IntToPtr/PtrToInt) and atomics, but this already works > > > > > for what we want to do. > > > > > > > > > > Micah > > > > > > > > > > > -----Original Message----- > > > > > > From: Eli Friedman [mailto:eli.friedman at gmail.com] > > > > > > Sent: Friday, August 17, 2012 3:48 PM > > > > > > To: Villmow, Micah > > > > > > Cc: LLVM Developers Mailing List > > > > > > Subject: Re: [LLVMdev] RFC: Supporting different sized > address > > > > > > space arithmetic > > > > > > > > > > > > On Fri, Aug 17, 2012 at 3:35 PM, Villmow, Micah > > > > > > <Micah.Villmow at amd.com> > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > >> -----Original Message----- > > > > > > >> From: Eli Friedman [mailto:eli.friedman at gmail.com] > > > > > > >> Sent: Friday, August 17, 2012 3:16 PM > > > > > > >> To: Villmow, Micah > > > > > > >> Cc: LLVM Developers Mailing List > > > > > > >> Subject: Re: [LLVMdev] RFC: Supporting different sized > > > > > > >> address space arithmetic > > > > > > >> > > > > > > >> On Fri, Aug 17, 2012 at 2:53 PM, Villmow, Micah > > > > > > >> <Micah.Villmow at amd.com> > > > > > > >> wrote: > > > > > > >> > Currently LLVM only supports address calculations for a > > > > > > >> > single address space(the default). This is problematic > > > > > > >> > when the device pointer type is 32/64bits, but there are > > > > > > >> > small distinct memories that only have 16 bits of > > > > > > >> > addressing(think GPU's with small software controlled > > > > > > >> > memory > > > > > > >> segments). > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > I am proposing a modification to a few API's in > > > > > > >> > SelectionDAG that I believe will fix this problem. > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > In TargetLowering > > > > > > >> > > > > > > > >> > * Add getPointerTy to take an argument, the > > > > > > >> > address > > > > > space, > > > > > > >> this > > > > > > >> > defaults to 0. > > > > > > >> > > > > > > > >> > * Add a virtual API call that returns the > default > > > > > > >> > address > > > > > > >> space for > > > > > > >> > the target, this defaults to returning 0. > > > > > > >> > > > > > > >> Under what circumstances would the default address space > > > > > > >> for a target not be 0? > > > > > > > [Villmow, Micah] In OpenCL, the default address space is > the > > > > > > > private address space. This is closer to TLS in the X86 > > > > > > > world, whereas the llvm default address space of 0, is > > > > > > > closer to the global address space in OpenCL. For our > > > > > > > GPU's when supporting 64bit pointers, there is a huge > > > > > > > drop(think ~60%) in LDS bandwidth performance because of > > > > > > > the extra computation 64bit pointers requires. What would > > > > > > > be ideal is that our default would be treated as global(AS > > > > > > > 1) and be 64bit pointers, > > > > > > the private(AS 0) would be 32bit pointers and constant(AS 2) > > > > > > and local(AS 3) address spaces can be considered as 16bit > > > > > > pointers. This would allow us to highly optimize our memory > > > > > > access and only pay for the 64bit pointer addressing costs > > > > > > where it is required by hardware. So we want the default > > > > > > address space to be the global address space for all > > > > > > computations the use the pointer type, but on non-OpenCL > > > > > > systems, this behavior might not be warranted. > > > > > > > > > > > > Okay... that mostly makes sense. (Although it sounds like > > > > > > long-term, you'd want to eliminate all uses of > > > > > > getDefaultAddressSpace from target- independent code.) > > > > > > > > > > > > >> > * Have the original getPointerTy implementation > > > > > > >> > with no > > > > > > >> arguments > > > > > > >> > query for the default address space. > > > > > > >> > > > > > > > >> > In SelectionDAG > > > > > > >> > > > > > > > >> > * Add a new API to getIntPtrConstant that takes > > > > > > >> > an > > > > > address > > > > > > >> space as > > > > > > >> > the second argument > > > > > > >> > > > > > > >> Do we actually need this API? Normally, if you need a > > > > > > >> pointer-sized constant, it means you have a value of > > > > > > >> pointer type somewhere nearby; it's probably easier to > > > > > > >> just use the existing API which takes a constant and a > > > > > > >> type than to try to dig the address space out of a memory > > > > > > >> operation. > > > > > > > [Villmow, Micah] This is only there so that the current > > > > > > > implementation does not change and break anything. Instead > > > > > > > of changing all of the code to the new API, I leave the > > > > > > > code alone and only change the > > > > > > locations that need the new functionality. > > > > > > > > > > > > I understand that part; it just seems like nobody will > > > > > > actually end up using the two-argument form > > > > > > getIntPtrConstant(), in favor of just using getConstant(). > > > > > > > > > > > > >> > * Modify the implementation of the original > > > > > > getIntPtrConstant > > > > > > >> > function to call the new function with > > > > > > >> > getDefaultAddressSpace(). > > > > > > >> > > > > > > > >> > As far as I can tell, this should not affect any > backends > > > > > > >> > behavior, but will allow the targets with disjoint > > > > > > >> > address spaces to directly address them in the most > > > > > > >> > efficient > > > manner. > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > So, what do you think? Is this the correct approach? Or > > > > > > >> > does it actually require more fundamental changes. > > > > > > >> > > > > > > >> This is the right direction; I don't think you need > > > > > > >> anything else, at least not in the SelectionDAG. > > > > > > >> > > > > > > >> Are you planning on introducing an in-tree target that > > > actually > > > > > > >> uses this functionality? I'm afraid it'll bitrot without > > > > > > >> good > > > > > testing. > > > > > > > [Villmow, Micah] We have the AMDIL branch that this will be > > > > > > > added > > > > > to. > > > > > > > > > > > > That's good. > > > > > > > > > > > > -Eli > > > > > > > > > > > > > > > > -- > > > Hal Finkel > > > Postdoctoral Appointee > > > Leadership Computing Facility > > > Argonne National Laboratory > > > > > > > > -- > Hal Finkel > Postdoctoral Appointee > Leadership Computing Facility > Argonne National Laboratory
Maybe Matching Threads
- [LLVMdev] FW: RFC: Supporting different sized address space arithmetic
- [LLVMdev] FW: RFC: Supporting different sized address space arithmetic
- [LLVMdev] FW: RFC: Supporting different sized address space arithmetic
- [LLVMdev] RFC: Supporting different sized address space arithmetic
- [LLVMdev] FW: RFC: Supporting different sized address space arithmetic