Mon P Wang
2008-Jul-17 16:48 UTC
[LLVMdev] Casting between address spaces and address space semantics
In ISO/IEC WG14 n1169 on the C extensions to support embedded processors, any two address spaces must be disjoint, must be equivalent, or must be nested. As Eli indicated, the actual relationship is platform specific depending on what makes the most sense for your hardware and how the program will behave will depend on that relationship. -- Mon Ping On Jul 17, 2008, at 7:25 AM, Eli Friedman wrote:> On Thu, Jul 17, 2008 at 5:08 AM, Matthijs Kooijman > <matthijs at stdin.nl> wrote: >> Now, we are using a function which reads a value from one of these >> memories >> and does some processing. Since we want to execute this function >> for multiple >> memories, we make it accept a pointer in the generic address space >> (ie, no >> address space attribute): >> >> void do_stuff(char* mem); > > The "generic address space" you're referring to is really just address > space 0, at least in the current implementation. Assuming alternate > address spaces are actually separate, passing a pointer from a > different address space to this function is undefined behavior. > >> However, this brings me to my actual question: How are address spaces >> semantically defined? I see two options here: >> >> a) Every address space has the full range of addresses and they >> completely >> live side by side. This means that, for example, i32 addrspace(1) * >> 100 points >> to a different piece of memory than i32 addrspace(2) * 100. Also, >> this means >> that a bitcast from one address space to another (possibly 0), >> makes the >> pointer point to something different when loaded. >> >> b) Every address space is really a subspace of the full range of >> addresses, >> but always disjoint. This means that, for example, i32 addrspace(1) >> * 100 >> points to the same memory as i32 addrspace(2) * 100, though one, or >> possibly >> both of them can be invalid (since the pointer lies outside of that >> address >> space). This also means that bitcasting a pointer from one address >> space to >> another doesn't change it's meaning, though it can potentially >> become invalid. > > Address spaces are platform-specific, so you can define them any way > you want, I suppose. But if you can map the alternate address spaces > into address space 0, is there really any point to keeping the other > address spaces around? > > -Eli > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Matthijs Kooijman
2008-Jul-18 21:04 UTC
[LLVMdev] Casting between address spaces and address space semantics
Hi Eli, Mon Ping,> In ISO/IEC WG14 n1169 on the C extensions to support embedded > processors, any two address spaces must be disjoint, must be > equivalent, or must be nested.Ah, that standard is a lot clearer on this subject than the DSP-C one I read was.> As Eli indicated, the actual relationship is platform specific depending on > what makes the most sense for your hardware and how the program will behave > will depend on that relationship.If I read the standard correctly, the properties of these address spaces can be fully captured by defining the relationship between every pair of address spaces (disjoint, identical, subset/superset). I think it would make sense to make these relationships backend/platform specific, but for clang and the optimization passes to properly work with address spaces, these relationships should be known. It seems to me that the proper way to handle this, is to include this information in the TargetData class (or whatever platform-specific class is appropriate, I'm not really into this currently). When this information is known, we can define a "any memory" address space, which is a superset of the specific address spaces. This will allow a function to be (temporarily) defined to work with any memory, while still properly propagating information about the different memories through the program (and preventing mixup, since you can cast to the "any memory" address space, but not back). This requires clang to know about the relationships between address spaces, so it knows when to insert implicit casts and when to error out. This also requires the transformation passes to know about these relationships, so they know when a bitcast can be removed and types propagated. So, is it a plan to make these relationships available to the frontend/transformations in some way? If so, in what way? Gr. Matthijs -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080718/5bf51108/attachment.sig>
Cédric Venet
2008-Jul-18 21:40 UTC
[LLVMdev] Casting between address spaces and address space semantics
Hi,> When this information is known, we can define a "any memory" address > space, which is a superset of the specific address spaces. This willThe two uses I see for memory space have completely separate memory space (even different instruction to access). You can't define an 'any memory space', if your memory space is already mapped on a global memory, why bother to use memory space anyways... [ the two use cas I see are: - on small µC, you have acces to the RAM and to the program Flash as a ROM and sometimes to EEPROM, each with there own address (not event all the same size) and own instructions for load/store. - on one embedded µC I use, the I/O port isn't mapped in the RAM space and use different load/store instructions as RAM ]> allow a function to be (temporarily) defined to work with any memory, > while still properly propagating information about the different > memories through the program (and preventing mixup, since you can cast > to the "any memory" address space, but not back).I think that what you want is a basic form of C++ template allowing a function to take any address space. Like template, these function would be codegened as many time as there is of different (used) combination of address space in there parameter (since load and store are different for each address space). Regards, -- Cédric
Matthijs Kooijman
2008-Jul-21 15:05 UTC
[LLVMdev] Casting between address spaces and address space semantics
Hi all,> If I read the standard correctly, the properties of these address spaces can > be fully captured by defining the relationship between every pair of address > spaces (disjoint, identical, subset/superset). > > I think it would make sense to make these relationships backend/platform > specific, but for clang and the optimization passes to properly work with > address spaces, these relationships should be known. It seems to me that the > proper way to handle this, is to include this information in the TargetData > class (or whatever platform-specific class is appropriate, I'm not really into > this currently).To have something to actually talk about, I thought I'd create some code. I've got a patch attached which consists roughly of two parts: it adds the concept of address space relations to TargetData, and teaches InstructionCombining to remove casts based on those relations. The instcombine is pretty straightforward: If a bitcast changes the address space of a pointer in a way that is redundant, and all of the users of the bitcasts can be rewritten to use the original pointer as well (currently this means loads, stores and geps only), this is done. The TargetData part is a lot more tricky. I originally planned to simply add a virtual method to TargetData and create a specific subclass for our compiler, when I discovered that this is not how TargetData is supposed to work. Currently, TargetData consists only of a TargetDescription, which is basically only sizes and alignment info for different types. I solved this by creating a new class AddrspacesDescriptor, which has a single virtual method that defines the relationship between address spaces. This class contains a default implementation and can be subclassed to get a different implementation. A reference to a AddrspaceDescriptor is passed to TargetData's constructor, with a default. This means that all of the current code uses the default implementation. I'm not so sure what would be the correct way to make the various tool select other configurations, currently. Also, I'm quite positive that TargetData is not really the right place to put something like this. At least not in the way it is currently written, passing around references to AddrspacesDescriptor is far from ideal. However, I do not think we should be trying to define these relations between address spaces in a text-based format similar to the current TargetDescription. In that way, we would either loose a lot of flexibility, or end up with a huge TargetDescription (with one element for each combination of address spaces...). So, any comments on the patch and/or hints on how to do things better? Any thoughts on how selecting a address spaces configuration should work in the different tools? Gr. Matthijs Index: lib/Transforms/Scalar/InstructionCombining.cpp ==================================================================--- lib/Transforms/Scalar/InstructionCombining.cpp (revision 53716) +++ lib/Transforms/Scalar/InstructionCombining.cpp (working copy) @@ -400,6 +400,8 @@ unsigned GetOrEnforceKnownAlignment(Value *V, unsigned PrefAlign = 0); + /// Propagate address spaces through bitcasts + Instruction *PropagateAddressSpace(BitCastInst &CI); }; } @@ -7841,6 +7843,81 @@ return 0; } +/// Try to propagate the address space from the source of the bitcast into all +/// of its uses. CI casts between two pointer types with different address spaces. +Instruction *InstCombiner::PropagateAddressSpace(BitCastInst &CI) { + // Determine the new type. This is the old dest type with the address space + // changed. We can't simply use the source type, since the element type can be + // changed in the cast in addition to the address space. + const PointerType *DestTy = cast<PointerType>(CI.getDestTy()); + const PointerType *SrcTy = cast<PointerType>(CI.getSrcTy()); + const PointerType *NTy = PointerType::get(DestTy->getElementType(), SrcTy->getAddressSpace()); + + // Only propagate the address space when the source and dest address spaces + // are equivalent, or the source address space is a subset of the target + // address space. + AddrspacesDescriptor::AddrspaceRelation Rel; + Rel = TD->getAddrspaceRelation(SrcTy->getAddressSpace(), + DestTy->getAddressSpace()); + if (Rel != AddrspacesDescriptor::Equivalent + && Rel != AddrspacesDescriptor::Subset) + return 0; + + // We can propagate the address space if we can modify the type of CI. This + // requires that any uses of it should be modifiable. For now, we'll just + // propagate if all uses are loads and stores. + for (User::use_iterator I = CI.use_begin(), E = CI.use_end(); I != E; ++I) { + if (isa<StoreInst>(*I)) { + if (I.getOperandNo() != StoreInst::getPointerOperandIndex()) + return 0; // Can only handle store to this pointer + } else if (isa<GetElementPtrInst>(*I)) { + if (I.getOperandNo() != GetElementPtrInst::getPointerOperandIndex()) + return 0; + } else if (!isa<LoadInst>(*I)) { + // Can't handle other instructions + return 0; + } + } + // Insert a new bitcast, if needed + Value *NewVal = InsertBitCastBefore(CI.getOperand(0), NTy, CI); + + for (User::use_iterator I = CI.use_begin(), E = CI.use_end(); I != E; ++I) { + Instruction *NewInst; + Instruction *OldInst = cast<Instruction>(*I); + // Declare this here, so we can reuse the dyn_cast + GetElementPtrInst *GEP = NULL; + // Create a new instruction + if (isa<StoreInst>(*I)) + NewInst = new StoreInst(I->getOperand(0), NewVal); + else if ((GEP = dyn_cast<GetElementPtrInst>(*I))) { + // Insert a new GEP + std::vector<Value*> Indices(GEP->idx_begin(), GEP->idx_end()); + NewInst = GetElementPtrInst::Create(NewVal, Indices.begin(), Indices.end()); + } else // Must be LoadInst + NewInst = new LoadInst(NewVal); + + // Replace the old instruction + InsertNewInstBefore(NewInst, *OldInst); + NewInst->takeName(OldInst); + + // Put this in a Value*, since InsertBitCastBefore could return a Value* + // below. + Value *ReplaceWith = NewInst; + if (GEP) { + // Now insert a bitcast as well. We couldn't do this before, because the + // above two lines don't hold for the GEPs + const Type* GEPTy = PointerType::get(GEP->getType()->getElementType(), DestTy->getAddressSpace()); + ReplaceWith = InsertBitCastBefore(NewInst, GEPTy, *GEP); + } + ReplaceInstUsesWith(*OldInst, ReplaceWith); + EraseInstFromFunction(*OldInst); + + } + + // Return the old instruction, so it can be removed. + return &CI; +} + Instruction *InstCombiner::visitBitCast(BitCastInst &CI) { // If the operands are integer typed then apply the integer transforms, // otherwise just apply the common ones. @@ -7873,7 +7950,9 @@ // If the address spaces don't match, don't eliminate the bitcast, which is // required for changing types. if (SrcPTy->getAddressSpace() != DstPTy->getAddressSpace()) - return 0; + // If we're casting to address space zero, see if we can't just propagate + // the address space into the uses of CI. + return PropagateAddressSpace(CI); // If we are casting a malloc or alloca to a pointer to a type of the same // size, rewrite the allocation instruction to allocate the "right" type. Index: lib/Target/TargetData.cpp ==================================================================--- lib/Target/TargetData.cpp (revision 53716) +++ lib/Target/TargetData.cpp (working copy) @@ -231,8 +231,8 @@ } } -TargetData::TargetData(const Module *M) - : ImmutablePass((intptr_t)&ID) { +TargetData::TargetData(const Module *M, const AddrspacesDescriptor &Addrspaces) + : ImmutablePass((intptr_t)&ID), Addrspaces(Addrspaces) { init(M->getDataLayout()); } @@ -604,3 +604,5 @@ unsigned TargetData::getPreferredAlignmentLog(const GlobalVariable *GV) const { return Log2_32(getPreferredAlignment(GV)); } + +const AddrspacesDescriptor TargetData::DefaultAddrspaces; Index: include/llvm/Target/TargetData.h ==================================================================--- include/llvm/Target/TargetData.h (revision 53716) +++ include/llvm/Target/TargetData.h (working copy) @@ -63,12 +63,41 @@ std::ostream &dump(std::ostream &os) const; }; +/// This class is used to describe relations between different address spaces. +/// The default implementation defines all different address spaces as +/// equivalent. +/// +/// Create a subclass and pass it to TargetData's constructors to define +/// different behaviour. +class AddrspacesDescriptor { +public: + typedef enum { + Equivalent, + Disjoint, + Subset, + Superset + } AddrspaceRelation; + + /// getAddrspaceRelation - Returns the relation between the given address + /// spaces. Should be read left to right, for example Subset means that + /// Addrspace1 is a subset of Addrspace2. + virtual AddrspaceRelation getAddrspaceRelation( + int Addrspace1, + int Addrspace2) const { + return Equivalent; + } + + virtual ~AddrspacesDescriptor() {} +}; + class TargetData : public ImmutablePass { private: bool LittleEndian; ///< Defaults to false unsigned char PointerMemSize; ///< Pointer size in bytes unsigned char PointerABIAlign; ///< Pointer ABI alignment unsigned char PointerPrefAlign; ///< Pointer preferred alignment + const AddrspacesDescriptor &Addrspaces; ///< Address spaces descriptor + static const AddrspacesDescriptor DefaultAddrspaces; ///< Default address spaces descriptor //! Where the primitive type alignment data is stored. /*! @@ -109,20 +138,23 @@ /// /// @note This has to exist, because this is a pass, but it should never be /// used. - TargetData() : ImmutablePass(intptr_t(&ID)) { + TargetData() : ImmutablePass(intptr_t(&ID)), Addrspaces(DefaultAddrspaces) { assert(0 && "ERROR: Bad TargetData ctor used. " "Tool did not specify a TargetData to use?"); abort(); } /// Constructs a TargetData from a specification string. See init(). - explicit TargetData(const std::string &TargetDescription) - : ImmutablePass(intptr_t(&ID)) { + explicit TargetData(const std::string &TargetDescription, + const AddrspacesDescriptor &Addrspaces + = DefaultAddrspaces) + : ImmutablePass(intptr_t(&ID)), Addrspaces(Addrspaces) { init(TargetDescription); } /// Initialize target data from properties stored in the module. - explicit TargetData(const Module *M); + explicit TargetData(const Module *M, const AddrspacesDescriptor &Addrspaces + = DefaultAddrspaces); TargetData(const TargetData &TD) : ImmutablePass(intptr_t(&ID)), @@ -130,6 +162,7 @@ PointerMemSize(TD.PointerMemSize), PointerABIAlign(TD.PointerABIAlign), PointerPrefAlign(TD.PointerPrefAlign), + Addrspaces(TD.Addrspaces), Alignments(TD.Alignments) { } @@ -240,6 +273,15 @@ /// requested alignment (if the global has one). unsigned getPreferredAlignmentLog(const GlobalVariable *GV) const; + /// getAddrspaceRelation - Returns the relation between the given address + /// spaces. Should be read left to right, for example Subset means that + /// Addrspace1 is a subset of Addrspace2. + AddrspacesDescriptor::AddrspaceRelation getAddrspaceRelation(int Addrspace1, + int Addrspace2) { + + return Addrspaces.getAddrspaceRelation(Addrspace1, Addrspace2); + } + static char ID; // Pass identification, replacement for typeid }; -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080721/e5fb5120/attachment.sig>
Mon P Wang
2008-Jul-21 16:48 UTC
[LLVMdev] Casting between address spaces and address space semantics
Hi Matthijs, I was going to check in a change into clang assuming that by default, address spaces are separate unless there was some additional information (which since we didn't have any was always true) so that implicit cast will always generate an error. I'll hold off on this until we resolved what we want to do here. -- Mon Ping On Jul 18, 2008, at 2:04 PM, Matthijs Kooijman wrote:> When this information is known, we can define a "any memory" address > space, > which is a superset of the specific address spaces. This will allow > a function > to be (temporarily) defined to work with any memory, while still > properly > propagating information about the different memories through the > program (and > preventing mixup, since you can cast to the "any memory" address > space, but > not back). > > This requires clang to know about the relationships between address > spaces, so > it knows when to insert implicit casts and when to error out.
Possibly Parallel Threads
- [LLVMdev] Casting between address spaces and address space semantics
- [LLVMdev] Casting between address spaces and address space semantics
- [LLVMdev] Casting between address spaces and address space semantics
- [LLVMdev] Casting between address spaces and address space semantics
- [LLVMdev] replacing instructions