Hey, I'm curious about two of the design questions in LLVM after reading the language reference. If there's preexisting material explaining this, just point me at it (looked and couldn't find any). - Why unsigned types rather than signed-only like Java/JVM? If I understand correctly, their behavior is only distinguishable in overflow situations, and the availability of hardware-assisted overflow detection varies quite a bit across platforms. In other words, abandoning overflow detection makes the duplication of types redundant, while requiring it would be a great burden on CPUs that don't have overflow exception hardware. - Why identify functions by their type signatures? I have to assume that this is meant to allow overloading, but overloading is generally a higher-level concept (at the "language-aware" level). Chances are quite good that the sets of arguments to two different overloads of a function would wind up mapping down to the same LLVM-level types (since type information is partially lost in the transition). For example, the types for polar coordinates and cartesian coordinates are both {float,float}. So most languages will need to mangle symbols anyways. Actually, having an equal-by-name (in addition to the equal-by-structure array/struct type operators) type in LLVM would let compilers encode the distinction. But it might complicate linking conventions. Thanks! Overall I was hugely impressed. I'd had my attention focused on architecture-independant code generation libraries for a while, but being able to specify an IR outside of the language used to manipulate it opens up way more possibilities. Neat stuff. - a
On Wed, 2005-05-04 at 23:27 -0700, Adam Megacz wrote:> Hey, I'm curious about two of the design questions in LLVM after > reading the language reference. If there's preexisting material > explaining this, just point me at it (looked and couldn't find any). > > - Why unsigned types rather than signed-only like Java/JVM? If I > understand correctly, their behavior is only distinguishable in > overflow situations, and the availability of hardware-assisted > overflow detection varies quite a bit across platforms. > > In other words, abandoning overflow detection makes the > duplication of types redundant, while requiring it would be a > great burden on CPUs that don't have overflow exception hardware.Yes, you're right. This has been a desired change for quite some time now. Unfortunately, its a huge impact to nearly every part of LLVM. We will probably do it around the 2.0 time frame when we can afford to break bytecode compatibility and generally clean up a lot of other things as well.> > - Why identify functions by their type signatures? I have to assume > that this is meant to allow overloading, but overloading is > generally a higher-level concept (at the "language-aware" level).As with all other things in LLVM, values are partitioned by their type, not by their name. That is, identity is determined by type (structure) equivalence. In order to distinguish functions we must type them. A by product of this is that we get overloading for free, but its not the main concern.> > Chances are quite good that the sets of arguments to two different > overloads of a function would wind up mapping down to the same > LLVM-level types (since type information is partially lost in the > transition). For example, the types for polar coordinates and > cartesian coordinates are both {float,float}. So most languages > will need to mangle symbols anyways.Remember that LLVM is "low level". How a higher order language decides to deal with ambiguity in its runtime library is up to it. LLVM just provides the capability to express what the higher level language needs.> > Actually, having an equal-by-name (in addition to the > equal-by-structure array/struct type operators) type in LLVM would > let compilers encode the distinction. But it might complicate > linking conventions.We've thought about this and it gets debated from time to time. I think I'll let Chris answer it, hwoever.> > Thanks! Overall I was hugely impressed. I'd had my attention focused > on architecture-independant code generation libraries for a while, but > being able to specify an IR outside of the language used to manipulate > it opens up way more possibilities. Neat stuff.Yup! Glad you like it :) Reid. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20050505/30e04e92/attachment.sig>
On Wed, 4 May 2005, Adam Megacz wrote:> Hey, I'm curious about two of the design questions in LLVM after > reading the language reference. If there's preexisting material > explaining this, just point me at it (looked and couldn't find any).No problem. :)> - Why unsigned types rather than signed-only like Java/JVM? If I > understand correctly, their behavior is only distinguishable in > overflow situations, and the availability of hardware-assisted > overflow detection varies quite a bit across platforms.In this case, it's not about overflow detection. Some operators behave differently on signed vs unsigned data (e.g. division, remainder, <, >, etc). Over time, I would like to slowly move to a situation where LLVM moves the signed distinction from the type-system to the operators (e.g. we would only have i1/i8/i16/i32/i64, but would get SMOD vs UMOD).> - Why identify functions by their type signatures? I have to assume > that this is meant to allow overloading, but overloading is > generally a higher-level concept (at the "language-aware" level).You're right.> Chances are quite good that the sets of arguments to two different > overloads of a function would wind up mapping down to the same > LLVM-level types (since type information is partially lost in the > transition). For example, the types for polar coordinates and > cartesian coordinates are both {float,float}. So most languages > will need to mangle symbols anyways.You're right.> Actually, having an equal-by-name (in addition to the > equal-by-structure array/struct type operators) type in LLVM would > let compilers encode the distinction. But it might complicate > linking conventions.Yup, you're absolutely right :) This is something that exists due to historical reasons. It is another minor thing that we will be moving away from in time.> Thanks! Overall I was hugely impressed. I'd had my attention focused > on architecture-independant code generation libraries for a while, but > being able to specify an IR outside of the language used to manipulate > it opens up way more possibilities. Neat stuff.Great! :) -Chris -- http://nondot.org/sabre/ http://llvm.cs.uiuc.edu/