>> The compiler remembers for debugging purpose that x is defined as >> unsigned, and y as signed, no ? I'm not familiar with LLVM debug info, >> but maybe I can find this info there ? > > probably it can be extracted from debug info, but what if there is no debug > info? Can you please explain what you intend to do with this information - > then maybe we can suggest another way to solve your problem. >Thanks for your help. Actually, I'm working on a static analyzer that computes invariants at each basicBlock: "In basicBlock B, what is the set of possible assignments for each live values ?" and I obtains results such as "In B, we have 0 <= x <= 42" For an function's argument of type T, at the beginning of my analysis, I consider its set of possible values is all the set of all elements of type T. In the case of an int, it is [-2^31; 2^31-1], whereas it is [0, 2^32-1] for an unsigned... That's the reason why I was searching in the llvm bitcode something distinguishing these two types. Julien Henry
Hi Julien,> For an function's argument of type T, at the beginning of my analysis, I > consider its set of possible values is all the set of all elements of > type T. > In the case of an int, it is [-2^31; 2^31-1], whereas it is [0, 2^32-1] > for an unsigned... > That's the reason why I was searching in the llvm bitcode something > distinguishing these two types.there is no such information. You can still consider every type to have values in, say, T = [-2^31; 2^31-1]. Probably you are trying to deduce an interval of possible values for each register. You will need to allow intervals to wrap around the end of T since (eg) the basic "add" operator in LLVM uses modulo arithmetic, i.e. if you add 1 to 2^31-1 you get -2^31, which forces you to use wrapping intervals. Having wrapping intervals makes interval arithmetic more tricky but it can still be done. Whether an operation is signed (eg: sdiv, i.e. signed division) or unsigned (eg: udiv, i.e. unsigned division) you need to calculate the smallest interval that contains all possible result values given the intervals the arguments are known to belong to, or a conservative approximation to it. This is perfectly do-able, it's just that your interval for the possible values for the result probably won't always be as precise as you would like [*]. Some operations may be annotated with the "nsw" (no signed wrap) or "nuw" (no unsigned wrap) flags which allows you to do a better job if you are willing to assume that the program does not perform undefined behaviour. Ciao, Duncan. [*] LLVM has a ConstantRange that implements logic of this kind, so you don't even need to write it yourself!
On Wed, Mar 30, 2011 at 03:19, Julien Henry <Julien.Henry at imag.fr> wrote:> > Actually, I'm working on a static analyzer that computes invariants at > each basicBlock: "In basicBlock B, what is the set of possible > assignments for each live values ?" > and I obtains results such as "In B, we have 0 <= x <= 42" >Well, you have to find that from the comparisons that control the branches to the blocks, so you determine signedness from the signedness of the comparison instructions. (And get to have the fun of figuring out the right thing to do when the instructions mix signedness.) ~ Scott
me22 wrote:> On Wed, Mar 30, 2011 at 03:19, Julien Henry<Julien.Henry at imag.fr> wrote: >> >> Actually, I'm working on a static analyzer that computes invariants at >> each basicBlock: "In basicBlock B, what is the set of possible >> assignments for each live values ?" >> and I obtains results such as "In B, we have 0<= x<= 42" >> > > Well, you have to find that from the comparisons that control the > branches to the blocks, so you determine signedness from the > signedness of the comparison instructions. (And get to have the fun > of figuring out the right thing to do when the instructions mix > signedness.)It's not that hard :) There's already two ways to get a ConstantRange out of an ICmpInst; ICmpInst::makeConstantRange(predicate, apint) (eg., produce the range representing everything signed-less than 100) and ConstantRange::makeICmpRegion(predicate, constant-range) (eg., produce the range of all values that *could satisty* the predicate (signed less-than, etc.) any of the values in a range). Then you union them (if either way leads to block) or intersect them (if the conditions are and'd together). Nick> > ~ Scott > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
> there is no such information. You can still consider every type to have values > in, say, T = [-2^31; 2^31-1]. Probably you are trying to deduce an interval of > possible values for each register. You will need to allow intervals to wrap > around the end of T since (eg) the basic "add" operator in LLVM uses modulo > arithmetic, i.e. if you add 1 to 2^31-1 you get -2^31, which forces you to use > wrapping intervals. Having wrapping intervals makes interval arithmetic more > tricky but it can still be done. Whether an operation is signed (eg: sdiv, i.e. > signed division) or unsigned (eg: udiv, i.e. unsigned division) you need to > calculate the smallest interval that contains all possible result values given > the intervals the arguments are known to belong to, or a conservative > approximation to it. This is perfectly do-able, it's just that your interval > for the possible values for the result probably won't always be as precise as > you would like [*].Thanks for your suggestions. Actually, for now, I consider integers as mathematical integers (in ]-oo; +oo[), and that operators such as add don't overflow. (indeed, that's incorrect, but I'll work on this later ;) ) Suppose I approximate the set of possible values for my variable "x" with intervals. I think there are some cases for which I'll be unable to find precise enough invariants, for example, I'd like this: unsigned x; // x >= 0 if (x != 0) { // x >= 1 ... } but since I consider x in ]-oo; +oo[, I have this: unsigned x; // x in ]-oo; +oo[ if (x != 0) { // x in ]-oo; +oo[ (because of the abstraction by intervals) ... } Unfortunately, I think this loss of precision is unavoidable. Maybe if I consider programs that never do comparisons between signed and unsigned, I could retrieve types from the signedness of the comparison, as me22 suggests.> Some operations may be annotated with the "nsw" (no signed > wrap) or "nuw" (no unsigned wrap) flags which allows you to do a better job > if you are willing to assume that the program does not perform undefined > behaviour.I tried some examples to understand when these flags are set, because maybe it could help me retrieve types as well, but I confess I didn't understand well in which case they are set or not. Julien