I propose a new intrinsic "llvm.memcmp" that compares a block of memory for equality (a subset of the libc behavior). Backends are free to use the alignment to optimize using wider than byte operations. Since the result is only equal/not-equal, byte order is not important. For languages that support array compares, this would be very useful. Syntax: declare i1 @llvm.memcmp(i8* <arg1>, i8* <arg2>, i32 <len>, i32 <align>) declare i1 @llvm.memcmp(i8* <arg1>, i8* <arg2>, i64 <len>, i32 <align>) Overview: The 'llvm.memcmp.*' intrinsic compares a two blocks of memory for equality, returning true if they are equal. Arguments: The first two arguments are pointers to the memory to be compared. The third argument is an integer argument specifying the number of bytes to compare, the fourth argument is the alignment of the two memory locations If the call to this intrinsic has an alignment value that is not 0 or 1, then the caller guarantees that both source pointers are aligned to that boundary.
On Fri, Aug 20, 2010 at 1:03 PM, Bagel <bagel99 at gmail.com> wrote:> I propose a new intrinsic "llvm.memcmp" that compares a block of memory > for equality (a subset of the libc behavior). Backends are free to use the > alignment to optimize using wider than byte operations. Since the result is > only equal/not-equal, byte order is not important. > > For languages that support array compares, this would be very useful. > > Syntax: > > declare i1 @llvm.memcmp(i8* <arg1>, i8* <arg2>, i32 <len>, i32 <align>) > declare i1 @llvm.memcmp(i8* <arg1>, i8* <arg2>, i64 <len>, i32 <align>)The following would be preferred: declare i1 @llvm.memcmp.i32(i8* <arg1>, i8* <arg2>, i32 <len>, i32 <align>) declare i1 @llvm.memcmp.i64(i8* <arg1>, i8* <arg2>, i64 <len>, i32 <align>)> Overview: > > The 'llvm.memcmp.*' intrinsic compares a two blocks of memory for equality, > returning true if they are equal. > > Arguments: > > The first two arguments are pointers to the memory to be compared. > The third argument is an integer argument specifying the number of bytes to > compare, the fourth argument is the alignment of the two memory locations > > If the call to this intrinsic has an alignment value that is not 0 or 1, > then the caller guarantees that both source pointers are aligned to that boundary.I assume <align> is required to be a constant integer? Also, I assume this is supposed to guarantee that arg1 and arg2 point to len bytes of valid memory? Most importantly, is the overhead of calling memcmp actually significant for your application? Are there enough other people in the same situation to make this worth implementing? This is unlikely to provide significant performance improvements for C/C++ code... -Eli
On 08/20/2010 04:06 PM, Eli Friedman wrote:> On Fri, Aug 20, 2010 at 1:03 PM, Bagel<bagel99 at gmail.com> wrote: >> I propose a new intrinsic "llvm.memcmp" that compares a block of memory >> for equality (a subset of the libc behavior). Backends are free to use the >> alignment to optimize using wider than byte operations. Since the result is >> only equal/not-equal, byte order is not important. >> >> For languages that support array compares, this would be very useful. >> >> Syntax: >> >> declare i1 @llvm.memcmp(i8*<arg1>, i8*<arg2>, i32<len>, i32<align>) >> declare i1 @llvm.memcmp(i8*<arg1>, i8*<arg2>, i64<len>, i32<align>) > > The following would be preferred: > declare i1 @llvm.memcmp.i32(i8*<arg1>, i8*<arg2>, i32<len>, i32<align>) > declare i1 @llvm.memcmp.i64(i8*<arg1>, i8*<arg2>, i64<len>, i32<align>)OK. I had assumed that the it would be overloaded as is memcpy.>> Overview: >> >> The 'llvm.memcmp.*' intrinsic compares a two blocks of memory for equality, >> returning true if they are equal. >> >> Arguments: >> >> The first two arguments are pointers to the memory to be compared. >> The third argument is an integer argument specifying the number of bytes to >> compare, the fourth argument is the alignment of the two memory locations >> >> If the call to this intrinsic has an alignment value that is not 0 or 1, >> then the caller guarantees that both source pointers are aligned to that boundary. > > I assume<align> is required to be a constant integer?Yes.> Also, I assume this is supposed to guarantee that arg1 and arg2 point > to len bytes of valid memory?Yes.> Most importantly, is the overhead of calling memcmp actually > significant for your application? Are there enough other people in > the same situation to make this worth implementing? This is unlikely > to provide significant performance improvements for C/C++ code...I suppose it wouldn't help C/C++ much. But, with languages that support array compares directly, e.g. "D", memcmp() is not actually called and array compares can be quite common. For example, in IPv6 address compares, where 16 bytes are compared, using bigger chunks (assuming the addresses are suitably aligned) can be a big saving in a IPv6 stack. Of course, each front end could expand an aligned memcmp into chunk compares, but this does require knowledge of what widths the target can handle. bagel