Is it possible that the AVX support in the JIT engine or x86-64 backend
is not mature? I am getting segfaults when switching from a vector
length 4 to 8 in my application. I isolated the barfing function and it
still segfaults in the minimal setup:
The IR attached implements the following simple function:
void bar(int start, int end, int ignore , bool add , bool addme , float* 
out, float* in)
{
   int loop_start = add ? start+add : start;
   int loop_end = add ? end+add : end;
   loop_start /= 8;
   loop_end /= 8;
   for ( int i = loop_start ; i < loop_end ; ++i )
     for ( int q = 0 ; q < 8 ; ++q )
       out[ i * 8 + q ] = in[ i * 8 + q ];
}
The main.cc program implements the following:
Set loop vectorizer min trip count to 4
Create Module from file (given as program argument)
Set loop vectorizer debug info output
Optimize including loop vectorization
Create payload
Call function
Check result
I tried this on various CPUs. On those which support SSE only this program
works fine (vectorized to length 4). Now, running it on a CPU with AVX
support, the loop vectorizer goes for vector length 8, JIT'ing the function
works still fine but then the call to the function suddenly segfaults.
The LLVM website says that the x86-64 backend includes support for ISA
extensions such as MMX and SSE but it doesn't say explicitly AVX. Is this
on purpose and there's no AVX support?
Frank
-------------- next part --------------
define void @main(i64 %arg0, i64 %arg1, i64 %arg2, i1 %arg3, i64 %arg4, float*
noalias %arg5, float* noalias %arg6) {
entrypoint:
  br i1 %arg3, label %L0, label %L1
L0:                                               ; preds = %entrypoint
  %0 = add nsw i64 %arg0, %arg4
  %1 = add nsw i64 %arg1, %arg4
  br label %L2
L1:                                               ; preds = %entrypoint
  br label %L2
L2:                                               ; preds = %L0, %L1
  %2 = phi i64 [ %arg0, %L1 ], [ %0, %L0 ]
  %3 = phi i64 [ %arg1, %L1 ], [ %1, %L0 ]
  %4 = sdiv i64 %2, 8
  %5 = sdiv i64 %3, 8
  br label %L5
L3:                                               ; preds = %L7, %L5
  %6 = phi i64 [ %33, %L7 ], [ 0, %L5 ]
  %7 = mul i64 %32, 8
  %8 = add nsw i64 %7, %6
  %9 = mul i64 %32, 1
  %10 = add nsw i64 %9, 0
  %11 = mul i64 %10, 1
  %12 = add nsw i64 %11, 0
  %13 = mul i64 %12, 1
  %14 = add nsw i64 %13, 0
  %15 = mul i64 %14, 8
  %16 = add nsw i64 %15, %6
  %17 = getelementptr float* %arg6, i64 %16
  %18 = load float* %17
  %19 = mul i64 %32, 8
  %20 = add nsw i64 %19, %6
  %21 = mul i64 %32, 1
  %22 = add nsw i64 %21, 0
  %23 = mul i64 %22, 1
  %24 = add nsw i64 %23, 0
  %25 = mul i64 %24, 1
  %26 = add nsw i64 %25, 0
  %27 = mul i64 %26, 8
  %28 = add nsw i64 %27, %6
  %29 = getelementptr float* %arg5, i64 %28
  store float %18, float* %29
  br label %L7
L4:                                               ; preds = %L7
  %30 = add nsw i64 %32, 1
  %31 = icmp sge i64 %30, %5
  br i1 %31, label %L6, label %L5
L5:                                               ; preds = %L4, %L2
  %32 = phi i64 [ %30, %L4 ], [ %4, %L2 ]
  br label %L3
L6:                                               ; preds = %L4
  ret void
L7:                                               ; preds = %L3
  %33 = add nsw i64 %6, 1
  %34 = icmp sge i64 %33, 8
  br i1 %34, label %L4, label %L3
}
-------------- next part --------------
A non-text attachment was scrubbed...
Name: main.cc
Type: text/x-c++src
Size: 5310 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131110/3fa2b862/attachment.cc>
-------------- next part --------------
LLVM=${HOME}/toolchain/install/llvm
CONFIG=$(LLVM)/bin/llvm-config
CXXFLAGS=$(shell $(CONFIG) --cxxflags) -I. -std=c++0x
LDFLAGS=$(shell $(CONFIG) --ldflags)
LIBS=$(shell $(CONFIG) --libs) 
CXX=g++-4.8
OBJS= main.o
TARGET=main
all: $(TARGET)
main: $(OBJS)
	$(CXX) -o $@ $(CXXFLAGS) $^ $(LIBS) $(LDFLAGS)
%.o: %.cc
	$(CXX) $(CXXFLAGS) -c $<
clean:
	rm -rf $(TARGET) $(OBJS) *~
Do you have a stack trace of the segfault? We have two different code emitters for X86 in LLVM. The one used by the normal compiler and MCJIT and the other used by the legacy JIT. All of the test cases for AVX support go through the first one so it gets the most attention. We try to keep the legacy JIT in sync with it, but have a history of failing at that. The stack trace of the segfault may point to what part is missing. On Sun, Nov 10, 2013 at 3:32 PM, Frank Winter <fwinter at jlab.org> wrote:> Is it possible that the AVX support in the JIT engine or x86-64 backend > is not mature? I am getting segfaults when switching from a vector > length 4 to 8 in my application. I isolated the barfing function and it > still segfaults in the minimal setup: > > The IR attached implements the following simple function: > > void bar(int start, int end, int ignore , bool add , bool addme , float* > out, float* in) > { > int loop_start = add ? start+add : start; > int loop_end = add ? end+add : end; > loop_start /= 8; > loop_end /= 8; > for ( int i = loop_start ; i < loop_end ; ++i ) > for ( int q = 0 ; q < 8 ; ++q ) > out[ i * 8 + q ] = in[ i * 8 + q ]; > } > > The main.cc program implements the following: > > Set loop vectorizer min trip count to 4 > Create Module from file (given as program argument) > Set loop vectorizer debug info output > Optimize including loop vectorization > Create payload > Call function > Check result > > > I tried this on various CPUs. On those which support SSE only this program > works fine (vectorized to length 4). Now, running it on a CPU with AVX > support, the loop vectorizer goes for vector length 8, JIT'ing the function > works still fine but then the call to the function suddenly segfaults. > > The LLVM website says that the x86-64 backend includes support for ISA > extensions such as MMX and SSE but it doesn't say explicitly AVX. Is this > on purpose and there's no AVX support? > > Frank > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-- ~Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131110/4fbafe98/attachment.html>
It's not much. (gdb) bt #0 0x00007ffff7f6506b in ?? () #1 0x000000000045d01a in main () at main.cc:165 Line 165 is the call to the function that was compiled by the JIT'er. Meaning that JIT'ing the function went well, but the code or the pointer are somehow corrupt. There is no particular reason why I am working with the legacy interface. Would you recommend to use the MCJIT interface in general? Any ideas how to proceed here? Frank On 10/11/13 20:16, Craig Topper wrote:> Do you have a stack trace of the segfault? > > We have two different code emitters for X86 in LLVM. The one used by > the normal compiler and MCJIT and the other used by the legacy JIT. > All of the test cases for AVX support go through the first one so it > gets the most attention. We try to keep the legacy JIT in sync with > it, but have a history of failing at that. The stack trace of the > segfault may point to what part is missing. > > > On Sun, Nov 10, 2013 at 3:32 PM, Frank Winter <fwinter at jlab.org > <mailto:fwinter at jlab.org>> wrote: > > Is it possible that the AVX support in the JIT engine or x86-64 > backend > is not mature? I am getting segfaults when switching from a vector > length 4 to 8 in my application. I isolated the barfing function > and it > still segfaults in the minimal setup: > > The IR attached implements the following simple function: > > void bar(int start, int end, int ignore , bool add , bool addme , > float* out, float* in) > { > int loop_start = add ? start+add : start; > int loop_end = add ? end+add : end; > loop_start /= 8; > loop_end /= 8; > for ( int i = loop_start ; i < loop_end ; ++i ) > for ( int q = 0 ; q < 8 ; ++q ) > out[ i * 8 + q ] = in[ i * 8 + q ]; > } > > The main.cc program implements the following: > > Set loop vectorizer min trip count to 4 > Create Module from file (given as program argument) > Set loop vectorizer debug info output > Optimize including loop vectorization > Create payload > Call function > Check result > > > I tried this on various CPUs. On those which support SSE only this > program > works fine (vectorized to length 4). Now, running it on a CPU with AVX > support, the loop vectorizer goes for vector length 8, JIT'ing the > function > works still fine but then the call to the function suddenly segfaults. > > The LLVM website says that the x86-64 backend includes support for ISA > extensions such as MMX and SSE but it doesn't say explicitly AVX. > Is this > on purpose and there's no AVX support? > > Frank > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu> > http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > > -- > ~Craig-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131110/e66152ec/attachment.html>