thr3ads.net - search: "rsp"

Displaying 20 results from an estimated 1929 matches for "rsp".

Did you mean: esp

[PATCH 0/3] x86: adjust entry frame generation

2012 Oct 02

[PATCH 0/3] x86: adjust entry frame generation

This set of patches converts the way frames gets created from using PUSHes/POPs to using MOVes, thus allowing (in certain cases) to avoid saving/restoring part of the register set. While the place where the (small) win from this comes from varies between CPUs, the net effect is a 1 to 2% reduction on a combined interruption entry and exit when the full state save can be avoided. 1: use MOV

[LLVMdev] Reserved call frame

2011 Sep 09

[LLVMdev] Reserved call frame

Hello, i am trying to disable reserved call frame from the x86 backend, by setting hasReservedCallFrame function, in lib/Target/X86/X86RegisterInfo.cpp to always return false. When doing this i get the correct frame size, and the sub/add rsp instructions around the call correctly, but it seems that the offsets from rsp are not correctly updated between the sub instruction and the call. Is this some kind of a bug, or i should make more changes to disable reserved call frame ? Here is a piece of the output of the testcase with the re...

What does a dead register mean?

2018 Feb 06

What does a dead register mean?

Hi, My understanding of a "dead" register is a def that is never used. However, when I dump the MI after reg alloc on a simple program I see the following sequence: ADJCALLSTACKDOWN64 0, 0, 0, *implicit-def dead %rsp*, implicit-def dead %eflags, implicit-def dead %ssp, implicit %rsp, implicit %ssp CALL64pcrel32 @foo, <regmask %bh %bl %bp %bpl %bx %ebp %ebx %rbp %rbx %r12 %r13 %r14 %r15 %r12b %r13b %r14b %r15b %r12d %r13d %r14d %r15d %r12w %r13w %r14w %r15w>, *implicit %rsp*, implicit %ssp, implicit-def %r...

[Codegen bug in LLVM 3.8?] br following `fcmp une` is present in ll, absent in asm

2017 Mar 01

[Codegen bug in LLVM 3.8?] br following `fcmp une` is present in ll, absent in asm

...%603 = fcmp une double %rtb_Sum3_737, 0.000000e+00 %_rtB_739 = load %B_repro_T*, %B_repro_T** %_rtB_, align 8 br i1 %603, label %true73, label %false74 Now, in broken.asm, notice the same merge128 is missing the branch instruction: .LBB6_55: # %merge128 movq 184(%rsp), %rcx movq %rax, 728(%rcx) movq 184(%rsp), %rax movq 728(%rax), %rcx movq %rcx, 736(%rax) movq 184(%rsp), %rax movq $0, 744(%rax) movq 184(%rsp), %rax movq $0, 752(%rax) movq 184(%rsp), %rax movq $0, 760(%rax) movq 176(%rsp), %rax movsd 5608(%rax), %xmm0 # xmm0 = mem[0],zero movq 184(%rsp),...

[LLVMdev] Wrong assembly is written for x86_64 target in JIT without optimization?

2011 Jan 12

[LLVMdev] Wrong assembly is written for x86_64 target in JIT without optimization?

...8 to i32* %v10 = load i32* %v9 %op.dual.plus.uint32 = add i32 %v5, %v10 br label %lbl2 lbl2: ret i32 %op.dual.plus.uint32 } declare i32 @yfunc(i32, i32, i32) --- assembly obtained in gdb for JITted code --- 0x0000000800989bf0: push %rbp 0x0000000800989bf1: mov %rsp,%rbp 0x0000000800989bf4: sub $0x30,%rsp 0x0000000800989bf8: mov %rdi,0xfffffffffffffff8(%rbp) 0x0000000800989bfc: mov %rsi,0xfffffffffffffff0(%rbp) 0x0000000800989c00: mov $0x1,%edi 0x0000000800989c05: xor %eax,%eax 0x0000000800989c07...

[LLVMdev] "equivalent" .ll files diverge after optimizations are applied

2010 Aug 31

[LLVMdev] "equivalent" .ll files diverge after optimizations are applied

...ection __TEXT,__text,regular,pure_instructions .globl __ZN7WebCore15GraphicsContext19roundToDevicePixelsERKNS_9FloatRectE .align 4, 0x90 __ZN7WebCore15GraphicsContext19roundToDevicePixelsERKNS_9FloatRectE: ## @_ZN7WebCore15GraphicsContext19roundToDevicePixelsERKNS_9FloatRectE ## BB#0: subq $24, %rsp movq %rsi, %rdx movl $0, 16(%rsp) movl $0, 20(%rsp) movl $0, 8(%rsp) movl $0, 12(%rsp) movq 8(%rdi), %rsi leaq 16(%rsp), %rcx leaq 8(%rsp), %r8 callq __ZN7WebCore5mouniEPNS_15GraphicsContextEPNS_30GraphicsContextPlatformPrivateERKNS_9FloatRectERNS_10FloatPointES8_ movss 8(%rsp), %xmm1 mo...

xenstore-write segfault

2010 Jul 12

xenstore-write segfault

...e 0 (vif) could not be connected. Hotplug scripts not working. After that I checked my var/log/messages and It gave lots of xenstore-write segfault,exact error is attached iniline. Jul 12 07:26:46 centosxcat2 kernel: xenstore-read[16643]: segfault at 0000000000000000 rip 0000000000000000 rsp 00007fff2036def8 error 4 Now I dont know whether segfault error were caused by first error or they are independent , I think they occured after i tried to create domU with bridge networking. And also this error (segfault) is occurring continuously without stop and f...

[LLVMdev] Problem with MachineFunctionPass and JMP

2013 May 13

[LLVMdev] Problem with MachineFunctionPass and JMP

...4)).addMBB(origBB.at(1)); newEntry->push_back(plop); return false; } And here is the resulting code (it's a simple program with some 'if'): (null) BB#4 JMP_4 <BB#0> if.end BB#3 %RDI<def> = LEA64r %RIP, 1, %noreg, <ga:@.str2>, %noreg ADJCALLSTACKDOWN64 0, %RSP<imp-def>, %EFLAGS<imp-def>, %RSP<imp-use> %AL<def> = MOV8ri 0 CALL64pcrel32 <ga:@printf>, <regmask>, %RSP<imp-use>, %AL<imp-use,kill>, %RDI<imp-use,kill>, %EAX<imp-def> ADJCALLSTACKUP64 0, 0, %RSP<imp-def>, %EFLAGS<imp-def>, %R...

[LLVMdev] "equivalent" .ll files diverge after optimizations are applied

2010 Aug 31

[LLVMdev] "equivalent" .ll files diverge after optimizations are applied

...memcpy. I would certainly expect the memcpy expansion to be smart enough to avoid using MM registers, though; that's a serious bug if it isn't. movd %xmm0, %rax movd %rax, %mm0 movq2dq %mm0, %xmm1 movq2dq %mm0, %xmm2 punpcklqdq %xmm2, %xmm1 ## xmm1 = xmm1[0],xmm2[0] movq 16(%rsp), %rax movd %rax, %mm0 movq2dq %mm0, %xmm0 punpcklqdq %xmm2, %xmm0 ## xmm0 = xmm0[0],xmm2[0] On Aug 31, 2010, at 11:18 AMPDT, Argyrios Kyrtzidis wrote: > Hi, > > I've attached 2 .ll files which are supposed to be equivalent but > 'unopt-fail.ll' causes a crash in...

[LLVMdev] "equivalent" .ll files diverge after optimizations are applied

2010 Aug 31

[LLVMdev] "equivalent" .ll files diverge after optimizations are applied

...ect the memcpy expansion to be smart enough to avoid using MM registers, though; that's a serious bug if it isn't. > > movd %xmm0, %rax > movd %rax, %mm0 > movq2dq %mm0, %xmm1 > movq2dq %mm0, %xmm2 > punpcklqdq %xmm2, %xmm1 ## xmm1 = xmm1[0],xmm2[0] > movq 16(%rsp), %rax > movd %rax, %mm0 > movq2dq %mm0, %xmm0 > punpcklqdq %xmm2, %xmm0 ## xmm0 = xmm0[0],xmm2[0] > > > On Aug 31, 2010, at 11:18 AMPDT, Argyrios Kyrtzidis wrote: > >> Hi, >> >> I've attached 2 .ll files which are supposed to be equivalent but &...

Vectorization of math function failed?

2020 Aug 31

Vectorization of math function failed?

...i] = sinf(x[i]); } Which I compiled with: clang++ -O3 -march=native -mtune=native -c -o vec.o vec.cc -lmvec -fno-math-errno And here is what I get: vec.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <_Z4fct1Dv4_f>: 0: 48 83 ec 48 sub $0x48,%rsp 4: c5 f8 29 04 24 vmovaps %xmm0,(%rsp) 9: e8 00 00 00 00 callq e <_Z4fct1Dv4_f+0xe> e: c5 f8 29 44 24 30 vmovaps %xmm0,0x30(%rsp) 14: c5 fa 16 04 24 vmovshdup (%rsp),%xmm0 19: e8 00 00 00 00 callq 1e <_Z4fct1Dv4_f+0x1e> 1e: c5 f8 29 44 24...

[LLVMdev] "equivalent" .ll files diverge after optimizations are applied

2010 Aug 31

[LLVMdev] "equivalent" .ll files diverge after optimizations are applied

...nough to avoid using MM >> registers, though; that's a serious bug if it isn't. >> >> movd %xmm0, %rax >> movd %rax, %mm0 >> movq2dq %mm0, %xmm1 >> movq2dq %mm0, %xmm2 >> punpcklqdq %xmm2, %xmm1 ## xmm1 = xmm1[0],xmm2[0] >> movq 16(%rsp), %rax >> movd %rax, %mm0 >> movq2dq %mm0, %xmm0 >> punpcklqdq %xmm2, %xmm0 ## xmm0 = xmm0[0],xmm2[0] >> >> >> On Aug 31, 2010, at 11:18 AMPDT, Argyrios Kyrtzidis wrote: >> >>> Hi, >>> >>> I've attached 2 .ll files which...

UnoC function in survAUC for censoring-adjusted C-index

2011 Jun 24

UnoC function in survAUC for censoring-adjusted C-index

...nt to avoid using the right-end tail of the KM curve). Copying from the example in the help file: TR <- ovarian[1:16,] TE <- ovarian[17:26,] train.fit <- coxph(Surv(futime, fustat) ~ age,x=TRUE, y=TRUE, method="breslow", data=TR) lpnew <- predict(train.fit, newdata=TE) Surv.rsp <- Surv(TR$futime, TR$fustat) Surv.rsp.new <- Surv(TE$futime, TE$fustat) If time is left NULL, the maximum futime in the TE data is assumed: UnoC(Surv.rsp, Surv.rsp.new, lpnew, time=NULL) #same as UnoC(Surv.rsp, Surv.rsp.new, lpnew, time=max(Surv.rsp.new[,1])) #which incident...

[LLVMdev] Strange behaviour with x86-64 windows, bad call instruction address

2012 Feb 14

[LLVMdev] Strange behaviour with x86-64 windows, bad call instruction address

...39;ve seen when debugging the assembly is that the 3 that work all have JIT function pointer addresses less than a 32 bit value but the one that is failing has a 64 bit address, as indicated in the snippet below: 000007FFFFC511D7 pop rbp 000007FFFFC511D8 ret 000007FFFFC511D9 sub rsp,20h 000007FFFFC511DD mov rcx,qword ptr [rbp-70h] 000007FFFFC511E1 mov edx,0FFFFFFFEh 000007FFFFC511E6 xor r8d,r8d 000007FFFFC511E9 call rsi 000007FFFFC511EB add rsp,20h 000007FFFFC511EF test al,1 000007FFFFC511F2 je 000007FFFFC511C3 0000...

Code generation option for wide integers on x86_64?

2020 Aug 17

Code generation option for wide integers on x86_64?

Is there an existing option in X86_64 target code generator to emit a loop for the following code: define i4096 @add(i4096 %a, i4096 %b) alwaysinline { %c = add i4096 %a, %b ret i4096 %c } instead of: movq %rdi, %rax addq 96(%rsp), %rsi adcq 104(%rsp), %rdx movq %rdx, 8(%rdi) movq %rsi, (%rdi) adcq 112(%rsp), %rcx movq %rcx, 16(%rdi) adcq 120(%rsp), %r8 movq %r8, 24(%rdi) adcq 128(%rsp), %r9 movq %r9, 32(%rdi) movq 8(%rsp), %rcx adcq 136(%rsp), %rcx movq %rcx, 40(%...

[PATCH v1 06/27] x86/entry/64: Adapt assembly for PIE support

2017 Oct 11

[PATCH v1 06/27] x86/entry/64: Adapt assembly for PIE support

...This call instruction is handled specially in stub_ptregs_64. * It might end up jumping to the slow path. If it jumps, RAX * and all argument registers are clobbered. */ - call *sys_call_table(, %rax, 8) + call *(%r11, %rax, 8) .Lentry_SYSCALL_64_after_fastpath_call: movq %rax, RAX(%rsp) @@ -334,7 +337,8 @@ ENTRY(stub_ptregs_64) * RAX stores a pointer to the C function implementing the syscall. * IRQs are on. */ - cmpq $.Lentry_SYSCALL_64_after_fastpath_call, (%rsp) + leaq .Lentry_SYSCALL_64_after_fastpath_call(%rip), %r11 + cmpq %r11, (%rsp) jne 1f /* @@ -1172,7 +1...

[LLVMdev] Wrong assembly is written for x86_64 target in JIT without optimization?

2011 Jan 12

[LLVMdev] Wrong assembly is written for x86_64 target in JIT without optimization?

...alue > 0xffffffffffffffe0(%rbp) is used without being ever initialized (see my > comment in asm). > Same code on i386 works fine, with and w/out optimization. > > My guess is that this is a bug in LLVM. > > Yuri The unitialized load is there in the llc version as well (16(%rsp)). It looks like it's been erroneously moved ahead of the spill to that slot. Please file a PR. (I can see this on Darwin with llc -O0). > --- llvm --- > %struct.mystruct = type { i32, i8, i8, i8, i8 } > > define i32 @xfunc(%struct.mystruct* %a1, %struct.mystruct* %a2) { >...

Suggestion: Custom filename patterns for non-Sweave vignettes

2013 Feb 15

Suggestion: Custom filename patterns for non-Sweave vignettes

...e registered "weave" function, (d) and possibly post process the generated weave artifact (e.g. a *.tex file). I'd like to propose to extend this non-Sweave mechanism to allow for any filename patterns still using a very similar setup. Here is how I'd like it to see it work with RSP vignettes (cf. the R.rsp package): tools::vignetteEngine("rsp", weave=rspWeave, tangle=rspTangle, patterns="[.]rsp$") Argument 'patterns' could default to patterns=c("[.][RrSs](nw|tex)$", "[.]Rmd$"). This is just a sketch/mock up and it may be th...

[LLVMdev] RegAllocFast uses too much stack

2011 Jul 11

[LLVMdev] RegAllocFast uses too much stack

...he registers to the stack before each call, we also set up 0, 1 and 2 into regs first, then spill them and don't even get a chance to reuse stack slots. That's just bad: pushq %rax movl $2, %edi movl $1, %eax movl $0, %ecx movl %edi, 4(%rsp) # 4-byte Spill movl %ecx, %edi movl %eax, (%rsp) # 4-byte Spill callq foo movl (%rsp), %edi # 4-byte Reload callq foo movl 4(%rsp), %edi # 4-byte Reload callq foo popq %ra...

[LLVMdev] Fwd: Profile (-pg) segfault

2013 Sep 13

[LLVMdev] Fwd: Profile (-pg) segfault

...nu/libc.so.6 (gdb) bt #0 0x00007ffff7b1245b in mcount () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007ffff7dd6588 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x0000000000000000 in ?? () (gdb) disas Dump of assembler code for function mcount: 0x00007ffff7b12430 <+0>: sub $0x38,%rsp 0x00007ffff7b12434 <+4>: mov %rax,(%rsp) 0x00007ffff7b12438 <+8>: mov %rcx,0x8(%rsp) 0x00007ffff7b1243d <+13>: mov %rdx,0x10(%rsp) 0x00007ffff7b12442 <+18>: mov %rsi,0x18(%rsp) 0x00007ffff7b12447 <+23>: mov %rdi,0x20(%rsp) 0x00007ffff7b...

search for: rsp