similar to: [LLVMdev] Wrong encoding of movd on x64

Displaying 20 results from an estimated 1100 matches similar to: "[LLVMdev] Wrong encoding of movd on x64"

2009 Jul 09
0
[LLVMdev] Wrong encoding of movd on x64
On Thu, Jul 9, 2009 at 8:44 AM, Nicolas Capens<nicolas at capens.net> wrote: > I believe I’ve found a bug in the encoding of the movd instruction on x64. > Here’s some IR code to reproduce it: [snip > Note the last movq. What was probably intended to be generated was “movd > ecx, mm0”. LLVM mistakenly sets the ‘wide’ bit of the REX prefix to 1, > turning movd into movq. Also,
2008 Jul 31
2
[LLVMdev] Generating movq2dq using IRBuilder
Hi all, How do I generate the movq2dq SSE2 instruction using the IRBuilder? There is no zext from 64-bit to 128-bit (corresponding to MMX to XMM register transfer) as far as I can tell. So I've tried inserting an i64 into a v2i64, which generates valid code but rather a number of stores and loads on the stack instead of a single movq2dq. Looking though the code, I found a pattern for
2011 Oct 26
0
[LLVMdev] Lowering to MMX
On Oct 26, 2011, at 1:18 PM, Nicolas Capens wrote: > On 24/10/2011 9:50 PM, Bill Wendling wrote: >> On Oct 20, 2011, at 8:42 AM, Nicolas Capens wrote: >> >>> Hi all, >>> >>> I'm working on a graphics project which uses LLVM for dynamic code >>> generation, and I noticed a major performance regression when upgrading >>> from LLVM
2011 Oct 26
2
[LLVMdev] Lowering to MMX
Hi Bill, Comments inline: On 24/10/2011 9:50 PM, Bill Wendling wrote: > On Oct 20, 2011, at 8:42 AM, Nicolas Capens wrote: > >> Hi all, >> >> I'm working on a graphics project which uses LLVM for dynamic code >> generation, and I noticed a major performance regression when upgrading >> from LLVM 2.8 to 3.0-rc1 (LLVM 2.9 didn't support Win64 so I
2010 Aug 31
0
[LLVMdev] "equivalent" .ll files diverge after optimizations are applied
Using MM registers is wrong unless the user has specifically asked for it, which doesn't seem to be the case here. In the awesome MMX architecture, touching an MM register makes subsequent x87 operations fail unless an EMMS instruction is issued first; none of the compilers here are smart enough to insert EMMS instructions in the right places, so the only safe thing is not to use
2010 Aug 31
2
[LLVMdev] "equivalent" .ll files diverge after optimizations are applied
Here's the optimized versions: $ opt -std-compile-opts unopt-pass.ll -o - | llvm-dis -o - [...] define %3 @_ZN7WebCore15GraphicsContext19roundToDevicePixelsERKNS_9FloatRectE(%"class.WebCore::GraphicsContext"* %this, %"struct.WebCore::FloatRect"* %rect) nounwind ssp align 2 { %roundedOrigin = alloca %"class.WebCore::FloatSize", align 4 ;
2010 Aug 31
5
[LLVMdev] "equivalent" .ll files diverge after optimizations are applied
Hi, I've attached 2 .ll files which are supposed to be equivalent but 'unopt-fail.ll' causes a crash in webkit's test suite while 'unopt-pass.ll' does not. I can't give more details about the crash, when I run the crashing test it in isolation it passes, when I run the full suite it crashes; it boggles the mind. Below I provide the optimized asm that is produced from
2007 Dec 12
2
[LLVMdev] Bogus X86-64 Patterns
Tracking down a problem with one of our benchmark codes, we've discovered that some of the patterns in X86InstrX86-64.td are wrong. Specifically: def MOV64toPQIrm : RPDI<0x6E, MRMSrcMem, (outs VR128:$dst), (ins i64mem:$src), "mov{d|q}\t{$src, $dst|$dst, $src}", [(set VR128:$dst, (v2i64 (scalar_to_vector
2017 Jan 10
3
Problems with bind9_dlz when rndc is reloaded
Hi guys, I'm facing a problems with samba4 + bind9_dlz that consuming my time for several days. Everything is working fine until samba4 need to update dns when I'm work with more than one DC server. When samba (or bind) need to reload all zones, the module bind9_dlz is shutting down and then all my environment stops and I need to restart the bind to up again. See my log: ... Jan
2017 Jan 12
2
Problems with bind9_dlz when rndc is reloaded
Mathias, Thanks for your reply. Please, try to start your bind with some debug level and run commando "rndc reload" and see the end of the log. I saw samba source code and found the destroy dns function in dlz_bind9.c and called by turture blz_bind9.c. When dlz_bind9.c is shutting down, I get this error when I try to update dns. update failed: NOTAUTH Failed nsupdate: 2
2008 Jul 31
0
[LLVMdev] Generating movq2dq using IRBuilder
In the same breath I'd also like to kindly ask if someone could have a look at the reverse operations, namely trunk from 128 to 64 bit using movdq2q, and 128 to 32 and 64 to 32 using movd. This also seems related to Bug 2585. Thanks again. From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Nicolas Capens Sent: Thursday, 31 July, 2008 16:03 To:
2017 Jan 12
2
Problems with bind9_dlz when rndc is reloaded
Using your log parameters, the shutting down message is not showed, but when I reload rndc a get the same effect. Everything is working fine until bond9_dlz needs to reload (and no restart) rndc. When this happens, I need to restart bind and everything works fine again. I'm starting named with named -d 3 -u named and using /var/log/messages. See log using your parameters: # rndc reload
2004 Sep 10
2
An assembly optimization and fix
I have optimized FLAC__fixed_compute_best_predictor_asm_ia32_mmx_cmov function and fixed bug when data_len == 0. Now the function is about 50% faster and flac -5 is about 5% faster on my box. I have tested it thoroughly, I think it can go to flac 1.0.4. -- Miroslav Lichvar -------------- next part -------------- --- src/libFLAC/ia32/fixed_asm.nasm.orig 2002-01-26 19:05:12.000000000 +0100 +++
2010 Aug 31
0
[LLVMdev] "equivalent" .ll files diverge after optimizations are applied
On Aug 31, 2010, at 1:21 PMPDT, Argyrios Kyrtzidis wrote: > > Just to be clear, are you saying that the fact that, after using llc > on the second IR, the produced asm is using MM registers, indicates > a bug ? Yes. It's not immediately obvious whether it's in the opt or llc, though. Chris was doing work involving <2 x float> and may know about this. >
2005 Aug 17
2
MMX loop filter for theora-exp
Hello, I would like to announce the semi-optimized oc_state_loop_filter_frag_rows It gains like 7% speedup. Unfortunately it has some issues: 1) wont compile on 64bit (I will fix it later hopefully) 2) is not yet fully optimized (instruction stalls) Here are the results. CPU: Athlon, speed 1466.91 MHz (estimated) Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask
2011 Oct 25
0
[LLVMdev] Lowering to MMX
On Oct 20, 2011, at 8:42 AM, Nicolas Capens wrote: > Hi all, > > I'm working on a graphics project which uses LLVM for dynamic code > generation, and I noticed a major performance regression when upgrading > from LLVM 2.8 to 3.0-rc1 (LLVM 2.9 didn't support Win64 so I skipped it > entirely). > > I found out that the performance regression is due to removing
2004 Aug 24
5
MMX/mmxext optimisations
quite some speed improvement indeed. attached the updated patch to apply to svn/trunk. j -------------- next part -------------- A non-text attachment was scrubbed... Name: theora-mmx.patch.gz Type: application/x-gzip Size: 8648 bytes Desc: not available Url : http://lists.xiph.org/pipermail/theora-dev/attachments/20040824/5a5f2731/theora-mmx.patch-0001.bin
2009 Aug 30
3
experimental patch for libtheora1.1beta3
Good morning in the Lord Regarding the port of libtheora1.1beta3 for OpenBSD for amd64 and the problem I described at: http://lists.xiph.org/pipermail/theora/2009-August/002640.html Attached is a patch for libtheora/patches/patch-lib_x86_mmxencfrag_c I can play videos with it. ?Does it work for you? Best regards -- Dios, gracias por tu amor infinito.
2014 Apr 16
2
[LLVMdev] X86 mmx movq disassembler fail
0x0f 0x6f 0xc8 And 0x0f 0x7f 0xc1 Should both be movq % mm0, % mm1. (AT&T) However, llvm 3.4 at least does not recognise the second variant as being a valid instruction. We are currently compiling up latest src incase it has been fixed. If not, could someone take a look or recommend how to fix? Lee -------------- next part -------------- An HTML attachment was scrubbed... URL:
2017 Dec 11
2
New x86 instruction with opcode 0x0F 0x7A
Hi all, I'm trying to simulate an extended x86 architecture on gem5 with several new instructions. My hardware setup is done and now I'd like llvm to accept the existence of the new instruction passed in inline assembly and output the correct opcode and registers. I chose the two-byte opcode 0x0F 0x7A and I would like the instruction to have the same operands and return values as CVTPS2PI