thr3ads.net - similar to: "[LLVMdev] problem with X86's AVX assembler?"

Displaying 20 results from an estimated 400 matches similar to: "[LLVMdev] problem with X86's AVX assembler?"

[LLVMdev] problem with X86's AVX assembler?

2014 Jun 26

[LLVMdev] problem with X86's AVX assembler?

On Thu, Jun 26, 2014 at 5:47 AM, Adam Nemet <anemet at apple.com> wrote: > Hi Jun, > > On Jun 25, 2014, at 8:14 AM, Jun Koi <junkoi2004 at gmail.com> wrote: > > > Hi, > > > > I am trying to assemble below instruction with latest LLVM code, but > fail. Am I doing something wrong, or is this a bug? > > > > > > $ echo "vaddps zmm7

[LLVMdev] problem with X86's AVX assembler?

2014 Jun 26

[LLVMdev] problem with X86's AVX assembler?

On Thu, Jun 26, 2014 at 10:23 AM, Adam Nemet <anemet at apple.com> wrote: > > > On Jun 25, 2014, at 7:05 PM, Jun Koi <junkoi2004 at gmail.com> wrote: > > > > > On Thu, Jun 26, 2014 at 5:47 AM, Adam Nemet <anemet at apple.com> wrote: > >> Hi Jun, >> >> On Jun 25, 2014, at 8:14 AM, Jun Koi <junkoi2004 at gmail.com> wrote: >>

KNL Assembly Code for Matrix Multiplication

2017 Jul 01

KNL Assembly Code for Matrix Multiplication

Thank You, It means vmovdqa64 zmm22, zmmword ptr [rip + .LCPI0_0] # zmm22 = [8,9,10,11,12,13,14,15] zmm22 will contain 64 bit constant values which are indexes here zmm22=8, 9, 10, 11, 12,13,14,15. not the values loaded from these locations. and zmm2 contains constant 4000. so, vpmuludq zmm14, zmm10, zmm2 ; will multiply the indexes values with 4000, as for array b the stride is 4000. zmm14=

AVX2 codegen - question reg. FMA generation

2019 Sep 02

AVX2 codegen - question reg. FMA generation

Hello, On the appended reasonably simple test case that has an fmul/fadd sequence on <8 x float> vector types, I don't see the x86-64 code generator (with cpu set to haswell or later types) turning it into an AVX2 FMA instructions. Here's the snippet in the output it generates: $ llc -O3 -mcpu=skylake --------------------- .LBB0_2: # =>This Inner

[LLVMdev] use AVX automatically if present

2012 May 24

[LLVMdev] use AVX automatically if present

I wonder why AVX is not used automatically if available at the host machine. In contrast to that, SSE41 instructions (like pmulld) are automatically used if the host machine supports SSE41. E.g. $ cat avx.ll define void @_fun1(<8 x float>*, <8 x float>*) { _L1: %x = load <8 x float>* %0 %y = load <8 x float>* %1 %z = fadd <8 x float> %x, %y store

[LLVMdev] use AVX automatically if present

2012 May 24

[LLVMdev] use AVX automatically if present

On Thu, 24 May 2012, Pan, Wei wrote: > Very likely AVX is not enabled in your llc. This feature was enabled > just recently (late of April). I forgot to mention that I am using recent LLVM-3.1 and in principle my llc knows about avx as I have shown in the second example. But avx does not seem to be used by default. On Thu, 24 May 2012, Henning Thielemann wrote: > $ llc -o - -mattr

[LLVMdev] Calling conventions for YMM registers on AVX

2012 Jan 10

[LLVMdev] Calling conventions for YMM registers on AVX

This is the wrong code: declare <16 x float> @foo(<16 x float>) define <16 x float> @test(<16 x float> %x, <16 x float> %y) nounwind { entry: %x1 = fadd <16 x float> %x, %y %call = call <16 x float> @foo(<16 x float> %x1) nounwind %y1 = fsub <16 x float> %call, %y ret <16 x float> %y1 } ./llc -mattr=+avx

Creating extension groups

2005 Feb 23

Creating extension groups

Hi I want to create 2 groups of extensions, for example group 1 can't make outgoing calls they can only call other extensions and extensions of group 2. group 2 can call any of the extensions + they can make out going calls using our SIP server. Please let me know how to do this. I was going through the docs and I sae that I have to specify a group in zapta.conf , this is not clear please

Tablas en R

2017 Jun 06

Tablas en R

Hola doblett Creo qeu el paquete stargazer te puede ayudar: https://cran.r-project.org/web/packages/stargazer/index.html http://jakeruss.com/cheatsheets/stargazer.html Un saludo. El 6 de junio de 2017, 15:17, Álvaro Hernández <alvarohv en um.es> escribió: > Hola, doblett: > > Yo utilizo normalmente el paquete 'tables' que tiene una documentación > bastante buena.

Can't get F77_CALL(dgemm) to work [SEC=UNCLASSIFIED]

2009 Jun 12

Can't get F77_CALL(dgemm) to work [SEC=UNCLASSIFIED]

Hi I am new to writing C code and am trying to write an R extension in C. I have hit a wall with F77_CALL(dgemm) in that it produces wrong results. The code below is a simplified example that multiplies the matrices Ab and Bm to give Cm. The results below show clearly that Cm is wrong. Am= 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Bm= 1 1 1 1 1

input en markdown

2017 Jan 13

input en markdown

Hola lista: Una duda rápida (espero). ¿Se puede hacer un "inlcude" o un "input" en markdown? sin emplear Rmarkdown o knitr.... solo puro markdown, de modo que por ejemplo github lo interprete o un visor básico de markdown. la idea es q un file muestre(contenga) todo pero tener la info distribuida en varios files. Gracias. -- Antonio Maurandi López Sec. Apoyo

[LLVMdev] Merging AVX

2009 Dec 17

[LLVMdev] Merging AVX

I'd like to start moving some of our AVX work into trunk. We've got quite a bit of it implemented already. I wanted to make sure we got something that would work and remain relatively stable. Here's how I'd like to do this. First, I have some more TableGen fixes and enhancements that are prereqs for AVX templates. I'd like to get those in first. Then there are a number of

[LLVMdev] SLP vectorizer on AVX feature

2015 Jul 01

[LLVMdev] SLP vectorizer on AVX feature

Hi Frank, What does --debug-only=vectorize says? You may try to get the datalayout and the triple on the IR header, just to make sure you got everything right. LLVM will honour those, and front-ends should create them correctly. --renato On 1 July 2015 at 19:06, Frank Winter <fwinter at jlab.org> wrote: > I realized that the function parameters had no alignment attributes on them.

[LLVMdev] AVX code gen

2013 Dec 12

[LLVMdev] AVX code gen

It probably does not pick the right processor architecture. You could try “clang -mavx” or “clang -march=corei7-avx” for ivy-bridge and “clang -march=core-avx2” or “clang -mavx2" for haswell. $ clang -march=core-avx2 -O3 -S -o - test.c .section __TEXT,__text,regular,pure_instructions .globl _f .align 4, 0x90 _f: ## @f

[LLVMdev] SLP vectorizer on AVX feature

2015 Jul 01

[LLVMdev] SLP vectorizer on AVX feature

Frank, It sounds like the SLP vectorizer thinks that it is more profitable to use 128bit wide operations (because 256bit operations are double pumped on Sandybridge). Did you see a different result on Haswell? Thanks, Nadav > On Jul 1, 2015, at 11:06 AM, Frank Winter <fwinter at jlab.org> wrote: > > I realized that the function parameters had no alignment attributes on them.

borrar texto en una gráfica

2016 Oct 26

borrar texto en una gráfica

Hola a todos, Os envío una consulta que considero sencilla pero me está resultando imposible de resolver. Si ejecutáis el siguiente código, obtendréis la gráfica que os adjunto: library(ltm) modelo <- rasch(LSAT) plot(modelo, main="Curva probabilidad pregunta 1",legend = TRUE, cx = "bottomright", items=1,xlab="Conocimiento",ylab="Probabilidad") Resulta

VBROADCAST Implementation Issues

2017 Aug 06

VBROADCAST Implementation Issues

i want to implement gather for v64i32. i wrote following code. def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst), (ins i2048mem:$src), "GATHER_256B\t{$src, $dst|$dst, $src}", [(set VR_2048:$dst, (v64i32 (masked_gather addr:$src)))], IIC_MOV_MEM>, TA; def: Pat<(v64f32 (masked_gather addr:$src)), (GATHER_256B

Fwd: Seminario sobre estimacion para areas pequeñas. "Small Area estimation"

2013 Oct 02

Fwd: Seminario sobre estimacion para areas pequeñas. "Small Area estimation"

Estimados compañeros Escribo para anunciar en la lista que el martes de la semana que viene (8 de Octubre) a la *13:00 h. *tendrá lugar en el *Instituto de Ciencias Matemáticas del CSIC-* *ICMAT (Universidad Autónoma, Cantoblanco,ver ubicacion <http://www.icmat.es/facilities/howtoarrive>) (Aula Gris 1), Madrid* un seminario sobre "Estimacion para areas pequeñas" ( "Small

VBROADCAST Implementation Issues

2017 Aug 07

VBROADCAST Implementation Issues

Hello, I did as you said, Please tell me whether the following correct now?? def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst, _.KRCWM:$mask_wb), (VR_2048:$src1, _.KRCWM:$mask, ins i2048mem:$src2), "GATHER_256B\t{$src2, {$dst}{${mask}}|${dst} {${mask}}, $src2}"), [(set VR_2048:$dst, _.KRCWM:$mask_wb, (v64i32 (GatherNode

Potential clue for Bug 16975 - lme fixed sigma - inconsistent REML estimation

2017 Mar 07

Potential clue for Bug 16975 - lme fixed sigma - inconsistent REML estimation

Dear list, I was trying to create a VarClass for nlme to work with Fay-Herriot (FH) models. The idea was to create a modification of VarComb that instead of multiplying the variance functions made their sum (I called it varSum). After some fails etc... I found that the I was not getting the expected results because I needed to make sigma fixed. Trying to find how to make sigma fixed I run into

similar to: [LLVMdev] problem with X86's AVX assembler?