Displaying 20 results from an estimated 400 matches similar to: "[LLVMdev] problem with X86's AVX assembler?"
2014 Jun 26
2
[LLVMdev] problem with X86's AVX assembler?
On Thu, Jun 26, 2014 at 5:47 AM, Adam Nemet <anemet at apple.com> wrote:
> Hi Jun,
>
> On Jun 25, 2014, at 8:14 AM, Jun Koi <junkoi2004 at gmail.com> wrote:
>
> > Hi,
> >
> > I am trying to assemble below instruction with latest LLVM code, but
> fail. Am I doing something wrong, or is this a bug?
> >
> >
> > $ echo "vaddps zmm7
2014 Jun 26
2
[LLVMdev] problem with X86's AVX assembler?
On Thu, Jun 26, 2014 at 10:23 AM, Adam Nemet <anemet at apple.com> wrote:
>
>
> On Jun 25, 2014, at 7:05 PM, Jun Koi <junkoi2004 at gmail.com> wrote:
>
>
>
>
> On Thu, Jun 26, 2014 at 5:47 AM, Adam Nemet <anemet at apple.com> wrote:
>
>> Hi Jun,
>>
>> On Jun 25, 2014, at 8:14 AM, Jun Koi <junkoi2004 at gmail.com> wrote:
>>
2017 Jul 01
2
KNL Assembly Code for Matrix Multiplication
Thank You,
It means vmovdqa64 zmm22, zmmword ptr [rip + .LCPI0_0] # zmm22 =
[8,9,10,11,12,13,14,15] zmm22 will contain 64 bit constant values which are
indexes here zmm22=8, 9, 10, 11, 12,13,14,15. not the values loaded from
these locations. and zmm2 contains constant 4000. so,
vpmuludq zmm14, zmm10, zmm2 ; will multiply the indexes values with 4000,
as for array b the stride is 4000.
zmm14=
2019 Sep 02
3
AVX2 codegen - question reg. FMA generation
Hello,
On the appended reasonably simple test case that has an fmul/fadd
sequence on <8 x float> vector types, I don't see the x86-64 code
generator (with cpu set to haswell or later types) turning it into an
AVX2 FMA instructions. Here's the snippet in the output it generates:
$ llc -O3 -mcpu=skylake
---------------------
.LBB0_2: # =>This Inner
2012 May 24
4
[LLVMdev] use AVX automatically if present
I wonder why AVX is not used automatically if available at the host
machine. In contrast to that, SSE41 instructions (like pmulld) are
automatically used if the host machine supports SSE41.
E.g.
$ cat avx.ll
define void @_fun1(<8 x float>*, <8 x float>*) {
_L1:
%x = load <8 x float>* %0
%y = load <8 x float>* %1
%z = fadd <8 x float> %x, %y
store
2012 May 24
0
[LLVMdev] use AVX automatically if present
On Thu, 24 May 2012, Pan, Wei wrote:
> Very likely AVX is not enabled in your llc. This feature was enabled
> just recently (late of April).
I forgot to mention that I am using recent LLVM-3.1 and in principle my
llc knows about avx as I have shown in the second example. But avx does
not seem to be used by default.
On Thu, 24 May 2012, Henning Thielemann wrote:
> $ llc -o - -mattr
2012 Jan 10
0
[LLVMdev] Calling conventions for YMM registers on AVX
This is the wrong code:
declare <16 x float> @foo(<16 x float>)
define <16 x float> @test(<16 x float> %x, <16 x float> %y) nounwind {
entry:
%x1 = fadd <16 x float> %x, %y
%call = call <16 x float> @foo(<16 x float> %x1) nounwind
%y1 = fsub <16 x float> %call, %y
ret <16 x float> %y1
}
./llc -mattr=+avx
2005 Feb 23
2
Creating extension groups
Hi
I want to create 2 groups of extensions, for example group 1 can't make outgoing calls they can only call other extensions and extensions of group 2. group 2 can call any of the extensions + they can make out going calls using our SIP server.
Please let me know how to do this. I was going through the docs and I sae that I have to specify a group in zapta.conf , this is not clear please
2017 Jun 06
3
Tablas en R
Hola doblett
Creo qeu el paquete stargazer te puede ayudar:
https://cran.r-project.org/web/packages/stargazer/index.html
http://jakeruss.com/cheatsheets/stargazer.html
Un saludo.
El 6 de junio de 2017, 15:17, Álvaro Hernández <alvarohv en um.es> escribió:
> Hola, doblett:
>
> Yo utilizo normalmente el paquete 'tables' que tiene una documentación
> bastante buena.
2009 Jun 12
1
Can't get F77_CALL(dgemm) to work [SEC=UNCLASSIFIED]
Hi
I am new to writing C code and am trying to write an R extension in C. I
have hit a wall with F77_CALL(dgemm) in that it produces wrong results.
The code below is a simplified example that multiplies the matrices Ab and
Bm to give Cm. The results below show clearly that Cm is wrong.
Am=
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
17 18 19 20
Bm=
1 1 1
1 1
2017 Jan 13
3
input en markdown
Hola lista:
Una duda rápida (espero).
¿Se puede hacer un "inlcude" o un "input" en markdown?
sin emplear Rmarkdown o knitr.... solo puro markdown, de modo que por
ejemplo github lo interprete o un visor básico de markdown.
la idea es q un file muestre(contenga) todo pero tener la info
distribuida en varios files.
Gracias.
--
Antonio Maurandi López
Sec. Apoyo
2009 Dec 17
1
[LLVMdev] Merging AVX
I'd like to start moving some of our AVX work into trunk.
We've got quite a bit of it implemented already. I wanted to make sure
we got something that would work and remain relatively stable.
Here's how I'd like to do this. First, I have some more TableGen fixes
and enhancements that are prereqs for AVX templates. I'd like to get
those in first. Then there are a number of
2015 Jul 01
3
[LLVMdev] SLP vectorizer on AVX feature
Hi Frank,
What does --debug-only=vectorize says?
You may try to get the datalayout and the triple on the IR header,
just to make sure you got everything right. LLVM will honour those,
and front-ends should create them correctly.
--renato
On 1 July 2015 at 19:06, Frank Winter <fwinter at jlab.org> wrote:
> I realized that the function parameters had no alignment attributes on them.
2013 Dec 12
0
[LLVMdev] AVX code gen
It probably does not pick the right processor architecture.
You could try “clang -mavx” or “clang -march=corei7-avx” for ivy-bridge and “clang -march=core-avx2” or “clang -mavx2" for haswell.
$ clang -march=core-avx2 -O3 -S -o - test.c
.section __TEXT,__text,regular,pure_instructions
.globl _f
.align 4, 0x90
_f: ## @f
2015 Jul 01
3
[LLVMdev] SLP vectorizer on AVX feature
Frank,
It sounds like the SLP vectorizer thinks that it is more profitable to use 128bit wide operations (because 256bit operations are double pumped on Sandybridge). Did you see a different result on Haswell?
Thanks,
Nadav
> On Jul 1, 2015, at 11:06 AM, Frank Winter <fwinter at jlab.org> wrote:
>
> I realized that the function parameters had no alignment attributes on them.
2016 Oct 26
2
borrar texto en una gráfica
Hola a todos,
Os envío una consulta que considero sencilla pero me está resultando imposible de resolver. Si ejecutáis el siguiente código, obtendréis la gráfica que os adjunto:
library(ltm)
modelo <- rasch(LSAT)
plot(modelo, main="Curva probabilidad pregunta 1",legend = TRUE, cx = "bottomright", items=1,xlab="Conocimiento",ylab="Probabilidad")
Resulta
2017 Aug 06
2
VBROADCAST Implementation Issues
i want to implement gather for v64i32. i wrote following code.
def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst), (ins
i2048mem:$src),
"GATHER_256B\t{$src, $dst|$dst, $src}",
[(set VR_2048:$dst, (v64i32 (masked_gather
addr:$src)))],
IIC_MOV_MEM>, TA;
def: Pat<(v64f32 (masked_gather addr:$src)), (GATHER_256B
2013 Oct 02
0
Fwd: Seminario sobre estimacion para areas pequeñas. "Small Area estimation"
Estimados compañeros
Escribo para anunciar en la lista que el martes de la semana que viene (8
de Octubre) a la *13:00 h. *tendrá lugar en el *Instituto de Ciencias
Matemáticas del CSIC-* *ICMAT (Universidad Autónoma, Cantoblanco,ver
ubicacion <http://www.icmat.es/facilities/howtoarrive>) (Aula Gris 1),
Madrid* un seminario sobre "Estimacion para areas pequeñas" ( "Small
2017 Aug 07
2
VBROADCAST Implementation Issues
Hello,
I did as you said,
Please tell me whether the following correct now??
def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst, _.KRCWM:$mask_wb),
(VR_2048:$src1, _.KRCWM:$mask, ins i2048mem:$src2),
"GATHER_256B\t{$src2, {$dst}{${mask}}|${dst} {${mask}},
$src2}"),
[(set VR_2048:$dst, _.KRCWM:$mask_wb, (v64i32
(GatherNode
2017 Mar 07
0
Potential clue for Bug 16975 - lme fixed sigma - inconsistent REML estimation
Dear list,
I was trying to create a VarClass for nlme to work with Fay-Herriot
(FH) models. The idea was to create a modification of VarComb that
instead of multiplying the variance functions made their sum (I called
it varSum). After some fails etc... I found that the I was not getting
the expected results because I needed to make sigma fixed. Trying to
find how to make sigma fixed I run into