Displaying 20 results from an estimated 200 matches similar to: "[LLVMdev] IndVar widening in IndVarSimplify causing performance regression on GPU programs"
2017 Jun 13
1
[Mesa-dev] [RFC 0/9] Add precise/invariant semantics to TGSI
Am 13.06.2017 um 02:05 schrieb Ilia Mirkin:
> On Mon, Jun 12, 2017 at 7:57 PM, Roland Scheidegger <sroland at vmware.com> wrote:
>> FWIW surely on nv50 you could keep a single mad instruction for umad
>> (sad maybe too?). (I'm actually wondering if the hw really can't do
>> unfused float multiply+add as a single instruction but I know next to
>> nothing
2017 Jun 12
3
[Mesa-dev] [RFC 0/9] Add precise/invariant semantics to TGSI
This looks like the right idea to me too. It may sound a bit weird to do
that per instruction, but d3d11 does that as well. (Some d3d versions
just have a global flag basically forbidding or allowing any such fast
math optimizations in the assembly, but I'm not actually sure everybody
honors that without tesselation...)
For 1/9:
Reviewed-by: Roland Scheidegger <sroland at vmware.com>
2000 Apr 04
0
stochastic process transition probabilities estimation
Hi all,
I'm new with R (and S), and relatively new to statistics (I'm a
computer scientist), so I ask sorry in advance if my question is silly.
My problem is this: I have a (sample of a) discrete time stochastic
process {X_t} and I want to estimate
Pr{ X_t | X_{t-l_1}, X_{t-l_2}, ..., X_{t-l_k} }
where l_1, l_2, ..., l_k are some fixed time lags. It will be enough for
me to compute
2013 Feb 20
3
[LLVMdev] Is va_arg correct on Mips backend?
I didn't have Mips board. I compile as the commands and check the asm output as below.
1. Question:
The distance of caller arg[4] and arg[5] is 4 bytes. But the the callee get every
arg[] by 8 bytes offset (arg_ptr1+8 or arg_ptr2+8). I assume the #BB#4 and #BB#5 are the arg_ptr which is the pointer to access the stack arguments.
2. Question:
Stack memory 28($sp) has no initial value. If
2013 Feb 20
0
[LLVMdev] Is va_arg correct on Mips backend?
Does it make a difference if you give the "-target" option to clang?
$ clang -target mips-linux-gnu ch8_3.cpp -o ch8_3.bc -emit-llvm -c
The .s file generated this way looks quite different from the one in your
email.
On Tue, Feb 19, 2013 at 5:06 PM, Jonathan <gamma_chen at yahoo.com.tw> wrote:
> I didn't have Mips board. I compile as the commands and check the asm
>
2013 Feb 19
0
[LLVMdev] Is va_arg correct on Mips backend?
Which part of the generated code do you think is not correct? Could you be
more specific?
I compiled this program with clang and ran it on a mips board. It returns
the expected result (21).
On Tue, Feb 19, 2013 at 4:15 AM, Jonathan <gamma_chen at yahoo.com.tw> wrote:
> I check the Mips backend for the following C code fragment compile result.
> It seems not correct. Is it my
2013 Feb 19
2
[LLVMdev] Is va_arg correct on Mips backend?
I check the Mips backend for the following C code fragment compile result. It seems not correct. Is it my misunderstand or it's a bug.
//ch8_3.cpp
#include <stdarg.h>
int sum_i(int amount, ...)
{
int i = 0;
int val = 0;
int sum = 0;
va_list vl;
va_start(vl, amount);
for (i = 0; i < amount; i++)
{
val = va_arg(vl, int);
sum += val;
}
va_end(vl);
2018 Nov 06
4
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
Hi @ll,
while clang/LLVM recognizes common bit-twiddling idioms/expressions
like
unsigned int rotate(unsigned int x, unsigned int n)
{
return (x << n) | (x >> (32 - n));
}
and typically generates "rotate" machine instructions for this
expression, it fails to recognize other also common bit-twiddling
idioms/expressions.
The standard IEEE CRC-32 for "big
2008 Oct 30
0
[LLVMdev] Using patterns inside patterns
I am not sure what you are looking to do. Please provide a mark up
example.
Evan
On Oct 28, 2008, at 11:00 AM, Villmow, Micah wrote:
> Is there currently a way to use a pattern inside of another pattern?
>
> Micah Villmow
> Systems Engineer
> Advanced Technology & Performance
> Advanced Micro Devices Inc.
> 4555 Great America Pkwy,
> Santa Clara, CA. 95054
> P:
2014 Sep 02
3
[LLVMdev] LICM promoting memory to scalar
All,
If we can speculatively execute a load instruction, why isn’t it safe to hoist it out by promoting it to a scalar in LICM pass?
There is a comment in LICM pass that if a load/store is conditional then it is not safe because it would break the LLVM concurrency model (See commit 73bfa4a).
It has an IR test for checking this in test/Transforms/LICM/scalar-promote-memmodel.ll
However, I have
2010 Oct 01
1
Question about Reduce
Hello!
In the example below the Reduce() function by default assigns "a" to
be the last accumulated value and "b" to be the current value in "x".
I could not find this documented anywhere as the default settings for
the Reduce() function. Does any sort of documentation for this
behavior exist?
x <- c(0,0,0,0,0,1,0,0,0,5,0,0,0,7,0,0,0,8,5,10)
2013 Oct 03
1
[LLVMdev] Help with a Microblaze code generation problem.
Sorry if this is a duplicate: I tried to send it last night and it
didn't go through. I'm trimming some text to see if it helps.
I have a simple program that fails on the Microblaze:
int main()
{
unsigned long long x, y;
x = 100;
y = 0x8000000000000000ULL;
return !(x > y);
}
As you can see, the test case compares two unsigned long long values. To
try to track
2017 Jul 01
2
KNL Assembly Code for Matrix Multiplication
Thank You,
It means vmovdqa64 zmm22, zmmword ptr [rip + .LCPI0_0] # zmm22 =
[8,9,10,11,12,13,14,15] zmm22 will contain 64 bit constant values which are
indexes here zmm22=8, 9, 10, 11, 12,13,14,15. not the values loaded from
these locations. and zmm2 contains constant 4000. so,
vpmuludq zmm14, zmm10, zmm2 ; will multiply the indexes values with 4000,
as for array b the stride is 4000.
zmm14=
2016 Aug 13
2
A "hello world" coverage sanitizer
Thank you, kcc. I am unsure if I misunderstand your reply. It seems that
trace-bb, rather than trace-pc, fits better for my problem, given that my
instrumentation is to put before each conditional statement. Do I
misunderstand something here?
"
Tracing basic blocks
<http://clang.llvm.org/docs/SanitizerCoverage.html#id11>
With -fsanitize-coverage=trace-bb the compiler will insert
2009 Feb 16
2
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
Evan Cheng-2 wrote:
>
> Well, how many possible permutations are there? Is it possible to
> model each case as a separate physical register?
>
> Evan
>
I don't think so. There are 4x4x4x4 = 256 permutations. For example:
* xyzw: default
* zxyw
* yyyy: splat
Even if can model each of these 256 cases as a separate physical register,
how can I model the use of r0.xyzw in
2014 Sep 02
2
[LLVMdev] LICM promoting memory to scalar
I think gcc is right.
It inserted a branch for n == 0 (the cbz at the top), so that's not a problem.
In all other regards, this is safe: if you examine the sequence of loads and stores, it eliminated all but the first load and all but the last store. How's that unsafe?
If I had to guess, the bug here is that LLVM doesn't want to hoist the load over the condition (which it is right
2018 Nov 27
2
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
"Sanjay Patel" <spatel at rotateright.com> wrote:
> IIUC, you want to use x86-specific bit-hacks (sbb masking) in cases like
> this:
> unsigned int foo(unsigned int crc) {
> if (crc & 0x80000000)
> crc <<= 1, crc ^= 0xEDB88320;
> else
> crc <<= 1;
> return crc;
> }
To document this for x86 too: rewrite the function
2008 Oct 28
4
[LLVMdev] Using patterns inside patterns
Is there currently a way to use a pattern inside of another pattern?
Micah Villmow
Systems Engineer
Advanced Technology & Performance
Advanced Micro Devices Inc.
4555 Great America Pkwy,
Santa Clara, CA. 95054
P: 408-572-6219
F: 408-572-6596
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2001 Jul 26
1
ext3fs for kernel 2.4.2?
Hello,
does anybody know if there is an ext3fs patch for kernel 2.4.2-2 that is
shipped with redhat 7.1?
I was abel to fins patches only for 2.4.6 and 2.4.7.
Please try to help ASAP.
Best regards,
Imad Ossaily.
2011 Nov 18
2
Mail_quota plugin and LDAP on Dovecot 1.2
Hi, I'm new in this List, but I have 6 years using
Dovecot on my debian from etch,lenny and now squeeze
Package: dovecot-imapd
Version: 1:1.2.15-4
Tags: squeeze
-- System Information:
Debian Release: 6.0
APT prefers stable
APT policy: (500, 'stable')
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.32-5-amd64 (SMP w/24 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8