Displaying 20 results from an estimated 5000 matches similar to: "[LLVMdev] Intrinsics and dead instruction/code elimination"
2010 May 19
0
[LLVMdev] Intrinsics and dead instruction/code elimination
On May 19, 2010, at 7:07 AM, o.j.sivart at gmail.com wrote:
> Hi all,
>
> I'm interested in the impact of representing code via intrinsic functions, in contrast to via an instruction, when it comes to performing dead instruction/code elimination. As a concrete example, lets consider the simple case of the llvm.*.with.overflow.* intrinsics.
>
> If I have some sequence (> 1)
2010 May 19
2
[LLVMdev] Intrinsics and dead instruction/code elimination
On 20/05/2010, at 3:01 AM, Chris Lattner wrote:
>
> On May 19, 2010, at 7:07 AM, o.j.sivart at gmail.com wrote:
>
>> Hi all,
>>
>> I'm interested in the impact of representing code via intrinsic functions, in contrast to via an instruction, when it comes to performing dead instruction/code elimination. As a concrete example, lets consider the simple case of the
2010 May 19
0
[LLVMdev] Intrinsics and dead instruction/code elimination
On May 19, 2010, at 3:13 PM, o.j.sivart at gmail.com wrote:
>>
>> Intrinsics should be optimized as well as instructions. In this specific case, these intrinsics should be marked readnone, which means that load/store optimization will ignore them. Dead code elimination will delete the intrinsic if it is dead etc.
>
> I understand that dead code elimination is able to delete
2010 May 19
1
[LLVMdev] Intrinsics and dead instruction/code elimination
On 20/05/2010, at 8:16 AM, Chris Lattner wrote:
>
> On May 19, 2010, at 3:13 PM, o.j.sivart at gmail.com wrote:
>
>>>
>>> Intrinsics should be optimized as well as instructions. In this specific case, these intrinsics should be marked readnone, which means that load/store optimization will ignore them. Dead code elimination will delete the intrinsic if it is dead
2007 Sep 01
2
Comparing "transform" to "with"
Hi All,
I've been successfully using the with function for analyses and the
transform function for multiple transformations. Then I thought, why not
use "with" for both? I ran into problems & couldn't figure them out from
help files or books. So I created a simplified version of what I'm
doing:
rm( list=ls() )
x1<-c(1,3,3)
x2<-c(3,2,1)
x3<-c(2,5,2)
2008 Apr 22
4
how to convert non numeric data into numeric?
I am having the following error in my function
function(theta,reqdIRR)
{
theta1<-theta[1]
theta2<-theta[2]
n<-length(reqdIRR)
constant<- n*(theta1+theta2)
sum1<-lapply(reqdIRR*exp(theta1),FUN = sum)
sum2<-lapply(exp(theta2 - reqdIRR*exp(theta1)),FUN = sum)
sum = sum1 + sum2
log.fcn = constant - as.numeric(sum)
result = - log.fcn
return(result)
}
*error :
2007 Feb 01
3
Help with efficient double sum of max (X_i, Y_i) (X & Y vectors)
Greetings.
For R gurus this may be a no brainer, but I could not find pointers to
efficient computation of this beast in past help files.
Background - I wish to implement a Cramer-von Mises type test statistic
which involves double sums of max(X_i,Y_j) where X and Y are vectors of
differing length.
I am currently using ifelse pointwise in a vector, but have a nagging
suspicion that there is a
2007 Apr 03
1
Speex ARM4 patch
The attached patch eliminates some warnings while compiling for ARM4
targets. It also simplifies the asm constraints a bit. Now we can use
the ARM4 optimisations when compiling for PortalPlayer targets in Rockbox.
Cheers,
Dan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: speex_arm4.patch
Type: text/x-diff
Size: 1550 bytes
Desc: not available
Url :
2018 Jul 10
9
[PATCH 0/7] PowerPC64 performance improvements
The following series adds initial vector support for PowerPC64.
On POWER9, flac --best is about 3.3x faster.
Amitay Isaacs (2):
Add m4 macro to check for C __attribute__ features
Check if compiler supports target attribute on ppc64
Anton Blanchard (5):
configure.ac: Remove SPE detection code
configure.ac: Add VSX enable/disable
configure.ac: Fix FLAC__CPU_PPC on little endian, and add
2013 Mar 25
2
Faster way of summing values up based on expand.grid
Hello!
# I have 3 vectors of values:
values1<-rnorm(10)
values2<-rnorm(10)
values3<-rnorm(10)
# In real life, all 3 vectors have a length of 25
# I create all possible combinations of 4 based on 10 elements:
mycombos<-expand.grid(1:10,1:10,1:10,1:10)
dim(mycombos)
# Removing rows that contain pairs of identical values in any 2 of
these columns:
mycombos<-mycombos[!(mycombos$Var1
2008 Nov 26
1
SSE2 code won't compile in VC
Jean-Marc,
At least VS2005 (what I'm using) won't compile resample_sse.h with
_USE_SSE2 defined because it refuses to cast __m128 to __m128d and vice
versa. While there are intrinsics to do the casts, I thought it would be
simpler to just use an intrinsic that accomplishes the same thing
without all the casting. Thanks,
--John
@@ -91,7 +91,7 @@ static inline double
2003 Oct 05
2
Possible security hole
Maybe security related mails should be sent elsewhere? I didn't notice
any so here it goes:
sender.c:receive_sums()
s->count = read_int(f);
..
s->sums = (struct sum_buf *)malloc(sizeof(s->sums[0])*s->count);
if (!s->sums) out_of_memory("receive_sums");
for (i=0; i < (int) s->count;i++) {
s->sums[i].sum1 = read_int(f);
2008 Apr 04
1
Resampler experimental speedups
Hello :)
The attached patch (which is not in any way finished) optimizes the
resampler. (For those following the discussions on IRC; this version
includes optimizations for both direct and interpolate cases).
Using GCC 4.3, x86_64, Valgrind to measure instruction counts,
resampling 10 frames of 320 floats at quality 3. Direct was measured
with a 16=>48 resampling, and interpolate with a
2003 Mar 30
1
[RFC][patch] dynamic rolling block and sum sizes II
Mark II of the patch set.
The first patch (dynsumlen2.patch) increments the protocol
version to support per-file dynamic block checksum sizes.
It is a prerequisite for varsumlen2.patch.
varsumlen2.patch implements per-file dynamic block and checksum
sizes.
The current block size calculation only applies to files
between 7MB and 160MB setting the block size to 1/10,0000 of
the file length for a
2012 Jun 04
1
simulation of modified bartlett's test
Hi, I run this code to get the power of the test for modified bartlett's
test..but I'm not really sure that my coding is right..
#normal distribution unequal variance
asim<-5000
pv<-rep(NA,asim)
for(i in 1:asim)
{print(i)
set.seed(i)
n1<-20
n2<-20
n3<-20
mu<-0
sd1<-sqrt(25)
sd2<-sqrt(50)
sd3<-sqrt(100)
g1<-rnorm(n1,mu,sd1)
g2<-rnorm(n2,mu,sd2)
2015 Mar 02
13
Patch cleaning up Opus x86 intrinsics configury
The attached patch cleans up Opus's x86 intrinsics configury.
It:
* Makes ?enable-intrinsics work with clang and other non-GCC compilers
* Enables RTCD for the floating-point-mode SSE code in Celt.
* Disables use of RTCD in cases where the compiler targets an instruction set by default.
* Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in
2015 Mar 13
1
[RFC PATCH v3] Intrinsics/RTCD related fixes. Mostly x86.
From: Jonathan Lennox <jonathan at vidyo.com>
* Makes ?enable-intrinsics work with clang and other non-GCC compilers
* Enables RTCD for the floating-point-mode SSE code in Celt.
* Disables use of RTCD in cases where the compiler targets an instruction set by default.
* Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in floating-point mode, not
2015 Mar 12
1
[RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.
From: Jonathan Lennox <jonathan at vidyo.com>
* Makes ?enable-intrinsics work with clang and other non-GCC compilers
* Enables RTCD for the floating-point-mode SSE code in Celt.
* Disables use of RTCD in cases where the compiler targets an instruction set by default.
* Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in floating-point mode, not
2009 Oct 26
1
[PATCH] Fix miscompile of SSE resampler
From: Thorvald Natvig <slicer at users.sourceforge.net>
Some optimizing compilers miscompile the current SSE optimizations when
full optimizations are enabled. By using output value pointer instead of
a return value, we can bypass this misbehaviour.
---
libspeex/resample.c | 8 ++++----
libspeex/resample_sse.h | 24 ++++++++----------------
2 files changed, 12 insertions(+), 20
2004 Jan 11
1
comparing 2 in rsync
Hi,
Is there a sure way to test 2 cd's if they are true copies in rsync.
I have tried this...
rsync -avv /mnt/cdrom/ /mnt/cdrom2
**
result....quite shotened...
misc/rpm2header is uptodate
pkg-9.2-FiveStar-download-i586.idx is uptodate
total: matches=0 tag_hits=0 false_alarms=0 data=0
wrote 65301 bytes read 20 bytes 1789.62 bytes/sec
total size is 679261427 speedup is 10398.82
**
Is