similar to: [LLVMdev] Intrinsics and dead instruction/code elimination

Displaying 20 results from an estimated 5000 matches similar to: "[LLVMdev] Intrinsics and dead instruction/code elimination"

2010 May 19
0
[LLVMdev] Intrinsics and dead instruction/code elimination
On May 19, 2010, at 7:07 AM, o.j.sivart at gmail.com wrote: > Hi all, > > I'm interested in the impact of representing code via intrinsic functions, in contrast to via an instruction, when it comes to performing dead instruction/code elimination. As a concrete example, lets consider the simple case of the llvm.*.with.overflow.* intrinsics. > > If I have some sequence (> 1)
2010 May 19
2
[LLVMdev] Intrinsics and dead instruction/code elimination
On 20/05/2010, at 3:01 AM, Chris Lattner wrote: > > On May 19, 2010, at 7:07 AM, o.j.sivart at gmail.com wrote: > >> Hi all, >> >> I'm interested in the impact of representing code via intrinsic functions, in contrast to via an instruction, when it comes to performing dead instruction/code elimination. As a concrete example, lets consider the simple case of the
2010 May 19
0
[LLVMdev] Intrinsics and dead instruction/code elimination
On May 19, 2010, at 3:13 PM, o.j.sivart at gmail.com wrote: >> >> Intrinsics should be optimized as well as instructions. In this specific case, these intrinsics should be marked readnone, which means that load/store optimization will ignore them. Dead code elimination will delete the intrinsic if it is dead etc. > > I understand that dead code elimination is able to delete
2010 May 19
1
[LLVMdev] Intrinsics and dead instruction/code elimination
On 20/05/2010, at 8:16 AM, Chris Lattner wrote: > > On May 19, 2010, at 3:13 PM, o.j.sivart at gmail.com wrote: > >>> >>> Intrinsics should be optimized as well as instructions. In this specific case, these intrinsics should be marked readnone, which means that load/store optimization will ignore them. Dead code elimination will delete the intrinsic if it is dead
2007 Sep 01
2
Comparing "transform" to "with"
Hi All, I've been successfully using the with function for analyses and the transform function for multiple transformations. Then I thought, why not use "with" for both? I ran into problems & couldn't figure them out from help files or books. So I created a simplified version of what I'm doing: rm( list=ls() ) x1<-c(1,3,3) x2<-c(3,2,1) x3<-c(2,5,2)
2008 Apr 22
4
how to convert non numeric data into numeric?
I am having the following error in my function function(theta,reqdIRR) { theta1<-theta[1] theta2<-theta[2] n<-length(reqdIRR) constant<- n*(theta1+theta2) sum1<-lapply(reqdIRR*exp(theta1),FUN = sum) sum2<-lapply(exp(theta2 - reqdIRR*exp(theta1)),FUN = sum) sum = sum1 + sum2 log.fcn = constant - as.numeric(sum) result = - log.fcn return(result) } *error :
2007 Feb 01
3
Help with efficient double sum of max (X_i, Y_i) (X & Y vectors)
Greetings. For R gurus this may be a no brainer, but I could not find pointers to efficient computation of this beast in past help files. Background - I wish to implement a Cramer-von Mises type test statistic which involves double sums of max(X_i,Y_j) where X and Y are vectors of differing length. I am currently using ifelse pointwise in a vector, but have a nagging suspicion that there is a
2007 Apr 03
1
Speex ARM4 patch
The attached patch eliminates some warnings while compiling for ARM4 targets. It also simplifies the asm constraints a bit. Now we can use the ARM4 optimisations when compiling for PortalPlayer targets in Rockbox. Cheers, Dan -------------- next part -------------- A non-text attachment was scrubbed... Name: speex_arm4.patch Type: text/x-diff Size: 1550 bytes Desc: not available Url :
2018 Jul 10
9
[PATCH 0/7] PowerPC64 performance improvements
The following series adds initial vector support for PowerPC64. On POWER9, flac --best is about 3.3x faster. Amitay Isaacs (2): Add m4 macro to check for C __attribute__ features Check if compiler supports target attribute on ppc64 Anton Blanchard (5): configure.ac: Remove SPE detection code configure.ac: Add VSX enable/disable configure.ac: Fix FLAC__CPU_PPC on little endian, and add
2013 Mar 25
2
Faster way of summing values up based on expand.grid
Hello! # I have 3 vectors of values: values1<-rnorm(10) values2<-rnorm(10) values3<-rnorm(10) # In real life, all 3 vectors have a length of 25 # I create all possible combinations of 4 based on 10 elements: mycombos<-expand.grid(1:10,1:10,1:10,1:10) dim(mycombos) # Removing rows that contain pairs of identical values in any 2 of these columns: mycombos<-mycombos[!(mycombos$Var1
2008 Nov 26
1
SSE2 code won't compile in VC
Jean-Marc, At least VS2005 (what I'm using) won't compile resample_sse.h with _USE_SSE2 defined because it refuses to cast __m128 to __m128d and vice versa. While there are intrinsics to do the casts, I thought it would be simpler to just use an intrinsic that accomplishes the same thing without all the casting. Thanks, --John @@ -91,7 +91,7 @@ static inline double
2003 Oct 05
2
Possible security hole
Maybe security related mails should be sent elsewhere? I didn't notice any so here it goes: sender.c:receive_sums() s->count = read_int(f); .. s->sums = (struct sum_buf *)malloc(sizeof(s->sums[0])*s->count); if (!s->sums) out_of_memory("receive_sums"); for (i=0; i < (int) s->count;i++) { s->sums[i].sum1 = read_int(f);
2008 Apr 04
1
Resampler experimental speedups
Hello :) The attached patch (which is not in any way finished) optimizes the resampler. (For those following the discussions on IRC; this version includes optimizations for both direct and interpolate cases). Using GCC 4.3, x86_64, Valgrind to measure instruction counts, resampling 10 frames of 320 floats at quality 3. Direct was measured with a 16=>48 resampling, and interpolate with a
2003 Mar 30
1
[RFC][patch] dynamic rolling block and sum sizes II
Mark II of the patch set. The first patch (dynsumlen2.patch) increments the protocol version to support per-file dynamic block checksum sizes. It is a prerequisite for varsumlen2.patch. varsumlen2.patch implements per-file dynamic block and checksum sizes. The current block size calculation only applies to files between 7MB and 160MB setting the block size to 1/10,0000 of the file length for a
2012 Jun 04
1
simulation of modified bartlett's test
Hi, I run this code to get the power of the test for modified bartlett's test..but I'm not really sure that my coding is right.. #normal distribution unequal variance asim<-5000 pv<-rep(NA,asim) for(i in 1:asim) {print(i) set.seed(i) n1<-20 n2<-20 n3<-20 mu<-0 sd1<-sqrt(25) sd2<-sqrt(50) sd3<-sqrt(100) g1<-rnorm(n1,mu,sd1) g2<-rnorm(n2,mu,sd2)
2015 Mar 02
13
Patch cleaning up Opus x86 intrinsics configury
The attached patch cleans up Opus's x86 intrinsics configury. It: * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in
2015 Mar 13
1
[RFC PATCH v3] Intrinsics/RTCD related fixes. Mostly x86.
From: Jonathan Lennox <jonathan at vidyo.com> * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in floating-point mode, not
2015 Mar 12
1
[RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.
From: Jonathan Lennox <jonathan at vidyo.com> * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in floating-point mode, not
2009 Oct 26
1
[PATCH] Fix miscompile of SSE resampler
From: Thorvald Natvig <slicer at users.sourceforge.net> Some optimizing compilers miscompile the current SSE optimizations when full optimizations are enabled. By using output value pointer instead of a return value, we can bypass this misbehaviour. --- libspeex/resample.c | 8 ++++---- libspeex/resample_sse.h | 24 ++++++++---------------- 2 files changed, 12 insertions(+), 20
2004 Jan 11
1
comparing 2 in rsync
Hi, Is there a sure way to test 2 cd's if they are true copies in rsync. I have tried this... rsync -avv /mnt/cdrom/ /mnt/cdrom2 ** result....quite shotened... misc/rpm2header is uptodate pkg-9.2-FiveStar-download-i586.idx is uptodate total: matches=0 tag_hits=0 false_alarms=0 data=0 wrote 65301 bytes read 20 bytes 1789.62 bytes/sec total size is 679261427 speedup is 10398.82 ** Is