thr3ads.net - similar to: "[LLVMdev] Intrinsics and dead instruction/code elimination"

Displaying 20 results from an estimated 5000 matches similar to: "[LLVMdev] Intrinsics and dead instruction/code elimination"

[LLVMdev] Intrinsics and dead instruction/code elimination

2010 May 19

[LLVMdev] Intrinsics and dead instruction/code elimination

On May 19, 2010, at 7:07 AM, o.j.sivart at gmail.com wrote: > Hi all, > > I'm interested in the impact of representing code via intrinsic functions, in contrast to via an instruction, when it comes to performing dead instruction/code elimination. As a concrete example, lets consider the simple case of the llvm.*.with.overflow.* intrinsics. > > If I have some sequence (> 1)

[LLVMdev] Intrinsics and dead instruction/code elimination

2010 May 19

[LLVMdev] Intrinsics and dead instruction/code elimination

On 20/05/2010, at 3:01 AM, Chris Lattner wrote: > > On May 19, 2010, at 7:07 AM, o.j.sivart at gmail.com wrote: > >> Hi all, >> >> I'm interested in the impact of representing code via intrinsic functions, in contrast to via an instruction, when it comes to performing dead instruction/code elimination. As a concrete example, lets consider the simple case of the

[LLVMdev] Intrinsics and dead instruction/code elimination

2010 May 19

[LLVMdev] Intrinsics and dead instruction/code elimination

On May 19, 2010, at 3:13 PM, o.j.sivart at gmail.com wrote: >> >> Intrinsics should be optimized as well as instructions. In this specific case, these intrinsics should be marked readnone, which means that load/store optimization will ignore them. Dead code elimination will delete the intrinsic if it is dead etc. > > I understand that dead code elimination is able to delete

[LLVMdev] Intrinsics and dead instruction/code elimination

2010 May 19

[LLVMdev] Intrinsics and dead instruction/code elimination

On 20/05/2010, at 8:16 AM, Chris Lattner wrote: > > On May 19, 2010, at 3:13 PM, o.j.sivart at gmail.com wrote: > >>> >>> Intrinsics should be optimized as well as instructions. In this specific case, these intrinsics should be marked readnone, which means that load/store optimization will ignore them. Dead code elimination will delete the intrinsic if it is dead

Comparing "transform" to "with"

2007 Sep 01

Comparing "transform" to "with"

Hi All, I've been successfully using the with function for analyses and the transform function for multiple transformations. Then I thought, why not use "with" for both? I ran into problems & couldn't figure them out from help files or books. So I created a simplified version of what I'm doing: rm( list=ls() ) x1<-c(1,3,3) x2<-c(3,2,1) x3<-c(2,5,2)

how to convert non numeric data into numeric?

2008 Apr 22

how to convert non numeric data into numeric?

I am having the following error in my function function(theta,reqdIRR) { theta1<-theta[1] theta2<-theta[2] n<-length(reqdIRR) constant<- n*(theta1+theta2) sum1<-lapply(reqdIRR*exp(theta1),FUN = sum) sum2<-lapply(exp(theta2 - reqdIRR*exp(theta1)),FUN = sum) sum = sum1 + sum2 log.fcn = constant - as.numeric(sum) result = - log.fcn return(result) } *error :

Help with efficient double sum of max (X_i, Y_i) (X & Y vectors)

2007 Feb 01

Help with efficient double sum of max (X_i, Y_i) (X & Y vectors)

Greetings. For R gurus this may be a no brainer, but I could not find pointers to efficient computation of this beast in past help files. Background - I wish to implement a Cramer-von Mises type test statistic which involves double sums of max(X_i,Y_j) where X and Y are vectors of differing length. I am currently using ifelse pointwise in a vector, but have a nagging suspicion that there is a

Speex ARM4 patch

2007 Apr 03

Speex ARM4 patch

The attached patch eliminates some warnings while compiling for ARM4 targets. It also simplifies the asm constraints a bit. Now we can use the ARM4 optimisations when compiling for PortalPlayer targets in Rockbox. Cheers, Dan -------------- next part -------------- A non-text attachment was scrubbed... Name: speex_arm4.patch Type: text/x-diff Size: 1550 bytes Desc: not available Url :

[PATCH 0/7] PowerPC64 performance improvements

2018 Jul 10

[PATCH 0/7] PowerPC64 performance improvements

The following series adds initial vector support for PowerPC64. On POWER9, flac --best is about 3.3x faster. Amitay Isaacs (2): Add m4 macro to check for C __attribute__ features Check if compiler supports target attribute on ppc64 Anton Blanchard (5): configure.ac: Remove SPE detection code configure.ac: Add VSX enable/disable configure.ac: Fix FLAC__CPU_PPC on little endian, and add

Faster way of summing values up based on expand.grid

2013 Mar 25

Faster way of summing values up based on expand.grid

Hello! # I have 3 vectors of values: values1<-rnorm(10) values2<-rnorm(10) values3<-rnorm(10) # In real life, all 3 vectors have a length of 25 # I create all possible combinations of 4 based on 10 elements: mycombos<-expand.grid(1:10,1:10,1:10,1:10) dim(mycombos) # Removing rows that contain pairs of identical values in any 2 of these columns: mycombos<-mycombos[!(mycombos$Var1

SSE2 code won't compile in VC

2008 Nov 26

SSE2 code won't compile in VC

Jean-Marc, At least VS2005 (what I'm using) won't compile resample_sse.h with _USE_SSE2 defined because it refuses to cast __m128 to __m128d and vice versa. While there are intrinsics to do the casts, I thought it would be simpler to just use an intrinsic that accomplishes the same thing without all the casting. Thanks, --John @@ -91,7 +91,7 @@ static inline double

Possible security hole

2003 Oct 05

Possible security hole

Maybe security related mails should be sent elsewhere? I didn't notice any so here it goes: sender.c:receive_sums() s->count = read_int(f); .. s->sums = (struct sum_buf *)malloc(sizeof(s->sums[0])*s->count); if (!s->sums) out_of_memory("receive_sums"); for (i=0; i < (int) s->count;i++) { s->sums[i].sum1 = read_int(f);

structure(<primitive function>, ...) is sticky: a bug, or should it be an error?

2025 Mar 19

structure(<primitive function>, ...) is sticky: a bug, or should it be an error?

Hello. I just (re-)discovered that structure(sum, init = 100) is "sticky", i.e. it stays with base::sum(). Here's an minimal example: $ R --vanilla --quiet > void <- structure(sum, some_attr = TRUE) > str(sum) function (..., na.rm = FALSE) - attr(*, "some_attr")= logi TRUE >From my very basic troubleshooting, it looks like this is happening for primitive

Resampler experimental speedups

2008 Apr 04

Resampler experimental speedups

Hello :) The attached patch (which is not in any way finished) optimizes the resampler. (For those following the discussions on IRC; this version includes optimizations for both direct and interpolate cases). Using GCC 4.3, x86_64, Valgrind to measure instruction counts, resampling 10 frames of 320 floats at quality 3. Direct was measured with a 16=>48 resampling, and interpolate with a

[RFC][patch] dynamic rolling block and sum sizes II

2003 Mar 30

[RFC][patch] dynamic rolling block and sum sizes II

Mark II of the patch set. The first patch (dynsumlen2.patch) increments the protocol version to support per-file dynamic block checksum sizes. It is a prerequisite for varsumlen2.patch. varsumlen2.patch implements per-file dynamic block and checksum sizes. The current block size calculation only applies to files between 7MB and 160MB setting the block size to 1/10,0000 of the file length for a

simulation of modified bartlett's test

2012 Jun 04

simulation of modified bartlett's test

Hi, I run this code to get the power of the test for modified bartlett's test..but I'm not really sure that my coding is right.. #normal distribution unequal variance asim<-5000 pv<-rep(NA,asim) for(i in 1:asim) {print(i) set.seed(i) n1<-20 n2<-20 n3<-20 mu<-0 sd1<-sqrt(25) sd2<-sqrt(50) sd3<-sqrt(100) g1<-rnorm(n1,mu,sd1) g2<-rnorm(n2,mu,sd2)

Patch cleaning up Opus x86 intrinsics configury

2015 Mar 02

Patch cleaning up Opus x86 intrinsics configury

The attached patch cleans up Opus's x86 intrinsics configury. It: * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in

[RFC PATCH v3] Intrinsics/RTCD related fixes. Mostly x86.

2015 Mar 13

[RFC PATCH v3] Intrinsics/RTCD related fixes. Mostly x86.

From: Jonathan Lennox <jonathan at vidyo.com> * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in floating-point mode, not

[RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.

2015 Mar 12

[RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.

[PATCH] Fix miscompile of SSE resampler

2009 Oct 26

[PATCH] Fix miscompile of SSE resampler

From: Thorvald Natvig <slicer at users.sourceforge.net> Some optimizing compilers miscompile the current SSE optimizations when full optimizations are enabled. By using output value pointer instead of a return value, we can bypass this misbehaviour. --- libspeex/resample.c | 8 ++++---- libspeex/resample_sse.h | 24 ++++++++---------------- 2 files changed, 12 insertions(+), 20

similar to: [LLVMdev] Intrinsics and dead instruction/code elimination