thr3ads.net - similar to: "[LLVMdev] Disabling x87 instructions for a sub-target"

Displaying 20 results from an estimated 3000 matches similar to: "[LLVMdev] Disabling x87 instructions for a sub-target"

[LLVMdev] Disabling x87 instructions for a sub-target

2012 Apr 04

[LLVMdev] Disabling x87 instructions for a sub-target

Hi Sriram, I'm not sure if I understand your question correctly: Do you need to generate code that contains no x87 floating-point instructions altogether, but uses calls into a soft-float library instead? That behaviour can be enabled using the "-soft-float" flag, as far as I know. Or is it only about the fcomi* instructions, which are not supported by pre-Pentium Pro chips? Then I

[LLVMdev] Disabling x87 instructions for a sub-target

2012 Apr 06

[LLVMdev] Disabling x87 instructions for a sub-target

Thanks Chris, your response has been very helpful so far. I will try your solution, as opposed to the one that I have right now. (Disabling all the x87 instructions altogether). Yours, Ram -----Original Message----- From: Christoph Erhardt [mailto:christoph at sicherha.de] Sent: Thursday, April 05, 2012 7:08 PM To: Murali, Sriram Subject: Re: [LLVMdev] Disabling x87 instructions for a

[LLVMdev] Improving the usability of LNT

2013 Apr 30

[LLVMdev] Improving the usability of LNT

Hi Daniel, I made some changes to the LNT perf reporting tool to make it more user friendly by adding some features: 1. Make the sidebar and the navigation bar stationary, so that it is easy to navigate the site 2. Have the pop-down menu for the items in the navigation bar, activate upon hovering the mouse, rather than clicking the item 3. Add a nav-link in the sidebar for the

[LLVMdev] Trip count and Loop Vectorizer

2013 Sep 27

[LLVMdev] Trip count and Loop Vectorizer

Hi, I am trying to get a small loop to *not vectorize* for cases where it doesn't make sense. For instance, this loop: void foo(int a[4][8], int n) { int b[4][8]; for(int i = 0; i < 4; i++) { for(int j = 0; j < n; j++) { a[i][j] = b[i][j]; } } } * Has maximum of 8ints copy. LLVM tries to use Memcpy for the inner loop. It is not helpful to perform

[LLVMdev] Improving the usability of LNT

2013 May 02

[LLVMdev] Improving the usability of LNT

Wow, that sounds great! Thanks for working on this, and yes, please, send the patches! --renato On 30 April 2013 16:23, Murali, Sriram <sriram.murali at intel.com> wrote: > Hi Daniel,**** > > I made some changes to the LNT perf reporting tool to make it more user > friendly by adding some features:**** > > **1. **Make the sidebar and the navigation bar stationary,

[LLVMdev] Trip count and Loop Vectorizer

2013 Sep 27

[LLVMdev] Trip count and Loop Vectorizer

Hi Sriram, Thanks for performing this analysis. The problem here, both for memcpy and the vectorizer, is that we can’t predict the size of “n”, even though the only use of ’n’ is for the loop bound for the alloca [4 x [8 x i32]]. If you change the unroll condition to TC >= 0 then you will disable loop unrolling for all loops because getSmallConstantTripCount returns an unsigned number. You

[LLVMdev] Trip count and Loop Vectorizer

2013 Sep 27

[LLVMdev] Trip count and Loop Vectorizer

Hi Nadav, Thanks for the response. I forgot to mention that there is an upper limit of 16 for the Trip Count check, TinyTripCountVectorThreshold = 16; if (TC > 0u && TC < TinyTripCountVectorThreshold). So right now, any loop with Trip Count as 0, or with value >=16, LV with unroll. With the change to the lower bound, it will also include the loop with 0 trip count. SCEV returns 0

[LLVMdev] State of Loop Unrolling and Vectorization in LLVM

2013 Apr 15

[LLVMdev] State of Loop Unrolling and Vectorization in LLVM

Hi , I have a test case (and a micro benchmark made out of the test case) to check if loop unrolling and loop vectorization is efficiently done on LLVM. Here is the test case (credits: Tyler Nowicki) {code} extern float * array; extern int array_size; float g() { int i; float total = 0; for(i = 0; i < array_size; i++) { total += array[i]; } return total; } {code} When

[LLVMdev] [Patch][Review Requested][Compilation Time] Avoid frequent copy of elements in LoopStrengthReduce

2013 Jan 29

[LLVMdev] [Patch][Review Requested][Compilation Time] Avoid frequent copy of elements in LoopStrengthReduce

Hello, This patch aims to improve compile time performance by increasing the SCEV vector size in LoopStrengthReduce. It is observed that the BaseRegs vector size is 4 in most cases, and elements are frequently copied when it is initialized as SmallVector<const SCEV *, 2> BaseRegs. Our benchmark results show that the compilation time performance improved by ~0.5%. Patch by Wan Xiaofei.

[LLVMdev] [Patch][Review Requested][Compilation Time] Avoid frequent copy of elements in LoopStrengthReduce

2013 Jan 29

[LLVMdev] [Patch][Review Requested][Compilation Time] Avoid frequent copy of elements in LoopStrengthReduce

On Tue, Jan 29, 2013 at 3:59 PM, Murali, Sriram <sriram.murali at intel.com> wrote: > Our benchmark results show that the compilation time performance improved by > ~0.5%. That's fairly small; what was the standard deviation, confidence interval, etc? -- Sean Silva

[LLVMdev] Trip count and Loop Vectorizer

2013 Sep 27

[LLVMdev] Trip count and Loop Vectorizer

On Sep 27, 2013, at 12:47 PM, Arnold Schwaighofer <aschwaighofer at apple.com> wrote: > so you could infer that n must be smaller than 8 (because you know the range of the other dimension). The question is how often does such an example occur, where this is possible, to make such an effort justifiable? smaller equal, of course ;)

[LLVMdev] Trip count and Loop Vectorizer

2013 Sep 27

[LLVMdev] Trip count and Loop Vectorizer

Hey Arnold, I have run into this situation many times while benchmarking. I think it is best if this is addressed using a simple heuristic. For that, we need to identify the loop cost and decide if it makes sense to completely unroll the loop, or partially unroll. I am unsure of the optimal way to implement this though. I want to run it by the list to get any ideas floating around :) Thanks

[LLVMdev] [codegen] how to generate x87 instructions using LLVM

2009 Sep 16

[LLVMdev] [codegen] how to generate x87 instructions using LLVM

Hi All I am a greenhand for LLVM. I find the LLVM generate SSE instrctions for floating pointing computation, is there some method or options to let it generate x87 instructions? Thanks Simon -- >From : Simon.Zhou at PPI, Fudan University -------------- next part -------------- An HTML attachment was scrubbed... URL:

[LLVMdev] Calling with register indirect reference instead of memory indirect reference.

2013 Feb 28

[LLVMdev] Calling with register indirect reference instead of memory indirect reference.

Hi, I am working on a small optimization feature to replace the calls with indirect reference using a memory with an indirect reference using register. The purpose of this feature is to improve the performance of calls to functions referred to by function pointers. The motivation behind this work is that gcc does this optimization. Here is a small test case, that will generate an indirect call

No sub-menus in complex.c32

2008 Jun 19

No sub-menus in complex.c32

>> Hi, >> >> I have created my own custom menu using menu/complex.c as an example. >> This was way back in the syslinux 2.x days, so the menu system had not >> yet been converted to COM32. With a few minor changes, I now have >> this compiling and running under v3.63. The problem is that only the >> top level menu works. Trying to access any sub-menu

[LLVMdev] vector type legalization

2013 Aug 13

[LLVMdev] vector type legalization

Hi Nadav, I believe the implementation to keep on widening the vector to the next power of two must be in TargetLowering.h because that is where we decide whether to Widen the vector or not, and the size to which we widen it. In this case, we stop at 4xi8 and do not check if it is legal or not. But the comment says ‘try to widen vector elements until a legal type is found’. Also, there is a

[LLVMdev] Calling with register indirect reference instead of memory indirect reference.

2013 Mar 01

[LLVMdev] Calling with register indirect reference instead of memory indirect reference.

Hello > I am wondering if the modification made to the DAG is causing a problem, and > can it be done at all? If I cannot do this, is there any other place I can > look at, to make this work. It's hard to tell w/o seeing the exact code / DAG. Note, however, that this assertion is seen on simple LLVM IR: http://llvm.org/bugs/show_bug.cgi?id=15053 So, it might be not your bug after

[LLVMdev] Trip count and Loop Vectorizer

2013 Sep 27

[LLVMdev] Trip count and Loop Vectorizer

Sriram, The problem is that you want to unroll/vectorize many loops with non-constant loop count - it is a trade-off of which case you estimate as more likely. int foo(int *ptr, int n) { for ( .. i <n) ptr[i] = ... } The question is: is it more likely to have “n” such that unrolling is beneficial or not. Now, you could probably write an analysis that bounds the loop count (for the

Ocfs mount issues

2004 Feb 12

Ocfs mount issues

I am trying to mount the ocfs partitions using the following command Mount -t ocfs -o uid=oracle,gid=dba /dev/sda /ocfs01 as user oracle and group dba. However it mounts the volume as root. But if I use the ocfstool for the first and mount it as oracle:dba, the subsequent mounts using the above command line mounts the volume as oracle:dba. Is there something that I am missing or I will have

CDR 0.00 duration

2009 Jan 21

CDR 0.00 duration

Hi I am using Trixbox 2.4 and PRI lines..on the CDR i see many calls that have duration of 0 seconds, but they are still shown as ANSWERED . how come its possible when duration is 0.00 ? Are the callers billed for such calls ? Rgds Sriram -------------- next part -------------- An HTML attachment was scrubbed... URL:

similar to: [LLVMdev] Disabling x87 instructions for a sub-target