similar to: [LLVMdev] YA Vectorization Benchmark

Displaying 20 results from an estimated 6000 matches similar to: "[LLVMdev] YA Vectorization Benchmark"

2012 Nov 05
0
[LLVMdev] YA Vectorization Benchmark
----- Original Message ----- > From: "Renato Golin" <rengolin at systemcall.org> > To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Monday, November 5, 2012 2:57:35 AM > Subject: [LLVMdev] YA Vectorization Benchmark > > Folks, > > Has anyone tried this benchmark before? > >
2012 Nov 05
0
[LLVMdev] YA Vectorization Benchmark
Renato, Thanks for the link. At the moment we are unable to vectorize any of the loops in this benchmark. I found two main problems: 1. We do not allow reductions on floating point types. We should allow them when unsafe-math is used. 2. All of the arrays are located in a struct. At the moment we don't detect that these arrays are disjoin, and this prevents vectorization. We should be
2012 Nov 05
2
[LLVMdev] YA Vectorization Benchmark
On 5 November 2012 17:41, Nadav Rotem <nrotem at apple.com> wrote: > 1. We do not allow reductions on floating point types. We should allow them when unsafe-math is used. > 2. All of the arrays are located in a struct. At the moment we don't detect that these arrays are disjoin, and this prevents vectorization. Indeed, they look like simple changes. If no one is dying to get them
2012 Nov 05
0
[LLVMdev] YA Vectorization Benchmark
That would be great! On Nov 5, 2012, at 2:11 PM, Renato Golin <rengolin at systemcall.org> wrote: > On 5 November 2012 17:41, Nadav Rotem <nrotem at apple.com> wrote: >> 1. We do not allow reductions on floating point types. We should allow them when unsafe-math is used. >> 2. All of the arrays are located in a struct. At the moment we don't detect that these
2012 Nov 06
2
[LLVMdev] YA Vectorization Benchmark
Ok, I got the benchmark to work on test-suite, but it's not printing details for each run (or execution wouldn't work). I had to comment out the printf lines, but nothing more than that. I'm not sure how individual timings would have to be extracted, but the program produces output via text file, which can be used for comparison. Also, it does check the results and does report if they
2012 Nov 16
4
[LLVMdev] YA Vectorization Benchmark
Daniel, Nadav, Hal, So, after some painstakingly boring re-formatting, I've split the 24 kernels into 24 files (and left a horrible header file with code in it, which I'll clean up later). Since we're taking times in the benchmark tool, and we're trying to assert the quality of the FP approximation by the vectorization, I'll try to come up with a reasonable watermark for each
2012 Nov 16
1
[LLVMdev] YA Vectorization Benchmark
On 16 November 2012 21:43, Nadav Rotem <nrotem at apple.com> wrote: > Once Michael Ilseman commits the fast math patch we will be able to implement floating point reductions. That's great news! Attached is the whole benchmark, divided into 24 kernels and running on LNT with FP comparison and timings. Unpack the file onto SingleSource/Benchmarks and change the Makefile to add
2012 Nov 16
0
[LLVMdev] YA Vectorization Benchmark
Hi Renato! Thanks for working on this! It's really important to have more array-ish benchmarks. On Nov 16, 2012, at 12:28 PM, Renato Golin <rengolin at systemcall.org> wrote: > Daniel, Nadav, Hal, > > So, after some painstakingly boring re-formatting, I've split the 24 > kernels into 24 files (and left a horrible header file with code in > it, which I'll
2012 Nov 06
0
[LLVMdev] YA Vectorization Benchmark
Hey Renato, Cool, glad you got it working. There is very primitive support for tests that generate multiple output results, but I would rather not use those facilities. Is it possible instead to refactor the tests so that each binary corresponds to one test? For example, look at how Hal went about integrating TSVC:
2012 Nov 07
2
[LLVMdev] YA Vectorization Benchmark
On 6 November 2012 22:28, Daniel Dunbar <daniel at zuster.org> wrote: > Is it possible instead to refactor the tests so that each binary corresponds > to one test? For example, look at how Hal went about integrating TSVC: It should be possible. I'll have to understand better what the preamble does to make sure I'm not stripping out important stuff, but also what to copy to
2012 Nov 16
0
[LLVMdev] YA Vectorization Benchmark
Seems fairly reasonable to me. I don't know what size of arrays you are dealing with, if they are reasonably small it is probably also fine to just output each element in the result. It's fine to start by just setting FP_TOLERANCE to a small value and if it breaks in the future because of an actual precision change we can tweak it. Thanks for the reformatting, its great to see new
2011 Oct 29
4
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
Ralf, et al., Attached is the latest version of my autovectorization patch. llvmdev has been CC'd (as had been suggested to me); this e-mail contains additional benchmark results. First, these are preliminary results because I did not do the things necessary to make them real (explicitly quiet the machine, bind the processes to one cpu, etc.). But they should be good enough for discussion.
2012 Nov 07
0
[LLVMdev] YA Vectorization Benchmark
On Wed, Nov 7, 2012 at 12:39 AM, Renato Golin <rengolin at systemcall.org>wrote: > On 6 November 2012 22:28, Daniel Dunbar <daniel at zuster.org> wrote: > > Is it possible instead to refactor the tests so that each binary > corresponds > > to one test? For example, look at how Hal went about integrating TSVC: > > It should be possible. I'll have to
2011 Oct 29
0
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
On Sat, 2011-10-29 at 12:30 -0500, Hal Finkel wrote: > Ralf, et al., > > Attached is the latest version of my autovectorization patch. llvmdev > has been CC'd (as had been suggested to me); this e-mail contains > additional benchmark results. > > First, these are preliminary results because I did not do the things > necessary to make them real (explicitly quiet the
2012 Nov 05
2
[LLVMdev] New benchmark in test-suite
Hi Daniel, I'm trying to add LivermoreLoops test to the benchmark suite (tar ball attached), but I'm getting the error below: --- Tested: 2 tests -- FAIL: SingleSource/Benchmarks/LivermoreLoops/lloops.compile_time (1 of 2) FAIL: SingleSource/Benchmarks/LivermoreLoops/lloops.execution_time (2 of 2) When I use the option to only run this test: --only-test
2009 Dec 17
2
[LLVMdev] Automatic Vectorization
Hi all, I've sent it as a reply to another thread, but it was ill placed. Anyway, sorry about the duplication, but here it goes. I've been looking into the loop passes and noticed we do alias analysis and scalar evolution only, trying to clean up the loop as far as possible. I suppose that, if we were to define SCCs, split them into groups and re-arranging into multiple loops, we would
2012 Apr 18
5
[LLVMdev] Vectorization metadata
Hal, I'm opening a new discussion on vectorization metadata, since it has little to do with fp-math. ;) What kind of metadata would you annotate in the instructions? If I remember from your talk, you're not doing any loop or whole-function analysis, possibly leaving it for Polly to help you along the way. I remember discussing it with Tobias that Polly could have three main steps: 1.
2010 May 06
1
[LLVMdev] Auto-Vectorization in LLVM
On 6 May 2010 05:34, Chris Lattner <clattner at apple.com> wrote: > On May 5, 2010, at 1:01 PM, Rajkishore Barik wrote: >> I would also like to know if there is any progress/future plans to >> include this >> in the main trunk? > > Unfortunately, nothing came of this project AFAIK, maybe Devang knows more. I looked for it and couldn't find any, too. I found
2011 Oct 29
4
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
On Sat, 2011-10-29 at 14:02 -0500, Hal Finkel wrote: > On Sat, 2011-10-29 at 12:30 -0500, Hal Finkel wrote: > > Ralf, et al., > > > > Attached is the latest version of my autovectorization patch. llvmdev > > has been CC'd (as had been suggested to me); this e-mail contains > > additional benchmark results. > > > > First, these are preliminary
2012 Apr 18
2
[LLVMdev] Vectorization metadata
Hi Ether, On 18 April 2012 19:11, Hongbin Zheng <etherzhhb at gmail.com> wrote: > Instead of exporting the polyhedral model of the program with > metadata, another possible solution is designing a generic "Loop > Parallelism" analysis interface just like the AliasAnalysis group. > For a particular loop, the interface simply answer how many loop > iterations can run