similar to: Determination of statements that contain only matrix multiplication

Displaying 20 results from an estimated 100 matches similar to: "Determination of statements that contain only matrix multiplication"

2016 May 17
4
Determination of statements that contain only matrix multiplication
On 05/17/2016 01:47 PM, Michael Kruse wrote: > 2016-05-16 19:52 GMT+02:00 Roman Gareev <gareevroman at gmail.com>: >> Hi Tobias, >> >> could we use information about memory accesses of a SCoP statement and >> def-use chains to determine statements, which don’t contain matrix >> multiplication of the following form? > > Assuming s/don't/do you want
2008 Sep 26
3
rails -1.2.3 to 2.1.1 ? how ?
now i am working in a rails project version 1.2.3 but i like to work in rails 2.1.1 i thing by freexe the 1.2.3 gemm into vendor will solve the problem but i have 35 plugin in my project(which is version 1.2.3) so i worried about freezeing old gem into vendeor any help appreciated? thanks -- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You
2018 Jan 08
2
Fwd: R/MKL Intel 2018 Compatibility
Dear all, I would like to submit an issue that we are facing. Indeed, in our environment, we are optimizing the R code to speed up some mathematical calculations as matrix products using the INTEL libraries ( MKL) ( https://software.intel.com/en-us/mkl ) With the last version of the MKL libraries Intel 2018, we are facing to an issue with *all INTERNAL command* that are executing in R.
2013 Sep 02
2
[LLVMdev] [Polly] Comionpile-time of Polly's code generation
Hi all, It seems that Polly's code generation can leads to high compile-time overhead, especially for PolyBench applications such as 2mm, 3mm, gemm, syrk, etc. Some basic evaluation and analysis for Polly's code generation can be referred to http://llvm.org/bugs/show_bug.cgi?id=16898. Currently, we can choose to run -polly-code-generator=cloog or -polly-code-generator=isl for code
2013 Aug 12
1
[LLVMdev] [FastPolly]: Update of Polly's performance on LLVM test-suite
At 2013-08-12 01:18:30,"Tobias Grosser" <tobias at grosser.es> wrote: >On 08/10/2013 06:59 PM, Star Tan wrote: >> Hi all, >> >> I have evaluated Polly's performance on LLVM test-suite with latest LLVM (r188054) and Polly (r187981).  Results can be viewed on: http://188.40.87.11:8000. > >Hi Star Tan, > >thanks for the update. >
2018 Jan 15
3
Inclusion of Polly and isl into core LLVM
[add subject] Dear LLVM community, hope all of you had a good start into 2018 and a quiet branching of LLVM 6.0. With the latest LLVM release out of the way and a longer development phase starting, we would like to restart the process of including Polly and isl into core LLVM to bring changes in early on before the next LLVM release. Short summary: * Today Polly is already part of each LLVM
2018 Jan 23
0
Inclusion of Polly and isl into core LLVM
On Mon, 15 Jan 2018 22:44:45 +0100, Tobias Grosser via llvm-dev wrote: <snip> > * How stable/fast/… is Polly today > * We build all of AOSP with rather restrictive compile-time limits > * Bootstrapping time of clang is regressed by 6% (at most) > * Removal of scalar dependences is today very generic and must be > sped up in the future > * Polly still
2013 Aug 11
0
[LLVMdev] [FastPolly]: Update of Polly's performance on LLVM test-suite
On 08/10/2013 06:59 PM, Star Tan wrote: > Hi all, > > I have evaluated Polly's performance on LLVM test-suite with latest LLVM (r188054) and Polly (r187981). Results can be viewed on: http://188.40.87.11:8000. Hi Star Tan, thanks for the update. > There are mainly five new tests and each test is run with 10 samples: > clang (run id = 27): clang -O3 > pollyBasic (run id =
2016 May 28
1
Determination of statements that contain only matrix multiplication
Sorry for not responding earlier. On 05/20/2016 03:05 PM, Roman Gareev wrote: > Thank you very much for the advices! I could probably try to avoid > using of nonhardware prefetching in the project, if Tobias doesn’t > disagree with it. My understanding is that prefetching isn’t used > explicitly in [1] and, according to [2], in some cases 90% of the > turbo boost peak of the
2013 Sep 08
2
[LLVMdev] [Polly] Compile-time of Polly's code generation
At 2013-09-02 17:05:52,"Tobias Grosser" <tobias at grosser.es> wrote: >On 09/01/2013 08:02 PM, Star Tan wrote: >> Hi all, >> >> >> It seems that Polly's code generation can leads to high compile-time overhead, especially for PolyBench applications such as 2mm, 3mm, gemm, syrk, etc. Some basic evaluation and analysis for Polly's code generation
2016 May 20
0
Determination of statements that contain only matrix multiplication
2016-05-19 21:45 GMT+05:00 4lbert C0hen <4lbert.h.c0hen at gmail.com>: > One short note. I would advise against spending time on prefetching for x86. > Recent hardware prefetchers are amazingly good at strided accesses in > single-threaded code. Caution: this is not based on objective/published > data, but on personal experience. > > There are open challenges in
2013 Aug 11
2
[LLVMdev] [FastPolly]: Update of Polly's performance on LLVM test-suite
Hi all, I have evaluated Polly's performance on LLVM test-suite with latest LLVM (r188054) and Polly (r187981).  Results can be viewed on: http://188.40.87.11:8000. There are mainly five new tests and each test is run with 10 samples: clang (run id = 27):  clang -O3 pollyBasic (run id = 28):  clang -O3 -load LLVMPolly.so pollyNoGen (run id = 29):  pollycc -O3 -mllvm -polly-optimizer=none
2018 Jan 15
2
(no subject)
Dear LLVM community, hope all of you had a good start into 2018 and a quiet branching of LLVM 6.0. With the latest LLVM release out of the way and a longer development phase starting, we would like to restart the process of including Polly and isl into core LLVM to bring changes in early on before the next LLVM release. Short summary: * Today Polly is already part of each LLVM release (and
2013 Sep 02
0
[LLVMdev] [Polly] Comionpile-time of Polly's code generation
On 09/01/2013 08:02 PM, Star Tan wrote: > Hi all, > > > It seems that Polly's code generation can leads to high compile-time overhead, especially for PolyBench applications such as 2mm, 3mm, gemm, syrk, etc. Some basic evaluation and analysis for Polly's code generation can be referred to http://llvm.org/bugs/show_bug.cgi?id=16898. > > > Currently, we can choose to
2015 Nov 17
12
3.7.1-rc1 has been tagged. Let's begin testing!
Hi, I have just tagged 3.7.1-rc1, so it is ready for testing. As a reminder, when doing regression testing, use the 3.7.0 release as your baseline. Thanks, Tom
2013 May 03
0
[LLVMdev] [Polly] GSoC Proposal: Reducing LLVM-Polly Compiling overhead
Dear Tobias, Thank you very much for your very helpful advice. Yes, -debug-pass and -time-passes are two very useful and powerful options when evaluating the compile-time of each compiler pass. They are exactly what I need! With these options, I can step into details of the compile-time overhead of each pass. I have finished some preliminary testing based on two randomly selected files from
2013 May 02
2
[LLVMdev] [Polly] GSoC Proposal: Reducing LLVM-Polly Compiling overhead
On 04/30/2013 04:13 PM, Star Tan wrote: > Hi all, [...] > How could I find out where the time is spent on between two adjacent Polly passes? Can anyone give me some advice? Hi Star Tan, I propose to do the performance analysis using the 'opt' tool and optimizing LLVM-IR, instead of running it from within clang. For the 'opt' tool there are two commands that should help
2013 Jul 05
0
[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR
On 07/04/2013 01:39 PM, Stéphane Letz wrote: > Hi, > > Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be vectorized with opt -O3 -vectorize-loops. So our guess is that our generated LLVM IR lacks some
2013 Jul 04
3
[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR
Hi, Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be vectorized with opt -O3 -vectorize-loops. So our guess is that our generated LLVM IR lacks some informations that are needed by the vectorization passes to
2015 Feb 26
5
[LLVMdev] [RFC] AArch64: Should we disable GlobalMerge?
Hi all, I've started looking at the GlobalMerge pass, enabled by default on ARM and AArch64. I think we should reconsider that, at least for AArch64. As is, the pass just merges all globals together, in groups of 4KB (AArch64, 128B on ARM). At the time it was enabled, the general thinking was "it's almost free, it doesn't affect performance much, we might as well use it".