thr3ads.net - similar to: "Determination of statements that contain only matrix multiplication"

Displaying 20 results from an estimated 100 matches similar to: "Determination of statements that contain only matrix multiplication"

Determination of statements that contain only matrix multiplication

2016 May 17

Determination of statements that contain only matrix multiplication

On 05/17/2016 01:47 PM, Michael Kruse wrote: > 2016-05-16 19:52 GMT+02:00 Roman Gareev <gareevroman at gmail.com>: >> Hi Tobias, >> >> could we use information about memory accesses of a SCoP statement and >> def-use chains to determine statements, which don’t contain matrix >> multiplication of the following form? > > Assuming s/don't/do you want

rails -1.2.3 to 2.1.1 ? how ?

2008 Sep 26

rails -1.2.3 to 2.1.1 ? how ?

now i am working in a rails project version 1.2.3 but i like to work in rails 2.1.1 i thing by freexe the 1.2.3 gemm into vendor will solve the problem but i have 35 plugin in my project(which is version 1.2.3) so i worried about freezeing old gem into vendeor any help appreciated? thanks -- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You

Fwd: R/MKL Intel 2018 Compatibility

2018 Jan 08

Fwd: R/MKL Intel 2018 Compatibility

Dear all, I would like to submit an issue that we are facing. Indeed, in our environment, we are optimizing the R code to speed up some mathematical calculations as matrix products using the INTEL libraries ( MKL) ( https://software.intel.com/en-us/mkl ) With the last version of the MKL libraries Intel 2018, we are facing to an issue with *all INTERNAL command* that are executing in R.

[LLVMdev] [Polly] Comionpile-time of Polly's code generation

2013 Sep 02

[LLVMdev] [Polly] Comionpile-time of Polly's code generation

Hi all, It seems that Polly's code generation can leads to high compile-time overhead, especially for PolyBench applications such as 2mm, 3mm, gemm, syrk, etc. Some basic evaluation and analysis for Polly's code generation can be referred to http://llvm.org/bugs/show_bug.cgi?id=16898. Currently, we can choose to run -polly-code-generator=cloog or -polly-code-generator=isl for code

[LLVMdev] [FastPolly]: Update of Polly's performance on LLVM test-suite

2013 Aug 12

[LLVMdev] [FastPolly]: Update of Polly's performance on LLVM test-suite

At 2013-08-12 01:18:30,"Tobias Grosser" <tobias at grosser.es> wrote: >On 08/10/2013 06:59 PM, Star Tan wrote: >> Hi all, >> >> I have evaluated Polly's performance on LLVM test-suite with latest LLVM (r188054) and Polly (r187981). Results can be viewed on: http://188.40.87.11:8000. > >Hi Star Tan, > >thanks for the update. >

Inclusion of Polly and isl into core LLVM

2018 Jan 15

Inclusion of Polly and isl into core LLVM

[add subject] Dear LLVM community, hope all of you had a good start into 2018 and a quiet branching of LLVM 6.0. With the latest LLVM release out of the way and a longer development phase starting, we would like to restart the process of including Polly and isl into core LLVM to bring changes in early on before the next LLVM release. Short summary: * Today Polly is already part of each LLVM

Inclusion of Polly and isl into core LLVM

2018 Jan 23

Inclusion of Polly and isl into core LLVM

On Mon, 15 Jan 2018 22:44:45 +0100, Tobias Grosser via llvm-dev wrote: <snip> > * How stable/fast/… is Polly today > * We build all of AOSP with rather restrictive compile-time limits > * Bootstrapping time of clang is regressed by 6% (at most) > * Removal of scalar dependences is today very generic and must be > sped up in the future > * Polly still

[LLVMdev] [FastPolly]: Update of Polly's performance on LLVM test-suite

2013 Aug 11

[LLVMdev] [FastPolly]: Update of Polly's performance on LLVM test-suite

On 08/10/2013 06:59 PM, Star Tan wrote: > Hi all, > > I have evaluated Polly's performance on LLVM test-suite with latest LLVM (r188054) and Polly (r187981). Results can be viewed on: http://188.40.87.11:8000. Hi Star Tan, thanks for the update. > There are mainly five new tests and each test is run with 10 samples: > clang (run id = 27): clang -O3 > pollyBasic (run id =

Determination of statements that contain only matrix multiplication

2016 May 28

Determination of statements that contain only matrix multiplication

Sorry for not responding earlier. On 05/20/2016 03:05 PM, Roman Gareev wrote: > Thank you very much for the advices! I could probably try to avoid > using of nonhardware prefetching in the project, if Tobias doesn’t > disagree with it. My understanding is that prefetching isn’t used > explicitly in [1] and, according to [2], in some cases 90% of the > turbo boost peak of the

[LLVMdev] [Polly] Compile-time of Polly's code generation

2013 Sep 08

[LLVMdev] [Polly] Compile-time of Polly's code generation

At 2013-09-02 17:05:52,"Tobias Grosser" <tobias at grosser.es> wrote: >On 09/01/2013 08:02 PM, Star Tan wrote: >> Hi all, >> >> >> It seems that Polly's code generation can leads to high compile-time overhead, especially for PolyBench applications such as 2mm, 3mm, gemm, syrk, etc. Some basic evaluation and analysis for Polly's code generation

Determination of statements that contain only matrix multiplication

2016 May 20

Determination of statements that contain only matrix multiplication

2016-05-19 21:45 GMT+05:00 4lbert C0hen <4lbert.h.c0hen at gmail.com>: > One short note. I would advise against spending time on prefetching for x86. > Recent hardware prefetchers are amazingly good at strided accesses in > single-threaded code. Caution: this is not based on objective/published > data, but on personal experience. > > There are open challenges in

[LLVMdev] [FastPolly]: Update of Polly's performance on LLVM test-suite

2013 Aug 11

[LLVMdev] [FastPolly]: Update of Polly's performance on LLVM test-suite

Hi all, I have evaluated Polly's performance on LLVM test-suite with latest LLVM (r188054) and Polly (r187981). Results can be viewed on: http://188.40.87.11:8000. There are mainly five new tests and each test is run with 10 samples: clang (run id = 27): clang -O3 pollyBasic (run id = 28): clang -O3 -load LLVMPolly.so pollyNoGen (run id = 29): pollycc -O3 -mllvm -polly-optimizer=none

(no subject)

2018 Jan 15

(no subject)

Dear LLVM community, hope all of you had a good start into 2018 and a quiet branching of LLVM 6.0. With the latest LLVM release out of the way and a longer development phase starting, we would like to restart the process of including Polly and isl into core LLVM to bring changes in early on before the next LLVM release. Short summary: * Today Polly is already part of each LLVM release (and

[LLVMdev] [Polly] Comionpile-time of Polly's code generation

2013 Sep 02

[LLVMdev] [Polly] Comionpile-time of Polly's code generation

On 09/01/2013 08:02 PM, Star Tan wrote: > Hi all, > > > It seems that Polly's code generation can leads to high compile-time overhead, especially for PolyBench applications such as 2mm, 3mm, gemm, syrk, etc. Some basic evaluation and analysis for Polly's code generation can be referred to http://llvm.org/bugs/show_bug.cgi?id=16898. > > > Currently, we can choose to

3.7.1-rc1 has been tagged. Let's begin testing!

2015 Nov 17

3.7.1-rc1 has been tagged. Let's begin testing!

Hi, I have just tagged 3.7.1-rc1, so it is ready for testing. As a reminder, when doing regression testing, use the 3.7.0 release as your baseline. Thanks, Tom

[LLVMdev] [Polly] GSoC Proposal: Reducing LLVM-Polly Compiling overhead

2013 May 03

[LLVMdev] [Polly] GSoC Proposal: Reducing LLVM-Polly Compiling overhead

Dear Tobias, Thank you very much for your very helpful advice. Yes, -debug-pass and -time-passes are two very useful and powerful options when evaluating the compile-time of each compiler pass. They are exactly what I need! With these options, I can step into details of the compile-time overhead of each pass. I have finished some preliminary testing based on two randomly selected files from

[LLVMdev] [Polly] GSoC Proposal: Reducing LLVM-Polly Compiling overhead

2013 May 02

[LLVMdev] [Polly] GSoC Proposal: Reducing LLVM-Polly Compiling overhead

On 04/30/2013 04:13 PM, Star Tan wrote: > Hi all, [...] > How could I find out where the time is spent on between two adjacent Polly passes? Can anyone give me some advice? Hi Star Tan, I propose to do the performance analysis using the 'opt' tool and optimizing LLVM-IR, instead of running it from within clang. For the 'opt' tool there are two commands that should help

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

2013 Jul 05

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

On 07/04/2013 01:39 PM, Stéphane Letz wrote: > Hi, > > Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be vectorized with opt -O3 -vectorize-loops. So our guess is that our generated LLVM IR lacks some

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

2013 Jul 04

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

Hi, Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be vectorized with opt -O3 -vectorize-loops. So our guess is that our generated LLVM IR lacks some informations that are needed by the vectorization passes to

[LLVMdev] [RFC] AArch64: Should we disable GlobalMerge?

2015 Feb 26

[LLVMdev] [RFC] AArch64: Should we disable GlobalMerge?

Hi all, I've started looking at the GlobalMerge pass, enabled by default on ARM and AArch64. I think we should reconsider that, at least for AArch64. As is, the pass just merges all globals together, in groups of 4KB (AArch64, 128B on ARM). At the time it was enabled, the general thinking was "it's almost free, it doesn't affect performance much, we might as well use it".

similar to: Determination of statements that contain only matrix multiplication