search for: prefecthing

Displaying 6 results from an estimated 6 matches for "prefecthing".

2016 May 20
0
Determination of statements that contain only matrix multiplication
...d advise against spending time on prefetching for x86. > Recent hardware prefetchers are amazingly good at strided accesses in > single-threaded code. Caution: this is not based on objective/published > data, but on personal experience. > > There are open challenges in multiprocessor prefecthing, even for regularly > strided data, but these are probabably too ambitious to be tackled > effectively in the time frame of a SoC. There are lots of papers on this > however. > > Now, if you are targeting lower power processors, including most ARM v7a/v8 > implementations, prefetc...
2017 Feb 01
2
[RFC] IR-level Region Annotations
> On Jan 31, 2017, at 10:59 PM, Tian, Xinmin <xinmin.tian at intel.com> wrote: > > >   <> > From: mehdi.amini at apple.com <mailto:mehdi.amini at apple.com> [mailto:mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>] > Sent: Tuesday, January 31, 2017 9:03 PM > To: Tian, Xinmin <xinmin.tian at intel.com <mailto:xinmin.tian at
2016 May 17
4
Determination of statements that contain only matrix multiplication
On 05/17/2016 01:47 PM, Michael Kruse wrote: > 2016-05-16 19:52 GMT+02:00 Roman Gareev <gareevroman at gmail.com>: >> Hi Tobias, >> >> could we use information about memory accesses of a SCoP statement and >> def-use chains to determine statements, which don’t contain matrix >> multiplication of the following form? > > Assuming s/don't/do you want
2017 Feb 01
2
[RFC] IR-level Region Annotations
From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com] Sent: Tuesday, January 31, 2017 9:03 PM To: Tian, Xinmin <xinmin.tian at intel.com> Cc: Sanjoy Das <sanjoy at playingwithpointers.com>; Adve, Vikram Sadanand <vadve at illinois.edu>; llvm-dev at lists.llvm.org; llvm-dev-request at lists.llvm.org Subject: Re: [llvm-dev] [RFC] IR-level Region Annotations On Jan 31,
2017 Feb 01
0
[RFC] IR-level Region Annotations
> On Jan 31, 2017, at 7:53 PM, Tian, Xinmin <xinmin.tian at intel.com> wrote: > > In this case, inliner is educated to add all local variables to the tag of enclosing parallel region, if there is enclosing parallel region. So isn’t it a good example that shows that your intrinsic *cannot* be opaque and that IR passes need to be modified to handle not only the IR-region
2017 Feb 01
2
[RFC] IR-level Region Annotations
In this case, inliner is educated to add all local variables to the tag of enclosing parallel region, if there is enclosing parallel region. In our icc implementation, it is even simple, as we have routine level symbol table, the inliner adds ”private” attribute to those local variables w/o checking enclosing scope, the parallelizer does check and use it. Xinmin From: mehdi.amini at apple.com