similar to: [LLVMdev] Adjusting Load Latencies

Displaying 20 results from an estimated 20000 matches similar to: "[LLVMdev] Adjusting Load Latencies"

2012 Mar 02
0
[LLVMdev] Adjusting Load Latencies
On Mar 2, 2012, at 9:01 AM, Hal Finkel <hfinkel at anl.gov> wrote: > Hello, > > I am interested in writing an analysis pass that looks at the stride > used for loads in a loop and passes that information down so that it > can be used by the instruction scheduler. The reason is that if the > load stride is greater than the cache line size, then I would expect > the load
2012 Mar 02
1
[LLVMdev] Adjusting Load Latencies
On Fri, 02 Mar 2012 13:49:48 -0800 Andrew Trick <atrick at apple.com> wrote: > On Mar 2, 2012, at 9:01 AM, Hal Finkel <hfinkel at anl.gov> wrote: > > > Hello, > > > > I am interested in writing an analysis pass that looks at the stride > > used for loads in a loop and passes that information down so that it > > can be used by the instruction
2018 Nov 07
3
RFC: System (cache, etc.) model for LLVM
Am Mo., 5. Nov. 2018 um 10:26 Uhr schrieb David Greene <dag at cray.com>: > Yes, I agree the terminology is confusing. I used the term "stream" in > the sense of stream processing (https://en.wikipedia.org/wiki/Stream_processing). > The programming model is very different, of course, but the idea of a > stream of data that is acted upon and then essentially discarded
2018 Nov 01
2
RFC: System (cache, etc.) model for LLVM
Hi, thank you for sharing the system hierarchy model. IMHO it makes a lot of sense, although I don't know which of today's passes would make use of it. Here are my remarks. I am wondering how one could model the following features using this model, or whether they should be part of a performance model at all: * ARM's big.LITTLE * NUMA hierarchies (are the NUMA domains
2018 Nov 02
2
RFC: System (cache, etc.) model for LLVM
Am Do., 1. Nov. 2018 um 16:56 Uhr schrieb David Greene <dag at cray.com>: > Ok. I would like to start posting patches for review without > speculating too much on fancy/exotic things that may come later. We > shouldn't do anything that precludes extensions but I don't want to get > bogged down in a lot of details on things related to a small number of > targets.
2018 Nov 01
3
RFC: System (cache, etc.) model for LLVM
Am Do., 1. Nov. 2018 um 15:21 Uhr schrieb David Greene <dag at cray.com>> > > thank you for sharing the system hierarchy model. IMHO it makes a lot > > of sense, although I don't know which of today's passes would make use > > of it. Here are my remarks. > > LoopDataPrefetch would use it via the existing TTI interfaces, but I > think that's about it
2009 Feb 12
8
Xen 3.3.1 Windows HVM Disk I/O -> domU and dom0 hangs
Hi, we are currently working on getting windows working on your xen servers. but we are facing a severe problem where dom0 and all domus hang for 1-5 seconds from time to time. we think it is probably because of disk i/o, because top sometimes says 100% wa (waiting on io) during the hang. dom0 has cpu 0 for exclusive use and the windows vms use cpu 1 to 7. should we give dom0 more than once
2011 Jun 27
4
How many L1/L2 my cpu have ?
Hi Could anybody explain me how to check how many L1/L2 cache my cpu have. I'm using CentOS 5.6 *cat /proc/cpuinfo |grep CPU * model name : Intel(R) Core(TM)2 Duo CPU T9300 @ 2.50GHz model name : Intel(R) Core(TM)2 Duo CPU T9300 @ 2.50GHz Diagram of a generic dual-core processor, with CPU-local level 1 caches, and a shared, on-die level 2 cache.
2010 Jul 28
6
Read ahead / prefetching
Hi, I am trying to educate myself on prefetching/readahead algorithm for Lustre''s read. For a starter I only have two simple questions. 1 - Does Lustre detect linear or random I/O pattern or it always triggers readahead? 2 - If readahead is triggered, how many pages are read in addition to what is necessary? Thanks, Arifa.
2015 Nov 18
2
[PATCH] virtio_ring: Shadow available ring flags & index
On Mon, Nov 16, 2015 at 7:46 PM, Xie, Huawei <huawei.xie at intel.com> wrote: > On 11/14/2015 7:41 AM, Venkatesh Srinivas wrote: > > On Wed, Nov 11, 2015 at 02:34:33PM +0200, Michael S. Tsirkin wrote: > >> On Tue, Nov 10, 2015 at 04:21:07PM -0800, Venkatesh Srinivas wrote: > >>> Improves cacheline transfer flow of available ring header. > >>> >
2015 Nov 18
2
[PATCH] virtio_ring: Shadow available ring flags & index
On Mon, Nov 16, 2015 at 7:46 PM, Xie, Huawei <huawei.xie at intel.com> wrote: > On 11/14/2015 7:41 AM, Venkatesh Srinivas wrote: > > On Wed, Nov 11, 2015 at 02:34:33PM +0200, Michael S. Tsirkin wrote: > >> On Tue, Nov 10, 2015 at 04:21:07PM -0800, Venkatesh Srinivas wrote: > >>> Improves cacheline transfer flow of available ring header. > >>> >
2015 Nov 13
2
[PATCH] virtio_ring: Shadow available ring flags & index
On Wed, Nov 11, 2015 at 02:34:33PM +0200, Michael S. Tsirkin wrote: > On Tue, Nov 10, 2015 at 04:21:07PM -0800, Venkatesh Srinivas wrote: > > Improves cacheline transfer flow of available ring header. > > > > Virtqueues are implemented as a pair of rings, one producer->consumer > > avail ring and one consumer->producer used ring; preceding the > > avail ring
2015 Nov 13
2
[PATCH] virtio_ring: Shadow available ring flags & index
On Wed, Nov 11, 2015 at 02:34:33PM +0200, Michael S. Tsirkin wrote: > On Tue, Nov 10, 2015 at 04:21:07PM -0800, Venkatesh Srinivas wrote: > > Improves cacheline transfer flow of available ring header. > > > > Virtqueues are implemented as a pair of rings, one producer->consumer > > avail ring and one consumer->producer used ring; preceding the > > avail ring
2016 May 28
1
Determination of statements that contain only matrix multiplication
Sorry for not responding earlier. On 05/20/2016 03:05 PM, Roman Gareev wrote: > Thank you very much for the advices! I could probably try to avoid > using of nonhardware prefetching in the project, if Tobias doesn’t > disagree with it. My understanding is that prefetching isn’t used > explicitly in [1] and, according to [2], in some cases 90% of the > turbo boost peak of the
2007 Mar 02
3
3.0.4 ACPI support and Opteron 2210 ?
Hello, I originally posted this to xen-users, but someone suggested I post it here. I am having ACPI problems on a PenguinComputing Altus1600 system. It has 2x dual core Opteron 2210 processors. The system boots with a standard Debian or Ubuntu SMP kernel, with ACPI enabled. However the xen live cd, binary xen install, as well as my own custom compile of xen 3.0.4 from source will not boot.
2007 Dec 11
2
nut-2.2.1-pre2
Shamelessly reusing the announcement Arnaud sent about three months ago for nut-2.2.1: "We're preparing to release 2.2.1-pre2, so if you have some fixes to backport on Testing, consider announcing it and doing asap. As always, compatibilities update and bugfixes only!" Regards, Arjen -- Eindhoven - The Netherlands Key fingerprint - 66 4E 03 2C 9D B5 CB 9B 7A FE 7E C1
2009 May 01
1
integrate with large parameters
Dear R-users, i have to integrate the following function `fun1` <- function (a, l1, l2) { exp(log(l1) * (a - 1) - l2 * lgamma(a)) } but if l1 is large, i get the "non-finite function value" error, so my idea is to rescale with exp(-l1) `fun2` <- function (a, l1, l2) { exp(log(l1) * (a - 1) - l2 * lgamma(a) - l1) } but it seems this doesn't solve the problem, when
2016 May 20
0
Determination of statements that contain only matrix multiplication
2016-05-19 21:45 GMT+05:00 4lbert C0hen <4lbert.h.c0hen at gmail.com>: > One short note. I would advise against spending time on prefetching for x86. > Recent hardware prefetchers are amazingly good at strided accesses in > single-threaded code. Caution: this is not based on objective/published > data, but on personal experience. > > There are open challenges in
2016 May 17
4
Determination of statements that contain only matrix multiplication
On 05/17/2016 01:47 PM, Michael Kruse wrote: > 2016-05-16 19:52 GMT+02:00 Roman Gareev <gareevroman at gmail.com>: >> Hi Tobias, >> >> could we use information about memory accesses of a SCoP statement and >> def-use chains to determine statements, which don’t contain matrix >> multiplication of the following form? > > Assuming s/don't/do you want
2005 Feb 21
5
Compare rows of two matrices
Hello, #I have two matrices, eg.: y <- matrix( c(20, NA, NA, 45, 50, 19, 32, 101, 10, 22, NA, NA, 80, 49, 61, 190), ncol=4 ) x <- matrix( c(20, NA, NA, NA, 50, 19, 32, 101, 10, 22, NA, NA, 80, 49, 61, 190), ncol=4 ) #Whereas x contains all NA?s from y plus some additional NA?s. #I want to find the index of these additional NA?s. I think, there must be a very