similar to: [LLVMdev] Floats as Doubles in Vectors

Displaying 20 results from an estimated 4000 matches similar to: "[LLVMdev] Floats as Doubles in Vectors"

2013 Jan 28
0
[LLVMdev] Floats as Doubles in Vectors
Hi Hal, I am not familiar with Blue Gene at all but I will do my best to answer. > is going to assume that the vectors can hold 256/32 == 8 single-precision values even though the real maximum is 4 values. How should we handle this? The loop vectorizer uses TTI to find the size of the vector register. It uses this number to calculate the maximum vectorization factor. The formula is MaxVF =
2013 Jan 28
2
[LLVMdev] Floats as Doubles in Vectors
----- Original Message ----- > From: "Nadav Rotem" <nrotem at apple.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Sunday, January 27, 2013 9:16:42 PM > Subject: Re: Floats as Doubles in Vectors > > Hi Hal, > > I am not familiar with Blue Gene at all but
2013 Jan 28
0
[LLVMdev] Floats as Doubles in Vectors
>> >> I think that the best approach would be to improve the cost model >> because on blue gene <8 x float> ops would be more expensive than <4 >> x float>. > > I agree. The cost model needs to reflect this fact, I am wondering only about things to be done on top of that. Why should the loop vectorizer (or the BB vectorizer for that matter) spend time
2018 Jun 01
2
[VPlan] about vectorization factor selection
Hi, Current loop vectorizer uses a range of vectorization factors computed by MaxVF. For each VF, it setups unform and scalar info before building VPlan and the final best VF selection. The best VF is also selected within the VF range. for (unsigned VF = 1; VF <= MaxVF; VF *= 2) { // Collect Uniform and Scalar instructions after vectorization with VF.
2009 Dec 01
0
[LLVMdev] Possible bug in ExpandShiftWithUnknownAmountBit
Hi, > I'm working in adding support for 64-bit integers to my target. I'm using > LLVM to decompose the 64-bit integer operations by using 32-bit registers > wherever possible and emulating support where not. When looking at the bit > shift decomposition I saw what seems to be a bug in the implementation. The > affected function is ExpandShiftWithUnknownAmountBit in >
2009 Dec 01
4
[LLVMdev] Possible bug in ExpandShiftWithUnknownAmountBit
Hello, I'm working in adding support for 64-bit integers to my target. I'm using LLVM to decompose the 64-bit integer operations by using 32-bit registers wherever possible and emulating support where not. When looking at the bit shift decomposition I saw what seems to be a bug in the implementation. The affected function is ExpandShiftWithUnknownAmountBit in LegalizeIntegerTypes.cpp.
2014 Dec 13
2
[LLVMdev] Vectorization factor limitation in Loop Vectorizer
So IMO, if we modify the VF calculation for targets/subtargets using TTI where higher VF is supported The vectorizer’s scope will become wider. Did/do you foresee any issue with this? Thanks, Shahid From: Nadav Rotem [mailto:nrotem at apple.com] Sent: Saturday, December 13, 2014 2:47 AM To: Shahid, Asghar-ahmad Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Vectorization factor limitation in
2016 Jan 05
3
TargetTransformInfo getOperationCost uses
Hi, I'm trying to implement the TTI hooks for AMDGPU to avoid unrolling loops for operations with huge expansions (i.e. integer division). The values that are ultimately reported by opt -cost-model -analyze (the actual cost model tests) seem to not matter for this. The huge cost I've assigned division doesn't prevent the loop from being unrolled, because it isn't actually
2014 Dec 11
2
[LLVMdev] Vectorization factor limitation in Loop Vectorizer
Hi Nadav/Devs I am exploring Loop Vectorizer to vectorize i8 scalar operations into 8xi8 vector operation. I was expecting the Loop Vectorizer to analyze the profitability for vectorization factor(VF) of 8, However it is not doing so due to the widest type calculation done for the blocks inside the loop. May be I am missing something, however, I am curious to know why Loop Vectorizer limits the
2000 Aug 29
5
Optimization and doubles vs. floats
I saw some mail go by a bit ago about doubles-vs-floats, but I seem to have lost it. I'm interested in rewriting the mdct code using Altivec on MacOS X. Altivec doesn't support doubles, though -- the only floating point vector type is single precision floats. Vorbis currently has doubles everywhere -- is this really necessary? Doubles are supposedly faster than floats in the PPC
2010 Nov 03
1
[LLVMdev] LLVM x86 Code Generator discards Instruction-level Parallelism
Dear LLVMdev, I've noticed an unusual behavior of the LLVM x86 code generator (with default options) that results in nearly a 4x slow-down in floating-point throughput for my microbenchmark. I've written a compute-intensive microbenchmark to approach theoretical peak throughput of the target processor by issuing a large number of independent floating-point multiplies. The distance
2014 May 14
0
Bug#748052: Bug#748052: xen-hypervisor-4.3-amd64: No USB keyboard after booting into Dom0
Control: tag -1 +moreinfo On Tue, 2014-05-13 at 08:55 -0700, Mike Egglestone wrote: > After booting into kernel 3.13 dom0, the keyboard no longer works. > All other aspects work, ssh is needed to login to the system to > start working with the domU's. > They keyboard works at the grub menu just fine, and > works normal when booting into the non xen kernel. I think at a minimum
2011 Sep 17
2
[LLVMdev] Pre-Allocation Schedulers in LLVM
Hi, I am currently writing a paper documenting a research project that we have done on pre-allocation instruction scheduling to balance ILP and register pressure. In the paper we compare the pre-allocation scheduler that we have developed to LLVM's default schedulers for two targets: x86-64 and x86-32. We would like to include in our paper some brief descriptions of the two LLVM
2011 Sep 21
0
[LLVMdev] Pre-Allocation Schedulers in LLVM
On Sep 17, 2011, at 10:07 AM, Ghassan Shobaki wrote: > Hi, > > I am currently writing a paper documenting a research project that we have done on pre-allocation instruction scheduling to balance ILP and register pressure. In the paper we compare the pre-allocation scheduler that we have developed to LLVM's default schedulers for two targets: x86-64 and x86-32. We would like to
2011 Sep 23
2
[LLVMdev] Pre-Allocation Schedulers in LLVM
Hi Andrew, What we have is not a patch to any of LLVM's schedulers. We have implemented our own scheduler and integrated it into LLVM 2.9 as yet-another scheduler. Our scheduler uses a combinatorial optimization approach to balance ILP and register pressure. In one experiment, we added more precise latency information for most common x86 instructions to our scheduler and noticed a 10%
2012 Sep 29
7
[LLVMdev] LLVM's Pre-allocation Scheduler Tested against a Branch-and-Bound Scheduler
Hi, We are currently working on revising a journal article that describes our work on pre-allocation scheduling using LLVM and have some questions about LLVM's pre-allocation scheduler. The answers to these question will help us better document and analyze the results of our benchmark tests that compare our algorithm with LLVM's pre-allocation scheduling algorithm. First, here is a
2012 Sep 29
0
[LLVMdev] LLVM's Pre-allocation Scheduler Tested against a Branch-and-Bound Scheduler
On Sep 29, 2012, at 2:43 AM, Ghassan Shobaki <ghassan_shobaki at yahoo.com> wrote: > Hi, > > We are currently working on revising a journal article that describes our work on pre-allocation scheduling using LLVM and have some questions about LLVM's pre-allocation scheduler. The answers to these question will help us better document and analyze the results of our benchmark
2011 Sep 23
0
[LLVMdev] Pre-Allocation Schedulers in LLVM
On Sep 23, 2011, at 6:16 AM, Ghassan Shobaki wrote: > Hi Andrew, > > What we have is not a patch to any of LLVM's schedulers. We have implemented our own scheduler and integrated it into LLVM 2.9 as yet-another scheduler. Our scheduler uses a combinatorial optimization approach to balance ILP and register pressure. In one experiment, we added more precise latency information for
2011 Sep 26
1
[LLVMdev] Pre-Allocation Schedulers in LLVM
Hi Andy, Please see my in-line answers below. Regards -Ghassan ________________________________ From: Andrew Trick <atrick at apple.com> To: Ghassan Shobaki <ghassan_shobaki at yahoo.com> Cc: "llvmdev at cs.uiuc.edu" <llvmdev at cs.uiuc.edu> Sent: Friday, September 23, 2011 8:02 PM Subject: Re: [LLVMdev] Pre-Allocation Schedulers in LLVM On Sep 23, 2011, at
2011 Dec 19
2
[LLVMdev] specializing hybrid_ls_rr_sort (was: Re: Bottom-Up Scheduling?)
On Mon, 2011-12-19 at 07:41 -0800, Andrew Trick wrote: > On Dec 19, 2011, at 6:51 AM, Hal Finkel <hfinkel at anl.gov> wrote: > > > On Tue, 2011-10-25 at 21:00 -0700, Andrew Trick wrote: > > Now, to generate the best PPC schedules, there is one thing you may > >> want to override. The scheduler's priority function has a > >> HasReadyFilter attribute