thr3ads.net - similar to: "devid support for EFI partition improved zfs usibility"

Displaying 20 results from an estimated 2000 matches similar to: "devid support for EFI partition improved zfs usibility"

[LLVMdev] Documentation of fmuladd intrinsic

2013 Jan 11

[LLVMdev] Documentation of fmuladd intrinsic

On Fri, Jan 11, 2013 at 1:08 PM, Andrew Booker <andrew.booker at arm.com>wrote: > The fmuladd intrinsic is described as saying that a multiply and > addition sequence can be fused into an fma instruction "if the code > generator determines that the fused expression would be legal and > efficient". (http://llvm.org/docs/LangRef.html#llvm-fma-intrinsic) > >

[LLVMdev] Question about FMA formation

2012 Dec 12

[LLVMdev] Question about FMA formation

Hi, Dear All: I'm going implement FMA formation. On some architectures, "FMA a, b, c" is more precise than "a * b + c". I'm wondering if FMA could be less precise. In the former case, can we enable FMA formation despite restrictive FP mode? Thanks Shuxin

[LLVMdev] Documentation of fmuladd intrinsic

2013 Jan 11

[LLVMdev] Documentation of fmuladd intrinsic

----- Original Message ----- > From: "Cameron McInally" <cameron.mcinally at nyu.edu> > To: "Andrew Booker" <andrew.booker at arm.com> > Cc: llvmdev at cs.uiuc.edu > Sent: Friday, January 11, 2013 12:37:07 PM > Subject: Re: [LLVMdev] Documentation of fmuladd intrinsic > > > On Fri, Jan 11, 2013 at 1:08 PM, Andrew Booker < >

[LLVMdev] Clarifying FMA-related TargetOptions

2012 Feb 08

[LLVMdev] Clarifying FMA-related TargetOptions

Hello everyone, I'd like to propose the attached patch to form FMA intrinsics aggressively, but in order to do so I need some clarification on the intended semantics for the various FP precision-related TargetOptions. I've summarized the three relevant ones below: UnsafeFPMath - Defaults to off, enables "less precise" results than permitted by IEEE754. Comments specifically

[X86] FMA transformation restrictions

2016 Sep 12

[X86] FMA transformation restrictions

I noticed that the operand commuting code in X86InstrInfo.cpp treats scalar FMA intrinsics specially. It prevents operand commuting on these scalar instructions because the scalar FMA instructions preserve the upper bits of the vector. Presumably, the restrictions are there because commuting operands potentially changes the result upper bits. However, AFAIK the Intel and GNU FMA intrinsics

[LLVMdev] Question about FMA formation

2012 Dec 13

[LLVMdev] Question about FMA formation

A little background: The fmuladd intrinsic was introduced to support the FP_CONTRACT pragma in C. llvm.fmuladd.* is generated by clang when it sees an expression of the form 'a * b + c' within a single source statement. If you want to opportunistically form FMA target instructions my inclination would be to skip llvm.fmuladd.* and just form them from a*b+c expressions at isel time. I

what does -ffp-contract=fast allow?

2016 Nov 18

what does -ffp-contract=fast allow?

----- Original Message ----- > From: "Sanjay Patel" <spatel at rotateright.com> > To: "Hal J. Finkel" <hfinkel at anl.gov> > Cc: "Mehdi Amini" <mehdi.amini at apple.com>, "llvm-dev" > <llvm-dev at lists.llvm.org>, "cfe-dev" <cfe-dev at lists.llvm.org>, > "andrew kaylor" <andrew.kaylor at

[LLVMdev] Best way for JIT to query whether llvm.fma.* is fast?

2014 Dec 10

[LLVMdev] Best way for JIT to query whether llvm.fma.* is fast?

Thanks! That’s probably close enough for practical purposes. I looked at the overrides on various targets, and they all return true if the FMA hardware exists. - Arch From: Jingyue Wu [mailto:jingyue at google.com] Sent: Wednesday, December 10, 2014 2:56 PM To: Robison, Arch Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Best way for JIT to query whether llvm.fma.* is fast? Does

what does -ffp-contract=fast allow?

2016 Nov 18

what does -ffp-contract=fast allow?

Sent from my Verizon Wireless 4G LTE DROID On Nov 17, 2016 5:53 PM, Mehdi Amini <mehdi.amini at apple.com<mailto:mehdi.amini at apple.com>> wrote: > > >> On Nov 17, 2016, at 4:33 PM, Hal Finkel <hfinkel at anl.gov<mailto:hfinkel at anl.gov>> wrote: >> >> >> ________________________________ >>> >>> From: "Warren

[LLVMdev] Documentation of fmuladd intrinsic

2013 Jan 11

[LLVMdev] Documentation of fmuladd intrinsic

Out of curiosity, what is the use-case for isFMAFasterThanMulAndAdd? If a target declares that FMA is actually slower for a given type, why not just declare it as illegal for that type? Wouldn't that accomplish the same thing without another target hook? I feel like I'm missing something here. On Fri, Jan 11, 2013 at 2:40 PM, Hal Finkel <hfinkel at anl.gov> wrote: > -----

Commit history duplicated, seeing weird diffusion activity (Was: [Diffusion] rG67c416dc9a5a: [DebugInfo] Allow spill slots in call site parameter descriptions)

2019 Nov 15

Commit history duplicated, seeing weird diffusion activity (Was: [Diffusion] rG67c416dc9a5a: [DebugInfo] Allow spill slots in call site parameter descriptions)

I just got a Diffusion notification about a change of mine being reverted by Fangrui, but I'm not sure that's actually what happened and am confused and concerned. My commit was "[DebugInfo] Allow spill slots in call site parameter descriptions", and it appears in the history under two hashes: 1ee84e and 67c416. The first commit contains the actual change. The second touches

[LLVMdev] Clarifying FMA-related TargetOptions

2012 Feb 08

[LLVMdev] Clarifying FMA-related TargetOptions

Hi Owen, Having looked into this due to Clang failing PlumHall with it recently I can give an opinion... I think !NoExcessFPPrecision covers FMA completely. There are indeed some algorithms which give incorrect results when FMA is enabled, examples being those that do floating point comparisons such as: a * b + c - d. If c == d, it is still possible for that result not to equal a*b, as "+c

[LLVMdev] Tablegen bug???

2012 Nov 30

[LLVMdev] Tablegen bug???

Should tablegen detect this as an error, or is it documented as a limitation somewhere that we've missed? In the tablegen-generated file AMDILGenIntrinsics.inc, we have a bunch of if statements comparing strings, many of which are dead, preventing correct recognition of some intrinsics in the their text form. I'm not quite sure what GET_FUNCTION_RECOGNIZER is used for, but if it's

FMA canonicalization in IR

2016 Nov 19

FMA canonicalization in IR

Sent from my Verizon Wireless 4G LTE DROID On Nov 19, 2016 10:26 AM, Sanjay Patel <spatel at rotateright.com<mailto:spatel at rotateright.com>> wrote: > > If I have my FMA intrinsics story straight now (thanks for the explanation, Hal!), I think it raises another question about IR canonicalization (and may affect the proposed revision to IR FMF): No, I think that we specifically

[LLVMdev] Question about FMA formation

2012 Dec 13

[LLVMdev] Question about FMA formation

On Dec 12, 2012, at 5:20 PM, Eric Christopher <echristo at gmail.com> wrote: > Why not just form them via a fast IR level pass and just have patterns match in fast isel instead of trying to form code? Or are we saying the same thing? (Your words of "fast isel spot"ting and "form better code" caused me to think you mean to do optimizations within the fast isel pass).

[LLVMdev] API break for out-of-tree targets implementing TargetLoweringBase::isFMAFasterThanMulAndAdd

2013 Jul 08

[LLVMdev] API break for out-of-tree targets implementing TargetLoweringBase::isFMAFasterThanMulAndAdd

Hello, To any out-of-tree targets, please be aware that I intend to commit a patch that will break the build of any target implementing TargetLoweringBase::isFMAFasterThanMulAndAdd, for the reasons described below. (Basically, the current interface definition is broken and not followed, and no in-tree target was doing the right thing with it, so it is unlikely any out-of-tree target is either...)

FMA canonicalization in IR

2016 Nov 20

FMA canonicalization in IR

The potential advantage I was considering would be more accurate cost modeling in the vectorizer, inliner, etc. Like min/max, this is another case where the sum of the IR parts is greater than the actual cost. Beyond that, it seems odd to me that we'd choose the longer IR expression of something that could be represented in a minimal form. I know we make practical concessions in IR based on

[LLVMdev] Tablegen bug???

2012 Nov 30

[LLVMdev] Tablegen bug???

If the source being scanned has "llvm.AMDIL.barrier.global, it will match the first barrier test and return AMDIL_barrier, not AMDIL_barrier_global. On Nov 29, 2012, at 7:19 PM, Chris Lattner <clattner at apple.com> wrote: > Out of curiosity, what is wrong about that? It looks ok to me. > > -Chris > > On Nov 29, 2012, at 6:52 PM, "Relph, Richard"

RFC: change -fp-contract=off to actually disable FMAs

2019 Jul 10

RFC: change -fp-contract=off to actually disable FMAs

> I think you have a different definition of fused then. Fused is a description of how the operation is computed/rounded, not an instruction count. "Only fuse FP ops when the result won't be affected" is what the existing comment says. So it can't be both a fused op and not a fused op if it's only meant to imply a difference in rounding. I'm just re-using the

[LLVMdev] Question about FMA formation

2012 Dec 13

[LLVMdev] Question about FMA formation

On Dec 12, 2012, at 3:40 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote: > Hi, Dear All: > > I'm going implement FMA formation. On some architectures, "FMA a, b, c" is more precise than > "a * b + c". I'm wondering if FMA could be less precise. In the former case, can we enable FMA > formation despite restrictive FP mode? > I believe

similar to: devid support for EFI partition improved zfs usibility