thr3ads.net - similar to: "[LLVMdev] MC-JIT Patches 1/3"

Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] MC-JIT Patches 1/3"

2010 Aug 01

[LLVMdev] MC-JIT Patches 1/3

Hi Jan, Applied with edits in r109996, thanks! I wasn't happy about the change to add "Target" everywhere -- I just added a local hack in TargetSelect.h to workaround this. We should really change the definition of LLVM_NATIVE_TARGET, but I am not in a configure hacking mood. - Daniel On Wed, Jul 28, 2010 at 10:39 AM, Jan Sjodin <jan_sjodin at yahoo.com> wrote: > I have

[LLVMdev] MC-JIT Patches 2/3

2010 Jul 28

[LLVMdev] MC-JIT Patches 2/3

This patch contains the initial implementation of MCJIT. - Jan -------------- next part -------------- A non-text attachment was scrubbed... Name: 0019_mcjit.patch Type: text/x-diff Size: 42198 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100728/2eb6ac65/attachment.patch>

[LLVMdev] MC-JIT Patches 2/3

2010 Aug 01

[LLVMdev] MC-JIT Patches 2/3

Hi Jan, I would rather not work with a patch this large. Can you pull out the addition of the MCJITStreamer into its own patch, and we can iterate on getting that in as a single commit? I realize it won't work or do anything useful, but I can't deal with reviewing patches this large. The main thing I am concerned about is getting the basic design of how the streamer and the assembler and

[LLVMdev] MC-JIT Streamer 1/3

2010 Aug 20

[LLVMdev] MC-JIT Streamer 1/3

I was delayed creating the smaller patches, but finally I had some time to put the first set together. There are three small patches, the first two are classes the MCJITStreamer uses, and the last patch is the MCJITStreamer class itself. - Jan --- On Sun, 8/1/10, Daniel Dunbar <daniel at zuster.org> wrote: > From: Daniel Dunbar <daniel at zuster.org> > Subject: Re: [LLVMdev]

[LLVMdev] X86 FMA4

2012 Jul 25

[LLVMdev] X86 FMA4

We're migrating to LLVM 3.1 and trying to use the upstream FMA patterns. Why is VFMADDSD4 defined with vector types? Is this simply because the gcc intrinsic uses vector types? It's quite unnatural if you have a compiler that generates FMAs as opposed to requiring user intrinsics. -Dave

[LLVMdev] bdver1 cpu(bulldozer) support with dragonegg

2011 Dec 01

[LLVMdev] bdver1 cpu(bulldozer) support with dragonegg

Better be quick! I am adding FMA4 and XOP now, and if you contribute code before I do, you can spare yourself some XOP merging. - Jan ----- Original Message ----- > From: David A. Greene <greened at obbligato.org> > To: Benjamin Kramer <benny.kra at googlemail.com> > Cc: llvmdev at cs.uiuc.edu > Sent: Thursday, December 1, 2011 12:19 PM > Subject: Re: [LLVMdev]

[LLVMdev] llvm Greater Toronto Area social

2012 May 02

[LLVMdev] llvm Greater Toronto Area social

8th will work for me. Can we pick a place that is not overly noisy? - Jan >________________________________ > From: Rafael Espíndola <rafael.espindola at gmail.com> >To: Ehsan Akhgari <ehsan.akhgari at gmail.com> >Cc: Jeff Muizelaar <jmuizelaar at mozilla.com>; clang-dev Developers <cfe-dev at cs.uiuc.edu>; "Minard, Brian" <brian.minard at

[LLVMdev] X86 FMA4

2012 Jul 26

[LLVMdev] X86 FMA4

Ah, bad example. This is a general problem for all (maybe most) SSE and AVX SS/SD patterns though, which is why I mentioned Sandybridge. You can swap out VFMADDSD in my example for VADDSD or whatever you like. I have a lion's share of such a change implemented already and performance is greatly affected. If the community is interested in this change, I would be happy to prepare a patch.

[LLVMdev] X86 FMA4

2012 Jul 27

[LLVMdev] X86 FMA4

Just looked up the numbers from Agner Fog for Sandy Bridge for vmovaps/etc for loading/storing from memory. vmovaps - load takes 1 load mu op, 3 latency, with a reciprocal throughput of 0.5. vmovaps - store takes 1 store mu op, 1 load mu op for address calculation, 3 latency, with a reciprocal throughput of 1. He does not list vmovsd, but movsd has the same stats as vmovaps, so I feel it is a

[LLVMdev] Problem building llvm after r109996 (Add InitializeNativeTargetAsmPrinter())

2010 Aug 02

[LLVMdev] Problem building llvm after r109996 (Add InitializeNativeTargetAsmPrinter())

Hello, After I try making a clean build and got the following error: llvm[3]: Compiling EDDisassembler.cpp for Release build In file included from /llvm/include/llvm/Target/TargetSelect.h:38, from /llvm/lib/MC/MCDisassembler/EDDisassembler.cpp:37: /llvm/stage1/include/llvm/Config/AsmPrinters.def: In function ‘void LLVMInitializeX86TargetAsmPrinter()’:

[LLVMdev] bdver1 cpu(bulldozer) support with dragonegg

2011 Dec 01

[LLVMdev] bdver1 cpu(bulldozer) support with dragonegg

Jan Sjodin <jan_sjodin at yahoo.com> writes: > Better be quick! I am adding FMA4 and XOP now, and if you contribute > code before I do, you can spare yourself some XOP merging. Go ahead. We're not going to get there soon enough. :( -Dave

[LLVMdev] llvm Greater Toronto Area social

2012 May 02

[LLVMdev] llvm Greater Toronto Area social

On 2 May 2012 16:57, Jan Sjodin <jan_sjodin at yahoo.com> wrote: > 8th will work for me. Can we pick a place that is not overly noisy? http://www.harbordhouse.ca should be fine, but let us know if you have another suggestion. > - Jan > Cheers, Rafael

[LLVMdev] Problem building llvm after r109996 (Add InitializeNativeTargetAsmPrinter())

2010 Aug 02

[LLVMdev] Problem building llvm after r109996 (Add InitializeNativeTargetAsmPrinter())

Hi Jean-Daniel, My fault, I'm sure, but I don't see the problem yet. Is it possible your version of llvm/Config/AsmPrinters.def has X86 listed twice? - Daniel On Mon, Aug 2, 2010 at 12:43 AM, Jean-Daniel Dupas <devlists at shadowlab.org> wrote: > Hello, > > After I try making a clean build and got the following error: > > llvm[3]: Compiling EDDisassembler.cpp for

[LLVMdev] X86 FMA4

2012 Jul 26

[LLVMdev] X86 FMA4

Hey Jan and Dave, It's not obvious, but there is a significant scalar performance issue following the GCC intrinsics. Let's look at the VFMADDSD pattern. We're operating on scalars with undefineds as the remaining vector elements of the operands. This sounds okay, but when one looks closer... vmovsd fp4_+1088(%rip), %xmm3 # fpppp.f:647 vmovaps %xmm3, 18560(%rsp)

[LLVMdev] X86 FMA4

2012 Jul 26

[LLVMdev] X86 FMA4

Because the intrinsics uses vector types (same as gcc). - Jan ----- Original Message ----- > From: "dag at cray.com" <dag at cray.com> > To: llvmdev at cs.uiuc.edu > Cc: > Sent: Wednesday, July 25, 2012 3:26 PM > Subject: [LLVMdev] X86 FMA4 > > We're migrating to LLVM 3.1 and trying to use the upstream FMA patterns. > > Why is VFMADDSD4

[LLVMdev] X86 FMA4

2012 Jul 27

[LLVMdev] X86 FMA4

Hey Michael, Thanks for the legwork! It appears that the stats you listed are for movaps [SSE], not vmovaps [AVX]. I would *assume* that vmovaps(m128) is closer to vmovaps(m256), since they are both AVX instructions. Although, yes, I agree that this is not clear from Agner's report. Please correct me if I am misunderstanding. As I am sure you are aware, we cannot use SSE (movaps)

[LLVMdev] bdver1 cpu(bulldozer) support with dragonegg

2011 Dec 01

[LLVMdev] bdver1 cpu(bulldozer) support with dragonegg

That is too bad. :( You can always review the patches, and if you see something that can be done better let me know. - Jan ----- Original Message ----- > From: David A. Greene <greened at obbligato.org> > To: Jan Sjodin <jan_sjodin at yahoo.com> > Cc: David A. Greene <greened at obbligato.org>; Benjamin Kramer <benny.kra at googlemail.com>; "llvmdev at

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Feb 09

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

On Sat, Feb 9, 2019 at 4:44 PM Jan Sjodin <jan_sjodin at yahoo.com> wrote: > > The reason I'm looking for solutions that can work without "scanning the > > code" or "spooky action at a distance" is that we should have a solution > > that's easily digestible by folks who are not aware of GPU execution > models. > > > > The fallback

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Feb 01

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

On 31.01.19 15:59, Jan Sjodin wrote: >> > Any transform that re-arranges control flow would potentially have to >> > know about the properties of ballot(), and the rules with respect to >> > the CFG (and maybe consider the target) to know where to insert the >> > intrinsics. > >> But the same is true for basically any approach to handling this. In

[LLVMdev] Announcing: LLVM 2.9 Tentative Release Schedule

2011 Feb 25

[LLVMdev] Announcing: LLVM 2.9 Tentative Release Schedule

On Feb 24, 2011, at 4:05 AM, Jan Sjodin wrote: > On Feb 19, 2011, at 8:05 PM, Yuri wrote: >> >>> On 02/19/2011 14:52, Yuri wrote: >>>> Will MC path for JNI be included in 2.9? >>>> >>> >>> Sorry. I meant: Will MC path for JIT be included in 2.9? >> >> While it would be nice, it doesn't seem like anyone is working on

similar to: [LLVMdev] MC-JIT Patches 1/3