thr3ads.net - similar to: "[LLVMdev] MC-JIT Streamer 1/3"

Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] MC-JIT Streamer 1/3"

2010 Aug 20

[LLVMdev] MC-JIT Streamer 1/3

I was delayed creating the smaller patches, but finally I had some time to put the first set together. There are three small patches, the first two are classes the MCJITStreamer uses, and the last patch is the MCJITStreamer class itself. - Jan --- On Sun, 8/1/10, Daniel Dunbar <daniel at zuster.org> wrote: > From: Daniel Dunbar <daniel at zuster.org> > Subject: Re: [LLVMdev]

[LLVMdev] MC-JIT Patches 2/3

2010 Aug 01

[LLVMdev] MC-JIT Patches 2/3

Hi Jan, I would rather not work with a patch this large. Can you pull out the addition of the MCJITStreamer into its own patch, and we can iterate on getting that in as a single commit? I realize it won't work or do anything useful, but I can't deal with reviewing patches this large. The main thing I am concerned about is getting the basic design of how the streamer and the assembler and

[LLVMdev] JIT with MC - structure

2010 Sep 30

[LLVMdev] JIT with MC - structure

Hi LLVM folks, Attached with this email, you will find a patch not directly applicable (it doesn't compile like this) which (try to) enhance the structure of a possible JIT with the MC framework. Basically : - MCJIT : - Main class implementing ExecutionEngine interface - owns and creates : - a MemoryManager (will be a reuse of the JITMemoryManager mechanism) - a MCJITStreamer

[LLVMdev] X86 FMA4

2012 Jul 26

[LLVMdev] X86 FMA4

Ah, bad example. This is a general problem for all (maybe most) SSE and AVX SS/SD patterns though, which is why I mentioned Sandybridge. You can swap out VFMADDSD in my example for VADDSD or whatever you like. I have a lion's share of such a change implemented already and performance is greatly affected. If the community is interested in this change, I would be happy to prepare a patch.

[LLVMdev] bdver1 cpu(bulldozer) support with dragonegg

2011 Dec 01

[LLVMdev] bdver1 cpu(bulldozer) support with dragonegg

That is too bad. :( You can always review the patches, and if you see something that can be done better let me know. - Jan ----- Original Message ----- > From: David A. Greene <greened at obbligato.org> > To: Jan Sjodin <jan_sjodin at yahoo.com> > Cc: David A. Greene <greened at obbligato.org>; Benjamin Kramer <benny.kra at googlemail.com>; "llvmdev at

[LLVMdev] MC-JIT Patches 1/3

2010 Aug 01

[LLVMdev] MC-JIT Patches 1/3

Hi Jan, Applied with edits in r109996, thanks! I wasn't happy about the change to add "Target" everywhere -- I just added a local hack in TargetSelect.h to workaround this. We should really change the definition of LLVM_NATIVE_TARGET, but I am not in a configure hacking mood. - Daniel On Wed, Jul 28, 2010 at 10:39 AM, Jan Sjodin <jan_sjodin at yahoo.com> wrote: > I have

[LLVMdev] X86 FMA4

2012 Jul 27

[LLVMdev] X86 FMA4

Hey Michael, Thanks for the legwork! It appears that the stats you listed are for movaps [SSE], not vmovaps [AVX]. I would *assume* that vmovaps(m128) is closer to vmovaps(m256), since they are both AVX instructions. Although, yes, I agree that this is not clear from Agner's report. Please correct me if I am misunderstanding. As I am sure you are aware, we cannot use SSE (movaps)

[LLVMdev] X86 FMA4

2012 Jul 27

[LLVMdev] X86 FMA4

Just looked up the numbers from Agner Fog for Sandy Bridge for vmovaps/etc for loading/storing from memory. vmovaps - load takes 1 load mu op, 3 latency, with a reciprocal throughput of 0.5. vmovaps - store takes 1 store mu op, 1 load mu op for address calculation, 3 latency, with a reciprocal throughput of 1. He does not list vmovsd, but movsd has the same stats as vmovaps, so I feel it is a

[LLVMdev] MC-JIT

2010 Jul 20

[LLVMdev] MC-JIT

Some boring style comments: - whack trailing whitespace - spaces, not tabs - the methods in MCJITStreamer.cpp should probably have blank lines between them There seems to be an ownership problem of the MCJITObjectWriter. If I understand the code correctly, the assembler Finish method takes ownership of the Writer parameter, which presumably is needed to JIT two functions. +1 for separate

[LLVMdev] MC-JIT Patches 2/3

2010 Jul 28

[LLVMdev] MC-JIT Patches 2/3

This patch contains the initial implementation of MCJIT. - Jan -------------- next part -------------- A non-text attachment was scrubbed... Name: 0019_mcjit.patch Type: text/x-diff Size: 42198 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100728/2eb6ac65/attachment.patch>

[LLVMdev] bdver1 cpu(bulldozer) support with dragonegg

2011 Dec 01

[LLVMdev] bdver1 cpu(bulldozer) support with dragonegg

Jan Sjodin <jan_sjodin at yahoo.com> writes: > Better be quick! I am adding FMA4 and XOP now, and if you contribute > code before I do, you can spare yourself some XOP merging. Go ahead. We're not going to get there soon enough. :( -Dave

[LLVMdev] llvm Greater Toronto Area social

2012 May 02

[LLVMdev] llvm Greater Toronto Area social

On 2 May 2012 16:57, Jan Sjodin <jan_sjodin at yahoo.com> wrote: > 8th will work for me. Can we pick a place that is not overly noisy? http://www.harbordhouse.ca should be fine, but let us know if you have another suggestion. > - Jan > Cheers, Rafael

[LLVMdev] X86 FMA4

2012 Jul 27

[LLVMdev] X86 FMA4

> It appears that the stats you listed are for movaps [SSE], not vmovaps [AVX]. I would *assume* that vmovaps(m128) is closer to vmovaps(m256), since they are both AVX instructions. Although, yes, I agree that this is not clear from Agner's report. Please correct me if I am misunderstanding. You are misunderstanding [no worries, happens to everyone = )]. The timings I listed were for

[LLVMdev] X86 FMA4

2012 Jul 26

[LLVMdev] X86 FMA4

Hey Jan and Dave, It's not obvious, but there is a significant scalar performance issue following the GCC intrinsics. Let's look at the VFMADDSD pattern. We're operating on scalars with undefineds as the remaining vector elements of the operands. This sounds okay, but when one looks closer... vmovsd fp4_+1088(%rip), %xmm3 # fpppp.f:647 vmovaps %xmm3, 18560(%rsp)

[LLVMdev] MC-JIT

2010 Jul 19

[LLVMdev] MC-JIT

Together with Jan Sjodin (in copy of this email), we begin an implementation of the JIT with MC. The idea, suggested by Jan, is to develop a MCJIT in parallel of the current JIT and to keep the two implementations until (at least) the new MC one is mature enough. Currently code is kept on gitorious (http://gitorious.org/llvm-mc-jit/llvm-mc-jit). Following this, a boolean "bool MCJIT =

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Feb 09

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

On Sat, Feb 9, 2019 at 4:44 PM Jan Sjodin <jan_sjodin at yahoo.com> wrote: > > The reason I'm looking for solutions that can work without "scanning the > > code" or "spooky action at a distance" is that we should have a solution > > that's easily digestible by folks who are not aware of GPU execution > models. > > > > The fallback

[LLVMdev] MC-JIT

2010 Jul 20

[LLVMdev] MC-JIT

> In the context of the JIT, there really is no such thing as a > relocation, just fixups. I'm not completely sure what the right > approach is, but the JIT should be able to fully resolve all of the > symbols that are being used in the module. We may need some extra > interfaces to allow the JIT to tell the MCAssembler about the address > of some external symbols though.

[LLVMdev] X86 FMA4

2012 Jul 26

[LLVMdev] X86 FMA4

Jan Sjodin <jan_sjodin at yahoo.com> writes: > You can't execute FMA4 instructions on Intel processors, so it doesn't > really matter what the impact of the move instructions would be, since > it would end up with an illegal instruction regardless. :) Interlagos? All the world is not Intel. > It does perhaps bring up an issue of tuning for different >

[LLVMdev] RFC: Code Ownership

2012 Nov 12

[LLVMdev] RFC: Code Ownership

On Mon, Nov 12, 2012 at 10:09 AM, Jan Sjodin <jan_sjodin at yahoo.com> wrote: >> Not exactly - since this includes post commit review too. Since we > >> don't have any positive acknowledgement of post-commit review (when >> there are no comments being provided) a code owner will generally end >> up having to perform any cases of post-commit review in there area

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Feb 01

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

On 31.01.19 15:59, Jan Sjodin wrote: >> > Any transform that re-arranges control flow would potentially have to >> > know about the properties of ballot(), and the rules with respect to >> > the CFG (and maybe consider the target) to know where to insert the >> > intrinsics. > >> But the same is true for basically any approach to handling this. In

similar to: [LLVMdev] MC-JIT Streamer 1/3