Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] MC-JIT Streamer 1/3"
2010 Aug 20
1
[LLVMdev] MC-JIT Streamer 1/3
I was delayed creating the smaller patches, but finally I had some time to put the first set together. There are three small patches, the first two are classes the MCJITStreamer uses, and the last patch is the MCJITStreamer class itself.
- Jan
--- On Sun, 8/1/10, Daniel Dunbar <daniel at zuster.org> wrote:
> From: Daniel Dunbar <daniel at zuster.org>
> Subject: Re: [LLVMdev]
2010 Aug 01
0
[LLVMdev] MC-JIT Patches 2/3
Hi Jan,
I would rather not work with a patch this large. Can you pull out the
addition of the MCJITStreamer into its own patch, and we can iterate
on getting that in as a single commit? I realize it won't work or do
anything useful, but I can't deal with reviewing patches this large.
The main thing I am concerned about is getting the basic design of how
the streamer and the assembler and
2010 Sep 30
3
[LLVMdev] JIT with MC - structure
Hi LLVM folks,
Attached with this email, you will find a patch not directly applicable (it
doesn't compile like this) which (try to) enhance the structure of a
possible JIT with the MC framework.
Basically :
- MCJIT :
- Main class implementing ExecutionEngine interface
- owns and creates :
- a MemoryManager (will be a reuse of the JITMemoryManager mechanism)
- a MCJITStreamer
2012 Jul 26
0
[LLVMdev] X86 FMA4
Ah, bad example. This is a general problem for all (maybe most) SSE and AVX
SS/SD patterns though, which is why I mentioned Sandybridge. You can swap
out VFMADDSD in my example for VADDSD or whatever you like.
I have a lion's share of such a change implemented already and performance
is greatly affected. If the community is interested in this change, I would
be happy to prepare a patch.
2011 Dec 01
1
[LLVMdev] bdver1 cpu(bulldozer) support with dragonegg
That is too bad. :( You can always review the patches, and if you see something that can be done better let me know.
- Jan
----- Original Message -----
> From: David A. Greene <greened at obbligato.org>
> To: Jan Sjodin <jan_sjodin at yahoo.com>
> Cc: David A. Greene <greened at obbligato.org>; Benjamin Kramer <benny.kra at googlemail.com>; "llvmdev at
2010 Aug 01
0
[LLVMdev] MC-JIT Patches 1/3
Hi Jan,
Applied with edits in r109996, thanks!
I wasn't happy about the change to add "Target" everywhere -- I just
added a local hack in TargetSelect.h to workaround this. We should
really change the definition of LLVM_NATIVE_TARGET, but I am not in a
configure hacking mood.
- Daniel
On Wed, Jul 28, 2010 at 10:39 AM, Jan Sjodin <jan_sjodin at yahoo.com> wrote:
> I have
2012 Jul 27
0
[LLVMdev] X86 FMA4
Hey Michael,
Thanks for the legwork!
It appears that the stats you listed are for movaps [SSE], not vmovaps
[AVX]. I would *assume* that vmovaps(m128) is closer to vmovaps(m256),
since they are both AVX instructions. Although, yes, I agree that this is
not clear from Agner's report. Please correct me if I am misunderstanding.
As I am sure you are aware, we cannot use SSE (movaps)
2012 Jul 27
2
[LLVMdev] X86 FMA4
Just looked up the numbers from Agner Fog for Sandy Bridge for vmovaps/etc for loading/storing from memory.
vmovaps - load takes 1 load mu op, 3 latency, with a reciprocal throughput of 0.5.
vmovaps - store takes 1 store mu op, 1 load mu op for address calculation, 3 latency, with a reciprocal throughput of 1.
He does not list vmovsd, but movsd has the same stats as vmovaps, so I feel it is a
2010 Jul 20
0
[LLVMdev] MC-JIT
Some boring style comments:
- whack trailing whitespace
- spaces, not tabs
- the methods in MCJITStreamer.cpp should probably have blank lines between them
There seems to be an ownership problem of the MCJITObjectWriter. If I
understand the code correctly, the assembler Finish method takes
ownership of the Writer parameter, which presumably is needed to JIT
two functions.
+1 for separate
2010 Jul 28
2
[LLVMdev] MC-JIT Patches 2/3
This patch contains the initial implementation of MCJIT.
- Jan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0019_mcjit.patch
Type: text/x-diff
Size: 42198 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100728/2eb6ac65/attachment.patch>
2011 Dec 01
0
[LLVMdev] bdver1 cpu(bulldozer) support with dragonegg
Jan Sjodin <jan_sjodin at yahoo.com> writes:
> Better be quick! I am adding FMA4 and XOP now, and if you contribute
> code before I do, you can spare yourself some XOP merging.
Go ahead. We're not going to get there soon enough. :(
-Dave
2012 May 02
0
[LLVMdev] llvm Greater Toronto Area social
On 2 May 2012 16:57, Jan Sjodin <jan_sjodin at yahoo.com> wrote:
> 8th will work for me. Can we pick a place that is not overly noisy?
http://www.harbordhouse.ca should be fine, but let us know if you have
another suggestion.
> - Jan
>
Cheers,
Rafael
2012 Jul 27
3
[LLVMdev] X86 FMA4
> It appears that the stats you listed are for movaps [SSE], not vmovaps [AVX]. I would *assume* that vmovaps(m128) is closer to vmovaps(m256), since they are both AVX instructions. Although, yes, I agree that this is not clear from Agner's report. Please correct me if I am misunderstanding.
You are misunderstanding [no worries, happens to everyone = )]. The timings I listed were for
2012 Jul 26
1
[LLVMdev] X86 FMA4
Hey Jan and Dave,
It's not obvious, but there is a significant scalar performance issue
following the GCC intrinsics.
Let's look at the VFMADDSD pattern. We're operating on scalars with
undefineds as the remaining vector elements of the operands. This sounds
okay, but when one looks closer...
vmovsd fp4_+1088(%rip), %xmm3 # fpppp.f:647
vmovaps %xmm3, 18560(%rsp)
2010 Jul 19
7
[LLVMdev] MC-JIT
Together with Jan Sjodin (in copy of this email), we begin an
implementation of the JIT with MC. The idea, suggested by Jan, is to
develop a MCJIT in parallel of the current JIT and to keep the two
implementations until (at least) the new MC one is mature enough.
Currently code is kept on gitorious
(http://gitorious.org/llvm-mc-jit/llvm-mc-jit).
Following this, a boolean "bool MCJIT =
2019 Feb 09
1
[RFC] Adding thread group semantics to LangRef (motivated by GPUs)
On Sat, Feb 9, 2019 at 4:44 PM Jan Sjodin <jan_sjodin at yahoo.com> wrote:
> > The reason I'm looking for solutions that can work without "scanning the
> > code" or "spooky action at a distance" is that we should have a solution
> > that's easily digestible by folks who are not aware of GPU execution
> models.
> >
> > The fallback
2010 Jul 20
2
[LLVMdev] MC-JIT
> In the context of the JIT, there really is no such thing as a
> relocation, just fixups. I'm not completely sure what the right
> approach is, but the JIT should be able to fully resolve all of the
> symbols that are being used in the module. We may need some extra
> interfaces to allow the JIT to tell the MCAssembler about the address
> of some external symbols though.
2012 Jul 26
0
[LLVMdev] X86 FMA4
Jan Sjodin <jan_sjodin at yahoo.com> writes:
> You can't execute FMA4 instructions on Intel processors, so it doesn't
> really matter what the impact of the move instructions would be, since
> it would end up with an illegal instruction regardless. :)
Interlagos? All the world is not Intel.
> It does perhaps bring up an issue of tuning for different
>
2012 Nov 12
0
[LLVMdev] RFC: Code Ownership
On Mon, Nov 12, 2012 at 10:09 AM, Jan Sjodin <jan_sjodin at yahoo.com> wrote:
>> Not exactly - since this includes post commit review too. Since we
>
>> don't have any positive acknowledgement of post-commit review (when
>> there are no comments being provided) a code owner will generally end
>> up having to perform any cases of post-commit review in there area
2019 Feb 01
2
[RFC] Adding thread group semantics to LangRef (motivated by GPUs)
On 31.01.19 15:59, Jan Sjodin wrote:
>> > Any transform that re-arranges control flow would potentially have to
>> > know about the properties of ballot(), and the rules with respect to
>> > the CFG (and maybe consider the target) to know where to insert the
>> > intrinsics.
>
>> But the same is true for basically any approach to handling this. In