Displaying 20 results from an estimated 57 matches for "mispredicted".
2008 Sep 29
4
[LLVMdev] Hi Cache Miss and Branch Misprediction
Hi Guys,
I am an absolute newbie to the compiler community. I am experimenting a little bit with llvm.
I have a few small questions, i would be really great if someone could help me.
1. Can i find out (is there something already built), if the previous instruction / or some instruction was a cache miss. Basically i want to detect cache misses and instructions that are causing this
2. Can i find
2008 Sep 29
0
[LLVMdev] Hi Cache Miss and Branch Misprediction
Ketan Pundlik Umare wrote:
> Hi Guys,
> I am an absolute newbie to the compiler community. I am experimenting a little bit with llvm.
> I have a few small questions, i would be really great if someone could help me.
It sounds like what you want is valgrind --tool=cachegrind (or
--tool=callgrind). See http://valgrind.org/
> 1. Can i find out (is there something already built), if the
2008 Sep 30
0
[LLVMdev] Hi Cache Miss and Branch Misprediction
On Sep 28, 2008, at 6:43 PM, Ketan Pundlik Umare wrote:
> I am an absolute newbie to the compiler community. I am
> experimenting a little bit with llvm.
> I have a few small questions, i would be really great if someone
> could help me.
>
> 1. Can i find out (is there something already built), if the
> previous instruction / or some instruction was a cache miss.
>
2008 Sep 30
0
[LLVMdev] Hi Cache Miss and Branch Misprediction
Thanx a lot Guys!!
But i have to do this online and and use it to do some kind code transformation. Its for a different project.
But all this has given me a quite a bit of knowledge.Wow!!!
Thank you
Best
Ketan
----- Original Message -----
From: "OvermindDL1" <overminddl1 at gmail.com>
To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
Sent: Monday,
2008 Sep 30
2
[LLVMdev] Hi Cache Miss and Branch Misprediction
On Mon, Sep 29, 2008 at 6:30 PM, Mike Stump <mrs at apple.com> wrote:
> /* snip */
AMD's CodeAnalyst is free and quite wonderful at this job. Shows
details about just about anything the CPU reports (and on newer AMD
CPU's there is an even more ridiculous amount of information) about
every little function call, time they took, multiple profiling modes,
etc...
2008 Sep 30
1
[LLVMdev] Hi Cache Miss and Branch Misprediction
Than a lot John!!!!
That was great, i wanted to read the performance counter, but was always afraid how llvm would react (virtual machine)
Thanx for the resource i ll go through it.
i think i read somewhere that llvm doesnot support assembly instructions???
Also can you redirect me to run-time transforms for llvm, where can i find the material. I thought that the transfor pass is done on the IR
2019 Sep 17
2
Spectre V1 Mitigation - Internals?
...t I understand that as soon as the condition
value is available, the processor can check about it's assumptions and
revert back.
That is,
If the branch prediction is correct during speculation, we mask with
all_ones, the processor can follow the predicted branch to retire.
But if the processor mispredicted the branch, it will revert back as soon
as condition become available if this is the case then we don't execute
speculatively the operations : pointer1 &= predicate_state - (if branch)
and *pointer2 & predicted_state - (else branch) right? Or
out-of-processor's allow such access?
P...
2019 Sep 17
2
Spectre V1 Mitigation - Internals?
...ut the store operation in speculative execution.
Am I wrong here?
On Tue, 17 Sep 2019 at 21:30, Craig Topper <craig.topper at gmail.com> wrote:
> The reverting of state doesn’t occur when the condition is available. The
> processor has to “execute” the branch uop and see that it was mispredicted.
> This occurs at least one cycle after the condition is available. The
> conditional move on the predicate state can execute at the same time as the
> branch. Or it can execute before it if the branch unit is busy. The load
> can do the same.
>
> On Tue, Sep 17, 2019 at 7:57 AM P...
2017 Dec 16
2
Replace call stack with an equivalent on the heap?
Hello,
I'm implementing a custom Haskell-to-LLVM compiler, and in my
experimentation, noticed that GHC is much slower than clang certain
examples, such as the ackermann function. However, from reading their
respective IRs (Cmm for GHC and LLVM for clang), I don't really see much of
a difference. Here is a link to the numbers. (n, m) are the parameters to
the ackermann function
2018 Aug 14
3
[RFC] Delaying phi-to-select transformation until later in the pass pipeline
Summary
=======
I'm planning on adjusting SimplifyCFG so that it doesn't turn two-entry phi
nodes into selects until later in the pass pipeline, to give passes which can
understand phis but not selects more opportunity to optimize. The thing I'm
trying to do which made me think of doing this is described below, but from the
benchmarking I've done it looks like this is overall a
2019 Sep 16
2
Spectre V1 Mitigation - Internals?
...redicate_state = all_ones_mask;
if (condition) {
predicate_state = !condition ? all_zeros_mask : predicate_state;
pointer1 &= predicate_state;
leak(*pointer1);
} else {
int value2 = *pointer2 & predicate_state;
leak(value2);
}
}
Let's assume that the branch is mispredicted and if body is taken. The
value predicate_state mask is depend on the "result of the condition"
but which is not yet available hence
speculative execution. My question whether the value of
predicate_state is also guessed by the processor? If it is correct,
then the value of predicate_sta...
2014 Feb 03
2
[LLVMdev] [RFC] BlockFrequency is the wrong metric; we need a new one
On Feb 2, 2014, at 6:18 PM, Andrew Trick <atrick at apple.com> wrote:
>> The result of such a system would produce weights for every block in the above CFG as '1.0', or equivalent to the entry block weight. This to me is a really useful metric -- it indicates that no block in the CFG is really more or less likely than any other. Only *biases* in a specific direction would
2016 Jul 29
0
XRay: Demo on x86_64/Linux almost done; some questions.
On 28 July 2016 at 16:14, Serge Rogatch via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> Can I ask you why you chose to patch both function entrances and exits,
> rather than just patching the entrances and (in the patches) pushing on the
> stack the address of __xray_FunctionExit , so that the user function returns
> normally (with RETQ or POP RIP or whatever else instruction)
2011 Jan 04
0
[LLVMdev] Is PIC code defeating the branch predictor?
On Jan 3, 2011, at 11:30 PM, Jakob Stoklund Olesen wrote:
> I noticed that we generate code like this for i386 PIC:
>
> calll L0$pb
> L0$pb:
> popl %eax
> movl %eax, -24(%ebp) ## 4-byte Spill
>
> I worry that this defeats the return address prediction for returns in the function because calls and returns no longer are matched.
Yes, this will defeat the
2018 Aug 15
2
[RFC] Delaying phi-to-select transformation until later in the pass pipeline
I'm concerned that we're focusing on one side of this. Let me point out
a few concerns w/changing the canonical form here:
1. LICM does not know how to hoist or sink regions. It does know how
to hoist and sink selects.
2. InstCombine has limited support for triangles/diamonds, but fairly
extensive support for selects.
3. EarlyCSE and GVN do not know how to eliminate fully
2007 Apr 18
1
[Bridge] [BRIDGE] Unaligned access on IA64 when comparing ethernet addresses
From: Evgeny Kravtsunov <emkravts@openvz.org>
compare_ether_addr() implicitly requires that the addresses
passed are 2-bytes aligned in memory.
This is not true for br_stp_change_bridge_id() and
br_stp_recalculate_bridge_id() in which one of the addresses
is unsigned char *, and thus may not be 2-bytes aligned.
Signed-off-by: Evgeny Kravtsunov <emkravts@openvz.org>
Signed-off-by:
2016 Jul 29
2
XRay: Demo on x86_64/Linux almost done; some questions.
Thanks for pointing this out, Tim. Then maybe this approach is not the best
choice for x86, though ideally measuring is needed, it is just that on ARM
the current x86 approach is not applicable because ARM doesn't have a
single return instruction (such as RETQ on x86_64), furthermore, the return
instructions on ARM can be conditional.
I have another question: what happens if the instrumented
2017 Apr 20
4
[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long
> This seems like it was done for perf reason (mispredict). Conditional-to-cmov transformation should keep
> from introducing additional observable side-effects, and it's clear that whatever did this did not account
> for floating point exception.
That’s a very reasonable statement, but I’m not sure it corresponds to the way we have typically approached this sort of problem.
In
2018 Aug 17
2
[RFC] Delaying phi-to-select transformation until later in the pass pipeline
> On Aug 15, 2018, at 10:57 PM, Hal Finkel via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
>
> On 08/15/2018 02:38 PM, Philip Reames via llvm-dev wrote:
>> I'm concerned that we're focusing on one side of this. Let me point out a few concerns w/changing the canonical form here:
>>
>> LICM does not know how to hoist or sink regions. It does know
2011 Jan 04
4
[LLVMdev] Is PIC code defeating the branch predictor?
I noticed that we generate code like this for i386 PIC:
calll L0$pb
L0$pb:
popl %eax
movl %eax, -24(%ebp) ## 4-byte Spill
I worry that this defeats the return address prediction for returns in the function because calls and returns no longer are matched.
From Intel's Optimization Reference Manual:
"The return address stack mechanism augments the static and dynamic