thr3ads.net - search: "mispredicted"

Displaying 20 results from an estimated 57 matches for "mispredicted".

[LLVMdev] Hi Cache Miss and Branch Misprediction

2008 Sep 29

[LLVMdev] Hi Cache Miss and Branch Misprediction

Hi Guys, I am an absolute newbie to the compiler community. I am experimenting a little bit with llvm. I have a few small questions, i would be really great if someone could help me. 1. Can i find out (is there something already built), if the previous instruction / or some instruction was a cache miss. Basically i want to detect cache misses and instructions that are causing this 2. Can i find

[LLVMdev] Hi Cache Miss and Branch Misprediction

2008 Sep 29

[LLVMdev] Hi Cache Miss and Branch Misprediction

Ketan Pundlik Umare wrote: > Hi Guys, > I am an absolute newbie to the compiler community. I am experimenting a little bit with llvm. > I have a few small questions, i would be really great if someone could help me. It sounds like what you want is valgrind --tool=cachegrind (or --tool=callgrind). See http://valgrind.org/ > 1. Can i find out (is there something already built), if the

[LLVMdev] Hi Cache Miss and Branch Misprediction

2008 Sep 30

[LLVMdev] Hi Cache Miss and Branch Misprediction

On Sep 28, 2008, at 6:43 PM, Ketan Pundlik Umare wrote: > I am an absolute newbie to the compiler community. I am > experimenting a little bit with llvm. > I have a few small questions, i would be really great if someone > could help me. > > 1. Can i find out (is there something already built), if the > previous instruction / or some instruction was a cache miss. >

[LLVMdev] Hi Cache Miss and Branch Misprediction

2008 Sep 30

[LLVMdev] Hi Cache Miss and Branch Misprediction

Thanx a lot Guys!! But i have to do this online and and use it to do some kind code transformation. Its for a different project. But all this has given me a quite a bit of knowledge.Wow!!! Thank you Best Ketan ----- Original Message ----- From: "OvermindDL1" <overminddl1 at gmail.com> To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> Sent: Monday,

[LLVMdev] Hi Cache Miss and Branch Misprediction

2008 Sep 30

[LLVMdev] Hi Cache Miss and Branch Misprediction

On Mon, Sep 29, 2008 at 6:30 PM, Mike Stump <mrs at apple.com> wrote: > /* snip */ AMD's CodeAnalyst is free and quite wonderful at this job. Shows details about just about anything the CPU reports (and on newer AMD CPU's there is an even more ridiculous amount of information) about every little function call, time they took, multiple profiling modes, etc...

[LLVMdev] Hi Cache Miss and Branch Misprediction

2008 Sep 30

[LLVMdev] Hi Cache Miss and Branch Misprediction

Than a lot John!!!! That was great, i wanted to read the performance counter, but was always afraid how llvm would react (virtual machine) Thanx for the resource i ll go through it. i think i read somewhere that llvm doesnot support assembly instructions??? Also can you redirect me to run-time transforms for llvm, where can i find the material. I thought that the transfor pass is done on the IR

Spectre V1 Mitigation - Internals?

2019 Sep 17

Spectre V1 Mitigation - Internals?

...t I understand that as soon as the condition value is available, the processor can check about it's assumptions and revert back. That is, If the branch prediction is correct during speculation, we mask with all_ones, the processor can follow the predicted branch to retire. But if the processor mispredicted the branch, it will revert back as soon as condition become available if this is the case then we don't execute speculatively the operations : pointer1 &= predicate_state - (if branch) and *pointer2 & predicted_state - (else branch) right? Or out-of-processor's allow such access? P...

Spectre V1 Mitigation - Internals?

2019 Sep 17

Spectre V1 Mitigation - Internals?

...ut the store operation in speculative execution. Am I wrong here? On Tue, 17 Sep 2019 at 21:30, Craig Topper <craig.topper at gmail.com> wrote: > The reverting of state doesn’t occur when the condition is available. The > processor has to “execute” the branch uop and see that it was mispredicted. > This occurs at least one cycle after the condition is available. The > conditional move on the predicate state can execute at the same time as the > branch. Or it can execute before it if the branch unit is busy. The load > can do the same. > > On Tue, Sep 17, 2019 at 7:57 AM P...

Replace call stack with an equivalent on the heap?

2017 Dec 16

Replace call stack with an equivalent on the heap?

Hello, I'm implementing a custom Haskell-to-LLVM compiler, and in my experimentation, noticed that GHC is much slower than clang certain examples, such as the ackermann function. However, from reading their respective IRs (Cmm for GHC and LLVM for clang), I don't really see much of a difference. Here is a link to the numbers. (n, m) are the parameters to the ackermann function

[RFC] Delaying phi-to-select transformation until later in the pass pipeline

2018 Aug 14

[RFC] Delaying phi-to-select transformation until later in the pass pipeline

Summary ======= I'm planning on adjusting SimplifyCFG so that it doesn't turn two-entry phi nodes into selects until later in the pass pipeline, to give passes which can understand phis but not selects more opportunity to optimize. The thing I'm trying to do which made me think of doing this is described below, but from the benchmarking I've done it looks like this is overall a

Spectre V1 Mitigation - Internals?

2019 Sep 16

Spectre V1 Mitigation - Internals?

...redicate_state = all_ones_mask; if (condition) { predicate_state = !condition ? all_zeros_mask : predicate_state; pointer1 &= predicate_state; leak(*pointer1); } else { int value2 = *pointer2 & predicate_state; leak(value2); } } Let's assume that the branch is mispredicted and if body is taken. The value predicate_state mask is depend on the "result of the condition" but which is not yet available hence speculative execution. My question whether the value of predicate_state is also guessed by the processor? If it is correct, then the value of predicate_sta...

[LLVMdev] [RFC] BlockFrequency is the wrong metric; we need a new one

2014 Feb 03

[LLVMdev] [RFC] BlockFrequency is the wrong metric; we need a new one

On Feb 2, 2014, at 6:18 PM, Andrew Trick <atrick at apple.com> wrote: >> The result of such a system would produce weights for every block in the above CFG as '1.0', or equivalent to the entry block weight. This to me is a really useful metric -- it indicates that no block in the CFG is really more or less likely than any other. Only *biases* in a specific direction would

XRay: Demo on x86_64/Linux almost done; some questions.

2016 Jul 29

XRay: Demo on x86_64/Linux almost done; some questions.

On 28 July 2016 at 16:14, Serge Rogatch via llvm-dev <llvm-dev at lists.llvm.org> wrote: > Can I ask you why you chose to patch both function entrances and exits, > rather than just patching the entrances and (in the patches) pushing on the > stack the address of __xray_FunctionExit , so that the user function returns > normally (with RETQ or POP RIP or whatever else instruction)

[LLVMdev] Is PIC code defeating the branch predictor?

2011 Jan 04

[LLVMdev] Is PIC code defeating the branch predictor?

On Jan 3, 2011, at 11:30 PM, Jakob Stoklund Olesen wrote: > I noticed that we generate code like this for i386 PIC: > > calll L0$pb > L0$pb: > popl %eax > movl %eax, -24(%ebp) ## 4-byte Spill > > I worry that this defeats the return address prediction for returns in the function because calls and returns no longer are matched. Yes, this will defeat the

[RFC] Delaying phi-to-select transformation until later in the pass pipeline

2018 Aug 15

[RFC] Delaying phi-to-select transformation until later in the pass pipeline

I'm concerned that we're focusing on one side of this. Let me point out a few concerns w/changing the canonical form here: 1. LICM does not know how to hoist or sink regions. It does know how to hoist and sink selects. 2. InstCombine has limited support for triangles/diamonds, but fairly extensive support for selects. 3. EarlyCSE and GVN do not know how to eliminate fully

[Bridge] [BRIDGE] Unaligned access on IA64 when comparing ethernet addresses

2007 Apr 18

[Bridge] [BRIDGE] Unaligned access on IA64 when comparing ethernet addresses

From: Evgeny Kravtsunov <emkravts@openvz.org> compare_ether_addr() implicitly requires that the addresses passed are 2-bytes aligned in memory. This is not true for br_stp_change_bridge_id() and br_stp_recalculate_bridge_id() in which one of the addresses is unsigned char *, and thus may not be 2-bytes aligned. Signed-off-by: Evgeny Kravtsunov <emkravts@openvz.org> Signed-off-by:

XRay: Demo on x86_64/Linux almost done; some questions.

2016 Jul 29

XRay: Demo on x86_64/Linux almost done; some questions.

Thanks for pointing this out, Tim. Then maybe this approach is not the best choice for x86, though ideally measuring is needed, it is just that on ARM the current x86 approach is not applicable because ARM doesn't have a single return instruction (such as RETQ on x86_64), furthermore, the return instructions on ARM can be conditional. I have another question: what happens if the instrumented

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

2017 Apr 20

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

> This seems like it was done for perf reason (mispredict). Conditional-to-cmov transformation should keep > from introducing additional observable side-effects, and it's clear that whatever did this did not account > for floating point exception. That’s a very reasonable statement, but I’m not sure it corresponds to the way we have typically approached this sort of problem. In

[RFC] Delaying phi-to-select transformation until later in the pass pipeline

2018 Aug 17

[RFC] Delaying phi-to-select transformation until later in the pass pipeline

> On Aug 15, 2018, at 10:57 PM, Hal Finkel via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > On 08/15/2018 02:38 PM, Philip Reames via llvm-dev wrote: >> I'm concerned that we're focusing on one side of this. Let me point out a few concerns w/changing the canonical form here: >> >> LICM does not know how to hoist or sink regions. It does know

[LLVMdev] Is PIC code defeating the branch predictor?

2011 Jan 04

[LLVMdev] Is PIC code defeating the branch predictor?

I noticed that we generate code like this for i386 PIC: calll L0$pb L0$pb: popl %eax movl %eax, -24(%ebp) ## 4-byte Spill I worry that this defeats the return address prediction for returns in the function because calls and returns no longer are matched. From Intel's Optimization Reference Manual: "The return address stack mechanism augments the static and dynamic

search for: mispredicted