Displaying 20 results from an estimated 2000 matches similar to: "Aarch64: unaligned access despite -mstrict-align"
2020 Jun 01
2
Aarch64: unaligned access despite -mstrict-align
Sorry, quick message to ignore what I wrote before, I got myself confused (probably you too),
With a recent trunk build I get this:
f:
adrp x8, g
ldr x8, [x8, :lo12:g]
mov w2, #16
mov x1, x0
mov x0, x8
b memcmp
This looks more correct, and I need to look a bit more into this (and how clang 10.0.0 behaves).
2020 Jun 22
3
Hardware ASan Generating Unknown Instruction
Hi,
I am trying to execute a simple hello world program compiled like so:
path/to/compiled/clang -o test --target=aarch64-linux-gnu
-march=armv8.5-a -fsanitize=hwaddress
--sysroot=/usr/aarch64-linux-gnu/
-L/usr/lib/gcc/aarch64-linux-gnu/10.1.0/ -g test.c
However, when I look at the disassembly, there is an unknown
instruction listed at 0x2d51c:
000000000002d4c0 main:
2d4c0: ff c3 00 d1
2020 Jul 15
2
[MTE] Tagging Globals
Hello,
We're evaluating memory tagging (MTE) on some internal workloads.
We noticed that stack variables are tagged by an instrumentation pass and heap objects are handled by the allocator (Scudo).
How about global variables? We tried a simple case using -march=armv8a+memtag -fsanitize=memtag, but found no tagging:
Are we missing anything or tagging globals is still in progress?
int
2020 Jul 15
2
[MTE] Tagging Globals
Thanks for the update, Phillips.
Yes, please add me, Stephen and Ana (CCed) to Phabricator reviews.
Zhaoshi
From: Mitch Phillips <mitchp at google.com>
Sent: Tuesday, July 14, 2020 19:10
To: Zhaoshi Zheng <zhaoshiz at quicinc.com>
Cc: llvm-dev at lists.llvm.org; Stephen Long <steplong at quicinc.com>
Subject: [EXT] Re: [llvm-dev] [MTE] Tagging Globals
Hi Zhaoshi,
Currently
2016 May 27
2
Handling post-inc users in LSR
Hello,
For a very simple loop where all IV users are post-inc users, I observed
redundant add instructions in AArch64.
From LSR debug, I can see initial formula for icmp is the one that
transformed to a post-inc form in OptimizeLoopTermCond() and later
expanded in post-inc mode. Based on the observation that the icmp is
already a post-inc user, I hacked LSR to prevent the icmp from being
2016 May 27
0
Handling post-inc users in LSR
> On May 27, 2016, at 2:50 PM, via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> Hello,
>
> For a very simple loop where all IV users are post-inc users, I observed redundant add instructions in AArch64.
>
> From LSR debug, I can see initial formula for icmp is the one that transformed to a post-inc form in OptimizeLoopTermCond() and later expanded in post-inc
2020 Jun 12
2
Issue with __attribute__((constructor)) and -Os -fno-common
On 6/11/20 11:25 PM, James Y Knight wrote:
> The global constructor was removed by setting the initial value of "val" to
> 1 instead of 0. So, the behavior of this program is preserved. Doesn't look
> like erroneous behavior.
OK, my example is too simplified indeed. Please consider the following
instead:
int val;
static void __attribute__((constructor)) init_fn(void)
{
2014 Sep 02
3
[LLVMdev] LICM promoting memory to scalar
All,
If we can speculatively execute a load instruction, why isn’t it safe to hoist it out by promoting it to a scalar in LICM pass?
There is a comment in LICM pass that if a load/store is conditional then it is not safe because it would break the LLVM concurrency model (See commit 73bfa4a).
It has an IR test for checking this in test/Transforms/LICM/scalar-promote-memmodel.ll
However, I have
2019 Jun 30
6
[hexagon][PowerPC] code regression (sub-optimal code) on LLVM 9 when generating hardware loops, and the "llvm.uadd" intrinsic.
Hi All,
The following code :
void hexagon2( int *a, int *res )
{
int i = 100;
while ( i-- ) {
*res++ = *a++;
}
}
gets compiled as a sub-optimal Software loop on LLVM 9.0 instead of a Hardware loop, whereas it was compiled as a Hardware Loop in LLVM 7.0.
This is the final assembly code generated by LLVM 9.0 :
.text
.file "main.c"
.globl hexagon2 // --
2014 Sep 02
2
[LLVMdev] LICM promoting memory to scalar
I think gcc is right.
It inserted a branch for n == 0 (the cbz at the top), so that's not a problem.
In all other regards, this is safe: if you examine the sequence of loads and stores, it eliminated all but the first load and all but the last store. How's that unsafe?
If I had to guess, the bug here is that LLVM doesn't want to hoist the load over the condition (which it is right
2020 Jun 11
2
Issue with __attribute__((constructor)) and -Os -fno-common
Hi,
I think that Clang erroneously discards a function annotated with
__attribute__((constructor)) when flags -Os -fno-common are given. Test
case below.
What do you think?
Thanks.
----8<--------8<--------8<--------8<--------8<--------8<--------
$ cat ctor.c
int val;
static void __attribute__((constructor)) init_fn(void)
{
val = 1;
}
int main(int argc, char *argv[])
2018 Sep 20
3
Comparing Clang and GCC: only clang stores updated value in each iteration.
Hi,
I have a benchmark (mcf) that is currently slower when compiled with
clang compared to gcc 8 (~10%). It seems that a hot loop has a few
differences, where one interesting one is that while clang stores an
incremented value in each iteration, gcc waits and just stores the final
value just once after the loop. The value is a global variable.
I wonder if this is something clang does not do
2014 Sep 03
3
[LLVMdev] LICM promoting memory to scalar
Thanks for the background on the concurrent memory model.
So, is it sufficient that the loop entry is guarded by condition (cbz at
top) for preventing the race?
The loop entry will be guarded by condition if loop has been rotated by loop
rotate pass.
Since LICM runs after loop rotate, we can use
ScalarEvolution::isLoopEntryGuardedByCond to check if we can speculatively
execute load without
2014 Dec 09
4
[LLVMdev] dmb ishld in AArch64
Hi,
I got an optimization problem (O1, O2) regarding memory barrier “dmb ishld”
I find in the test/CodeGen/AArch64/intrinsics-memory-barrier.ll , it’s stated that memory access around DMB should not be reordered, but when compiling the Linux kernel, I found load/store in
static inline void hlist_add_before_rcu(struct hlist_node *n,
struct hlist_node *next)
{
n->pprev
2017 Dec 19
4
A code layout related side-effect introduced by rL318299
Hi,
Recently 10% performance regression on an important benchmark showed up
after we integrated https://reviews.llvm.org/rL318299. The analysis showed
that rL318299 triggered loop rotation on an multi exits loop, and the loop
rotation introduced code layout issue. The performance regression is a
side-effect of rL318299. I got two testcases a.ll and b.ll attached to
illustrate the problem. a.ll
2017 Dec 19
2
A code layout related side-effect introduced by rL318299
On Mon, Dec 18, 2017 at 5:46 PM Xinliang David Li <davidxl at google.com>
wrote:
> The introduction of cleanup.cond block in b.ll without loop-rotation
> already makes the layout worse than a.ll.
>
>
> Without introducing cleanup.cond block, the layout out is
>
> entry->while.cond -> while.body->ret
>
> All the arrows are hot fall through edges which is
2020 Jun 22
3
Hardware ASan Generating Unknown Instruction
I suspect that this is hitting the issue that I mentioned here:
https://reviews.llvm.org/D65857#1621335
We may need to do what I suggested there and restrict global tag entropy on
non-Android Linux to 7 bits. You can try working around this issue for now
by using lld as the linker (-fuse-ld=lld).
Peter
On Mon, Jun 22, 2020 at 1:37 PM Mitch Phillips via llvm-dev <
llvm-dev at
2014 Aug 22
5
[LLVMdev] Pseudo load and store instructions for AArch64
Hi Renato,
> > I'm trying to add pseudo 64-bit load and store instructions for AArch64, which
> > should have latencies set to "1" while being otherwise exactly the same as
> > normal load and store instructions.
>
> Can I ask why would you need that?
This is the only way I found to stop Machine Instruction Scheduler from
reordering load and store
2020 Jun 22
2
Hardware ASan Generating Unknown Instruction
Thanks for the confirmation. From the assembly that was sent on the other
branch of the thread:
> .set .L.str, .L.str.hwasan-3458764513820540928
-3458764513820540928 = 0xd0 << 56
i.e. a "negative" tag.
So this appears to be the issue exactly.
Peter
On Mon, Jun 22, 2020 at 1:55 PM Derrick McKee <derrick.mckee at gmail.com>
wrote:
> Using lld fixes this issue.
>
2018 Feb 22
2
Sink redundant spill after RA
Hi All,
I found some cases where a spill of a live range in a block is reloaded only
in one of its successors, and there is no reload in other paths through
other successors. Since the spill is reloaded only in a certain path, it
must be okay to sink such spill close to its reloads. In the AArch64 code
below, there is a spill(x2) in the entry, but this value is reloaded only
in %bb.1, not in