thr3ads.net - similar to: "optimization remarks"

Displaying 20 results from an estimated 400 matches similar to: "optimization remarks"

2019 Jan 08

distributed thinlto usage

I am trying to work through the usage of thinlto for distributed builds. Here is the simple thinlto usage, just add -flto=thin everywhere, easy: clang++ -flto=thin -O3 -c -o CreateWay_.o -DSPEC_CPU -DNDEBUG -DSPEC_CPU_LITTLE_ENDIAN -Wno-dangling-else CreateWay_.cpp clang++ -flto=thin -O3 -c -o Places_.o -DSPEC_CPU -DNDEBUG -DSPEC_CPU_LITTLE_ENDIAN

distributed thinlto usage

2019 Jan 09

distributed thinlto usage

Thanks Teresa Yes it is astar, happen to send a tar of the sources but they are just copies from the spec distribution The ld command is: GNU ld (GNU Binutils) 2.29.1 Thanks for the guidance on path names. The prefix-replace just effects the string written to the object files right? So we could post-process that file with other tools as well, correct? Thanks again --david From: Teresa Johnson

distributed thinlto usage

2019 Jan 09

distributed thinlto usage

Fails with gold too: Library-native.o:Library.cpp:regway: error: undefined reference to 'vtable for regwayobj' /home/dcallahan/fbsource/fbcode/third-party-buck/platform007/tools/binutils/bin/gold/ld: the vtable symbol may be undefined because the class is missing its key function clang-8: error: linker command failed with exit code 1 (use -v to see invocation) From: Teresa Johnson

[LLVMdev] LLVM 3.4 performance regressed?

2014 Apr 07

[LLVMdev] LLVM 3.4 performance regressed?

Hi, It was suggested that I post my question regarding a LLVM 3.4 performance regression to this mailing list, rather than stackoverflow. So here is the link: https://stackoverflow.com/questions/22902034/llvm-3-4-performance-regressed Thanks :) Jens -- Jens Tröger http://savage.light-speed.de/

Stats for XDP actions

2019 Jun 20

Stats for XDP actions

David Ahern <dsahern at gmail.com> writes: > On 4/18/19 8:24 AM, Toke H?iland-J?rgensen wrote: >>>> >>> >>> Understood. Hopefully in March I will get some time to come back to this >>> and propose an idea on what I would like to see - namely, the admin has >>> a config option at load time to enable driver counters versus custom map

Stats for XDP actions

2019 Jun 20

Stats for XDP actions

Naming a random effect in lmer

2009 May 21

Naming a random effect in lmer

Dear guRus: I am using lmer for a mixed model that includes a random intercept for a set of effects that have the same distribution, Normal(0, sig2b). This set of effects is of variable size, so I am using an as.formula statement to create the formula for lmer. For example, if the set of random effects has dimension 8, then the lmer call is: Zs<-

Stats for XDP actions (was: Re: [PATCH net] virtio_net: Account for tx bytes and packets on sending xdp_frames)

2019 Apr 18

Stats for XDP actions (was: Re: [PATCH net] virtio_net: Account for tx bytes and packets on sending xdp_frames)

David Ahern <dsahern at gmail.com> writes: > On 2/4/19 3:53 AM, Jesper Dangaard Brouer wrote: >> On Sat, 2 Feb 2019 14:27:26 -0700 >> David Ahern <dsahern at gmail.com> wrote: >> >>> On 1/31/19 1:15 PM, Jesper Dangaard Brouer wrote: >>>>> >>>>> David, Jesper, care to chime in where we ended up in that last thread

llvm is illegally vectorizing with a recurrence on skylake

2019 May 02

llvm is illegally vectorizing with a recurrence on skylake

Hi -- I have found a bug in an HPC code where llvm is vectorizing a loop on Skylake that has an obvious recurrence. I derived a small test case based on the original benchmark below: /*****************************************************************/ static void __attribute__ ((always_inline)) one( const int *restrict in, const int *const end, const unsigned shift, int *const restrict index,

How to get optimization remarks while testing with lnt in llvm

2018 Jun 05

How to get optimization remarks while testing with lnt in llvm

Hi, I'm new to llvm and am trying to run benchmarks from the test-suite using lnt to check loop-vectorization for various benchmarks. Test are compiling and executing fine, but I am not getting optimization remarks while using flags like -Rpass-missed=loop-vectorize and -Rpass-analysis=loop-vectorize I've tried running it like this: lnt runtest test-suite --sandbox SANDBOX --cc

Profile-based inlining status

2016 Mar 07

Profile-based inlining status

Hello, I'm learning how LLVM performs PGO (profile-guided optimizations) by using the instrumentation-based profile build (-fprofile-instr-generate and -fprofile-instr-use). However, I found there is no difference in inlining behaviors between with and without PGO for a few spec benchmarks by checking the emit optimization reports (-Rpass=inline -Rpass-missed=inline -Rpass-analysis=inline).

[LLVMdev] -gcolumn-info and PR 14106

2014 Jun 26

[LLVMdev] -gcolumn-info and PR 14106

For -Rpass, and other related uses, I am looking at enabling column info by default. David pointed me at PR 14106, which seems to be the original motivation for introducing -gcolumn-info. However, I am finding no differences when using it on this test. I've tried building with/without -gcolumn-info and found almost no difference in compile time (+0.4%): $ /usr/bin/time clang -w -fno-builtin

Pattern not recognized as reduction

2018 Feb 12

Pattern not recognized as reduction

Reduction Not Captured By LLVM CODE_1 ------------------------------------------------------------ ------------------------------------------------------------ -------------------- #include <stdio.h> int main() { int sum[1000]={1,2,3,4}; for (int i=1;i<1000;i++) { sum[0] +=sum[i-1]; } }

On Loop Distribution pass

2016 Oct 09

On Loop Distribution pass

Dear community, Our team at IITH have been experimenting with loop-distribution pass in LLVM. We see the following results on few benchmarks. clang -O3 -mllvm -enable-loop-distribute -Rpass=loop-distribute file.c clang -O3 -mllvm -enable-loop-distribute -Rpass-analysis=loop-distribute file.c TORCH

[PATCH] test_opus_decode: force integer constants unsigned

2013 Feb 27

[PATCH] test_opus_decode: force integer constants unsigned

This allows a clean build when using -Werror and the mentioned 'gcc -m32 -std=gnu90'. -------------- next part --------------

AutoFDO sample profiles v. SelectInst,

2016 Aug 12

AutoFDO sample profiles v. SelectInst,

I am looking for advice on a problem observed with -fprofile-sample-use for samples built with the AutoFDO tool I took the "hmmer" benchmark out of SPEC2006 It is initially compiled clnag++ -o hmmer -O3 -std=gnu89 -DSPEC_CPU -DNDEBUG -fno-strict-aliasing -w -g *.c This baseline binary runs in about 164.2 seconds as reported by "perf stat" We build a sample file from this

Filter optimization remarks by the hotness of the code region

2016 May 11

Filter optimization remarks by the hotness of the code region

> On May 11, 2016, at 3:37 AM, Hal Finkel <hfinkel at anl.gov> wrote: > > ----- Original Message ----- >> From: "Adam Nemet" <anemet at apple.com> >> To: "Hal Finkel" <hfinkel at anl.gov> >> Cc: "llvm-dev (llvm-dev at lists.llvm.org)" <llvm-dev at lists.llvm.org> >> Sent: Wednesday, May 11, 2016 1:15:42 AM

Vectorization width not correct using #pragma clang loop vectorize_width

2018 Sep 20

Vectorization width not correct using #pragma clang loop vectorize_width

Hello, I m trying to set vector width using #pragma clang loop vectorize_width(32) but i m getting width 8 for the following kernel; #define M 128 #define N 128 #define SQRT_FUN(x) sqrtf(x) int main(int argc, char** argv) { /* Variable declaration/allocation. */ double float_n = (double)N; double data[N*M]; double corr[M*M]; double mean[M]; double stddev[M]; uint32_t

Loop vectorization and unsafe floating point math

2020 Jun 24

Loop vectorization and unsafe floating point math

Hi llvm-dev! We are doing some fuzzy testing using C program generators, and one question that came up when generating a program with both floating point arithmetic and loop pragmas was; Is the loop vectorizer really allowed to vectorize a loop when it can't prove that it is safe to reorder fp math, even if there is a loop pragma that hints about a preferred width. When reading here

Filter optimization remarks by the hotness of the code region

2016 May 04

Filter optimization remarks by the hotness of the code region

This idea came up a few times recently [1][2] so I’d like start prototyping it. To summarize, we can emit optimization remarks using the -Rpass* options. These are currently emitted by optimizations like vectorization[3], unrolling, inlining and since last week loop distribution. For large programs however this can amount to a lot of diagnostics output to sift through. Filtering this by the

similar to: optimization remarks