similar to: optimization remarks

Displaying 20 results from an estimated 400 matches similar to: "optimization remarks"

2019 Jan 08
2
distributed thinlto usage
I am trying to work through the usage of thinlto for distributed builds. Here is the simple thinlto usage, just add -flto=thin everywhere, easy: clang++ -flto=thin -O3 -c -o CreateWay_.o -DSPEC_CPU -DNDEBUG -DSPEC_CPU_LITTLE_ENDIAN -Wno-dangling-else CreateWay_.cpp clang++ -flto=thin -O3 -c -o Places_.o -DSPEC_CPU -DNDEBUG -DSPEC_CPU_LITTLE_ENDIAN
2019 Jan 09
2
distributed thinlto usage
Thanks Teresa Yes it is astar, happen to send a tar of the sources but they are just copies from the spec distribution The ld command is: GNU ld (GNU Binutils) 2.29.1 Thanks for the guidance on path names. The prefix-replace just effects the string written to the object files right? So we could post-process that file with other tools as well, correct? Thanks again --david From: Teresa Johnson
2019 Jan 09
2
distributed thinlto usage
Fails with gold too: Library-native.o:Library.cpp:regway: error: undefined reference to 'vtable for regwayobj' /home/dcallahan/fbsource/fbcode/third-party-buck/platform007/tools/binutils/bin/gold/ld: the vtable symbol may be undefined because the class is missing its key function clang-8: error: linker command failed with exit code 1 (use -v to see invocation) From: Teresa Johnson
2014 Apr 07
4
[LLVMdev] LLVM 3.4 performance regressed?
Hi, It was suggested that I post my question regarding a LLVM 3.4 performance regression to this mailing list, rather than stackoverflow. So here is the link: https://stackoverflow.com/questions/22902034/llvm-3-4-performance-regressed Thanks :) Jens -- Jens Tröger http://savage.light-speed.de/
2019 Jun 20
2
Stats for XDP actions
David Ahern <dsahern at gmail.com> writes: > On 4/18/19 8:24 AM, Toke H?iland-J?rgensen wrote: >>>> >>> >>> Understood. Hopefully in March I will get some time to come back to this >>> and propose an idea on what I would like to see - namely, the admin has >>> a config option at load time to enable driver counters versus custom map
2019 Jun 20
2
Stats for XDP actions
David Ahern <dsahern at gmail.com> writes: > On 4/18/19 8:24 AM, Toke H?iland-J?rgensen wrote: >>>> >>> >>> Understood. Hopefully in March I will get some time to come back to this >>> and propose an idea on what I would like to see - namely, the admin has >>> a config option at load time to enable driver counters versus custom map
2009 May 21
2
Naming a random effect in lmer
Dear guRus: I am using lmer for a mixed model that includes a random intercept for a set of effects that have the same distribution, Normal(0, sig2b). This set of effects is of variable size, so I am using an as.formula statement to create the formula for lmer. For example, if the set of random effects has dimension 8, then the lmer call is: Zs<-
2019 Apr 18
2
Stats for XDP actions (was: Re: [PATCH net] virtio_net: Account for tx bytes and packets on sending xdp_frames)
David Ahern <dsahern at gmail.com> writes: > On 2/4/19 3:53 AM, Jesper Dangaard Brouer wrote: >> On Sat, 2 Feb 2019 14:27:26 -0700 >> David Ahern <dsahern at gmail.com> wrote: >> >>> On 1/31/19 1:15 PM, Jesper Dangaard Brouer wrote: >>>>> >>>>> David, Jesper, care to chime in where we ended up in that last thread
2019 May 02
2
llvm is illegally vectorizing with a recurrence on skylake
Hi -- I have found a bug in an HPC code where llvm is vectorizing a loop on Skylake that has an obvious recurrence. I derived a small test case based on the original benchmark below: /*****************************************************************/ static void __attribute__ ((always_inline)) one( const int *restrict in, const int *const end, const unsigned shift, int *const restrict index,
2018 Jun 05
2
How to get optimization remarks while testing with lnt in llvm
Hi, I'm new to llvm and am trying to run benchmarks from the test-suite using lnt to check loop-vectorization for various benchmarks. Test are compiling and executing fine, but I am not getting optimization remarks while using flags like -Rpass-missed=loop-vectorize and -Rpass-analysis=loop-vectorize I've tried running it like this: lnt runtest test-suite --sandbox SANDBOX --cc
2016 Mar 07
3
Profile-based inlining status
Hello, I'm learning how LLVM performs PGO (profile-guided optimizations) by using the instrumentation-based profile build (-fprofile-instr-generate and -fprofile-instr-use). However, I found there is no difference in inlining behaviors between with and without PGO for a few spec benchmarks by checking the emit optimization reports (-Rpass=inline -Rpass-missed=inline -Rpass-analysis=inline).
2014 Jun 26
7
[LLVMdev] -gcolumn-info and PR 14106
For -Rpass, and other related uses, I am looking at enabling column info by default. David pointed me at PR 14106, which seems to be the original motivation for introducing -gcolumn-info. However, I am finding no differences when using it on this test. I've tried building with/without -gcolumn-info and found almost no difference in compile time (+0.4%): $ /usr/bin/time clang -w -fno-builtin
2018 Feb 12
1
Pattern not recognized as reduction
Reduction Not Captured By LLVM CODE_1 ------------------------------------------------------------ ------------------------------------------------------------ -------------------- #include <stdio.h> int main() { int sum[1000]={1,2,3,4}; for (int i=1;i<1000;i++) { sum[0] +=sum[i-1]; } }
2016 Oct 09
3
On Loop Distribution pass
Dear community, Our team at IITH have been experimenting with loop-distribution pass in LLVM. We see the following results on few benchmarks. clang -O3 -mllvm -enable-loop-distribute -Rpass=loop-distribute file.c clang -O3 -mllvm -enable-loop-distribute -Rpass-analysis=loop-distribute file.c TORCH
2013 Feb 27
1
[PATCH] test_opus_decode: force integer constants unsigned
This allows a clean build when using -Werror and the mentioned 'gcc -m32 -std=gnu90'. -------------- next part --------------
2016 Aug 12
3
AutoFDO sample profiles v. SelectInst,
I am looking for advice on a problem observed with -fprofile-sample-use for samples built with the AutoFDO tool I took the "hmmer" benchmark out of SPEC2006 It is initially compiled clnag++ -o hmmer -O3 -std=gnu89 -DSPEC_CPU -DNDEBUG -fno-strict-aliasing -w -g *.c This baseline binary runs in about 164.2 seconds as reported by "perf stat" We build a sample file from this
2016 May 11
4
Filter optimization remarks by the hotness of the code region
> On May 11, 2016, at 3:37 AM, Hal Finkel <hfinkel at anl.gov> wrote: > > ----- Original Message ----- >> From: "Adam Nemet" <anemet at apple.com> >> To: "Hal Finkel" <hfinkel at anl.gov> >> Cc: "llvm-dev (llvm-dev at lists.llvm.org)" <llvm-dev at lists.llvm.org> >> Sent: Wednesday, May 11, 2016 1:15:42 AM
2018 Sep 20
2
Vectorization width not correct using #pragma clang loop vectorize_width
Hello, I m trying to set vector width using #pragma clang loop vectorize_width(32) but i m getting width 8 for the following kernel; #define M 128 #define N 128 #define SQRT_FUN(x) sqrtf(x) int main(int argc, char** argv) { /* Variable declaration/allocation. */ double float_n = (double)N; double data[N*M]; double corr[M*M]; double mean[M]; double stddev[M]; uint32_t
2020 Jun 24
2
Loop vectorization and unsafe floating point math
Hi llvm-dev! We are doing some fuzzy testing using C program generators, and one question that came up when generating a program with both floating point arithmetic and loop pragmas was; Is the loop vectorizer really allowed to vectorize a loop when it can't prove that it is safe to reorder fp math, even if there is a loop pragma that hints about a preferred width. When reading here
2016 May 04
4
Filter optimization remarks by the hotness of the code region
This idea came up a few times recently [1][2] so I’d like start prototyping it. To summarize, we can emit optimization remarks using the -Rpass* options. These are currently emitted by optimizations like vectorization[3], unrolling, inlining and since last week loop distribution. For large programs however this can amount to a lot of diagnostics output to sift through. Filtering this by the