Displaying 20 results from an estimated 400 matches similar to: "optimization remarks"
2019 Jan 08
2
distributed thinlto usage
I am trying to work through the usage of thinlto for distributed builds.
Here is the simple thinlto usage, just add -flto=thin everywhere, easy:
clang++ -flto=thin -O3 -c -o CreateWay_.o -DSPEC_CPU -DNDEBUG -DSPEC_CPU_LITTLE_ENDIAN -Wno-dangling-else CreateWay_.cpp
clang++ -flto=thin -O3 -c -o Places_.o -DSPEC_CPU -DNDEBUG -DSPEC_CPU_LITTLE_ENDIAN
2019 Jan 09
2
distributed thinlto usage
Thanks Teresa
Yes it is astar, happen to send a tar of the sources but they are just copies from the spec distribution
The ld command is:
GNU ld (GNU Binutils) 2.29.1
Thanks for the guidance on path names. The prefix-replace just effects the string written to the object files right? So we could post-process that file with other tools as well, correct?
Thanks again
--david
From: Teresa Johnson
2019 Jan 09
2
distributed thinlto usage
Fails with gold too:
Library-native.o:Library.cpp:regway: error: undefined reference to 'vtable for regwayobj'
/home/dcallahan/fbsource/fbcode/third-party-buck/platform007/tools/binutils/bin/gold/ld: the vtable symbol may be undefined because the class is missing its key function
clang-8: error: linker command failed with exit code 1 (use -v to see invocation)
From: Teresa Johnson
2014 Apr 07
4
[LLVMdev] LLVM 3.4 performance regressed?
Hi,
It was suggested that I post my question regarding a LLVM 3.4 performance
regression to this mailing list, rather than stackoverflow. So here is
the link:
https://stackoverflow.com/questions/22902034/llvm-3-4-performance-regressed
Thanks :)
Jens
--
Jens Tröger
http://savage.light-speed.de/
2019 Jun 20
2
Stats for XDP actions
David Ahern <dsahern at gmail.com> writes:
> On 4/18/19 8:24 AM, Toke H?iland-J?rgensen wrote:
>>>>
>>>
>>> Understood. Hopefully in March I will get some time to come back to this
>>> and propose an idea on what I would like to see - namely, the admin has
>>> a config option at load time to enable driver counters versus custom map
2019 Jun 20
2
Stats for XDP actions
David Ahern <dsahern at gmail.com> writes:
> On 4/18/19 8:24 AM, Toke H?iland-J?rgensen wrote:
>>>>
>>>
>>> Understood. Hopefully in March I will get some time to come back to this
>>> and propose an idea on what I would like to see - namely, the admin has
>>> a config option at load time to enable driver counters versus custom map
2009 May 21
2
Naming a random effect in lmer
Dear guRus:
I am using lmer for a mixed model that includes a random intercept for a
set of effects that have the same distribution, Normal(0, sig2b). This set
of effects is of variable size, so I am using an as.formula statement to
create the formula for lmer. For example, if the set of random effects has
dimension 8, then the lmer call is:
Zs<-
2019 Apr 18
2
Stats for XDP actions (was: Re: [PATCH net] virtio_net: Account for tx bytes and packets on sending xdp_frames)
David Ahern <dsahern at gmail.com> writes:
> On 2/4/19 3:53 AM, Jesper Dangaard Brouer wrote:
>> On Sat, 2 Feb 2019 14:27:26 -0700
>> David Ahern <dsahern at gmail.com> wrote:
>>
>>> On 1/31/19 1:15 PM, Jesper Dangaard Brouer wrote:
>>>>>
>>>>> David, Jesper, care to chime in where we ended up in that last thread
2019 May 02
2
llvm is illegally vectorizing with a recurrence on skylake
Hi -- I have found a bug in an HPC code where llvm is vectorizing a loop on
Skylake that has an obvious recurrence. I derived a small test case based
on the original benchmark below:
/*****************************************************************/
static void __attribute__ ((always_inline)) one(
const int *restrict in, const int *const end,
const unsigned shift, int *const restrict index,
2018 Jun 05
2
How to get optimization remarks while testing with lnt in llvm
Hi, I'm new to llvm and am trying to run benchmarks from the test-suite
using lnt to check loop-vectorization for various benchmarks.
Test are compiling and executing fine, but I am not getting optimization
remarks while using flags like -Rpass-missed=loop-vectorize and
-Rpass-analysis=loop-vectorize
I've tried running it like this:
lnt runtest test-suite --sandbox SANDBOX --cc
2016 Mar 07
3
Profile-based inlining status
Hello,
I'm learning how LLVM performs PGO (profile-guided optimizations) by using
the instrumentation-based profile build (-fprofile-instr-generate and
-fprofile-instr-use).
However, I found there is no difference in inlining behaviors between with
and without PGO for a few spec benchmarks by checking the emit optimization
reports (-Rpass=inline -Rpass-missed=inline -Rpass-analysis=inline).
2014 Jun 26
7
[LLVMdev] -gcolumn-info and PR 14106
For -Rpass, and other related uses, I am looking at enabling column info by
default. David pointed me at PR 14106, which seems to be the original
motivation for introducing -gcolumn-info. However, I am finding no
differences when using it on this test. I've tried building with/without
-gcolumn-info and found almost no difference in compile time (+0.4%):
$ /usr/bin/time clang -w -fno-builtin
2018 Feb 12
1
Pattern not recognized as reduction
Reduction Not Captured By LLVM
CODE_1
------------------------------------------------------------
------------------------------------------------------------
--------------------
#include <stdio.h>
int main()
{
int sum[1000]={1,2,3,4};
for (int i=1;i<1000;i++)
{
sum[0] +=sum[i-1];
}
}
2016 Oct 09
3
On Loop Distribution pass
Dear community,
Our team at IITH have been experimenting with loop-distribution pass in
LLVM. We see the following results on few benchmarks.
clang -O3 -mllvm -enable-loop-distribute -Rpass=loop-distribute file.c
clang -O3 -mllvm -enable-loop-distribute -Rpass-analysis=loop-distribute
file.c
TORCH
2013 Feb 27
1
[PATCH] test_opus_decode: force integer constants unsigned
This allows a clean build when using -Werror and the mentioned 'gcc
-m32 -std=gnu90'.
-------------- next part --------------
2016 Aug 12
3
AutoFDO sample profiles v. SelectInst,
I am looking for advice on a problem observed with
-fprofile-sample-use for samples built with the AutoFDO tool
I took the "hmmer" benchmark out of SPEC2006
It is initially compiled
clnag++ -o hmmer -O3 -std=gnu89 -DSPEC_CPU -DNDEBUG -fno-strict-aliasing -w -g *.c
This baseline binary runs in about 164.2 seconds as reported by "perf stat"
We build a sample file from this
2016 May 11
4
Filter optimization remarks by the hotness of the code region
> On May 11, 2016, at 3:37 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>
> ----- Original Message -----
>> From: "Adam Nemet" <anemet at apple.com>
>> To: "Hal Finkel" <hfinkel at anl.gov>
>> Cc: "llvm-dev (llvm-dev at lists.llvm.org)" <llvm-dev at lists.llvm.org>
>> Sent: Wednesday, May 11, 2016 1:15:42 AM
2018 Sep 20
2
Vectorization width not correct using #pragma clang loop vectorize_width
Hello,
I m trying to set vector width using #pragma clang loop vectorize_width(32)
but i m getting width 8 for the following kernel;
#define M 128
#define N 128
#define SQRT_FUN(x) sqrtf(x)
int main(int argc, char** argv)
{
/* Variable declaration/allocation. */
double float_n = (double)N;
double data[N*M];
double corr[M*M];
double mean[M];
double stddev[M];
uint32_t
2020 Jun 24
2
Loop vectorization and unsafe floating point math
Hi llvm-dev!
We are doing some fuzzy testing using C program generators,
and one question that came up when generating a program with
both floating point arithmetic and loop pragmas was;
Is the loop vectorizer really allowed to vectorize a loop when
it can't prove that it is safe to reorder fp math, even if
there is a loop pragma that hints about a preferred width.
When reading here
2016 May 04
4
Filter optimization remarks by the hotness of the code region
This idea came up a few times recently [1][2] so I’d like start prototyping it. To summarize, we can emit optimization remarks using the -Rpass* options. These are currently emitted by optimizations like vectorization[3], unrolling, inlining and since last week loop distribution.
For large programs however this can amount to a lot of diagnostics output to sift through. Filtering this by the