Displaying 20 results from an estimated 7000 matches similar to: "[LLVMdev] addPassesToEmitFile and Intel syntax"
2016 Feb 22
2
raw_pwrite_stream to string or stdout?
TargetMachine::CGFT_AssemblyFile is exactly what I am trying to write out.
Frank
On 02/22/2016 11:06 AM, Rafael Espíndola wrote:
> On 19 February 2016 at 16:16, Frank Winter via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> TargetMachine::addPassesToEmitFile(..)
>> requires as its 2nd argument an raw_pwrite_stream.
>>
>> Is it possible to create such a
2013 May 22
1
[LLVMdev] How to write output of a backend to a memory buffer instead of a into a file?
Right now, I am using
TargetMachine::addPassesToEmitFile
<http://llvm.org/docs/doxygen/html/classllvm_1_1LLVMTargetMachine.html#a356929c1f0d202e4a9d3202aff1dbb05>
to write the output of a backend to a file. How can I tell LLVM to write
into a memory buffer instead?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2013 Nov 15
3
[LLVMdev] Limit loop vectorizer to SSE
On 15 November 2013 20:05, Frank Winter <fwinter at jlab.org> wrote:
> Good catch! That was the problem in my case too. I totally
> overlooked the alignment requirement for AVX.
I wonder if the validation mechanism shouldn't have caught it earlier... Do
you guys run validate on the modules before JIT-ing?
--renato
-------------- next part --------------
An HTML attachment was
2013 Nov 06
3
[LLVMdev] loop vectorizer
Good that you bring this up. I still have no solution to this
vectorization problem.
However, I can rewrite the code and insert a second loop which
eliminates the 'urem' and 'div' instructions in the index calculations.
In this case, the inner loop's trip count would be equal to the SIMD
length and the loop vectorizer ignores the loop. Unrolling the loop and
SLP is not an
2013 Oct 31
5
[LLVMdev] loop vectorizer
On 30 October 2013 18:40, Frank Winter <fwinter at jlab.org> wrote:
> const std::uint64_t ir0 = (i+0)%4; // not working
>
I thought this would be the case when I saw the original expression. Maybe
we need to teach module arithmetic to SCEV?
--renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2009 Mar 08
2
[LLVMdev] addPassesToEmitFile
Hi,
Long time ago (llvm-svn june 2008) I asked here about a way to output
the assembly code of my JIT generated code to a string, so I could use
it to read it on the screen. I came up with this solution:
std::string Err;
const llvm::TargetMachineRegistry::entry* _arch =
llvm::TargetMachineRegistry::getClosestTargetForJIT(Err);
std::string FeaturesStr;
llvm::TargetMachine*
2015 Jul 01
3
[LLVMdev] SLP vectorizer on AVX feature
On 1 July 2015 at 21:22, Frank Winter <fwinter at jlab.org> wrote:
> there were two follow-up emails.
I only got one... weird...
> The issue is solved. The SLP vectorizer has
> a magic number built into the code which determines the max. vector length
> to search for. That was set to 128 bits. Increasing it to 256 bits solved
> the issue.
That looks like a simple fix. Is
2009 Mar 08
0
[LLVMdev] addPassesToEmitFile
Well, I've been before hours trying this, but soon after I sent the
email I found something. However is quite intriguing.
I just changed the order of and the static libraries that I was linking.
How can this be possible??
I'm using Cmake for building my llvm projects, so I choose the order
and I pick the .a libraries I want to link by hand...
Thank you,
alvaro
2009/3/8 Álvaro
2009 Jun 16
2
[LLVMdev] x86 Intel Syntax and MASM 9.x
Hi Eli,
Thanks for the response I have one question inline.
Regards,
Ben
[...]
> The main problem that I have hit is regarding the use of CL register
in the
> shift instructions. The problem is that ATT syntax states that it
should be
> referenced as "%cl" while Intel says just "cl" but these references
occur in
> X86InstInfo.td and this means that it is shared
2009 Jun 16
0
[LLVMdev] x86 Intel Syntax and MASM 9.x
On Mon, Jun 15, 2009 at 11:21 PM, Gaster,
Benedict<Benedict.Gaster at amd.com> wrote:
> I can get this two work with additional changes to X86InstrInfocpp but
> the problem I have with this approach is that it introduces a lot of
> duplication, when all I really want to do is parameterize the final
> field in the string "shl{b}\t{%cl, $dst|$dst, %CL}". I was wondering
2013 Oct 31
3
[LLVMdev] loop vectorizer misses opportunity, exploit
Hi Frank,
This loop should be vectorized by the SLP-vectorizer. It has several scalars (C[0], C[1] … ) that can be merged into a vector. The SLP vectorizer can’t figure out that the stores are consecutive because SCEV can’t analyze the OR in the index calculation:
%2 = and i64 %i.04, 3
%3 = lshr i64 %i.04, 2
%4 = shl i64 %3, 3
%5 = or i64 %4, %2
%11 = getelementptr inbounds float*
2013 Nov 15
0
[LLVMdev] Limit loop vectorizer to SSE
Hmm.. I don't quite understand. How can a module validator
catch this, when it's the pointers, i.e. the payload, you pass
as function arguments that need to be aligned.. ?!
Frank
On 15/11/13 15:16, Renato Golin wrote:
> On 15 November 2013 20:05, Frank Winter <fwinter at jlab.org
> <mailto:fwinter at jlab.org>> wrote:
>
> Good catch! That was the problem in my
2016 Feb 22
2
raw_pwrite_stream to string or stdout?
Note that raw_fd_ostream is not seekable, and hence will not be suitable as
addPassesToEmitFile output stream.
2016-02-22 18:27 GMT+02:00 Rafael Espíndola <llvm-dev at lists.llvm.org>:
> On 22 February 2016 at 11:16, Frank Winter <fwinter at jlab.org> wrote:
> > TargetMachine::CGFT_AssemblyFile is exactly what I am trying to write
> out.
>
> I see.
>
> For
2013 Nov 06
0
[LLVMdev] loop vectorizer
Sent from my iPhone
> On Nov 5, 2013, at 7:39 PM, Frank Winter <fwinter at jlab.org> wrote:
>
> Good that you bring this up. I still have no solution to this vectorization problem.
>
> However, I can rewrite the code and insert a second loop which eliminates the 'urem' and 'div' instructions in the index calculations. In this case, the inner loop's trip
2013 Nov 12
2
[LLVMdev] Limit loop vectorizer to SSE
On 12 November 2013 15:14, Frank Winter <fwinter at jlab.org> wrote:
> I am asking because the option 'force-vector-width' is too restrictive.
> I would like to leave open the possibility to use vector width 2.
I was about to say that, and you saved us both one cycle. ;)
What you could do is to force an architecture that doesn't have AVX, only
SSE. I'm not sure how
2013 Nov 12
2
[LLVMdev] Limit loop vectorizer to SSE
On 12 November 2013 15:53, Frank Winter <fwinter at jlab.org> wrote:
> .. forcing the vector size to 4 does not prevent using AVX.
>
Sure. That's more for tests than anything else.
So, there are ways of disabling stuf in Clang, for instance "-mattr=-avx"
or "-target-feature -avx", but I'm not sure how you're doing it in the JIT.
I'm also not sure
2009 Mar 09
1
[LLVMdev] addPassesToEmitFile
When you say 'static libraries' do you mean static libraries or shared
objects (.so)... Because if you mean shared objects, then it could
very well explain you crash.
On Mar 9, 12:16 am, Álvaro Castro Castilla
<alvaro.castro.casti... at gmail.com> wrote:
> Well, I've been before hours trying this, but soon after I sent the
> email I found something. However is quite
2014 Aug 07
3
[LLVMdev] MCJIT generates MOVAPS on unaligned address
It's not reproducible with 'opt'. I call the SLP pass from my
application and only then the wrong IR gets generated.
On the attached module I call via the function pass manager:
1) TargetLibraryInfo with the target triple
2) Set the data layout
3) Basic Alias Analysis
4) SLP vectorizer
This produces the wrong IR. On the other hand running the attached
module through 'opt
2015 Jul 01
3
[LLVMdev] SLP vectorizer on AVX feature
Frank,
It sounds like the SLP vectorizer thinks that it is more profitable to use 128bit wide operations (because 256bit operations are double pumped on Sandybridge). Did you see a different result on Haswell?
Thanks,
Nadav
> On Jul 1, 2015, at 11:06 AM, Frank Winter <fwinter at jlab.org> wrote:
>
> I realized that the function parameters had no alignment attributes on them.
2014 Aug 08
2
[LLVMdev] How to broaden the SLP vectorizer's search
Hi Frank,
Thanks for working on this. Please look at vectorizeStoreChains. In this function we process all of the stores in the function in buckets of 16 elements because constructing consecutive stores is implemented using an O(n^2) algorithm. You can try to increase this threshold to 128 and see if it helps.
I also agree with Renato and Chad that adding a flag to tell the SLP-vectorizer to