similar to: [LLVMdev] -s-

Displaying 20 results from an estimated 10000 matches similar to: "[LLVMdev] -s-"

2007 Jan 08
1
[LLVMdev] Stack switching, Active Objects and LLVM
[Apologies. This is a repost because the earlier post didn't have a subject heading and might have been missed by members] Hello, I wish to have lots of little stacks and be able to switch rapidly between them. I could do CPS transformation but don't like the overhead of creating gc'able continuation thunks and the copying from stack to heap. I'd like to explore a no-copy
2015 Jan 18
2
[LLVMdev] Marking *some* pointers for gc
Hi, I just found out that it's not practical to mark only some pointers for GC. Consider: %a = i8 addrspace(1)* malloc(...) %b = i8* alloca(...) The issue then becomes that routine functions declared: declare i1 foo(i8 addrspace(1)*) have a choice of accepting either gc'able or non-gc'able pointers. Is there no way to have a reasonable mix of both? Ram
2017 Apr 17
2
[RFC] Adding CPS call support
> Is there a reason you can't use the algorithm from the paper "A Correspondence between Continuation Passing Style and Static Single Assignment Form" to convert your IR to LLVM's SSA IR? Yes, there are a few reasons. Undoing the CPS transformation earlier in the pipeline would mean that we are using LLVM's built-in stack. The special layout and usage of the stack in
2008 Sep 30
1
[LLVMdev] Integer handling
WRT: Message-based concurrence, you might what to check-out the papers here: http://www.malhar.net/sriram/
2017 Apr 19
3
[RFC] Adding CPS call support
> The semantics around inlining alone are problematic enough to warrant serious hesitation. There are nicer ways to embed CPS call/return into LLVM; I just figured that there would not be much support for adding a new terminator because it would change a lot of code. Ideally we would have a block terminator like: cps call ghccc @bar (.. args ..) returnsto label %retpt Where the
2017 Apr 17
2
[RFC] Adding CPS call support
(Sorry for the 2nd email Eli, I forgot to reply-all). > I'm not following how explicitly representing the return address of a call in the IR before isel actually solves any relevant issue. We already pass the return address implicitly as an argument to every call; you can retrieve it with llvm.returnaddress if you need it. Unfortunately the @llvm.returnaddress intrinsic does not solve
2017 Apr 17
9
[RFC] Adding CPS call support
Summary ======= There is a need for dedicated continuation-passing style (CPS) calls in LLVM to support functional languages. Herein I describe the problem and propose a solution. Feedback and/or tips are greatly appreciated, as our goal is to implement these changes so they can be merged into LLVM trunk. Problem ======= Implementations of functional languages like Haskell and ML (e.g., GHC
2013 Sep 27
0
[LLVMdev] Trip count and Loop Vectorizer
Sriram, The problem is that you want to unroll/vectorize many loops with non-constant loop count - it is a trade-off of which case you estimate as more likely. int foo(int *ptr, int n) { for ( .. i <n) ptr[i] = ... } The question is: is it more likely to have “n” such that unrolling is beneficial or not. Now, you could probably write an analysis that bounds the loop count (for the
2006 Jan 11
2
[LLVMdev] Explicitly Managed Stack Frames
I was wondering what the current state of this (explicitly managed stack frames) is. Is it being worked on? If not, how hard do you think it would be for me to add it? I am more than willing to work on it, but don't have any experience with LLVM, so it might take a while. The reason I ask is because I am starting work on a project to make a language similar to ML with which to experiment,
2006 Jan 12
0
[LLVMdev] Explicitly Managed Stack Frames
On Wed, 11 Jan 2006, Ben Chambers wrote: > I was wondering what the current state of this (explicitly managed stack frames) > is. Is it being worked on? If not, how hard do you think it would be for me to > add it? I'm not sure what you mean. It depends on correct tail calls, but no other LLVM-level support. > I am more than willing to work on it, but don't have any
2013 Sep 27
2
[LLVMdev] Trip count and Loop Vectorizer
Hi Nadav, Thanks for the response. I forgot to mention that there is an upper limit of 16 for the Trip Count check, TinyTripCountVectorThreshold = 16; if (TC > 0u && TC < TinyTripCountVectorThreshold). So right now, any loop with Trip Count as 0, or with value >=16, LV with unroll. With the change to the lower bound, it will also include the loop with 0 trip count. SCEV returns 0
2013 Sep 27
0
[LLVMdev] Trip count and Loop Vectorizer
Hi Sriram, Thanks for performing this analysis. The problem here, both for memcpy and the vectorizer, is that we can’t predict the size of “n”, even though the only use of ’n’ is for the loop bound for the alloca [4 x [8 x i32]]. If you change the unroll condition to TC >= 0 then you will disable loop unrolling for all loops because getSmallConstantTripCount returns an unsigned number. You
2013 May 02
0
[LLVMdev] Improving the usability of LNT
Wow, that sounds great! Thanks for working on this, and yes, please, send the patches! --renato On 30 April 2013 16:23, Murali, Sriram <sriram.murali at intel.com> wrote: > Hi Daniel,**** > > I made some changes to the LNT perf reporting tool to make it more user > friendly by adding some features:**** > > **1. **Make the sidebar and the navigation bar stationary,
2013 Feb 01
0
[LLVMdev] [Patch][Review Requested][Compilation Time] Avoid frequent copy of elements in LoopStrengthReduce
Sriram, This patch looks good. Please commit. ...and thanks for the data. -Andy On Jan 29, 2013, at 12:59 PM, "Murali, Sriram" <sriram.murali at intel.com> wrote: > Hello, > This patch aims to improve compile time performance by increasing the SCEV vector size in LoopStrengthReduce. It is observed that the BaseRegs vector size is 4 in most cases, and elements are
2013 Jan 30
1
[LLVMdev] [Patch][Review Requested][Compilation Time] Avoid frequent copy of elements in LoopStrengthReduce
The compilation time is measured for different benchmarks while compiling a .bc file into a shared object. The improvement across the range of benchmarks is listed in following table. If the reason behind the need for other performance metrics is to identify possible measurement errors, then I think this table would be of some help. However, we do not have the standard deviation and confidence
2017 Jan 05
2
RFC: LLD range extension thunks
Hello Rui, Thanks for the comments - Synthetic sections and rewriting relocations I think that this would definitely be worth trying. It should remove the need for thunks to be represented in the core data structures, and would allow . It would also mean that we wouldn't have to associate symbols with thunks as the relocations would directly target the thunks. ARM interworking makes reusing
2017 Apr 04
2
[LLD] RFC Range Thunks Implementation review for ARM and Mips
This RFC is primarily to support the review of the range extension thunks implementation of lld. This concerns ARM and Mips as all of the thunk creation step is skipped over if the target doesn't need thunks. Mips LA25 Thunks are not range extension Thunks but these are generated using the same code, I've kept the behaviour the same as it is now, although the implementation is obviously
2012 Apr 04
0
[LLVMdev] Disabling x87 instructions for a sub-target
Hi Sriram, I'm not sure if I understand your question correctly: Do you need to generate code that contains no x87 floating-point instructions altogether, but uses calls into a soft-float library instead? That behaviour can be enabled using the "-soft-float" flag, as far as I know. Or is it only about the fcomi* instructions, which are not supported by pre-Pentium Pro chips? Then I
2017 Jan 04
5
RFC: LLD range extension thunks
I'm about to start working on range extension thunks in lld. This is an attempt to summarize the approach I'd like to take and what the impact will be on lld outside of thunks. I'm interested if anyone has any constraints the approach will break, alternative suggestions, or is working on something I'll need to take account of? I expect range extension thunks to be important for
2017 Jan 06
2
RFC: LLD range extension thunks
After looking at this for a while, I do not think that this problem is NP-hard. With a finite "short branch" displacement k, I was not able to come up with a gadget that could create global constraints as would be needed to e.g. model an instance of 3SAT or vertex cover in terms of this problem. The problem is hard though. I believe that it is likely to be exponential in the "short