Displaying 20 results from an estimated 7000 matches similar to: "[LLVMdev] Segmented Stacks (re-roll)"
2011 Aug 22
0
[LLVMdev] Segmented Stacks (re-roll)
Hi Sanjoy,
The patch generally looks fine except for this part:
diff --git a/lib/CodeGen/StackSegmenter.cpp b/lib/CodeGen/StackSegmenter.cpp
new file mode 100644
index 0000000..5ffb8f2
--- /dev/null
+++ b/lib/CodeGen/StackSegmenter.cpp
@@ -0,0 +1,48 @@
+//===-- StackSegmenter.h - Prolog/Epilog code insertion -------*- C++ -* --===//
The comment is obviously incorrect.
diff --git
2011 Aug 23
2
[LLVMdev] Segmented Stacks (re-roll)
Hi!
> diff --git a/lib/CodeGen/StackSegmenter.cpp b/lib/CodeGen/StackSegmenter.cpp
> new file mode 100644
> index 0000000..5ffb8f2
> --- /dev/null
> +++ b/lib/CodeGen/StackSegmenter.cpp
> @@ -0,0 +1,48 @@
> +//===-- StackSegmenter.h - Prolog/Epilog code insertion -------*- C++ -* --===//
>
> The comment is obviously incorrect.
Thanks. So much for lifting file
2011 Aug 23
0
[LLVMdev] Segmented Stacks (re-roll)
On Aug 23, 2011, at 9:24 AM, Sanjoy Das wrote:
> Hi!
>
>> diff --git a/lib/CodeGen/StackSegmenter.cpp b/lib/CodeGen/StackSegmenter.cpp
>> new file mode 100644
>> index 0000000..5ffb8f2
>> --- /dev/null
>> +++ b/lib/CodeGen/StackSegmenter.cpp
>> @@ -0,0 +1,48 @@
>> +//===-- StackSegmenter.h - Prolog/Epilog code insertion -------*- C++ -* --===//
2011 Aug 24
1
[LLVMdev] Segmented Stacks (re-roll)
Hi!
> According to the patch you send, the pass is not doing anything:
>
> +bool StackSegmenter::runOnMachineFunction(MachineFunction &MF) {
> + return false;
> +}
> +
It is, in the next patch.
diff --git a/lib/CodeGen/StackSegmenter.cpp b/lib/CodeGen/StackSegmenter.cpp
index 5ffb8f2..cc2ca87 100644
--- a/lib/CodeGen/StackSegmenter.cpp
+++ b/lib/CodeGen/StackSegmenter.cpp
2011 Aug 10
2
[LLVMdev] Segmented Stacks: Pre-midterm work
Hi!
Attached my pre-midterm GSoC work for segmented stacks for review (with
the required fixes).
Thanks!
--
Sanjoy Das
http://playingwithpointers.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-New-command-line-option-to-enable-segmented-stacks.patch
Type: text/x-diff
Size: 1699 bytes
Desc: not available
URL:
2011 May 30
2
[LLVMdev] [Segmented Stacks] Week 1
Hi!
I've attached my first week of work as a patchset for review. This is
also available on Github [1].
By next Monday I intend to (more or less) finish up the preliminary
parts concerning the codegen; and start working on the runtime (so that
I can do a basic sanity check).
[1] https://github.com/sanjoy/llvm/tree/segmented-stacks
--
Sanjoy Das
http://playingwithpointers.com
2011 Jul 14
3
[LLVMdev] [PATCH] Segmented Stacks
Hi llvm-dev!
I have attached the current state of my GSoC work in patches [1] for
review; this currently allows LLVM to correctly handle functions running
out of stack space and variable sized stack objects.
Firstly, since I think it is better to get things merged in small
chunks, I'd like to have some specific feedback on where my work stands
in terms of mergeability.
Secondly, I had been
2015 Aug 16
2
[LLVMdev] Adding a stack probe function attribute
I started to implement inlining of the stack probe function based on
Microsoft's inlined stack probes in
https://github.com/Microsoft/llvm/tree/MS.
Do we know why the stack pointer cannot be updated in a loop (which
results in ideal code)? I noticed that was commented in Microsoft's
code.
I suspect this is due to debug or unwinding information, since it is
allowed on Windows x86-32.
I
2011 Aug 25
2
[LLVMdev] Segmented Stacks (Re-Roll 2)
Hi all!
I've attached a corrected set of patches (based on the input I
received here). Please let me know if this work looks mergeable.
The documentation is only partially filled in, I'll add more details
once support for Go is also merged (the current co-routine work I'm
doing).
Thanks!
--
Sanjoy Das
http://playingwithpointers.com
-------------- next part --------------
A
2017 Feb 06
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi Jean-Marc,
Thanks a lot for reviewing this huge assembly function!
silk_warped_autocorrelation_FIX_c()'s kernel part is
for( n = 0; n < length; n++ ) {
tmp1_QS = silk_LSHIFT32( (opus_int32)input[ n ], QS );
/* Loop over allpass sections */
for( i = 0; i < order; i++ ) {
/* Output of allpass section */
tmp2_QS = silk_SMLAWB(
2017 Jan 31
6
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi,
Attached is a patch with arm neon optimizations for
silk_warped_autocorrelation_FIX(). Please review.
Thanks,
Felicia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xiph.org/pipermail/opus/attachments/20170131/9a912bb4/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name:
2017 Feb 07
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
This is a great idea. But the order (psEncC->shapingLPCOrder) can be
configured to 12, 14, 16, 20 and 24 according to complexity parameter.
It's hard to get a universal function to handle all these orders
efficiently. Any suggestions?
Thanks,
Linfeng
On Mon, Feb 6, 2017 at 12:40 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote:
> Hi Linfeng,
>
> On 06/02/17 02:51 PM,
2017 Feb 07
3
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi Jean-Marc,
Thanks for your suggestions. Will get back to you once we have some updates.
Linfeng
On Mon, Feb 6, 2017 at 5:47 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote:
> Hi Linfeng,
>
> On 06/02/17 07:18 PM, Linfeng Zhang wrote:
> > This is a great idea. But the order (psEncC->shapingLPCOrder) can be
> > configured to 12, 14, 16, 20 and 24 according to
2010 Apr 07
3
[LLVMdev] Injecting code before function prolog
I'm trying to implement something similar to this:
http://gcc.gnu.org/wiki/SplitStacks in LLVM. The reason I want this
is so that I can have dynamically growing and shrinking stacks in my
programming language. In order to do this, I need to be able to check
for overflow of a stack frame. The methods of doing this are outlined
in the link above, but my intention is to pass the current stack
2020 Mar 24
2
[RFC][AArch64] Homogeneous Prolog and Epilog for Size Optimization
Hello,
I'd like to upstream our work over the time which the community would
benefit from.
This is a part of effort toward minimizing code size presented in here
<https://llvm.org/devmtg/2020-02-23/slides/Kyungwoo-GlobalMachineOutlinerForThinLTO.pdf>.
In particular, this RFC is about optimizing prolog and epilog for size.
*Homogeneous Prolog and Epilog for Size Optimization, D76570
2020 Mar 24
2
[RFC][AArch64] Homogeneous Prolog and Epilog for Size Optimization
Hi Vedant,
Thanks for your interest and comment.
Size-optimization improves page-faults and a start-up time for a large
application, which this enabling also followed.
Even though I didn't see a large regression/complaint on a CPU-bound case,
which is not a typical case for mobile workload, I wanted to be precautious
of enabling it by default.
However, as with default outlining case, I
2017 Apr 05
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
I attached a new patch with small cleanup (disassembly is identical as the
last patch). We have done the same internal testing as usual.
Also, attached 2 failed temporary versions which try to reduce code size
(just for code review reference purpose).
The new patch of silk_warped_autocorrelation_FIX_neon() has a code size of
3,228 bytes (with gcc).
smaller_slower.c has a code size of 2,304
2019 Jul 16
2
MachinePipeliner refactoring
Hi James,
I also think that refactoring the code generation part is a great idea. That code is very complicated and difficult to maintain. I’ve wanted to rewrite that code for a long time, but just have never got to it. There are quite a few edge cases to handle (at least in the current code). I’ll take a deeper look at your patch. The abstractions that you mention, Stage and Block, are good
2019 Jul 15
1
MachinePipeliner refactoring
Hi James:
Personally, I like the idea of refactoring and more abstraction,
But unfortunately, I don't know enough about the edges cases either.
BTW: the prototype is still causing quite some Asseertions in PowerPC -
some nodes are not generated in correct order.
Best,
Jinsong Ji (纪金松), PhD.
XL/LLVM on Power Compiler Development
E-mail: jji at us.ibm.com
From: James Molloy <james at
2017 Apr 05
4
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Thank Jean-Marc!
The speedup percentages are all relative to the entire encoder.
Comparing to master, this optimization patch speeds up fixed-point SILK
encoder on NEON as following: Complexity 5: 6.1% Complexity 6: 5.8%
Complexity 8: 5.5% Complexity 10: 4.0%
when testing on an Acer Chromebook, ARMv7 Processor rev 3 (v7l), CPU max
MHz: 2116.5
Thanks,
Linfeng
On Wed, Apr 5, 2017 at 11:02 AM,