search for: prolog

Displaying 20 results from an estimated 307 matches for "prolog".

2020 Mar 24
2
[RFC][AArch64] Homogeneous Prolog and Epilog for Size Optimization
...I'd like to upstream our work over the time which the community would benefit from. This is a part of effort toward minimizing code size presented in here <https://llvm.org/devmtg/2020-02-23/slides/Kyungwoo-GlobalMachineOutlinerForThinLTO.pdf>. In particular, this RFC is about optimizing prolog and epilog for size. *Homogeneous Prolog and Epilog for Size Optimization, D76570 <https://reviews.llvm.org/D76570>:* Prolog and epilog to handle callee-save registers tend to be irregular with different immediate offsets, which are not often being outlined (by machine outliner) when optimi...
2001 Apr 14
1
Postscript font bugs (and a suggestion) (PR#914)
...c; on line 236, it checks that the font number is in the range 1..32. Later this crashes PostScriptStringWidth in devPS.c, because only fonts numbered 1..5 are actually defined. I don't know how to fix it. 2. The ?postscript help topic refers to .ps.profile; the actual variable name is .ps.prolog. The C source in main/devPS.c has this on lines 606-608, with the incorrect name in the error message: prolog = findVar(install(".ps.prolog"), R_GlobalEnv); if(!isString(prolog)) error("Object .ps.profile is not a character vector"); 3. And this is a suggestion, no...
2020 Mar 24
2
[RFC][AArch64] Homogeneous Prolog and Epilog for Size Optimization
...) with an opt-out option. Regards, Kyungwoo On Tue, Mar 24, 2020 at 12:01 PM Vedant Kumar <vedant_kumar at apple.com> wrote: > This looks really interesting. In the slides, it’s mentioned that the > combination of tuning the MachineOutliner for ThinLTO and of optimizing > function prolog/epilogs improved measured run-time performance. > > What kind of performance impact do you see from simply homogenizing > prolog/epilogs? (If, say across LNT/aarch64/-Oz the performance impact is > not large, it may make sense to have homogenization enabled by default.) > > best,...
2010 Apr 07
3
[LLVMdev] Injecting code before function prolog
...mit as the first argument to the function. What I'm hoping to do is to be able to inject the following code (in x86 asm, c calling convention) on entry to each function: _foo: lea -frame_size(%esp), %eax cmpl %eax, 4(%esp) jb function_entry // handle overflow function_entry: function prolog ... The problem I'm encountering is how to force this before the prolog. I'm attempting to add a machine function pass after the emit prolog/epilog pass that injects this code, but directly injecting x86 code seems to be very messy as I have to figure out how LLVM encodes the addressing...
2017 Jun 09
2
Question about Prolog/Epilog Code Insertion
Hi All, When seeing the title "Prolog/Epilog Code Insertion", I'd expect something about XXXFrameLowering.cpp (particular about emitPrologue/emitEpilogue). But the document [1] is about unwind. Is it placed at the right place/section? Thanks. [1] http://llvm.org/docs/CodeGenerator.html#prolog-epilog-code-insertion Regards...
2019 Jul 16
2
MachinePipeliner refactoring
...lock, are good ones. I think you’ll need to account for the cycle, or position within a Stage as well. It is a complex problem with a lot different edge cases when dealing with Phis, though I think they can be dealt with much better than the existing code. Generating code for the 3 main parts, the prolog, kernel, and epilog, can be challenging because each has a unique difference that made is hard to generalize the code. For example, the prologs will not have an Phis, but the code generation needs to be aware of when the Phis occur in the sequence in order to get the correct remapped name. In the...
2014 Mar 27
2
[LLVMdev] PR19267 - Add a feature to clobber non-calle-save regs in the prolog.
This is a feature I’m considering for the LLVM backend. Feel free to provide input in the following PR. llvm.org/pr19267 - Add a feature to clobber non-callee-save regs in the prolog. I’m copying llvm-dev because it seems like something that others must have already done or at least thought about at some point. -Andy
2017 Feb 06
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
...(sy)) denote processing input[x] at stage y. If there is no dependency between inx(sy) and in(x+1)(sy), then we can do this FOR in=0 TO N WITH in+=8 FOR y=0 TO order-1 WITH y++ PROC(in0(sy) in1(sy) in2(sy) in3(sy) in4(sy) in5(sy) in6(sy) in7(sy)) END FOR END FOR Definitely there is no any prolog and epilog needed. However, the critical thing is that all the states in each stage when processing input[i] are reused by the next input[i+1]. That is input[i+1] must wait input[i] for 1 stage, and input[i+2] must wait input[i+1] for 1 stage, etc. Then it becomes this FOR in=0 to N WITH in+=8...
2009 Apr 09
3
[LLVMdev] Calling Conventions, function prologs and epilogs.
How/where are function prologs and epilogs generated, is it bespoke C++ code or TableGen generated ? If someone could point me in the right direction please. Many thanks in advance, Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/2...
2010 Apr 10
0
[LLVMdev] Injecting code before function prolog
...gt; > What I'm hoping to do is to be able to inject the following code (in > x86 asm, c calling convention) on entry to each function: > _foo: >  lea -frame_size(%esp), %eax >  cmpl %eax, 4(%esp) >  jb function_entry >  // handle overflow > function_entry: >  function prolog >  ... > > The problem I'm encountering is how to force this before the prolog. > I'm attempting to add a machine function pass after the emit > prolog/epilog pass that injects this code, but directly injecting x86 > code seems to be very messy as I have to figure out how...
2017 Jan 31
6
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi, Attached is a patch with arm neon optimizations for silk_warped_autocorrelation_FIX(). Please review. Thanks, Felicia -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xiph.org/pipermail/opus/attachments/20170131/9a912bb4/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name:
2019 Jul 15
1
MachinePipeliner refactoring
...n-first-out order (queues), the currently generated code cannot be used.   2) For architectures (like Hexagon) that have dedicated predicate register files, we can generate a compact representation of the loop by predicating stages of the loop kernel independently. In this case we can either have a prolog, epilog, or neither (wrapping the prolog and epilog inside the kernel by using PHIs of predicates). At the moment, a lot of the code generation helper code in MachinePipeliner is tightly fit to its current code generation strategy ("If we're in the epilog, to this, else do this"). I&...
2009 Apr 09
0
[LLVMdev] Calling Conventions, function prologs and epilogs.
Hello, Aaron > How/where are function prologs and epilogs generated, is it bespoke C++ code > or TableGen generated ? > > If someone could point me in the right direction please. Calling convention is really-really far from prologue/epilogue emission :) So: 1. Calling conventions Partly tablegen / partly C++ code. Look for CodeGen/S...
2017 Feb 07
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
...n4(s7) in5(s6) in6(s5) in7(s4) in8(s3) in9(s2)in10(s1)in11(s0)) > ...and so on until the end of the input vector > > The difference is that it's now the input vector that "slides" and the > "state" values sy that remain in the same place. There's still a > prologue, but you can easily get rid of it by (implicitly) zero-padding > the in vector during the initialization phase (start with a zero vector > and real one value at a time). Getting rid of the epilogue is a little > trickier, but I think it can be done. > > Cheers, > > Je...
2009 Apr 09
2
[LLVMdev] Calling Conventions, function prologs and epilogs.
On Thu, Apr 9, 2009 at 4:34 PM, Anton Korobeynikov <anton at korobeynikov.info>wrote: > Hello, Aaron > > > How/where are function prologs and epilogs generated, is it bespoke C++ > code > > or TableGen generated ? > > > > If someone could point me in the right direction please. > Calling convention is really-really far from prologue/epilogue emission :) > So: > > 1. Calling conventions > Partly ta...
2009 Apr 09
0
[LLVMdev] Calling Conventions, function prologs and epilogs.
On Apr 9, 2009, at 11:11 AMPDT, Aaron Gray wrote: > On Thu, Apr 9, 2009 at 4:34 PM, Anton Korobeynikov <anton at korobeynikov.info > > wrote: > Hello, Aaron > > > How/where are function prologs and epilogs generated, is it > bespoke C++ code > > or TableGen generated ? > > > > If someone could point me in the right direction please. > Calling convention is really-really far from prologue/epilogue > emission :) So: > > 1. Calling conventions > Partl...
2010 Apr 12
1
[LLVMdev] Question. about Machinefunction pass, funtion Prolog/Epilog code, stack frame
I am new to the LLVM, and need some help with this points. 1. how can we add special code for the Prolog/Epilog for some certain functions, this should be done with machinefunction pass, rt? 2. Basically, I want to get the function stack frame, that is the size and the initial position. I found int64_t llvm::MachineFrameInfo::getObjectSize ( int *ObjectIdx* ) const[inline] This method is done bef...
2004 Jun 09
2
[LLVMdev] X86 Frame info question
...quot;4" as local area offset. Based on prior discussion this should mean that the local area starts and address ESP+4. Is this really true? On X86 stack grows down, so I'd expect local area to start below ESP, e.g. at ESP - 4, and ESP + 4 would contains function arguments. It look like prolog/epilog generator (PEI::calculateFrameObjectOffsets) assumes local area offset is offset in the stack growth direction. For example, if there are 2 4-byte object, it will start with "Offset" of 4 and then go to "Offset" of 8... the actuall offsets set to stack objects will be...
2004 Jun 09
0
[LLVMdev] X86 Frame info question
...would contains function arguments. Yup, the magic number 4 is due to the 'call' instruction pushing the return address on the stack. The TargetFrameInfo class is all about keeping the stack aligned at some boundary (8 bytes in this case). In particular, on entry to a function, before the prolog, the stack pointer (on X86) is aligned to an 8 byte boundary, but is actually 4 bytes from that alignment. Put another way, immediately before the call, the stack pointer was aligned to 8 bytes. > It look like prolog/epilog generator (PEI::calculateFrameObjectOffsets) > assumes local area o...
2017 Feb 07
3
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
...p[MAX_SIZE]; > int *in = orig; > for (i=0;i<order;i+=4) { > autocorr_kernel4(corr+i, orig, in, tmp, len); > /* Make subsequent calls use the filtered signal as input. */ > in = tmp; > } > } > > I think the should not only reduce/eliminate the prologue/epilogue > problem, but it should also be more efficient since almost all vectors > processed would use the full size. > > Maybe a third option (not sure it's a good idea, but still mentioning > it) would be to have a function that hardcodes order=24 and discards the > large...