thr3ads.net - llvm dev - [llvm-dev] TableGen operand question [Jun 2016]

If this information is useful, please help other people find it:
Share via:

Dawid Retkiewicz via llvm-dev

2016-Jun-09 09:41 UTC

[llvm-dev] TableGen operand question

Hi,

As a way to learn LLVM, I'm trying to write a backend for the Microchip
PIC18 8-bit microcontroller.
On this device, the hardware stack is very small and is used only for
storing function return addresses.
A "real" software stack implementation is not very efficient because
of the
limited instruction set, so for all the non-reentrant functions I've
decided to use a similar solution to the one used by Microchip's own XC8
compiler, which is to simulate the stack by assigning constant memory
locations to every function.
Local variables/parameters could then be accessed by using constant memory
addresses. Since this microcontroller can perform most instruction directly
on such addresses, this seems the most efficient way to handle the stack.

The issue I'm facing right now is how to resolve the stack addresses in the
LLVM backend.
For instance, when calling a function, I'd like to store its parameters
directly in the memory area assigned to this function.
If I understand correctly, the exact stack size (and thus the memory needed
by each function) cannot be determined before the epilogue/prologue
insertion pass.This means that the target memory location is not known
before this step.

My idea is to create an operand that would hold an identifier of the target
function and the memory offset, and replace this identifier with a real
memory address in a pass after emit prologue/epilogue.
However, I'm not sure how to achieve such thing in TableGen, and whether
it's at all possible.
I would be grateful for any tips/directions

Thanks,
Dawid Retkiewicz
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160609/ebad74a6/attachment.html>

Bruce Hoult via llvm-dev

2016-Jun-09 12:21 UTC

head link

[llvm-dev] TableGen operand question

That's going to be a really challenging architecture to target from LLVM,
and you'll be fighting LLVM assumptions all the way, It would be a
challenge for an LLVM expert, let alone an LLVM beginner.

Of course, if you succeed then you will be an expert :-)

Converting functions to use statically-allocated memory instead of stack
memory would be a cool feature to have anywhere, and would also be almost
essential on other stack-unfriendly ISAs such as 6502.

To do it properly, you'd want to have functions that can never be part of
same call chain (i.e. never active at the same time) share memory
locations. I guess the way to do that (once you've worked out the frame
size for each function) would be to walk the complete program call tree and
give each function's frame the next addresses after its deepest caller.

And then there's recursion. If a function calls itself, or a function up
towards the root of the call tree, then you'll have to save and restore all
the function frames between to somewhere else.

Re-entrancy (e.g. things called from interrupt handlers) is even nastier.

On Thu, Jun 9, 2016 at 12:41 PM, Dawid Retkiewicz via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi,
>
> As a way to learn LLVM, I'm trying to write a backend for the Microchip
> PIC18 8-bit microcontroller.
> On this device, the hardware stack is very small and is used only for
> storing function return addresses.
> A "real" software stack implementation is not very efficient
because of
> the limited instruction set, so for all the non-reentrant functions
I've
> decided to use a similar solution to the one used by Microchip's own
XC8
> compiler, which is to simulate the stack by assigning constant memory
> locations to every function.
> Local variables/parameters could then be accessed by using constant memory
> addresses. Since this microcontroller can perform most instruction directly
> on such addresses, this seems the most efficient way to handle the stack.
>
> The issue I'm facing right now is how to resolve the stack addresses in
> the LLVM backend.
> For instance, when calling a function, I'd like to store its parameters
> directly in the memory area assigned to this function.
> If I understand correctly, the exact stack size (and thus the memory
> needed by each function) cannot be determined before the epilogue/prologue
> insertion pass.This means that the target memory location is not known
> before this step.
>
> My idea is to create an operand that would hold an identifier of the
> target function and the memory offset, and replace this identifier with a
> real memory address in a pass after emit prologue/epilogue.
> However, I'm not sure how to achieve such thing in TableGen, and
whether
> it's at all possible.
> I would be grateful for any tips/directions
>
> Thanks,
> Dawid Retkiewicz
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160609/a5250420/attachment.html>

Jim Grosbach via llvm-dev

2016-Jun-09 21:19 UTC

head link

[llvm-dev] TableGen operand question

Hi Dawid,

Oh, wow. Flashback time.

PIC18 is unfortunately not a good candidate for an LLVM backend. The separation
of program and data memory in the ISA, accumulator based design, and banked data
memory are the heavy hitters, but there’s plenty more once you get further.
Efficient use of access memory vs. banked memory can get “fun,” for example.

The approach you’re describing for static activation records is a good start.
You don’t want to try to assign explicit memory addresses at compile time,
though. Do that in the linker. Have the compiler create a symbol for the
activation record, including function arguments, that callers know how to
reference (no need to get fancy. use something like __Afoo for function foo or
something along those lines). Then have a metadata section that records for the
linker which other functions get referenced by that function (or have the linker
reconstruct via relocations). Use that to build a call graph at link time and
allocate the locations for those activation records so that the memory for
functions which can ever be live at the same time don’t overlap. Caveats: that
means no recursion and no function pointers and no variable length argument
lists. Getting bank selection done efficiently for referencing the activation
records is hard (for globals in general, really, but extra important for
locals). Also means that no function called from the main program can be called,
directly or indirectly, from an interrupt function. All of that is solvable.
None of it is trivial, and very tricky corner cases abound. All that being said,
if you’re going to push forwards, I strongly recommend just doing a software
stack to start with. That’s what the various INC/DEC addressing modes on the
INDF<n> registers were added for (well, that and better array walking for
stuff like memcpy()).

Now, as an alternative, the dsPIC, PIC24 and PIC30 micros are much more amenable
to an LLVM backend. There’s still some really crazy stuff to deal with (24-bit
wide program memory is happy times), but it’s a much more compiler friendly
instruction set and deliberately so.

-Jim

> On Jun 9, 2016, at 2:41 AM, Dawid Retkiewicz via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Hi,
> 
> As a way to learn LLVM, I'm trying to write a backend for the Microchip
PIC18 8-bit microcontroller.
> On this device, the hardware stack is very small and is used only for
storing function return addresses.
> A "real" software stack implementation is not very efficient
because of the limited instruction set, so for all the non-reentrant functions
I've decided to use a similar solution to the one used by Microchip's
own XC8 compiler, which is to simulate the stack by assigning constant memory
locations to every function.
> Local variables/parameters could then be accessed by using constant memory
addresses. Since this microcontroller can perform most instruction directly on
such addresses, this seems the most efficient way to handle the stack.
> 
> The issue I'm facing right now is how to resolve the stack addresses in
the LLVM backend.
> For instance, when calling a function, I'd like to store its parameters
directly in the memory area assigned to this function.
> If I understand correctly, the exact stack size (and thus the memory needed
by each function) cannot be determined before the epilogue/prologue insertion
pass.This means that the target memory location is not known before this step.
> 
> My idea is to create an operand that would hold an identifier of the target
function and the memory offset, and replace this identifier with a real memory
address in a pass after emit prologue/epilogue.
> However, I'm not sure how to achieve such thing in TableGen, and
whether it's at all possible.
> I would be grateful for any tips/directions 
> 
> Thanks,
> Dawid Retkiewicz
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Simon Cook via llvm-dev

2016-Jun-10 13:13 UTC

head link

[llvm-dev] TableGen operand question

Hi Dawid,

I've had to do something similar for a different backend, whereby I use
memory locations as a substitute for registers in something rather
register-contrained, and I hope this may be relevant to the issue you're
trying to solve.

How I would tackle this is to create a register class full of pseudo
registers where each one represents one memory location, and do
instruction selection and register allocation on that. Once that is
complete, I would have a pass which goes through each instruction in a
MachineFunction and replace the operands with the reference to memory
and emit that. In my case these pseudo registers lived on a stack, so
the replacement wasn't too difficult, I don't think replacing registers
with symbols would be any harder than that. Once complete it's just a
matter of having your assembler and linker handle these references
(though you should be able to generate some assembly directives to help
here also). Note may have to store some data in your targets
MachineFunctionInfo class to help keep track of the real size of a
frame, as asking a MachineFunction will now give the wrong result.

One consideration would be the size of this register class (that is, the
number of pseudo registers you create). If you make it too large and you
may end up wasting memory, though I'm not sure exactly what the register
allocations will do in this case if given far too many registers for a
given problem. If you make it too small, you'll have to start thinking
about how to spill memory locations to other memory locations (or
rewriting generated code post-regalloc to avoid this).

Hope this helps.

Thanks,
Simon

On 09/06/16 10:41, Dawid Retkiewicz via llvm-dev wrote:> Hi,
> 
> As a way to learn LLVM, I'm trying to write a backend for the Microchip
> PIC18 8-bit microcontroller.
> On this device, the hardware stack is very small and is used only for
> storing function return addresses.
> A "real" software stack implementation is not very efficient
because of
> the limited instruction set, so for all the non-reentrant functions
I've
> decided to use a similar solution to the one used by Microchip's own
XC8
> compiler, which is to simulate the stack by assigning constant memory
> locations to every function.
> Local variables/parameters could then be accessed by using constant
> memory addresses. Since this microcontroller can perform most
> instruction directly on such addresses, this seems the most efficient
> way to handle the stack.
> 
> The issue I'm facing right now is how to resolve the stack addresses in
> the LLVM backend.
> For instance, when calling a function, I'd like to store its parameters
> directly in the memory area assigned to this function.
> If I understand correctly, the exact stack size (and thus the memory
> needed by each function) cannot be determined before the
> epilogue/prologue insertion pass.This means that the target memory
> location is not known before this step.
> 
> My idea is to create an operand that would hold an identifier of the
> target function and the memory offset, and replace this identifier with
> a real memory address in a pass after emit prologue/epilogue.
> However, I'm not sure how to achieve such thing in TableGen, and
whether
> it's at all possible.
> I would be grateful for any tips/directions 
> 
> Thanks,
> Dawid Retkiewicz
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

Alex Bradbury via llvm-dev

2016-Jun-12 13:01 UTC

head link

[llvm-dev] TableGen operand question

On 10 June 2016 at 14:13, Simon Cook via llvm-dev
<llvm-dev at lists.llvm.org> wrote:> Hi Dawid,
>
> I've had to do something similar for a different backend, whereby I use
> memory locations as a substitute for registers in something rather
> register-contrained, and I hope this may be relevant to the issue
you're
> trying to solve.
>
> How I would tackle this is to create a register class full of pseudo
> registers where each one represents one memory location, and do
> instruction selection and register allocation on that. Once that is
> complete, I would have a pass which goes through each instruction in a
> MachineFunction and replace the operands with the reference to memory
> and emit that. In my case these pseudo registers lived on a stack, so
> the replacement wasn't too difficult, I don't think replacing
registers
> with symbols would be any harder than that. Once complete it's just a
> matter of having your assembler and linker handle these references
> (though you should be able to generate some assembly directives to help
> here also). Note may have to store some data in your targets
> MachineFunctionInfo class to help keep track of the real size of a
> frame, as asking a MachineFunction will now give the wrong result.
>
> One consideration would be the size of this register class (that is, the
> number of pseudo registers you create). If you make it too large and you
> may end up wasting memory, though I'm not sure exactly what the
register
> allocations will do in this case if given far too many registers for a
> given problem. If you make it too small, you'll have to start thinking
> about how to spill memory locations to other memory locations (or
> rewriting generated code post-regalloc to avoid this).
In your scheme, do you reduce the required stack storage to match the
number of pseudo-registers actually used?

The register allocator currently tries to use the minimum number of
registers, which in this case is exactly what you want. There was some
recent discussion on adding the ability to try to use the maximum
number of architectural registers in order to reduce false
dependencies http://lists.llvm.org/pipermail/llvm-dev/2016-May/099494.html

Alex

llvm dev - Jun 2016 - TableGen operand question

[llvm-dev] TableGen operand question

[llvm-dev] TableGen operand question

[llvm-dev] TableGen operand question

[llvm-dev] TableGen operand question

[llvm-dev] TableGen operand question