On 19/09/2020 17:26, james faure via llvm-dev wrote:
> Ideally, I want a guarantee that successive alloca's will be laid out
consecutive in memory.
>
> The context is that I need to buffer streams in a stack-machine; I
don't know the buffer size in advance and may need to grow it. It's of
little consequence to me that the buffer needs to be written backwards
to match the stack's growing direction.
>
> I noticed that conceptually, llvm.stacksave + gep + llvm.stackrestore
is a possible solution, but that is probably undefined behavior.?
Do you actually need a contiguous buffer? You can place alloca
instructions in a loop to allocate more stack space and chain together
allocations together in a linked list - they won't be freed until either
an llvm.stackrestore call or until the function returns. Note that, in
general, allocating things of unbounded size on the stack is not a good
idea because the stack is of finite size and much smaller than the heap.
You might be better off implementing a bump allocator.
An IR-level feature for growing a single stack allocation would be
tricky to implement. You'd need to guarantee a single growable alloca
per function (it would have to be the last allocated object) and that,
in turn, would prevent inlining of a function that contained a growable
allocation into another that provided one.
You'd also be placing a restriction on the back end that this alloca
must be the last thing allocated on the stack. Depending on the
addressing modes used for the stack, this may be a problem for spills,
so it would need to be carefully enabled for every architecture.
You already point out the problem with stack growth direction: it's not
a very generally useful feature because it depends on consumers that
want the end pointer to be stable and to grow a buffer at the beginning.
Growing a buffer at the beginning would require updating all GEPs
derived from PHI nodes that could point to the old or new beginning, so
you'd need to ensure a canonical form where the instruction following
the alloca / realloca was a GEP to the end and every subsequent GEP came
from the latter, not the former.
LLVM's alias analysis knowns that two allocas can't alias. If
you're
not suing inbounds GEPs, you may be able to get away with the stacksave
GEP stackrestore trick, but it's undefined behaviour to refer to an
alloca after it has been deallocated and so you're relying on the
optimiser not figuring out that this is what you're doing. You're also
relying on the back end lowering in that order (there are no guarantees
made on stack frame layout by the back end and vulnerability mitigation
techniques may break your expectations in fun ways).
David