thr3ads.net - llvm dev - [llvm-dev] The best way of generating a good representation for an array with header? [Nov 2019]

If this information is useful, please help other people find it:
Share via:

Christoffer Lernö via llvm-dev

2019-Nov-12 20:43 UTC

[llvm-dev] The best way of generating a good representation for an array with header?

Yes, we’re actually viewing the struct at an offset.

So basically it’s a struct like this:

typedef struct {
  uint32_t size;
  uint32_t capacity;
  int array[0];
} Foo;

The whole thing is malloc:ed with extra bytes at the end, and capacity is set to
that same number of extra bytes.

What’s then passed around is actually the int pointer at an offset:
&(foo->array)

Using the that pointer we can obviously in a simple way recover the pointer to
the struct, but can it be done so that LLVM and DWARF can identify the pointer
as a pointer to a struct member for a certain struct?

std::vector is as far as I know wrapping a pointer or two.

The advantage of a stretchy buffer is that its length is recoverable even if
stored as a pointer. It’s also incredibly thin, only taking up the same size as
a pointer – as opposed to std::vector which is likely 2 pointers long.


Best Regards,

Christoffer
> Date: Tue, 12 Nov 2019 11:34:42 -0800
> From: David Blaikie via llvm-dev <llvm-dev at lists.llvm.org>
> 
> the pointer points to the first element, and you walk backwards from there
> to find the header details about the bounds/etc?
> 
> In any case - I'd look at something like C++'s std::vector, which
is a
> variable length array, and model your situation similarly. I doubt
there's
> anything in particular you'll want to/be able to teach the
optimizations
> about your situation (nothing especially special that they know about
> std::vector-like things either, that I know of - they maybe can deduce
> certain things about how the bounds relate, and they certainly can optimize
> a lot of std::vector usage) & debug info would probably look like
> std::vector, in that it'd be a custom type, etc. Though if my guess
above
> was right about using prefix data to describe the bounds - that might be
> hard to model in DWARF & you might be better off not being
"tricky" like
> that & modelling this closer to something that you could have written
in C
> or C++ more naturally.
> 
> On Tue, Nov 12, 2019 at 4:14 AM Christoffer Lernö via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> 
>> I’m considering building in variable arrays by implementing them as a
>> stretchy buffer, that is a single allocation with header + elements
with
>> the pointer passed around pointing to the first element. (Example:
>>
https://www.gamasutra.com/blogs/NiklasGray/20180109/312683/Minimalist_container_library_in_C_part_1.php
>> )
>> 
>> Is there a good way to represent this in LLVM? I mean both in terms of
>> helping the optimizer passes understand how the layout works and to
make
>> sure the debug info looks ok.
>> 
>> 
>> Best regards,
>> Christoffer
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

David Blaikie via llvm-dev

2019-Nov-12 20:49 UTC

head link

[llvm-dev] The best way of generating a good representation for an array with header?

On Tue, Nov 12, 2019 at 12:44 PM Christoffer Lernö via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Yes, we’re actually viewing the struct at an offset.
>
> So basically it’s a struct like this:
>
> typedef struct {
>   uint32_t size;
>   uint32_t capacity;
>   int array[0];
> } Foo;
>
> The whole thing is malloc:ed with extra bytes at the end, and capacity is
> set to that same number of extra bytes.
>
> What’s then passed around is actually the int pointer at an offset:
> &(foo->array)
>
> Using the that pointer we can obviously in a simple way recover the
> pointer to the struct, but can it be done so that LLVM and DWARF can
> identify the pointer as a pointer to a struct member for a certain struct?
>
> std::vector is as far as I know wrapping a pointer or two.
>
> The advantage of a stretchy buffer is that its length is recoverable even
> if stored as a pointer.

What's the advantage compared to a pointer to the struct, rather than a
pointer to the array? (a pointer to this first element of the array would
still have to be tagged differently from a pointer to an arbitrary int
(either a singular int or an int somewhere in the array) to indicate that
you can backtrack to find the length - so it's not like you get to
generalize all int pointers) - I wouldn't expect (but don't know that
much)
that the extra constant offset on array indexing would be particularly
expensive/observable?

But yeah, I think you'd probably have some trouble getting DWARF consumers
to handle the idea that the parameter type to a function is more than the
type itself, or that pointers to that type actually point into the middle
of the object instead of the start.

Not insurmountable, but seems a bit expensive/complicated to try to make
that work - but don't know what your other constraints/data are.

> It’s also incredibly thin, only taking up the same size as a pointer – as
> opposed to std::vector which is likely 2 pointers long.
>
>
> Best Regards,
>
> Christoffer
>
> > Date: Tue, 12 Nov 2019 11:34:42 -0800
> > From: David Blaikie via llvm-dev <llvm-dev at lists.llvm.org>
> >
> > the pointer points to the first element, and you walk backwards from
> there
> > to find the header details about the bounds/etc?
> >
> > In any case - I'd look at something like C++'s std::vector,
which is a
> > variable length array, and model your situation similarly. I doubt
> there's
> > anything in particular you'll want to/be able to teach the
optimizations
> > about your situation (nothing especially special that they know about
> > std::vector-like things either, that I know of - they maybe can deduce
> > certain things about how the bounds relate, and they certainly can
> optimize
> > a lot of std::vector usage) & debug info would probably look like
> > std::vector, in that it'd be a custom type, etc. Though if my
guess above
> > was right about using prefix data to describe the bounds - that might
be
> > hard to model in DWARF & you might be better off not being
"tricky" like
> > that & modelling this closer to something that you could have
written in
> C
> > or C++ more naturally.
> >
> > On Tue, Nov 12, 2019 at 4:14 AM Christoffer Lernö via llvm-dev <
> > llvm-dev at lists.llvm.org> wrote:
> >
> >> I’m considering building in variable arrays by implementing them
as a
> >> stretchy buffer, that is a single allocation with header +
elements with
> >> the pointer passed around pointing to the first element. (Example:
> >>
>
https://www.gamasutra.com/blogs/NiklasGray/20180109/312683/Minimalist_container_library_in_C_part_1.php
> >> )
> >>
> >> Is there a good way to represent this in LLVM? I mean both in
terms of
> >> helping the optimizer passes understand how the layout works and
to make
> >> sure the debug info looks ok.
> >>
> >>
> >> Best regards,
> >> Christoffer
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org
> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191112/0689e856/attachment.html>

Christoffer Lernö via llvm-dev

2019-Nov-12 22:35 UTC

head link

[llvm-dev] The best way of generating a good representation for an array with header?

The advantages:

1. A pointer to the struct offset can be converted to a pointer without any
cost.
2. A nullpointer to a stretchy buffer can be treated as a zero length array.
Consequently no actual struct allocation is needed to represent a zero length
array.
3. A reference to the array is the same size as to a pointer.
4. It can be converted to and back from an pointer without losing any
information about the size & capacity.

The downsides are what we discuss. But it looks like I have to accept that I can
only represent it as a pointer with unknown length in DWARF then?

Best Regards,

Christoffer
> On 12 Nov 2019, at 21:49, David Blaikie <dblaikie at gmail.com>
wrote:
> 
> On Tue, Nov 12, 2019 at 12:44 PM Christoffer Lernö via llvm-dev
<llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>
wrote:
> Yes, we’re actually viewing the struct at an offset.
> 
> So basically it’s a struct like this:
> 
> typedef struct {
>   uint32_t size;
>   uint32_t capacity;
>   int array[0];
> } Foo;
> 
> The whole thing is malloc:ed with extra bytes at the end, and capacity is
set to that same number of extra bytes.
> 
> What’s then passed around is actually the int pointer at an offset:
&(foo->array)
> 
> Using the that pointer we can obviously in a simple way recover the pointer
to the struct, but can it be done so that LLVM and DWARF can identify the
pointer as a pointer to a struct member for a certain struct?
> 
> std::vector is as far as I know wrapping a pointer or two.
> 
> The advantage of a stretchy buffer is that its length is recoverable even
if stored as a pointer.
> 
> What's the advantage compared to a pointer to the struct, rather than a
pointer to the array? (a pointer to this first element of the array would still
have to be tagged differently from a pointer to an arbitrary int (either a
singular int or an int somewhere in the array) to indicate that you can
backtrack to find the length - so it's not like you get to generalize all
int pointers) - I wouldn't expect (but don't know that much) that the
extra constant offset on array indexing would be particularly
expensive/observable?
> 
> But yeah, I think you'd probably have some trouble getting DWARF
consumers to handle the idea that the parameter type to a function is more than
the type itself, or that pointers to that type actually point into the middle of
the object instead of the start.
> 
> Not insurmountable, but seems a bit expensive/complicated to try to make
that work - but don't know what your other constraints/data are.
>  
> It’s also incredibly thin, only taking up the same size as a pointer – as
opposed to std::vector which is likely 2 pointers long.
> 
> 
> Best Regards,
> 
> Christoffer
> 
> > Date: Tue, 12 Nov 2019 11:34:42 -0800
> > From: David Blaikie via llvm-dev <llvm-dev at lists.llvm.org
<mailto:llvm-dev at lists.llvm.org>>
> > 
> > the pointer points to the first element, and you walk backwards from
there
> > to find the header details about the bounds/etc?
> > 
> > In any case - I'd look at something like C++'s std::vector,
which is a
> > variable length array, and model your situation similarly. I doubt
there's
> > anything in particular you'll want to/be able to teach the
optimizations
> > about your situation (nothing especially special that they know about
> > std::vector-like things either, that I know of - they maybe can deduce
> > certain things about how the bounds relate, and they certainly can
optimize
> > a lot of std::vector usage) & debug info would probably look like
> > std::vector, in that it'd be a custom type, etc. Though if my
guess above
> > was right about using prefix data to describe the bounds - that might
be
> > hard to model in DWARF & you might be better off not being
"tricky" like
> > that & modelling this closer to something that you could have
written in C
> > or C++ more naturally.
> > 
> > On Tue, Nov 12, 2019 at 4:14 AM Christoffer Lernö via llvm-dev <
> > llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
> > 
> >> I’m considering building in variable arrays by implementing them
as a
> >> stretchy buffer, that is a single allocation with header +
elements with
> >> the pointer passed around pointing to the first element. (Example:
> >>
https://www.gamasutra.com/blogs/NiklasGray/20180109/312683/Minimalist_container_library_in_C_part_1.php
<https://www.gamasutra.com/blogs/NiklasGray/20180109/312683/Minimalist_container_library_in_C_part_1.php>
> >> )
> >> 
> >> Is there a good way to represent this in LLVM? I mean both in
terms of
> >> helping the optimizer passes understand how the layout works and
to make
> >> sure the debug info looks ok.
> >> 
> >> 
> >> Best regards,
> >> Christoffer
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191112/cbf2f15e/attachment.html>

Maybe Matching Threads

Search for more apparently analagous threads

llvm dev - Nov 2019 - The best way of generating a good representation for an array with header?

[llvm-dev] The best way of generating a good representation for an array with header?

[llvm-dev] The best way of generating a good representation for an array with header?

[llvm-dev] The best way of generating a good representation for an array with header?

Maybe Matching Threads