thr3ads.net - llvm dev - [llvm-dev] Why are GEPs type based? [Jul 2020]

If this information is useful, please help other people find it:
Share via:

Nikita Popov via llvm-dev

2020-Jul-13 20:08 UTC

[llvm-dev] Why are GEPs type based?

Hi,

I've been wondering why LLVMs GEP instructions are based on types, rather
than encoding the raw address calculation as a base pointer plus some
scaled offsets (still in the form of a GEP, to retain provenance).

The type information does not seem particularly useful (shouldn't be used
as an optimization base, because struct layouts lie), but increases the
non-canonical IR space (there are many ways to encode the same GEP) and
increases compile-time (optimizations need to constantly decompose GEPs,
e.g. to get constant offsets).

What am I missing here?

Nikita,
Regards
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200713/28789b73/attachment.html>

Stefanos Baziotis via llvm-dev

2020-Jul-13 20:30 UTC

head link

[llvm-dev] Why are GEPs type based?

Hi,

Although I'm not an expert on the topic, there are at least two reasons:
1) It looks more like C/C++ than computing offsets. This goes hand in hand
with the fact that GEP abstracts
target-specific information. For example, a pointer is 4 bytes in a typical
32-bit system but 8 bytes in a 64-bit system.
If you have a struct like:
struct {
  int *p;
  int v;
};

To get `v`, with a GEP you just say "give me the second member". If
you
were to code this with offsets, you would need to
know the target, something that generally front-ends are not good to have a
dependency on (Clang and other front-ends actually
have and that's another big discussion).

2) It's very important for alias analysis. Again, not an expert on that,
but e.g. see the first rule on when a pointer is based on
another (pointer) here: https://llvm.org/docs/LangRef.html#pointeraliasing

Best regards,
Stefanos

Στις Δευ, 13 Ιουλ 2020 στις 11:08 μ.μ., ο/η Nikita Popov via llvm-dev <
llvm-dev at lists.llvm.org> έγραψε:
> Hi,
>
> I've been wondering why LLVMs GEP instructions are based on types,
rather
> than encoding the raw address calculation as a base pointer plus some
> scaled offsets (still in the form of a GEP, to retain provenance).
>
> The type information does not seem particularly useful (shouldn't be
used
> as an optimization base, because struct layouts lie), but increases the
> non-canonical IR space (there are many ways to encode the same GEP) and
> increases compile-time (optimizations need to constantly decompose GEPs,
> e.g. to get constant offsets).
>
> What am I missing here?
>
> Nikita,
> Regards
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200713/5d1c13a9/attachment.html>

Nuno Lopes via llvm-dev

2020-Jul-13 21:35 UTC

head link

[llvm-dev] Why are GEPs type based?

You are right that it’s mostly a convenience for the front-ends. So they don’t
have to deal with boring things like padding and sizing things.

Otherwise it adds no semantic value. Object aliasing is not field sensitive in
LLVM, so it doesn’t matter. Though someone may want to add support for that in
the future for languages where it’s ok to do so.

FWIW, Alive2’s GEP instruction works over bytes only (pairs of constant * %reg).
Though I’m not sure I would advocate to change LLVM’s representation.

 

Nuno

 

 

From: Nikita Popov
Sent: 13 July 2020 21:08
To: llvm-dev <llvm-dev at lists.llvm.org>
Subject: [llvm-dev] Why are GEPs type based?

 

Hi,

 

I've been wondering why LLVMs GEP instructions are based on types, rather
than encoding the raw address calculation as a base pointer plus some scaled
offsets (still in the form of a GEP, to retain provenance).

 

The type information does not seem particularly useful (shouldn't be used as
an optimization base, because struct layouts lie), but increases the
non-canonical IR space (there are many ways to encode the same GEP) and
increases compile-time (optimizations need to constantly decompose GEPs, e.g. to
get constant offsets).

 

What am I missing here?

 

Nikita,

Regards

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200713/f4aed15a/attachment.html>

Stefanos Baziotis via llvm-dev

2020-Jul-13 22:39 UTC

head link

[llvm-dev] Why are GEPs type based?

Good to know, thanks for the info.

- Stefanos

On Tue, Jul 14, 2020, 00:35 Nuno Lopes via llvm-dev <llvm-dev at
lists.llvm.org>
wrote:
> You are right that it’s mostly a convenience for the front-ends. So they
> don’t have to deal with boring things like padding and sizing things.
>
> Otherwise it adds no semantic value. Object aliasing is not field
> sensitive in LLVM, so it doesn’t matter. Though someone may want to add
> support for that in the future for languages where it’s ok to do so.
>
> FWIW, Alive2’s GEP instruction works over bytes only (pairs of constant *
> %reg). Though I’m not sure I would advocate to change LLVM’s
> representation.
>
>
>
> Nuno
>
>
>
>
>
> *From:* Nikita Popov
> *Sent:* 13 July 2020 21:08
> *To:* llvm-dev <llvm-dev at lists.llvm.org>
> *Subject:* [llvm-dev] Why are GEPs type based?
>
>
>
> Hi,
>
>
>
> I've been wondering why LLVMs GEP instructions are based on types,
rather
> than encoding the raw address calculation as a base pointer plus some
> scaled offsets (still in the form of a GEP, to retain provenance).
>
>
>
> The type information does not seem particularly useful (shouldn't be
used
> as an optimization base, because struct layouts lie), but increases the
> non-canonical IR space (there are many ways to encode the same GEP) and
> increases compile-time (optimizations need to constantly decompose GEPs,
> e.g. to get constant offsets).
>
>
>
> What am I missing here?
>
>
>
> Nikita,
>
> Regards
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200714/407aa166/attachment.html>

Maybe Matching Threads

Search for more maybe matching threads

llvm dev - Jul 2020 - Why are GEPs type based?

[llvm-dev] Why are GEPs type based?

[llvm-dev] Why are GEPs type based?

[llvm-dev] Why are GEPs type based?

[llvm-dev] Why are GEPs type based?

Maybe Matching Threads