Hello community members, I was experimenting to see whether LLVM is able to devirtualize calls via vtable. I have this particular example: =======class Foo { public: virtual int foo() const = 0; int baz() const { return foo(); } }; class Bar : public Foo { public: int foo() const override final { return 0xdeadbeef; } }; int fred(Bar &x) { return x.baz(); } ======= As we can see, there is a call to Bar::foo(), via Foo::baz(), which returns a constant. The Bar::foo() function has final specifier, hence this implementation cannot be overridden by any child class. In this case, the compiler should be able to call Bar::foo() directly instead of calling via vtable, and then should be able to inline the const value. When I compile with LLVM main branch, I see this piece of code being generated below. It makes a call to the function via vtable entry. _Z4fredR3Bar: movq (%rdi), %rax jmpq *(%rax) When I compile with GCC, I see that it is able to correctly identify that it should call Bar::foo() directly and successfully inlines the const value. _Z4fredR3Bar: movq (%rdi), %rax movq (%rax), %rax cmpq $_ZNK3Bar3fooEv, %rax jne .L5 movl $-559038737, %eax ret Should LLVM be optimizing this call, or am I missing something? - Santanu Das -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210712/e7d58940/attachment.html>
FWIW, If I add the main() to call fred() and then I do see <https://godbolt.org/z/WPvbed8f6>the constant being inlined as expected at O1 and above. What is the optimization level you're trying at? I think code generated at O0 level for both compilers may vary significantly due to different pass configurations. On Mon, Jul 12, 2021 at 11:30 AM Santanu Das via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hello community members, > > > > I was experimenting to see whether LLVM is able to devirtualize calls via > vtable. I have this particular example: > > > > =======> > class Foo { > > public: > > virtual int foo() const = 0; > > int baz() const { return foo(); } > > }; > > > > class Bar : public Foo { > > public: > > int foo() const override final { return 0xdeadbeef; } > > }; > > > > int fred(Bar &x) { > > return x.baz(); > > } > > =======> > > > As we can see, there is a call to Bar::foo(), via Foo::baz(), which > returns a constant. > > The Bar::foo() function has final specifier, hence this implementation > cannot be overridden by any child class. In this case, the compiler should > be able to call Bar::foo() directly instead of calling via vtable, and then > should be able to inline the const value. > > > > When I compile with LLVM main branch, I see this piece of code being > generated below. It makes a call to the function via vtable entry. > > > > _Z4fredR3Bar: > > movq (%rdi), %rax > > jmpq *(%rax) > > > > > > When I compile with GCC, I see that it is able to correctly identify that > it should call Bar::foo() directly and successfully inlines the const value. > > > > _Z4fredR3Bar: > > movq (%rdi), %rax > > movq (%rax), %rax > > cmpq $_ZNK3Bar3fooEv, %rax > > jne .L5 > > movl $-559038737, %eax > > ret > > > > > > Should LLVM be optimizing this call, or am I missing something? > > > > - > > Santanu Das > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- *Disclaimer: Views, concerns, thoughts, questions, ideas expressed in this mail are of my own and my employer has no take in it. * Thank You. Madhur D. Amilkanthwar -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210712/aa7166a0/attachment.html>
On Sun, 11 Jul 2021 at 23:00, Santanu Das via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hello community members, > > > > I was experimenting to see whether LLVM is able to devirtualize calls via > vtable. I have this particular example: > > > > =======> > class Foo { > > public: > > virtual int foo() const = 0; > > int baz() const { return foo(); } > > }; > > > > class Bar : public Foo { > > public: > > int foo() const override final { return 0xdeadbeef; } > > }; > > > > int fred(Bar &x) { > > return x.baz(); > > } > > =======> > > > As we can see, there is a call to Bar::foo(), via Foo::baz(), which > returns a constant. > > The Bar::foo() function has final specifier, hence this implementation > cannot be overridden by any child class. In this case, the compiler should > be able to call Bar::foo() directly instead of calling via vtable, and then > should be able to inline the const value. > > > > When I compile with LLVM main branch, I see this piece of code being > generated below. It makes a call to the function via vtable entry. > > > > _Z4fredR3Bar: > > movq (%rdi), %rax > > jmpq *(%rax) > > > > > > When I compile with GCC, I see that it is able to correctly identify that > it should call Bar::foo() directly and successfully inlines the const value. > > > > _Z4fredR3Bar: > > movq (%rdi), %rax > > movq (%rax), %rax > > cmpq $_ZNK3Bar3fooEv, %rax > > jne .L5 > > movl $-559038737, %eax > > ret > > > > > > Should LLVM be optimizing this call, or am I missing something? >LLVM knows nothing about whether functions are "final"; most facts about the C++ type system are not representable in LLVM IR. The optimizations based on "final" are all done by Clang as part of lowering to LLVM IR. Clang would happily devirtualize a call to x.foo(), because it can locally see that would be a virtual call to a final function, so must call Bar::foo(), but in this example Clang can't do anything because it only reasons about the program before optimization, and LLVM can't do anything because the "final" information is not represented in LLVM IR. The information we'd need to propagate into the optimizer here would be moderately complex: we don't know what x's vptr is at the point of the call to baz() (it could be that of Bar or of a class derived from Bar), but we do know what a certain slot in that vptr contains (or at least, we know a function that is equivalent to that slot's value -- I don't think there's a guarantee it's actually the same value). It'd probably be feasible to add support to LLVM to model that kind of thing, but it would probably require a fair bit of work.> - > > Santanu Das > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210712/53f40cd9/attachment-0001.html>