thr3ads.net - llvm dev - [llvm-dev] Musings on the TableGen -emit-dag-isel backend [Nov 2020]

If this information is useful, please help other people find it:
Share via:

Paul C. Anagnostopoulos via llvm-dev

2020-Nov-13 12:22 UTC

[llvm-dev] Musings on the TableGen -emit-dag-isel backend

Your suggestion for two passes is indeed my plan if simply using 3-byte sizes is
not acceptable. I don't want to duplicate all the logic in a second
length-calculating function, so I would just have special logic for the three
matching operators with children and use the existing function for the rest,
passing a null output stream. Or I could conditionalize all the output on
another function parameter so it isn't done at all.

I'm not convinced that anyone would notice a 4% increase in the size of the
matching table, but I don't think that's my call. I have plenty to do
while waiting for more comments. ;-)

Nicolai Hähnle via llvm-dev

2020-Nov-13 12:53 UTC

head link

[llvm-dev] Musings on the TableGen -emit-dag-isel backend

On 13.11.20 13:22, Paul C. Anagnostopoulos via llvm-dev
wrote:> Your suggestion for two passes is indeed my plan if simply using 3-byte
sizes is not acceptable. I don't want to duplicate all the logic in a second
length-calculating function, so I would just have special logic for the three
matching operators with children and use the existing function for the rest,
passing a null output stream. Or I could conditionalize all the output on
another function parameter so it isn't done at all.
> 
> I'm not convinced that anyone would notice a 4% increase in the size of
the matching table, but I don't think that's my call.
It's not necessarily about the total size, but cache pressure is an issue.

That said, if we are seriously thinking about the performance of the 
byte code, perhaps some of these opcodes should be reconsidered at a 
higher level anyway.

For example: The overall bytecode always begins with an OPC_SwitchOpcode 
implemented as a linear list of cases, often hundreds of them (depending 
on the target). A binary search over a jump table would be *much* better 
for those.

Cheers,
Nicolai



> I have plenty to do while waiting for more comments. ;-)
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.

Paul C. Anagnostopoulos via llvm-dev

2020-Nov-13 13:43 UTC

head link

[llvm-dev] Musings on the TableGen -emit-dag-isel backend

I've only just begun looking at the compiler's use of the matching
table. You are correct about the outer SwitchOpcode: The AMDGPU target has 212
cases. The X86 has 452. However, there is a cache for the case offsets in the
SelectionDAGIsel class.

At 11/13/2020 07:53 AM, Nicolai HÃ¤hnle wrote:>On 13.11.20 13:22, Paul C. Anagnostopoulos via llvm-dev wrote:
>>Your suggestion for two passes is indeed my plan if simply using 3-byte
sizes is not acceptable. I don't want to duplicate all the logic in a second
length-calculating function, so I would just have special logic for the three
matching operators with children and use the existing function for the rest,
passing a null output stream. Or I could conditionalize all the output on
another function parameter so it isn't done at all.
>>I'm not convinced that anyone would notice a 4% increase in the size
of the matching table, but I don't think that's my call.
>
>It's not necessarily about the total size, but cache pressure is an
issue.
>
>That said, if we are seriously thinking about the performance of the byte
code, perhaps some of these opcodes should be reconsidered at a higher level
anyway.
>
>For example: The overall bytecode always begins with an OPC_SwitchOpcode
implemented as a linear list of cases, often hundreds of them (depending on the
target). A binary search over a jump table would be *much* better for those.
>
>Cheers,
>Nicolai

Paul C. Anagnostopoulos via llvm-dev

2020-Nov-13 14:55 UTC

head link

[llvm-dev] Musings on the TableGen -emit-dag-isel backend

Would it make sense for TableGen to generate the outer OPC_SwitchOpcode offset
table?


At 11/13/2020 07:53 AM, Nicolai HÃ¤hnle wrote:>That said, if we are seriously thinking about the performance of the byte
code, perhaps some of these opcodes should be reconsidered at a higher level
anyway.
>
>For example: The overall bytecode always begins with an OPC_SwitchOpcode
implemented as a linear list of cases, often hundreds of them (depending on the
target). A binary search over a jump table would be *much* better for those.
>
>Cheers,
>Nicolai

Reasonably Related Threads

Search for more maybe matching threads

llvm dev - Nov 2020 - Musings on the TableGen -emit-dag-isel backend

[llvm-dev] Musings on the TableGen -emit-dag-isel backend

[llvm-dev] Musings on the TableGen -emit-dag-isel backend

[llvm-dev] Musings on the TableGen -emit-dag-isel backend

[llvm-dev] Musings on the TableGen -emit-dag-isel backend

Reasonably Related Threads