Hi all, Thanks to everyone who participated in the code size birds-of-a-feather today! Here are the minutes: https://docs.google.com/document/d/1kDsbFDtkWLceR-Y63ez04CBfvhbDN8RxWV3xXwaNUC0/edit?usp=sharing The minutes are also copied below for convenience. LLVM 2020 Code Size BoF Minutes See also: Aditya Kumar’s talk on “Code Size Compiler Optimizations and Techniques” Minutes * What use cases for code size are important to the LLVM community? * Deeply embedded devices: IoT, RISC-V * Memory is a large part of the cost of the chip * Gaming consoles * WebAssembly, for faster page loads * Mobile apps * Cache-limited programs on desktops, supercomputers * Real-time systems (both size & speed are important) * Find the code that isn’t on the critical path and optimize it for size * Or find thresholds for heuristics that are good enough for both * Hyperscale systems with lots of processors but only a little memory for each one * EMU architecture<https://www.emutechnology.com/> * Epiphany/Adapteva<https://www.adapteva.com/introduction/> architecture * Why doesn’t -Oz do much better than -O2? * There are optimizations that make dramatic speed improvements, but not as many that make dramatic size improvements * It doesn’t disable things like loop unrolling * It doesn’t disable inlining * It doesn’t enable loop rerolling * -Oz affects heuristics for e.g. inlining, but they can still make the wrong decision * TargetTransformInfo has options for code size, but they often aren’t implemented * If -Oz uses a poorly tuned pass sequence, we can autotune it with OpenTuner etc. * Can get ~5% extra size reduction this way. * LTO is effective for code size; was recently fixed to support -Oz * Do we need different levels like -Os1, -Os2, -Os3? * We have profilers for code speed; can we have something similar for code size? * There is something like this for WebAssembly: Twiggy🌱<https://github.com/rustwasm/twiggy> * Remarks<https://llvm.org/docs/Remarks.html> help understand what optimizations are being missed * What code size benchmarks can we use to measure patches? * Embench.org<https://embench.org/> * CSiBE<http://szeged.github.io/csibe/> * Technique: outlining * How is outlining affected by the size of the instruction set? * Machine outliner might be affected, but IR-level outlining shouldn’t be affected much * How do machine outliner and IR-level outliner compare? * Technique: function merging * Research paper: “Function Merging by Sequence Alignment” * Look into string alignment techniques from biology * Technique: deleting unused code * What’s the best resource? * Technique: compressors like gzip * UPX, LZEXE<https://bellard.org/lzexe.html> save disk space * To save RAM, you need a compression algorithm that supports random access, like something based on Huffman codes * Has been implemented in hardware (IBM CodePack) * Technique: using different optimizations for hot & cold code * For best results, people currently have to move their cold code to a separate file so they can use different flags, which is painful * Could we do code size optimizations in the linker? * Needed to take advantage of RISC-V’s LUI instruction * Requires copy propagation and dead code elimination * Instead of inlining functions into each caller, can we keep one copy of the function and specialize it for all its call sites in the same translation unit? * Attributor already shows improvements when you have recursion and non-trivial but dead code (the level stuff in Olden/bisort/bitonic) Topics 1. Situations where code size matters 2. What do other compilers do better? 3. Techniques for code size optimization 4. How can we improve LLVM? Techniques * Program design, library design, etc. * General optimizations * Loop idiom recognition (memset, memcpy) * Partial inlining * Deduplication * Outlining * Function merging * Functionality changes * Partial evaluation * Delete unused code Thanks, Sean Bartell -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201006/b8cbbfff/attachment.html>