Edmund Grimley-Evans
2010-Dec-14 10:55 UTC
[LLVMdev] Which is more compact, .bc or .ll.gz? And what might be even more compact?
According to the few tests I did, .ll.gz is more compact: 1.00 LLVM bitcode (.bc) 0.80 Gzipped LLVM bitcode (.bc.gz) 4.13 LLVM assembly (.ll) 0.68 Gzipped LLVM assembly (.ll.gz) However, there's not much in it, considering that a stripped native binary is about 0.40 on the same scale. So, seeing as projects such as PNaCl want to send LLVM bitcode over the network, are there any proposed solutions for making LLVM bitcode more compact? Removing or simplifying the names of local variables would be an obvious thing to do. Is there anything else that could be done without changing the bitcode format? (There's an obvious analogy with JavaScript compression techniques.) Does anyone have any idea how much it would help? Edmund -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Török Edwin
2010-Dec-14 11:22 UTC
[LLVMdev] Which is more compact, .bc or .ll.gz? And what might be even more compact?
On Tue, 14 Dec 2010 10:55:09 -0000 "Edmund Grimley-Evans" <Edmund.Grimley-Evans at arm.com> wrote:> According to the few tests I did, .ll.gz is more compact: > > 1.00 LLVM bitcode (.bc) > 0.80 Gzipped LLVM bitcode (.bc.gz) > 4.13 LLVM assembly (.ll) > 0.68 Gzipped LLVM assembly (.ll.gz) > > However, there's not much in it, considering that a stripped native > binary is about 0.40 on the same scale. > > So, seeing as projects such as PNaCl want to send LLVM bitcode over > the network, are there any proposed solutions for making LLVM bitcode > more compact? > > Removing or simplifying the names of local variables would be an > obvious thing to do.opt -globaldce -strip -strip-dead-prototypes -deadtypeelim -strip removes names of local vars (and more).> Is there anything else that could be done > without changing the bitcode format? (There's an obvious analogy with > JavaScript compression techniques.) Does anyone have any idea how > much it would help?You might try some other compression techniques, .xz seems to be popular these days. Best regards, --Edwin
Edmund Grimley-Evans
2010-Dec-15 12:26 UTC
[LLVMdev] Which is more compact, .bc or .ll.gz? And what might be even more compact?
Thanks for the advice. I tried comparing LLVM bitcode, LLVM assembly, and x86 binary with all the files stripped and LZMA-compressed. The compressed LLVM assembly was very slightly smaller than the compressed LLVM bitcode. Both were about 1.45 the size of the compressed native binary. That's not a very exciting ratio. However, perhaps it's interesting that the bitpacking and other ad hoc compression techniques used in LLVM bitcode seem to get in the way of standard compression algorithms. Of course the compression techniques used in LLVM bitcode have the advantage that they allow the data to be selectively parsed. Edmund -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Maybe Matching Threads
- [LLVMdev] Which is more compact, .bc or .ll.gz? And what might be even more compact?
- [LLVMdev] ambiguity of .align
- [LLVMdev] Returning big vectors on ARM broke in rev 103411
- [LLVMdev] ambiguity of .align
- [LLVMdev] Returning big vectors on ARM broke in rev 103411