OK. Let's try a specific example: At least for ELF files, GNU objdump prints operand values in hex. AFAIK, hex is not just the default, but the only choice. On the other hand, llvm-objdump prints operand values in decimal and ignores the --print-imm-hex option for ELF. How about a patch to print operands in hex for ELF? Good place to start? On Mon, Dec 1, 2014 at 5:49 PM, Kevin Enderby <enderby at apple.com> wrote:> There currently is a -macho option to llvm-objdump to "Use MachO specific object file parser” which I’m hiding the disassembly stuff specific for Mach-O behind. Currently it is only used with the -disassemble option. But one could see it to be used for other stuff. But as Jim points out the output today for some things is controlled by the container which is what is done for things like -private-headers . There are flags like -exports-trie, -rebase, -bind, etc that are really Mach-O options. > > As far as the symbolizing work it can be relevant for ELF files and the code I did can be used as a model for hooking it up for ELF files. But the real work of the call backs are very specific to each type of object file. > > Kev > > On Dec 1, 2014, at 5:24 PM, Jim Grosbach <grosbach at apple.com> wrote: > >> At least for now, I don’t expect it to become all that unwieldy. Any behavioral differences should be easily separable into different classes and source files. If as things progress it becomes obvious that there’s really not much of anything in common other than the general nature of the tools, it’s easy to split them apart. >> >> -Jim >> >>> On Dec 1, 2014, at 5:20 PM, Steve King <steve at metrokings.com> wrote: >>> >>> Hi guys, thanks for responding. Will mimicking both otool and objdump >>> in one binary become unwieldy? Maybe a disassembler library would be >>> a better way to factor out common code? For example, will Kevin's >>> symbolizing work be relevant for ELF files? >>> Regards, >>> -steve >>> >>> >>> On Mon, Dec 1, 2014 at 4:50 PM, Jim Grosbach <grosbach at apple.com> wrote: >>>> Hey folks, >>>> >>>> This is great to see more interest on the supporting tools like objdump and such. I very much agree that bringing llvm-objdump up to feature parity (to start with) compared to both otool(1) and objdump(1) is a great goal. The default output formatting is easy enough to get right by having it be controlled by the container format (otool style for macho, objdump style for ELF). Kevin’s right that where this gets a bit interesting is command line option handling. The prevailing wisdom from clang and lld so far seems to the alternatives Kevin mentions of sniffing argv[0] and/or having a —flavor or —format option. IMO, for now we can just do the latter, which is the simpler thing, while we get the real functionality in place. Then when we’re ready to, optionally as packagers decide to opt-in, use llvm-objdump to replace the system version, we can figure out the right way to make that transition nice and clean. >>>> >>>> -jim >>>> >>>> >>>>> On Dec 1, 2014, at 4:40 PM, Kevin Enderby <enderby at apple.com> wrote: >>>>> >>>>> Hi Steve, >>>>> >>>>> I’ve been trying to get the functionality of llvm-objdump to match that of darwin’s otool(1). In adding the support for symbolic disassembly and to allow testing of it on very large files that would allow the disassembly to diff cleanly, I added a few options to llvm-objdump and to tool(1). For example these would be the two command lines I would use for testing: >>>>> >>>>> llvm-objdump -d -m -no-show-raw-insn -full-leading-addr -print-imm-hex … >>>>> otool -tV -U -no-show-raw-insn … >>>>> >>>>> Longest term I hope to see llvm-objdump take over all of darwin’s otool(1) functionality. Not sure the best way of going this for command line options as the trick of passing them differently based on argv[0] may not work. There may need to be some wrapper to do that. And also their may need to be some option like llvm-nm’s "-format XXX” to get the output to match so scrips can use the output. >>>>> >>>>> I’ve Cc’ed Jim Grosbach as he may have some guidance on this. >>>>> >>>>> My thoughts, >>>>> Kev >>>>> >>>>> On Dec 1, 2014, at 4:20 PM, Steve King <steve at metrokings.com> wrote: >>>>> >>>>>> Hello LLVM, >>>>>> >>>>>> Previously, some folks wanted llvm-objdump to behave more like GNU >>>>>> objdump. This could encompass both command line options and output >>>>>> format. Such a change helps developers already familiar with GNU >>>>>> tools and allows re-use of Perl scripts or other automation expecting >>>>>> to see GNU style dumps. >>>>>> >>>>>> Is moving llvm-objdump toward GNU objdump the general preference? And >>>>>> what about otools style output? >>>>>> >>>>>> Regards, >>>>>> -steve >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>> >>>> >>>> >> > >
On Dec 3, 2014, at 3:12 PM, Steve King <steve at metrokings.com> wrote:> OK. Let's try a specific example: At least for ELF files, GNU > objdump prints operand values in hex. AFAIK, hex is not just the > default, but the only choice. On the other hand, llvm-objdump prints > operand values in decimal and ignores the --print-imm-hex option for > ELF. > > How about a patch to print operands in hex for ELF? Good place to start?Seem like a good place to start if you want to create a patch that honors the --print-imm-hex option for ELF files. At one point I had to I hooked up the existing -no-show-raw-insn option to the Mach-O parser code in llvm-objdump to allow me to test its output against darwin’s otool(1). And later even had to add the -no-show-raw-insn option to darwin’s otool(1) so that arm64 code could also be diff’ed. In talking to Jim Grosbach today, the idea is to first get all the functionality implemented. Then later worry about getting the packaging stuff like the defaults for all the options to match the native tool we are trying to replace.> > On Mon, Dec 1, 2014 at 5:49 PM, Kevin Enderby <enderby at apple.com> wrote: >> There currently is a -macho option to llvm-objdump to "Use MachO specific object file parser” which I’m hiding the disassembly stuff specific for Mach-O behind. Currently it is only used with the -disassemble option. But one could see it to be used for other stuff. But as Jim points out the output today for some things is controlled by the container which is what is done for things like -private-headers . There are flags like -exports-trie, -rebase, -bind, etc that are really Mach-O options. >> >> As far as the symbolizing work it can be relevant for ELF files and the code I did can be used as a model for hooking it up for ELF files. But the real work of the call backs are very specific to each type of object file. >> >> Kev >> >> On Dec 1, 2014, at 5:24 PM, Jim Grosbach <grosbach at apple.com> wrote: >> >>> At least for now, I don’t expect it to become all that unwieldy. Any behavioral differences should be easily separable into different classes and source files. If as things progress it becomes obvious that there’s really not much of anything in common other than the general nature of the tools, it’s easy to split them apart. >>> >>> -Jim >>> >>>> On Dec 1, 2014, at 5:20 PM, Steve King <steve at metrokings.com> wrote: >>>> >>>> Hi guys, thanks for responding. Will mimicking both otool and objdump >>>> in one binary become unwieldy? Maybe a disassembler library would be >>>> a better way to factor out common code? For example, will Kevin's >>>> symbolizing work be relevant for ELF files? >>>> Regards, >>>> -steve >>>> >>>> >>>> On Mon, Dec 1, 2014 at 4:50 PM, Jim Grosbach <grosbach at apple.com> wrote: >>>>> Hey folks, >>>>> >>>>> This is great to see more interest on the supporting tools like objdump and such. I very much agree that bringing llvm-objdump up to feature parity (to start with) compared to both otool(1) and objdump(1) is a great goal. The default output formatting is easy enough to get right by having it be controlled by the container format (otool style for macho, objdump style for ELF). Kevin’s right that where this gets a bit interesting is command line option handling. The prevailing wisdom from clang and lld so far seems to the alternatives Kevin mentions of sniffing argv[0] and/or having a —flavor or —format option. IMO, for now we can just do the latter, which is the simpler thing, while we get the real functionality in place. Then when we’re ready to, optionally as packagers decide to opt-in, use llvm-objdump to replace the system version, we can figure out the right way to make that transition nice and clean. >>>>> >>>>> -jim >>>>> >>>>> >>>>>> On Dec 1, 2014, at 4:40 PM, Kevin Enderby <enderby at apple.com> wrote: >>>>>> >>>>>> Hi Steve, >>>>>> >>>>>> I’ve been trying to get the functionality of llvm-objdump to match that of darwin’s otool(1). In adding the support for symbolic disassembly and to allow testing of it on very large files that would allow the disassembly to diff cleanly, I added a few options to llvm-objdump and to tool(1). For example these would be the two command lines I would use for testing: >>>>>> >>>>>> llvm-objdump -d -m -no-show-raw-insn -full-leading-addr -print-imm-hex … >>>>>> otool -tV -U -no-show-raw-insn … >>>>>> >>>>>> Longest term I hope to see llvm-objdump take over all of darwin’s otool(1) functionality. Not sure the best way of going this for command line options as the trick of passing them differently based on argv[0] may not work. There may need to be some wrapper to do that. And also their may need to be some option like llvm-nm’s "-format XXX” to get the output to match so scrips can use the output. >>>>>> >>>>>> I’ve Cc’ed Jim Grosbach as he may have some guidance on this. >>>>>> >>>>>> My thoughts, >>>>>> Kev >>>>>> >>>>>> On Dec 1, 2014, at 4:20 PM, Steve King <steve at metrokings.com> wrote: >>>>>> >>>>>>> Hello LLVM, >>>>>>> >>>>>>> Previously, some folks wanted llvm-objdump to behave more like GNU >>>>>>> objdump. This could encompass both command line options and output >>>>>>> format. Such a change helps developers already familiar with GNU >>>>>>> tools and allows re-use of Perl scripts or other automation expecting >>>>>>> to see GNU style dumps. >>>>>>> >>>>>>> Is moving llvm-objdump toward GNU objdump the general preference? And >>>>>>> what about otools style output? >>>>>>> >>>>>>> Regards, >>>>>>> -steve >>>>>>> _______________________________________________ >>>>>>> LLVM Developers mailing list >>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>> >>>>> >>>>> >>> >> >>
On Wed, Dec 3, 2014 at 5:09 PM, Kevin Enderby <enderby at apple.com> wrote:> Seem like a good place to start if you want to create a patch > that honors the --print-imm-hex option for ELF files.Let's skip that. Piecemeal format controls like --print-imm-hex are too problematic since they pile up quickly and require nitpick checks in the target's InstPrinter code. I see you've already got at least three on your command line: -no-show-raw-insn -full-leading-addr -print-imm-hex ... Since each target controls it's own operand format, each target decides how closely to conform to whatever style matters most to them. This will involve more fine grain formatting issues than we'll want to control on the command line. In keeping with the main idea bouncing around, how about a global style enum with "GNU", "OTOOL", etc. available in MCInstPrinter()? llvm-objdump can set the global style automatically based on the binary's container. The user can override the default on the command line. Targets can check the hint and then do that they know to be best. Regards, -steve
Another wrinkle is that hex values are parsed as unsigned. For example, take this instruction on x86: 83 c0 9c addl $-100, %eax Keeping to imm8, -100 is a 0x9C in hex. Suppose llvm-objdump disassembled the instruction this way: addl $0x9C,%eax Re-assembling results in a different and wrong instruction. 05 9c 00 00 00 addl $0x9C, %eax I assume we have a golden rule that reassembling our disassembly should get back to the same binary. For reference, GNU objdump prints hex but knows the width of operand and extends appropriately: 83 c0 9c add $0xffffff9c,%eax 05 9c 00 00 00 add $0x9c,%eax Unfortunately, the logical operand width doesn't seem to be handy in the target InstPrinter code. Any ideas how best find the logical operand width? Regards, -steve On Wed, Dec 3, 2014 at 5:09 PM, Kevin Enderby <enderby at apple.com> wrote:> > On Dec 3, 2014, at 3:12 PM, Steve King <steve at metrokings.com> wrote: > >> OK. Let's try a specific example: At least for ELF files, GNU >> objdump prints operand values in hex. AFAIK, hex is not just the >> default, but the only choice. On the other hand, llvm-objdump prints >> operand values in decimal and ignores the --print-imm-hex option for >> ELF. >> >> How about a patch to print operands in hex for ELF? Good place to start? > > Seem like a good place to start if you want to create a patch that honors the --print-imm-hex option for ELF files. > > At one point I had to I hooked up the existing -no-show-raw-insn option to the Mach-O parser code in llvm-objdump to allow me to test its output against darwin’s otool(1). And later even had to add the -no-show-raw-insn option to darwin’s otool(1) so that arm64 code could also be diff’ed. > > In talking to Jim Grosbach today, the idea is to first get all the functionality implemented. Then later worry about getting the packaging stuff like the defaults for all the options to match the native tool we are trying to replace. > >> >> On Mon, Dec 1, 2014 at 5:49 PM, Kevin Enderby <enderby at apple.com> wrote: >>> There currently is a -macho option to llvm-objdump to "Use MachO specific object file parser” which I’m hiding the disassembly stuff specific for Mach-O behind. Currently it is only used with the -disassemble option. But one could see it to be used for other stuff. But as Jim points out the output today for some things is controlled by the container which is what is done for things like -private-headers . There are flags like -exports-trie, -rebase, -bind, etc that are really Mach-O options. >>> >>> As far as the symbolizing work it can be relevant for ELF files and the code I did can be used as a model for hooking it up for ELF files. But the real work of the call backs are very specific to each type of object file. >>> >>> Kev >>> >>> On Dec 1, 2014, at 5:24 PM, Jim Grosbach <grosbach at apple.com> wrote: >>> >>>> At least for now, I don’t expect it to become all that unwieldy. Any behavioral differences should be easily separable into different classes and source files. If as things progress it becomes obvious that there’s really not much of anything in common other than the general nature of the tools, it’s easy to split them apart. >>>> >>>> -Jim >>>> >>>>> On Dec 1, 2014, at 5:20 PM, Steve King <steve at metrokings.com> wrote: >>>>> >>>>> Hi guys, thanks for responding. Will mimicking both otool and objdump >>>>> in one binary become unwieldy? Maybe a disassembler library would be >>>>> a better way to factor out common code? For example, will Kevin's >>>>> symbolizing work be relevant for ELF files? >>>>> Regards, >>>>> -steve >>>>> >>>>> >>>>> On Mon, Dec 1, 2014 at 4:50 PM, Jim Grosbach <grosbach at apple.com> wrote: >>>>>> Hey folks, >>>>>> >>>>>> This is great to see more interest on the supporting tools like objdump and such. I very much agree that bringing llvm-objdump up to feature parity (to start with) compared to both otool(1) and objdump(1) is a great goal. The default output formatting is easy enough to get right by having it be controlled by the container format (otool style for macho, objdump style for ELF). Kevin’s right that where this gets a bit interesting is command line option handling. The prevailing wisdom from clang and lld so far seems to the alternatives Kevin mentions of sniffing argv[0] and/or having a —flavor or —format option. IMO, for now we can just do the latter, which is the simpler thing, while we get the real functionality in place. Then when we’re ready to, optionally as packagers decide to opt-in, use llvm-objdump to replace the system version, we can figure out the right way to make that transition nice and clean. >>>>>> >>>>>> -jim >>>>>> >>>>>> >>>>>>> On Dec 1, 2014, at 4:40 PM, Kevin Enderby <enderby at apple.com> wrote: >>>>>>> >>>>>>> Hi Steve, >>>>>>> >>>>>>> I’ve been trying to get the functionality of llvm-objdump to match that of darwin’s otool(1). In adding the support for symbolic disassembly and to allow testing of it on very large files that would allow the disassembly to diff cleanly, I added a few options to llvm-objdump and to tool(1). For example these would be the two command lines I would use for testing: >>>>>>> >>>>>>> llvm-objdump -d -m -no-show-raw-insn -full-leading-addr -print-imm-hex … >>>>>>> otool -tV -U -no-show-raw-insn … >>>>>>> >>>>>>> Longest term I hope to see llvm-objdump take over all of darwin’s otool(1) functionality. Not sure the best way of going this for command line options as the trick of passing them differently based on argv[0] may not work. There may need to be some wrapper to do that. And also their may need to be some option like llvm-nm’s "-format XXX” to get the output to match so scrips can use the output. >>>>>>> >>>>>>> I’ve Cc’ed Jim Grosbach as he may have some guidance on this. >>>>>>> >>>>>>> My thoughts, >>>>>>> Kev >>>>>>> >>>>>>> On Dec 1, 2014, at 4:20 PM, Steve King <steve at metrokings.com> wrote: >>>>>>> >>>>>>>> Hello LLVM, >>>>>>>> >>>>>>>> Previously, some folks wanted llvm-objdump to behave more like GNU >>>>>>>> objdump. This could encompass both command line options and output >>>>>>>> format. Such a change helps developers already familiar with GNU >>>>>>>> tools and allows re-use of Perl scripts or other automation expecting >>>>>>>> to see GNU style dumps. >>>>>>>> >>>>>>>> Is moving llvm-objdump toward GNU objdump the general preference? And >>>>>>>> what about otools style output? >>>>>>>> >>>>>>>> Regards, >>>>>>>> -steve >>>>>>>> _______________________________________________ >>>>>>>> LLVM Developers mailing list >>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>> >>>>>> >>>>>> >>>> >>> >>> > >