I suspect that the MCSymbol for ‘myvar’ simply represents a position in the
output stream, and certainly it is not directly tied to whatever instructions or
data directives might chance to follow it. You’re capturing the labels and data
directives as they’re emitted, but you’ll have to make the association between
them yourself. The assembler does not consider ‘myvar’ to be a variable name;
it’s the name of an offset within a section, which is (ultimately) an address in
the final object. How it gets used semantically by the program is not the
assembler’s concern.
HTH,
--paulr
From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Pietro
D'Ettole via llvm-dev
Sent: Thursday, February 18, 2021 12:13 PM
To: Brian Cain <brian.cain at gmail.com>; llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] Asm Parser vars extraction
Hi,
Even if that would be a straightforward and good way to proceed, I cannot take
the way you proposed with objdump as the files I'll be dealing with are .s
only, no obj files at all.
So far, I've been able to intercept tha emission of the Label and Value
while parsing; in particular, this has been done through the override of the
virtual void MCStreamer::emitValueImpl(const MCExpr *Value, unsigned Size, SMLoc
Loc = SMLoc()) method.
For what I've been able to read from the doxygen documentation, MCExpr
should have an attribute of MCSymbol * type which points to the symbol it refers
to (in this way it should be easy to match the "myvar" label).
Unfortunately I can't find a way to extract the actual value from it (i.e.
.long directives followed by the values, as shown in the asm file in the last
reply).
Now, is it possible to extract the value the way I've just described? If
yes, how?
I'm open to other suggestions and ways to achieve this, anyway (if any).
Thanks!
Il giorno gio 18 feb 2021 alle ore 05:09 Brian Cain <brian.cain at
gmail.com<mailto:brian.cain at gmail.com>> ha scritto:
Well, one good way to get this is to let the assembler make the object file and
dump the contents of the object file. I've illustrated w/objdump
disassembly but a more robust way might be to use the location and read the data
directly from the object file. See below for an example w/objdump.
But if you can't make the object file for some reason, I think ELFWriter
might have what you need, or at least be a starting point. There is no
arch-independent / object-file-independent way to determine where
"myvar" will end up. So the way I think it makes sense to get this
information is to wait until it's being emitted into the object file. But
this is so late in assembly that it doesn't seem to make as much sense to
modify the assembler so much as you might want to dissect its output. And
there's lots of great tools and libraries for dissecting object files.
However, another approach might be to modify the assembler to look for labels
matching "myvar" and set some kind of mode that lets you accumulate
the subsequent ".word" directives.
Also: you described "myvar" as being a variable in a C program. So
this is compiler-emitted assembly? Maybe it makes more sense to intercept this
value in the compiler instead of in the assembler. It might not be very robust
to try and scoop up .word's - this output could vary and still be legitimate
compiler output.
$ /opt/clang-latest/bin/llvm-mc -filetype=obj -triple=armv7 pietro.S -o pietro.o
$ /opt/clang-latest/bin/llvm-objdump --triple=armv7 -d pietro.o
pietro.o: file format elf32-littlearm
Disassembly of section .text:
...
00000034 <myvar>:
34: 00 00 00 00 .word 0x00000000
38: 01 00 00 00 .word 0x00000001
3c: 02 00 00 00 .word 0x00000002
40: 03 00 00 00 .word 0x00000003
On Wed, Feb 17, 2021 at 10:48 AM Pietro D'Ettole <progettoiotpolimi2019
at gmail.com<mailto:progettoiotpolimi2019 at gmail.com>> wrote:
Hi,
I'll try to give you more context through an example.
What I need is to extract the value of a variable (aka a label) inside an
assembly file; for example given the following assembly (compiled with armv7-a
clang 10.0.0):
---------------------------------------------
main:
sub sp, sp, #12
mov r2, #0
str r2, [sp, #8]
str r0, [sp, #4]
str r1, [sp]
ldr r0, .LCPI0_0
ldr r1, [r0, #4]
ldr r0, [r0, #8]
mul r2, r1, r0
mov r0, r2
add sp, sp, #12
bx lr
.LCPI0_0:
.long myvar
myvar:
.long 0 @ 0x0
.long 1 @ 0x1
.long 2 @ 0x2
.long 3 @ 0x3
---------------------------------------------
As you can see there's a label, namely "myvar", which is a
variable in the C program. What I'm trying to achieve is, while parsing with
the Asm Parser, get the parsed value of "myvar" out of the asm file
(i.e. get the values of 0x0, 0x1, 0x2, 0x3). Is this possible?
Thanks a lot!
[https://ipmcdn.avast.com/images/icons/icon-envelope-tick-green-avg-v1.png]<https://urldefense.com/v3/__http:/www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail__;!!JmoZiZGBv3RvKRSx!qgVwD7iMmy2B0XeAZCRNKis4oRthX9uZ1wrhAa-RmgXOd-d5G6CJVIxa_398yT8Yrw$>
Mail priva di virus.
www.avg.com<https://urldefense.com/v3/__http:/www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail__;!!JmoZiZGBv3RvKRSx!qgVwD7iMmy2B0XeAZCRNKis4oRthX9uZ1wrhAa-RmgXOd-d5G6CJVIxa_398yT8Yrw$>
Il giorno mar 9 feb 2021 alle ore 03:01 Brian Cain <brian.cain at
gmail.com<mailto:brian.cain at gmail.com>> ha scritto:
I don't quite know exactly, but I suppose one way would be to modify
ELFWriter::writeSymbol() to emit something when a symbol appears that matches
your criteria. I'm taking some liberty here assuming you can use an ELF
object file. I imagine there's something similar for macho/coff.
Then again, that information is present in the object file too. You could use
llvm-readelf or obj2yaml to extract what you want.
Maybe you could give a little more context about how you plan to use the info
and the community could offer a better answer.
On Mon, Feb 8, 2021 at 5:03 PM Pietro D'Ettole <progettoiotpolimi2019 at
gmail.com<mailto:progettoiotpolimi2019 at gmail.com>> wrote:
Hi Brian, thanks for your reply.
My goal is to be able to extract from an assembly file (i.e. source compiled to
assembly, directly) the static global vars declared in the source. So far I
haven't found any API in the llvm asm parser to serve my purpose. Do you
know if/ how I can accomplish that?
Thank you.
Il giorno sabato 6 febbraio 2021, Brian Cain <brian.cain at
gmail.com<mailto:brian.cain at gmail.com>> ha scritto:
You need to extract it from the source? Is it possible to use the resulting
object file instead?
Note that there's no way to get an 'address' but you can get a
section offset. The section offset for both code and data are available in the
object file and from the assembler as it writes the object file.
The contents of the assembly file - instructions and directives - contribute to
the resulting layout. The AsmParser can find tokens and build instructions but
shouldn't know how it will get layed out.
On Sat, Feb 6, 2021, 3:16 AM Pietro D'Ettole via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
Hi all,
I'm a little bit stuck in code reading right now. Maybe some of you can help
me to understand fast if what I need is feasible or not.
I was trying to understand if currently asm parser in llvm allows to easily
extract from an asm file variable names alongside with their values and their
addresses.
Thanks for the help!
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev<https://urldefense.com/v3/__https:/lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev__;!!JmoZiZGBv3RvKRSx!qgVwD7iMmy2B0XeAZCRNKis4oRthX9uZ1wrhAa-RmgXOd-d5G6CJVIxa_3-cSqTCSQ$>
--
-Brian
--
-Brian
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210218/d678af9d/attachment.html>