Dipanjan Das via llvm-dev
2017-May-04 23:47 UTC
[llvm-dev] Computing unique ID of IR instructions that can be mapped back
I am writing an analysis pass on LLVM which requires to: [1] generate unique, positive ID corresponding to each instruction [2] the ID must survive across runs [3] given the ID, corresponding instruction has to be mapped back For [1], the general suggestion is to use the Value* instr_ptr associated to each instruction. The instr_ptr points to specific instruction in memory, hence unique. However, it does change at every run, thus violating [2]. I have a hacking workaround of getting the first instruction of 'main' method and computing the offset. The problem is, this gives me negative offset for instructions lying in lower memory region than 'main'. Even if I circumvent the problem by adding a positive bias high enough, I have no clue how can I map the instr_ptr back to corresponding IR instruction. Can anyone suggest any elegant workaround? -- Thanks & Regards, Dipanjan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170504/0d96ba69/attachment.html>
Daniel Berlin via llvm-dev
2017-May-05 16:20 UTC
[llvm-dev] Computing unique ID of IR instructions that can be mapped back
When you say the ID must survive across runs, you mean that instructions that look the same must have the same ID? Or that the instructions from the first run must have the same ID? Because doing a unique positive ID is easy. Things are always constructed in the order they appear in the file. So using a single counter, walking every function in the module, and then every instruction in the function, and giving each an ID, should be unique, positive, and consistent across runs for the same file. On Thu, May 4, 2017 at 4:47 PM, Dipanjan Das via llvm-dev < llvm-dev at lists.llvm.org> wrote:> > I am writing an analysis pass on LLVM which requires to: > > [1] generate unique, positive ID corresponding to each instruction > [2] the ID must survive across runs > [3] given the ID, corresponding instruction has to be mapped back > > For [1], the general suggestion is to use the Value* instr_ptr associated > to each instruction. The instr_ptr points to specific instruction in > memory, hence unique. However, it does change at every run, thus violating > [2]. I have a hacking workaround of getting the first instruction of 'main' > method and computing the offset. The problem is, this gives me negative > offset for instructions lying in lower memory region than 'main'. Even if I > circumvent the problem by adding a positive bias high enough, I have no > clue how can I map the instr_ptr back to corresponding IR instruction. Can > anyone suggest any elegant workaround? > > -- > > Thanks & Regards, > Dipanjan > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170505/8c3df1eb/attachment.html>
Dipanjan Das via llvm-dev
2017-May-05 16:37 UTC
[llvm-dev] Computing unique ID of IR instructions that can be mapped back
On 5 May 2017 at 09:20, Daniel Berlin <dberlin at dberlin.org> wrote:> When you say the ID must survive across runs, you mean that instructions > that look the same must have the same ID? > Or that the instructions from the first run must have the same ID? >If I don't change the bitcode, each instruction should receive the same ID at every run.> Because doing a unique positive ID is easy. > Things are always constructed in the order they appear in the file. > So using a single counter, walking every function in the module, and then > every instruction in the function, and giving each an ID, should be unique, > positive, and consistent across runs for the same file. >Let's say, I walk the file, increment the counter but how'll I assign an ID to an instruction. Because, I need these IDs during an instrumentation phase. In other words, how will the instrumentation phase know what ID has been assigned to a particular instruction? Because, the instrumentation knows about the instr_ptr (memory address of an instruction) which changes at every run.> > > On Thu, May 4, 2017 at 4:47 PM, Dipanjan Das via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> >> I am writing an analysis pass on LLVM which requires to: >> >> [1] generate unique, positive ID corresponding to each instruction >> [2] the ID must survive across runs >> [3] given the ID, corresponding instruction has to be mapped back >> >> For [1], the general suggestion is to use the Value* instr_ptr associated >> to each instruction. The instr_ptr points to specific instruction in >> memory, hence unique. However, it does change at every run, thus violating >> [2]. I have a hacking workaround of getting the first instruction of 'main' >> method and computing the offset. The problem is, this gives me negative >> offset for instructions lying in lower memory region than 'main'. Even if I >> circumvent the problem by adding a positive bias high enough, I have no >> clue how can I map the instr_ptr back to corresponding IR instruction. Can >> anyone suggest any elegant workaround? >> >> -- >> >> Thanks & Regards, >> Dipanjan >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >-- Thanks & Regards, Dipanjan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170505/dbb7383b/attachment.html>
Robinson, Paul via llvm-dev
2017-May-09 00:40 UTC
[llvm-dev] Computing unique ID of IR instructions that can be mapped back
(adding back llvm-dev) Is there any standard means to add an extra field to LLVM IR instructions? Your description implies that you are not intending to change the on-disk format, so it's simple: It is a class. Change the source to add a field to it. Use it as you wish. --paulr From: its.dipanjan.das at gmail.com [mailto:its.dipanjan.das at gmail.com] On Behalf Of Dipanjan Das Sent: Monday, May 08, 2017 1:45 PM To: Robinson, Paul Subject: Re: [llvm-dev] Computing unique ID of IR instructions that can be mapped back Hi Paul, On 8 May 2017 at 10:50, Robinson, Paul <paul.robinson at sony.com<mailto:paul.robinson at sony.com>> wrote: Let's say, I walk the file, increment the counter but how'll I assign an ID to an instruction. Because, I need these IDs during an instrumentation phase. In other words, how will the instrumentation phase know what ID has been assigned to a particular instruction? Because, the instrumentation knows about the instr_ptr (memory address of an instruction) which changes at every run. Two tactics come to mind. First, you can store the ID directly in the instruction (adding a field to the class) when you assign the IDs. Is there any standard means to add an extra field to LLVM IR instructions? Second, you can build a map of instr_ptr to ID which you store somewhere that your instrumentation phase can use to look up the IDs as needed. As instr_ptr keeps on changing, the mapping isn't not going to work. Presumably you are not looking to modify the bitcode format itself, as you were worried about consistent IDs across runs. --paulr -- Thanks & Regards, Dipanjan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170509/a771e5e6/attachment.html>
Jonathan Roelofs via llvm-dev
2017-May-09 01:14 UTC
[llvm-dev] Computing unique ID of IR instructions that can be mapped back
On 5/8/17 6:40 PM, Robinson, Paul via llvm-dev wrote:> (adding back llvm-dev) > > > > Is there any standard means to add an extra field to LLVM IR instructions? > > > > Your description implies that you are not intending to change the > on-disk format, so it's simple: > > It is a class. Change the source to add a field to it. Use it as you wish.An alternative way to do it, which would be less invasive and more upstreaming friendly, would be to implement this as an analysis pass. In the pass, calculate a map<Instruction*,ID>, and use that as the analysis result. Then query that result for the ID where you need it. If you've written the analysis correctly, then you'll always get the same IDs for the same instructions given the same input IR. Jon> > --paulr > > > > *From:*its.dipanjan.das at gmail.com [mailto:its.dipanjan.das at gmail.com] > *On Behalf Of *Dipanjan Das > *Sent:* Monday, May 08, 2017 1:45 PM > *To:* Robinson, Paul > *Subject:* Re: [llvm-dev] Computing unique ID of IR instructions that > can be mapped back > > > > > > Hi Paul, > > > > On 8 May 2017 at 10:50, Robinson, Paul <paul.robinson at sony.com > <mailto:paul.robinson at sony.com>> wrote: > > Let's say, I walk the file, increment the counter but how'll I assign an > ID to an instruction. Because, I need these IDs during an > instrumentation phase. In other words, how will the instrumentation > phase know what ID has been assigned to a particular instruction? > Because, the instrumentation knows about the instr_ptr (memory address > of an instruction) which changes at every run. > > > > Two tactics come to mind. First, you can store the ID directly in the > instruction (adding a field to the class) when you assign the IDs. > > > > Is there any standard means to add an extra field to LLVM IR instructions? > > > > Second, you can build a map of instr_ptr to ID which you store > somewhere that your instrumentation phase can use to look up the IDs > as needed. > > > > As instr_ptr keeps on changing, the mapping isn't not going to work. > > > > Presumably you are not looking to modify the bitcode format itself, > as you were worried about consistent IDs across runs. > > --paulr > > > > > > -- > > Thanks & Regards, > > Dipanjan > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Jon Roelofs jonathan at codesourcery.com CodeSourcery / Mentor Embedded / Siemens
James Courtier-Dutton via llvm-dev
2017-May-10 12:38 UTC
[llvm-dev] Computing unique ID of IR instructions that can be mapped back
On 5 May 2017 at 00:47, Dipanjan Das via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > I am writing an analysis pass on LLVM which requires to: > > [1] generate unique, positive ID corresponding to each instruction > [2] the ID must survive across runs > [3] given the ID, corresponding instruction has to be mapped back > > For [1], the general suggestion is to use the Value* instr_ptr associated to > each instruction. The instr_ptr points to specific instruction in memory, > hence unique. However, it does change at every run, thus violating [2]. I > have a hacking workaround of getting the first instruction of 'main' method > and computing the offset. The problem is, this gives me negative offset for > instructions lying in lower memory region than 'main'. Even if I circumvent > the problem by adding a positive bias high enough, I have no clue how can I > map the instr_ptr back to corresponding IR instruction. Can anyone suggest > any elegant workaround? >Why not have your analysis pass build its own map. The map would map between the Instruction and an integer index or offset. I.e. int index, Value *, Node *, Mod * Where "Value *" is the reference to the instruction itself, "Node *" is the BB that the instruction is contained in. "Mod *" is the module containing all the Nodes. The index will be the same every time you build the table if the IR has not changed. The analysis code then uses the index to refer back to the instruction's context.
Dipanjan Das via llvm-dev
2017-May-10 17:22 UTC
[llvm-dev] Computing unique ID of IR instructions that can be mapped back
On 10 May 2017 at 05:38, James Courtier-Dutton <james.dutton at gmail.com> wrote:> > Why not have your analysis pass build its own map. > The map would map between the Instruction and an integer index or offset. > I.e. > int index, Value *, Node *, Mod * > Where "Value *" is the reference to the instruction itself, "Node *" > is the BB that the instruction is contained in. "Mod *" is the module > containing all the Nodes. > The index will be the same every time you build the table if the IR > has not changed. > The analysis code then uses the index to refer back to the > instruction's context. >Based on earlier replies on this thread, I plan to proceed in the exact same way. However, it'd be more elegant (implementation-wise) to be able to hook into the IR generation process and make the instructions carry the ID as a metadata in the IR itself. If I implement the way you suggested (and I am planning, too), I need to write the mapping on disk in a separate file to be able to refer back to it later on. -- Thanks & Regards, Dipanjan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170510/de526fa0/attachment.html>
Apparently Analagous Threads
- Computing unique ID of IR instructions that can be mapped back
- Computing unique ID of IR instructions that can be mapped back
- How to pass a StringRef to a function call inserted as instrumentation?
- How does sanitizers in compiler-rt work?
- How to get the address of a global variable in LLVM?