Nicolas Agostini via llvm-dev
2017-Aug-29 22:10 UTC
[llvm-dev] Is the flow "llvm-extract -> llvm-link -> clang++ " supposed to be used in this way? To Extract and Re-insert functions?
Hi all, First post to the list, I hope you can help or guide me on this task. I am involved in a project that requires to re-link extracted and edited IR code Thus I want to know if these tools can be used in this way? clang++-4.0 code03.cpp -emit-llvm -S -o code03.ll llvm-extract-4.0 code03.ll -func main -S -o extracted_main.ll llvm-link-4.0 code03.ll -only-needed -override extracted_main.ll -S -o linked_main.ll clang++-4.0 linked_main.ll -o main.out where code03.cpp is: #include <iostream>> using namespace std; > int main() > { > cout << "First Message\n "; > cout << "Second Message\n "; > cout << "Third Message\n "; > return 0; > }I have been trying to extract a function's llvm IR, modify it preserving its signature (or not), and re-insert this function back to the original IR file, however I am getting an error during the compilation step ( clang++-4.0 linked_main.ll -o main.out ): main.ll:(.text+0x14): undefined reference to `.str'> main.ll:(.text+0x34): undefined reference to `.str.1' > main.ll:(.text+0x51): undefined reference to `.str.2'and linked_main.ll file has this section: @.str.4 = private unnamed_addr constant [16 x i8] c"First Message\0A \00",> align 1 > @.str.1.6 = private unnamed_addr constant [17 x i8] c"Second Message\0A > \00", align 1 > @.str.2.8 = private unnamed_addr constant [16 x i8] c"Third Message\0A > \00", align 1 > @.str = external hidden unnamed_addr constant [16 x i8], align 1 > @.str.1 = external hidden unnamed_addr constant [17 x i8], align 1 > @.str.2 = external hidden unnamed_addr constant [16 x i8], align 1But the function does not use the correct versions of the strings as the linked "extracted_main" keeps making calls to .str, .str.1, .str.2? Am I not supposed to do it this way? Thank you in advance - nico -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170829/e83799e0/attachment.html>
Nicolas Agostini via llvm-dev
2017-Aug-30 18:53 UTC
[llvm-dev] Is the flow "llvm-extract -> llvm-link -> clang++ " supposed to be used in this way? To Extract and Re-insert functions?
After trying different things, I realized that I should modify the visibility of the conflicting variables on the target linked.ll file to hidden, before calling the linker. This can be easily done by calling llvm-extract with the delete option to prepare a file to receive the linked function llvm-extract-4.0 code03.ll -func main -S -o extracted_main.ll> lvm-extract-4.0 code03.ll -func main -delete -S -o linked.ll > llvm-link-4.0 linked.ll -only-needed -override extracted_main.ll -S -o > linked_main.llThis works great for a single module compilation. But what are the effects if I have several modules? Thanks, - nico 2017-08-29 17:10 GMT-05:00 Nicolas Agostini <n.b.agostini at gmail.com>:> Hi all, > First post to the list, I hope you can help or guide me on this task. > > I am involved in a project that requires to re-link extracted and edited > IR code > > Thus I want to know if these tools can be used in this way? > > clang++-4.0 code03.cpp -emit-llvm -S -o code03.ll > llvm-extract-4.0 code03.ll -func main -S -o extracted_main.ll > llvm-link-4.0 code03.ll -only-needed -override extracted_main.ll -S -o > linked_main.ll > clang++-4.0 linked_main.ll -o main.out > > > where code03.cpp is: > > #include <iostream> >> using namespace std; >> int main() >> { >> cout << "First Message\n "; >> cout << "Second Message\n "; >> cout << "Third Message\n "; >> return 0; >> } > > > > I have been trying to extract a function's llvm IR, modify it preserving > its signature (or not), and re-insert this function back to the original IR > file, however I am getting an error during the compilation step ( > clang++-4.0 linked_main.ll -o main.out ): > > main.ll:(.text+0x14): undefined reference to `.str' >> main.ll:(.text+0x34): undefined reference to `.str.1' >> main.ll:(.text+0x51): undefined reference to `.str.2' > > > and linked_main.ll file has this section: > > @.str.4 = private unnamed_addr constant [16 x i8] c"First Message\0A \00", >> align 1 >> @.str.1.6 = private unnamed_addr constant [17 x i8] c"Second Message\0A >> \00", align 1 >> @.str.2.8 = private unnamed_addr constant [16 x i8] c"Third Message\0A >> \00", align 1 >> @.str = external hidden unnamed_addr constant [16 x i8], align 1 >> @.str.1 = external hidden unnamed_addr constant [17 x i8], align 1 >> @.str.2 = external hidden unnamed_addr constant [16 x i8], align 1 > > > > But the function does not use the correct versions of the strings as the > linked "extracted_main" keeps making calls to .str, .str.1, .str.2? Am I > not supposed to do it this way? > > Thank you in advance > > - nico > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170830/1c696079/attachment.html>
Davide Italiano via llvm-dev
2017-Aug-30 19:26 UTC
[llvm-dev] Is the flow "llvm-extract -> llvm-link -> clang++ " supposed to be used in this way? To Extract and Re-insert functions?
On Tue, Aug 29, 2017 at 3:10 PM, Nicolas Agostini via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Hi all, > First post to the list, I hope you can help or guide me on this task. > > I am involved in a project that requires to re-link extracted and edited IR > code > > Thus I want to know if these tools can be used in this way? > > clang++-4.0 code03.cpp -emit-llvm -S -o code03.ll > llvm-extract-4.0 code03.ll -func main -S -o extracted_main.ll > llvm-link-4.0 code03.ll -only-needed -override extracted_main.ll -S -o > linked_main.ll > clang++-4.0 linked_main.ll -o main.out > > > where code03.cpp is: > >> #include <iostream> >> using namespace std; >> int main() >> { >> cout << "First Message\n "; >> cout << "Second Message\n "; >> cout << "Third Message\n "; >> return 0; >> } > > > > I have been trying to extract a function's llvm IR, modify it preserving its > signature (or not), and re-insert this function back to the original IR > file, however I am getting an error during the compilation step ( > clang++-4.0 linked_main.ll -o main.out ): > >> main.ll:(.text+0x14): undefined reference to `.str' >> main.ll:(.text+0x34): undefined reference to `.str.1' >> main.ll:(.text+0x51): undefined reference to `.str.2' > > > and linked_main.ll file has this section: > >> @.str.4 = private unnamed_addr constant [16 x i8] c"First Message\0A \00", >> align 1 >> @.str.1.6 = private unnamed_addr constant [17 x i8] c"Second Message\0A >> \00", align 1 >> @.str.2.8 = private unnamed_addr constant [16 x i8] c"Third Message\0A >> \00", align 1 >> @.str = external hidden unnamed_addr constant [16 x i8], align 1 >> @.str.1 = external hidden unnamed_addr constant [17 x i8], align 1 >> @.str.2 = external hidden unnamed_addr constant [16 x i8], align 1 > > > > But the function does not use the correct versions of the strings as the > linked "extracted_main" keeps making calls to .str, .str.1, .str.2? Am I not > supposed to do it this way? >llvm-extract changes the semantic as it gives every GlobalValue external linkage for simplicity. Therefore, if you have GVs with internal linkage when you run llvm-extract that information is lost. At least, you may want to fix this, the relevant code is around here (Transforms/IPO/ExtractGV.cpp) ``` // For simplicity, just give all GlobalValues ExternalLinkage. A trickier // implementation could figure out which GlobalValues are actually // referenced by the Named set, and which GlobalValues in the rest of // the module are referenced by the NamedSet, and get away with leaving // more internal and private things internal and private. But for now, // be conservative and simple. // Visit the GlobalVariables. for (Module::global_iterator I = M.global_begin(), E = M.global_end(); I != E; ++I) { bool Delete deleteStuff == (bool)Named.count(&*I) && !I->isDeclaration(); if (!Delete) { if (I->hasAvailableExternallyLinkage()) continue; if (I->getName() == "llvm.global_ctors") continue; } ``` Thanks, -- Davide "There are no solved problems; there are only problems that are more or less solved" -- Henri Poincare
UE US via llvm-dev
2017-Aug-30 19:27 UTC
[llvm-dev] Is the flow "llvm-extract -> llvm-link -> clang++ " supposed to be used in this way? To Extract and Re-insert functions?
I'm a bit rusty so forgive me if I'm not making sense, but wouldn't you want to extract the string literals along with the function in this case and re-link both later? Gordon Keiser, Software Delousing Engineer On Wed, Aug 30, 2017 at 2:53 PM, Nicolas Agostini via llvm-dev < llvm-dev at lists.llvm.org> wrote:> After trying different things, I realized that I should modify the > visibility of the conflicting > variables on the target linked.ll file to hidden, before calling the > linker. > > This can be easily done by calling llvm-extract with the delete option to > prepare a file to > receive the linked function > > llvm-extract-4.0 code03.ll -func main -S -o extracted_main.ll >> lvm-extract-4.0 code03.ll -func main -delete -S -o linked.ll >> llvm-link-4.0 linked.ll -only-needed -override extracted_main.ll -S -o >> linked_main.ll > > > This works great for a single module compilation. > But what are the effects if I have several modules? > > Thanks, > - nico > > 2017-08-29 17:10 GMT-05:00 Nicolas Agostini <n.b.agostini at gmail.com>: > >> Hi all, >> First post to the list, I hope you can help or guide me on this task. >> >> I am involved in a project that requires to re-link extracted and edited >> IR code >> >> Thus I want to know if these tools can be used in this way? >> >> clang++-4.0 code03.cpp -emit-llvm -S -o code03.ll >> llvm-extract-4.0 code03.ll -func main -S -o extracted_main.ll >> llvm-link-4.0 code03.ll -only-needed -override extracted_main.ll -S -o >> linked_main.ll >> clang++-4.0 linked_main.ll -o main.out >> >> >> where code03.cpp is: >> >> #include <iostream> >>> using namespace std; >>> int main() >>> { >>> cout << "First Message\n "; >>> cout << "Second Message\n "; >>> cout << "Third Message\n "; >>> return 0; >>> } >> >> >> >> I have been trying to extract a function's llvm IR, modify it preserving >> its signature (or not), and re-insert this function back to the original IR >> file, however I am getting an error during the compilation step ( >> clang++-4.0 linked_main.ll -o main.out ): >> >> main.ll:(.text+0x14): undefined reference to `.str' >>> main.ll:(.text+0x34): undefined reference to `.str.1' >>> main.ll:(.text+0x51): undefined reference to `.str.2' >> >> >> and linked_main.ll file has this section: >> >> @.str.4 = private unnamed_addr constant [16 x i8] c"First Message\0A >>> \00", align 1 >>> @.str.1.6 = private unnamed_addr constant [17 x i8] c"Second Message\0A >>> \00", align 1 >>> @.str.2.8 = private unnamed_addr constant [16 x i8] c"Third Message\0A >>> \00", align 1 >>> @.str = external hidden unnamed_addr constant [16 x i8], align 1 >>> @.str.1 = external hidden unnamed_addr constant [17 x i8], align 1 >>> @.str.2 = external hidden unnamed_addr constant [16 x i8], align 1 >> >> >> >> But the function does not use the correct versions of the strings as the >> linked "extracted_main" keeps making calls to .str, .str.1, .str.2? Am I >> not supposed to do it this way? >> >> Thank you in advance >> >> - nico >> >> >> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170830/e6e2a8f6/attachment.html>
Davide Italiano via llvm-dev
2017-Aug-30 19:29 UTC
[llvm-dev] Is the flow "llvm-extract -> llvm-link -> clang++ " supposed to be used in this way? To Extract and Re-insert functions?
On Wed, Aug 30, 2017 at 12:26 PM, Davide Italiano <davide at freebsd.org> wrote:> On Tue, Aug 29, 2017 at 3:10 PM, Nicolas Agostini via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> Hi all, >> First post to the list, I hope you can help or guide me on this task. >> >> I am involved in a project that requires to re-link extracted and edited IR >> code >> >> Thus I want to know if these tools can be used in this way? >> >> clang++-4.0 code03.cpp -emit-llvm -S -o code03.ll >> llvm-extract-4.0 code03.ll -func main -S -o extracted_main.ll >> llvm-link-4.0 code03.ll -only-needed -override extracted_main.ll -S -o >> linked_main.ll >> clang++-4.0 linked_main.ll -o main.out >> >> >> where code03.cpp is: >> >>> #include <iostream> >>> using namespace std; >>> int main() >>> { >>> cout << "First Message\n "; >>> cout << "Second Message\n "; >>> cout << "Third Message\n "; >>> return 0; >>> } >> >> >> >> I have been trying to extract a function's llvm IR, modify it preserving its >> signature (or not), and re-insert this function back to the original IR >> file, however I am getting an error during the compilation step ( >> clang++-4.0 linked_main.ll -o main.out ): >> >>> main.ll:(.text+0x14): undefined reference to `.str' >>> main.ll:(.text+0x34): undefined reference to `.str.1' >>> main.ll:(.text+0x51): undefined reference to `.str.2' >> >> >> and linked_main.ll file has this section: >> >>> @.str.4 = private unnamed_addr constant [16 x i8] c"First Message\0A \00", >>> align 1 >>> @.str.1.6 = private unnamed_addr constant [17 x i8] c"Second Message\0A >>> \00", align 1 >>> @.str.2.8 = private unnamed_addr constant [16 x i8] c"Third Message\0A >>> \00", align 1 >>> @.str = external hidden unnamed_addr constant [16 x i8], align 1 >>> @.str.1 = external hidden unnamed_addr constant [17 x i8], align 1 >>> @.str.2 = external hidden unnamed_addr constant [16 x i8], align 1 >> >> >> >> But the function does not use the correct versions of the strings as the >> linked "extracted_main" keeps making calls to .str, .str.1, .str.2? Am I not >> supposed to do it this way? >> > > llvm-extract changes the semantic as it gives every GlobalValue > external linkage for simplicity. > Therefore, if you have GVs with internal linkage when you run > llvm-extract that information is lost. > At least, you may want to fix this, the relevant code is around here > (Transforms/IPO/ExtractGV.cpp) > > ``` > // For simplicity, just give all GlobalValues ExternalLinkage. A trickier > // implementation could figure out which GlobalValues are actually > // referenced by the Named set, and which GlobalValues in the rest of > // the module are referenced by the NamedSet, and get away with leaving > // more internal and private things internal and private. But for now, > // be conservative and simple. > > // Visit the GlobalVariables. > for (Module::global_iterator I = M.global_begin(), E = M.global_end(); > I != E; ++I) { > bool Delete > deleteStuff == (bool)Named.count(&*I) && !I->isDeclaration(); > if (!Delete) { > if (I->hasAvailableExternallyLinkage()) > continue; > if (I->getName() == "llvm.global_ctors") > continue; > } > ``` > > Thanks, >I forgot, but apparently I had a bug open about this a while ago https://bugs.llvm.org/show_bug.cgi?id=31674 -- Davide
Nicolas Agostini via llvm-dev
2017-Aug-30 20:20 UTC
[llvm-dev] Is the flow "llvm-extract -> llvm-link -> clang++ " supposed to be used in this way? To Extract and Re-insert functions?
Hi Davide, thank you for the answer. Therefore, if you have GVs with internal linkage when you run> llvm-extract that information is lost.Which means that if I try to link another module I may have overlapping variables? My apologies but my knowledge of the linker is not as solid as I wished. At least, you may want to fix this, the relevant code is around here> (Transforms/IPO/ExtractGV.cpp)I will take a close look and do some experimentation. I forgot, but apparently I had a bug open about this a while ago> https://bugs.llvm.org/show_bug.cgi?id=31674 >I believe this falls under Gordon's comment right? Do we really want to extract this variable with the function? Quick question: What internalization means on this scope? - nico 2017-08-30 14:26 GMT-05:00 Davide Italiano <davide at freebsd.org>:> On Tue, Aug 29, 2017 at 3:10 PM, Nicolas Agostini via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > Hi all, > > First post to the list, I hope you can help or guide me on this task. > > > > I am involved in a project that requires to re-link extracted and edited > IR > > code > > > > Thus I want to know if these tools can be used in this way? > > > > clang++-4.0 code03.cpp -emit-llvm -S -o code03.ll > > llvm-extract-4.0 code03.ll -func main -S -o extracted_main.ll > > llvm-link-4.0 code03.ll -only-needed -override extracted_main.ll -S -o > > linked_main.ll > > clang++-4.0 linked_main.ll -o main.out > > > > > > where code03.cpp is: > > > >> #include <iostream> > >> using namespace std; > >> int main() > >> { > >> cout << "First Message\n "; > >> cout << "Second Message\n "; > >> cout << "Third Message\n "; > >> return 0; > >> } > > > > > > > > I have been trying to extract a function's llvm IR, modify it preserving > its > > signature (or not), and re-insert this function back to the original IR > > file, however I am getting an error during the compilation step ( > > clang++-4.0 linked_main.ll -o main.out ): > > > >> main.ll:(.text+0x14): undefined reference to `.str' > >> main.ll:(.text+0x34): undefined reference to `.str.1' > >> main.ll:(.text+0x51): undefined reference to `.str.2' > > > > > > and linked_main.ll file has this section: > > > >> @.str.4 = private unnamed_addr constant [16 x i8] c"First Message\0A > \00", > >> align 1 > >> @.str.1.6 = private unnamed_addr constant [17 x i8] c"Second Message\0A > >> \00", align 1 > >> @.str.2.8 = private unnamed_addr constant [16 x i8] c"Third Message\0A > >> \00", align 1 > >> @.str = external hidden unnamed_addr constant [16 x i8], align 1 > >> @.str.1 = external hidden unnamed_addr constant [17 x i8], align 1 > >> @.str.2 = external hidden unnamed_addr constant [16 x i8], align 1 > > > > > > > > But the function does not use the correct versions of the strings as the > > linked "extracted_main" keeps making calls to .str, .str.1, .str.2? Am I > not > > supposed to do it this way? > > > > llvm-extract changes the semantic as it gives every GlobalValue > external linkage for simplicity. > Therefore, if you have GVs with internal linkage when you run > llvm-extract that information is lost. > At least, you may want to fix this, the relevant code is around here > (Transforms/IPO/ExtractGV.cpp) > > ``` > // For simplicity, just give all GlobalValues ExternalLinkage. A > trickier > // implementation could figure out which GlobalValues are actually > // referenced by the Named set, and which GlobalValues in the rest of > // the module are referenced by the NamedSet, and get away with > leaving > // more internal and private things internal and private. But for > now, > // be conservative and simple. > > // Visit the GlobalVariables. > for (Module::global_iterator I = M.global_begin(), E > M.global_end(); > I != E; ++I) { > bool Delete > deleteStuff == (bool)Named.count(&*I) && !I->isDeclaration(); > if (!Delete) { > if (I->hasAvailableExternallyLinkage()) > continue; > if (I->getName() == "llvm.global_ctors") > continue; > } > ``` > > Thanks, > > -- > Davide > > "There are no solved problems; there are only problems that are more > or less solved" -- Henri Poincare >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170830/4ced3174/attachment.html>