Wehrli Johan via llvm-dev
2016-Feb-22 16:17 UTC
[llvm-dev] Transfer information IR to binary
HI Mats,> On 22 Feb 2016, at 16:48, mats petersson <mats at planetcatfish.com> wrote: > > And you want this for only SOME bits of code, and that's why you need to have the IR report what sections are "sensitive"?Exactly and I want to chose this once the compilation is over.> It would be fairly easy if the code you want to check is a normal functions, just store the start address of the function, and the length should be doable too at machine code level, but not IR level. If you want to check only the middle of the function, it's a bit harder.Sadly I need to be able to check random part of the code. One of the problem I have is even if a get the address in the IR level, you still need the CRC value. So normally, you still have to pass theses values to the post-processing in order to compute the hash.> How are you dealing with the fact that code gets relocated during loading?You talk about the loading phase during the link? If yes, this is why a do the post-processing after the link. If you talk about something like the -fpie parameter, I used a small trick. The function who will call isModified will calculate the offset dynamically. To do this, you get the address in IR (in C++ this is like (uint64_t)std::addressof(main);) and I remove a constant value. During the post-processing (again) I will update this constant value with the address into the binary. This will give you an offset and you used it to update the addresses.> [I'm always curious as to how these type of designs cope with someone modifying the checksumming code itself, but that's another problem - or is this one of these things where the checksum is stored in special hardwareprotected memory?].For now, I store it into an special place. My first solution for my problem was to used some temporary file but this is highly impractical. Maybe I can create a temporary section into the binary but I didn’t find a lot of information about it. Thanks, Johan> -- > Mats > > On 22 February 2016 at 14:45, Wehrli Johan <johan.wehrli at heig-vd.ch <mailto:johan.wehrli at heig-vd.ch>> wrote: > I will try to explain better what I do. > > The main goal behind this is to verify that a part of code is not modified by someone else (it is an integrity check). > > To do this, I create in IR a function who take 2 parameters, a begin and an end value. > > This function perform an hash over the code area (from begin to end) and return it. > > At first, I don’t know the addresses and the hash value so I put random value (it is an integer 64 bits). > > The function look like uint32_t isModified(uint64_t* begin, uint64_t* end). > > Once the compilation is over, I need to update the begin address, end address and the hash value. > > When I say the compilation is over, I mean the clang driver has finished all of his action (compiling, linking, etc.). > > Greetings, > > Johan > > >> On 22 Feb 2016, at 15:04, mats petersson <mats at planetcatfish.com <mailto:mats at planetcatfish.com>> wrote: >> >> What kind of constant: type, value and how is it created? >> >> You can make public symbols that you can extract in a linker script to a special section. >> >> Or perhaps you want some metadata that a special late state (machine instr) pass is extracting and adding. >> >> The "best" solution really depends on what you are trying to achieve overall and what kind of data you are working with. >> >> -- >> Mats >> >> On 22 February 2016 at 13:04, Wehrli Johan via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> Hi, >> >> I want to know if it is possible to pass information from IR to the final binary (like a constant value)? >> >> I have a module pass in IR who make some transformation and, once the compilation is finished, I need to apply a post-processing. >> >> The post-processing need information from the IR part. >> >> Greetings, >> >> Johan >> >> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >> >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160222/74a9f2ad/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP using GPGMail URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160222/74a9f2ad/attachment.sig>
mats petersson via llvm-dev
2016-Feb-22 17:23 UTC
[llvm-dev] Transfer information IR to binary
On 22 February 2016 at 16:17, Wehrli Johan <johan.wehrli at heig-vd.ch> wrote:> HI Mats, > > On 22 Feb 2016, at 16:48, mats petersson <mats at planetcatfish.com> wrote: > > And you want this for only SOME bits of code, and that's why you need to > have the IR report what sections are "sensitive"? > > > Exactly and I want to chose this once the compilation is over. > > It would be fairly easy if the code you want to check is a normal > functions, just store the start address of the function, and the length > should be doable too at machine code level, but not IR level. If you want > to check only the middle of the function, it's a bit harder. > > > Sadly I need to be able to check random part of the code. One of the > problem I have is even if a get the address in the IR level, you still need > the CRC value. > > So normally, you still have to pass theses values to the post-processing > in order to compute the hash. >I don't think there is a simple way to achieve this - even less so if you want it to be portable across multiple processor architectures. For a given processor architecture, you could add a pseudo-instruction that is some unusual form of no-op (e.g. one of the "does nothing" instructions in x86, with an unusual combination of (redundant) prefix bytes, or some such) and then scan the generated code for that and store the relevant information. But this is highly dependent on architecture [and may be sensitive to false positives]. (Alternatively some illegal instruction and replace it with no-op during the post-processing). But I don't think that's a particularly good solution long term. Maybe someone else has a better idea... -- Mats> > How are you dealing with the fact that code gets relocated during loading? > > > You talk about the loading phase during the link? If yes, this is why a do > the post-processing after the link. > > If you talk about something like the -fpie parameter, I used a small trick. > > The function who will call isModified will calculate the offset > dynamically. > > To do this, you get the address in IR (in C++ this is like > *(uint64_t)std::addressof(main);*) and I remove a constant value. > > During the post-processing (again) I will update this constant value with > the address into the binary. > > This will give you an offset and you used it to update the addresses. > > [I'm always curious as to how these type of designs cope with someone > modifying the checksumming code itself, but that's another problem - or is > this one of these things where the checksum is stored in special > hardwareprotected memory?]. > > > For now, I store it into an special place. > > My first solution for my problem was to used some temporary file but this > is highly impractical. > > Maybe I can create a temporary section into the binary but I didn’t find a > lot of information about it. > > Thanks, > > Johan > > -- > Mats > > On 22 February 2016 at 14:45, Wehrli Johan <johan.wehrli at heig-vd.ch> > wrote: > >> I will try to explain better what I do. >> >> The main goal behind this is to verify that a part of code is not >> modified by someone else (it is an integrity check). >> >> To do this, I create in IR a function who take 2 parameters, a begin and >> an end value. >> >> This function perform an hash over the code area (from begin to end) and >> return it. >> >> At first, I don’t know the addresses and the hash value so I put random >> value (it is an integer 64 bits). >> >> The function look like* uint32_t isModified(uint64_t* begin, uint64_t* >> end). * >> >> Once the compilation is over, I need to update the begin address, end >> address and the hash value. >> >> When I say the compilation is over, I mean the clang driver has finished >> all of his action (compiling, linking, etc.). >> >> Greetings, >> >> Johan >> >> >> On 22 Feb 2016, at 15:04, mats petersson <mats at planetcatfish.com> wrote: >> >> What kind of constant: type, value and how is it created? >> >> You can make public symbols that you can extract in a linker script to a >> special section. >> >> Or perhaps you want some metadata that a special late state (machine >> instr) pass is extracting and adding. >> >> The "best" solution really depends on what you are trying to achieve >> overall and what kind of data you are working with. >> >> -- >> Mats >> >> On 22 February 2016 at 13:04, Wehrli Johan via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> Hi, >>> >>> I want to know if it is possible to pass information from IR to the >>> final binary (like a constant value)? >>> >>> I have a module pass in IR who make some transformation and, once the >>> compilation is finished, I need to apply a post-processing. >>> >>> The post-processing need information from the IR part. >>> >>> Greetings, >>> >>> Johan >>> >>> >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>> >> >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160222/68fd49d0/attachment.html>
Wehrli Johan via llvm-dev
2016-Feb-23 09:05 UTC
[llvm-dev] Transfer information IR to binary
Thank you for the input.> But I don't think that's a particularly good solution long term. > > Maybe someone else has a better idea...Yeah maybe someone else will have an idea :-) Johan> > -- > Mats > >> How are you dealing with the fact that code gets relocated during loading? > > You talk about the loading phase during the link? If yes, this is why a do the post-processing after the link. > > If you talk about something like the -fpie parameter, I used a small trick. > > The function who will call isModified will calculate the offset dynamically. > > To do this, you get the address in IR (in C++ this is like (uint64_t)std::addressof(main);) and I remove a constant value. > > During the post-processing (again) I will update this constant value with the address into the binary. > > This will give you an offset and you used it to update the addresses. > >> [I'm always curious as to how these type of designs cope with someone modifying the checksumming code itself, but that's another problem - or is this one of these things where the checksum is stored in special hardwareprotected memory?]. > > For now, I store it into an special place. > > My first solution for my problem was to used some temporary file but this is highly impractical. > > Maybe I can create a temporary section into the binary but I didn’t find a lot of information about it. > > Thanks, > > Johan > >> -- >> Mats >> >> On 22 February 2016 at 14:45, Wehrli Johan <johan.wehrli at heig-vd.ch <mailto:johan.wehrli at heig-vd.ch>> wrote: >> I will try to explain better what I do. >> >> The main goal behind this is to verify that a part of code is not modified by someone else (it is an integrity check). >> >> To do this, I create in IR a function who take 2 parameters, a begin and an end value. >> >> This function perform an hash over the code area (from begin to end) and return it. >> >> At first, I don’t know the addresses and the hash value so I put random value (it is an integer 64 bits). >> >> The function look like uint32_t isModified(uint64_t* begin, uint64_t* end). >> >> Once the compilation is over, I need to update the begin address, end address and the hash value. >> >> When I say the compilation is over, I mean the clang driver has finished all of his action (compiling, linking, etc.). >> >> Greetings, >> >> Johan >> >> >>> On 22 Feb 2016, at 15:04, mats petersson <mats at planetcatfish.com <mailto:mats at planetcatfish.com>> wrote: >>> >>> What kind of constant: type, value and how is it created? >>> >>> You can make public symbols that you can extract in a linker script to a special section. >>> >>> Or perhaps you want some metadata that a special late state (machine instr) pass is extracting and adding. >>> >>> The "best" solution really depends on what you are trying to achieve overall and what kind of data you are working with. >>> >>> -- >>> Mats >>> >>> On 22 February 2016 at 13:04, Wehrli Johan via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>> Hi, >>> >>> I want to know if it is possible to pass information from IR to the final binary (like a constant value)? >>> >>> I have a module pass in IR who make some transformation and, once the compilation is finished, I need to apply a post-processing. >>> >>> The post-processing need information from the IR part. >>> >>> Greetings, >>> >>> Johan >>> >>> >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >>> >>> >> >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160223/9aadd020/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP using GPGMail URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160223/9aadd020/attachment.sig>