Smith, Kevin B via llvm-dev
2016-Feb-06 22:57 UTC
[llvm-dev] [RFC] Embedding Bitcode in Object Files
Hal, No, it is not more of a problem than with DWARF info. DWARF info definitely contains personally identifiable information. However, people usually realize that is the case, and will turn off or strip debug info if they are worried about such issues, or make a specific plan to cleanse that information. You really just want to attempt to eliminate such information to the greatest extent possible. The desirability of using embedded Bitcode in libraries (which is a very natural use model, that I'm pretty sure this is intended to support), will be improved by taking into consideration this aspect of the implementation. Kevin Smith From: Hal Finkel [mailto:hfinkel at anl.gov] Sent: Saturday, February 06, 2016 2:37 PM To: Smith, Kevin B <kevin.b.smith at intel.com> Cc: llvm-dev at lists.llvm.org; Clang Dev <cfe-dev at lists.llvm.org>; James Y Knight <jyknight at google.com>; Steven Wu <stevenwu at apple.com> Subject: Re: [llvm-dev] [RFC] Embedding Bitcode in Object Files ________________________________ From: "Kevin B via llvm-dev Smith" <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> To: "James Y Knight" <jyknight at google.com<mailto:jyknight at google.com>>, "Steven Wu" <stevenwu at apple.com<mailto:stevenwu at apple.com>> Cc: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>, "Clang Dev" <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> Sent: Saturday, February 6, 2016 4:30:20 PM Subject: Re: [llvm-dev] [RFC] Embedding Bitcode in Object Files I don't know whether this is an issue in the current implementation, but I wanted to bring up a potential privacy issue. In embedding the information, care should be taken to avoid embedding any information that contains personally identifiable information. This can certainly occur if paths need to be embedded, as user names, or other private/confidential information may be present in the naming of directories and paths. Is this any more of a problem than the information that gets included in the DWARF sections? -Hal Generally, I suspect that it would be desirable to have an opt-in strategy for designating in the compiler which pieces of information/options need to be saved, and for all options marked as needed, determine whether there is the possibility/likelihood that they may contain personally identifiable information. Kevin B. Smith From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of James Y Knight via llvm-dev Sent: Friday, February 05, 2016 3:13 PM To: Steven Wu <stevenwu at apple.com<mailto:stevenwu at apple.com>> Cc: LLVM Developers Mailing List <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>; Clang Dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> Subject: Re: [llvm-dev] [RFC] Embedding Bitcode in Object Files On Fri, Feb 5, 2016 at 6:06 PM, Steven Wu <stevenwu at apple.com<mailto:stevenwu at apple.com>> wrote: I don't think we need any path in the command line section. We only record the command-line options that will affect CodeGen. See my example in one of the preview reply: $ clang -fembed-bitcode -O0 test.c -c -### "clang" "-cc1" (...lots of options...) "-o" "test.bc" "-x" "c" "test.c" <--- First stage "clang" "-cc1" "-triple" "x86_64-apple-macosx10.11.0" "-emit-obj" "-fembed-bitcode" "-O0" "-disable-llvm-optzns" "-o" "test.o" "-x" "ir" "test.bc" <--- Second stage I can't think of any source path that can affect CodeGen. There should not be any paths other than the bitcode input path and binary output path exists in the second stage and they are excluded from the command line section as well. -fdebug-prefix-map is consumed by the front-end and prefixed paths are a part of the debug info in the metadata. You don't need to encode -fdebug-prefix-map in the bitcode section to reproduce the object file with the same debug info. Did that answer your concern? Great -- it wasn't clear from the first message if you were just embedding the whole command-line as is. If the plan instead to embed only a few relevant options, I agree there should be no issue as far as paths go. _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160206/36275648/attachment.html>
Mehdi Amini via llvm-dev
2016-Feb-07 01:46 UTC
[llvm-dev] [RFC] Embedding Bitcode in Object Files
Hi, There is not only DWARF but any use of the macro __FILE__ (so any assertions for instance). I wouldn't expect the bitcode to contain any more (or less) information than the binary. The options for the optimizer/codegen shouldn't need any "sensitive" information. -- Mehdi> On Feb 6, 2016, at 2:57 PM, Smith, Kevin B via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hal, > > No, it is not more of a problem than with DWARF info. DWARF info definitely contains personally identifiable information. However, people usually realize that is the case, > and will turn off or strip debug info if they are worried about such issues, or make a specific plan to cleanse that information. > > You really just want to attempt to eliminate such information to the greatest extent possible. The desirability of using embedded Bitcode in libraries (which is a very > natural use model, that I'm pretty sure this is intended to support), will be improved by taking into consideration this aspect of the implementation. > > Kevin Smith > > > > <>From: Hal Finkel [mailto:hfinkel at anl.gov] > Sent: Saturday, February 06, 2016 2:37 PM > To: Smith, Kevin B <kevin.b.smith at intel.com> > Cc: llvm-dev at lists.llvm.org; Clang Dev <cfe-dev at lists.llvm.org>; James Y Knight <jyknight at google.com>; Steven Wu <stevenwu at apple.com> > Subject: Re: [llvm-dev] [RFC] Embedding Bitcode in Object Files > > > From: "Kevin B via llvm-dev Smith" <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> > To: "James Y Knight" <jyknight at google.com <mailto:jyknight at google.com>>, "Steven Wu" <stevenwu at apple.com <mailto:stevenwu at apple.com>> > Cc: llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>, "Clang Dev" <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> > Sent: Saturday, February 6, 2016 4:30:20 PM > Subject: Re: [llvm-dev] [RFC] Embedding Bitcode in Object Files > > I don't know whether this is an issue in the current implementation, but I wanted to bring up a potential privacy issue. > > In embedding the information, care should be taken to avoid embedding any information that contains personally identifiable information. This can certainly occur > if paths need to be embedded, as user names, or other private/confidential information may be present in the naming of directories and paths. > > Is this any more of a problem than the information that gets included in the DWARF sections? > > -Hal > > Generally, I suspect > that it would be desirable to have an opt-in strategy for designating in the compiler which pieces of information/options need to be saved, and for all options marked > as needed, determine whether there is the possibility/likelihood that they may contain personally identifiable information. > > Kevin B. Smith > > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org <mailto:llvm-dev-bounces at lists.llvm.org>] On Behalf Of James Y Knight via llvm-dev > Sent: Friday, February 05, 2016 3:13 PM > To: Steven Wu <stevenwu at apple.com <mailto:stevenwu at apple.com>> > Cc: LLVM Developers Mailing List <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>; Clang Dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> > Subject: Re: [llvm-dev] [RFC] Embedding Bitcode in Object Files > > > > On Fri, Feb 5, 2016 at 6:06 PM, Steven Wu <stevenwu at apple.com <mailto:stevenwu at apple.com>> wrote: > I don't think we need any path in the command line section. We only record the command-line options that will affect CodeGen. See my example in one of the preview reply: > $ clang -fembed-bitcode -O0 test.c -c -### > "clang" "-cc1" (...lots of options...) "-o" "test.bc" "-x" "c" "test.c" <--- First stage > "clang" "-cc1" "-triple" "x86_64-apple-macosx10.11.0" "-emit-obj" "-fembed-bitcode" "-O0" "-disable-llvm-optzns" "-o" "test.o" "-x" "ir" "test.bc" <--- Second stage > I can't think of any source path that can affect CodeGen. There should not be any paths other than the bitcode input path and binary output path exists in the second stage and they are excluded from the command line section as well. -fdebug-prefix-map is consumed by the front-end and prefixed paths are a part of the debug info in the metadata. You don't need to encode -fdebug-prefix-map in the bitcode section to reproduce the object file with the same debug info. Did that answer your concern? > > Great -- it wasn't clear from the first message if you were just embedding the whole command-line as is. If the plan instead to embed only a few relevant options, I agree there should be no issue as far as paths go. > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > > > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160206/4feaf4f2/attachment.html>
Steven Wu via llvm-dev
2016-Feb-07 02:03 UTC
[llvm-dev] [RFC] Embedding Bitcode in Object Files
Hi Kevin That is a very good concern and we have ways to address the issue in our bitcode implementation to achieve similar something similar to ‘strip’ (hiding unnecessary symbols and debug info). It wasn’t in the proposal because we would like to get the basics in before diving into something more detailed and controversial. Here is a short description about how we deal with the issue. Our implementation requires linker support which runs a ‘Linkage-Unit’ pass that consistently rename all the symbols and metadata that are not exported. This has to be done after resolving all the symbols. We would be happy to upstream our implementation if it is beneficial. Thanks Steven> On Feb 6, 2016, at 2:57 PM, Smith, Kevin B <kevin.b.smith at intel.com> wrote: > > Hal, > > No, it is not more of a problem than with DWARF info. DWARF info definitely contains personally identifiable information. However, people usually realize that is the case, > and will turn off or strip debug info if they are worried about such issues, or make a specific plan to cleanse that information. > > You really just want to attempt to eliminate such information to the greatest extent possible. The desirability of using embedded Bitcode in libraries (which is a very > natural use model, that I'm pretty sure this is intended to support), will be improved by taking into consideration this aspect of the implementation. > > Kevin Smith > > > > <>From: Hal Finkel [mailto:hfinkel at anl.gov] > Sent: Saturday, February 06, 2016 2:37 PM > To: Smith, Kevin B <kevin.b.smith at intel.com> > Cc: llvm-dev at lists.llvm.org; Clang Dev <cfe-dev at lists.llvm.org>; James Y Knight <jyknight at google.com>; Steven Wu <stevenwu at apple.com> > Subject: Re: [llvm-dev] [RFC] Embedding Bitcode in Object Files > > > From: "Kevin B via llvm-dev Smith" <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> > To: "James Y Knight" <jyknight at google.com <mailto:jyknight at google.com>>, "Steven Wu" <stevenwu at apple.com <mailto:stevenwu at apple.com>> > Cc: llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>, "Clang Dev" <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> > Sent: Saturday, February 6, 2016 4:30:20 PM > Subject: Re: [llvm-dev] [RFC] Embedding Bitcode in Object Files > > I don't know whether this is an issue in the current implementation, but I wanted to bring up a potential privacy issue. > > In embedding the information, care should be taken to avoid embedding any information that contains personally identifiable information. This can certainly occur > if paths need to be embedded, as user names, or other private/confidential information may be present in the naming of directories and paths. > > Is this any more of a problem than the information that gets included in the DWARF sections? > > -Hal > > Generally, I suspect > that it would be desirable to have an opt-in strategy for designating in the compiler which pieces of information/options need to be saved, and for all options marked > as needed, determine whether there is the possibility/likelihood that they may contain personally identifiable information. > > Kevin B. Smith > > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org <mailto:llvm-dev-bounces at lists.llvm.org>] On Behalf Of James Y Knight via llvm-dev > Sent: Friday, February 05, 2016 3:13 PM > To: Steven Wu <stevenwu at apple.com <mailto:stevenwu at apple.com>> > Cc: LLVM Developers Mailing List <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>; Clang Dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> > Subject: Re: [llvm-dev] [RFC] Embedding Bitcode in Object Files > > > > On Fri, Feb 5, 2016 at 6:06 PM, Steven Wu <stevenwu at apple.com <mailto:stevenwu at apple.com>> wrote: > I don't think we need any path in the command line section. We only record the command-line options that will affect CodeGen. See my example in one of the preview reply: > $ clang -fembed-bitcode -O0 test.c -c -### > "clang" "-cc1" (...lots of options...) "-o" "test.bc" "-x" "c" "test.c" <--- First stage > "clang" "-cc1" "-triple" "x86_64-apple-macosx10.11.0" "-emit-obj" "-fembed-bitcode" "-O0" "-disable-llvm-optzns" "-o" "test.o" "-x" "ir" "test.bc" <--- Second stage > I can't think of any source path that can affect CodeGen. There should not be any paths other than the bitcode input path and binary output path exists in the second stage and they are excluded from the command line section as well. -fdebug-prefix-map is consumed by the front-end and prefixed paths are a part of the debug info in the metadata. You don't need to encode -fdebug-prefix-map in the bitcode section to reproduce the object file with the same debug info. Did that answer your concern? > > Great -- it wasn't clear from the first message if you were just embedding the whole command-line as is. If the plan instead to embed only a few relevant options, I agree there should be no issue as far as paths go. > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > > > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160206/bb807456/attachment-0001.html>
Joerg Sonnenberger via llvm-dev
2016-Feb-07 02:36 UTC
[llvm-dev] [cfe-dev] [RFC] Embedding Bitcode in Object Files
On Sat, Feb 06, 2016 at 05:46:50PM -0800, Mehdi Amini via cfe-dev wrote:> There is not only DWARF but any use of the macro __FILE__ (so any assertions for instance). > I wouldn't expect the bitcode to contain any more (or less) information than the binary. > The options for the optimizer/codegen shouldn't need any "sensitive" information.__FILE__ is a frontend issue, I still have to add some equivalent to my remap patches for that into clang... Joerg
Smith, Kevin B via llvm-dev
2016-Feb-07 03:31 UTC
[llvm-dev] [RFC] Embedding Bitcode in Object Files
I don't know what is involved in upstreaming that, but yes, it seems very useful/necessary to me. From: stevenwu at apple.com [mailto:stevenwu at apple.com] Sent: Saturday, February 06, 2016 6:04 PM To: Smith, Kevin B <kevin.b.smith at intel.com> Cc: Hal Finkel <hfinkel at anl.gov>; llvm-dev at lists.llvm.org; Clang Dev <cfe-dev at lists.llvm.org>; James Y Knight <jyknight at google.com> Subject: Re: [llvm-dev] [RFC] Embedding Bitcode in Object Files Hi Kevin That is a very good concern and we have ways to address the issue in our bitcode implementation to achieve similar something similar to ‘strip’ (hiding unnecessary symbols and debug info). It wasn’t in the proposal because we would like to get the basics in before diving into something more detailed and controversial. Here is a short description about how we deal with the issue. Our implementation requires linker support which runs a ‘Linkage-Unit’ pass that consistently rename all the symbols and metadata that are not exported. This has to be done after resolving all the symbols. We would be happy to upstream our implementation if it is beneficial. Thanks Steven On Feb 6, 2016, at 2:57 PM, Smith, Kevin B <kevin.b.smith at intel.com<mailto:kevin.b.smith at intel.com>> wrote: Hal, No, it is not more of a problem than with DWARF info. DWARF info definitely contains personally identifiable information. However, people usually realize that is the case, and will turn off or strip debug info if they are worried about such issues, or make a specific plan to cleanse that information. You really just want to attempt to eliminate such information to the greatest extent possible. The desirability of using embedded Bitcode in libraries (which is a very natural use model, that I'm pretty sure this is intended to support), will be improved by taking into consideration this aspect of the implementation. Kevin Smith From: Hal Finkel [mailto:hfinkel at anl.gov] Sent: Saturday, February 06, 2016 2:37 PM To: Smith, Kevin B <kevin.b.smith at intel.com<mailto:kevin.b.smith at intel.com>> Cc: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>; Clang Dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>>; James Y Knight <jyknight at google.com<mailto:jyknight at google.com>>; Steven Wu <stevenwu at apple.com<mailto:stevenwu at apple.com>> Subject: Re: [llvm-dev] [RFC] Embedding Bitcode in Object Files ________________________________ From: "Kevin B via llvm-dev Smith" <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> To: "James Y Knight" <jyknight at google.com<mailto:jyknight at google.com>>, "Steven Wu" <stevenwu at apple.com<mailto:stevenwu at apple.com>> Cc: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>, "Clang Dev" <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> Sent: Saturday, February 6, 2016 4:30:20 PM Subject: Re: [llvm-dev] [RFC] Embedding Bitcode in Object Files I don't know whether this is an issue in the current implementation, but I wanted to bring up a potential privacy issue. In embedding the information, care should be taken to avoid embedding any information that contains personally identifiable information. This can certainly occur if paths need to be embedded, as user names, or other private/confidential information may be present in the naming of directories and paths. Is this any more of a problem than the information that gets included in the DWARF sections? -Hal Generally, I suspect that it would be desirable to have an opt-in strategy for designating in the compiler which pieces of information/options need to be saved, and for all options marked as needed, determine whether there is the possibility/likelihood that they may contain personally identifiable information. Kevin B. Smith From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of James Y Knight via llvm-dev Sent: Friday, February 05, 2016 3:13 PM To: Steven Wu <stevenwu at apple.com<mailto:stevenwu at apple.com>> Cc: LLVM Developers Mailing List <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>; Clang Dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> Subject: Re: [llvm-dev] [RFC] Embedding Bitcode in Object Files On Fri, Feb 5, 2016 at 6:06 PM, Steven Wu <stevenwu at apple.com<mailto:stevenwu at apple.com>> wrote: I don't think we need any path in the command line section. We only record the command-line options that will affect CodeGen. See my example in one of the preview reply: $ clang -fembed-bitcode -O0 test.c -c -### "clang" "-cc1" (...lots of options...) "-o" "test.bc" "-x" "c" "test.c" <--- First stage "clang" "-cc1" "-triple" "x86_64-apple-macosx10.11.0" "-emit-obj" "-fembed-bitcode" "-O0" "-disable-llvm-optzns" "-o" "test.o" "-x" "ir" "test.bc" <--- Second stage I can't think of any source path that can affect CodeGen. There should not be any paths other than the bitcode input path and binary output path exists in the second stage and they are excluded from the command line section as well. -fdebug-prefix-map is consumed by the front-end and prefixed paths are a part of the debug info in the metadata. You don't need to encode -fdebug-prefix-map in the bitcode section to reproduce the object file with the same debug info. Did that answer your concern? Great -- it wasn't clear from the first message if you were just embedding the whole command-line as is. If the plan instead to embed only a few relevant options, I agree there should be no issue as far as paths go. _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160207/c9d23c4a/attachment.html>
Sergei Larin via llvm-dev
2016-Feb-08 16:07 UTC
[llvm-dev] [RFC] Embedding Bitcode in Object Files
My 2c… Benefits of the feature clearly outweigh any potential privacy concerns from my point of view… and yes, there are multiple ways to deal with privacy even if it is an issue. Sergei --- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Steven Wu via llvm-dev Sent: Saturday, February 06, 2016 8:04 PM To: Smith, Kevin B Cc: llvm-dev at lists.llvm.org; Clang Dev Subject: Re: [llvm-dev] [RFC] Embedding Bitcode in Object Files Hi Kevin That is a very good concern and we have ways to address the issue in our bitcode implementation to achieve similar something similar to ‘strip’ (hiding unnecessary symbols and debug info). It wasn’t in the proposal because we would like to get the basics in before diving into something more detailed and controversial. Here is a short description about how we deal with the issue. Our implementation requires linker support which runs a ‘Linkage-Unit’ pass that consistently rename all the symbols and metadata that are not exported. This has to be done after resolving all the symbols. We would be happy to upstream our implementation if it is beneficial. Thanks Steven On Feb 6, 2016, at 2:57 PM, Smith, Kevin B <kevin.b.smith at intel.com <mailto:kevin.b.smith at intel.com> > wrote: Hal, No, it is not more of a problem than with DWARF info. DWARF info definitely contains personally identifiable information. However, people usually realize that is the case, and will turn off or strip debug info if they are worried about such issues, or make a specific plan to cleanse that information. You really just want to attempt to eliminate such information to the greatest extent possible. The desirability of using embedded Bitcode in libraries (which is a very natural use model, that I'm pretty sure this is intended to support), will be improved by taking into consideration this aspect of the implementation. Kevin Smith From: Hal Finkel [mailto:hfinkel at anl.gov] Sent: Saturday, February 06, 2016 2:37 PM To: Smith, Kevin B <kevin.b.smith at intel.com <mailto:kevin.b.smith at intel.com> > Cc: llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> ; Clang Dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org> >; James Y Knight <jyknight at google.com <mailto:jyknight at google.com> >; Steven Wu <stevenwu at apple.com <mailto:stevenwu at apple.com> > Subject: Re: [llvm-dev] [RFC] Embedding Bitcode in Object Files _____ From: "Kevin B via llvm-dev Smith" < <mailto:llvm-dev at lists.llvm.org> llvm-dev at lists.llvm.org> To: "James Y Knight" < <mailto:jyknight at google.com> jyknight at google.com>, "Steven Wu" < <mailto:stevenwu at apple.com> stevenwu at apple.com> Cc: <mailto:llvm-dev at lists.llvm.org> llvm-dev at lists.llvm.org, "Clang Dev" < <mailto:cfe-dev at lists.llvm.org> cfe-dev at lists.llvm.org> Sent: Saturday, February 6, 2016 4:30:20 PM Subject: Re: [llvm-dev] [RFC] Embedding Bitcode in Object Files I don't know whether this is an issue in the current implementation, but I wanted to bring up a potential privacy issue. In embedding the information, care should be taken to avoid embedding any information that contains personally identifiable information. This can certainly occur if paths need to be embedded, as user names, or other private/confidential information may be present in the naming of directories and paths. Is this any more of a problem than the information that gets included in the DWARF sections? -Hal Generally, I suspect that it would be desirable to have an opt-in strategy for designating in the compiler which pieces of information/options need to be saved, and for all options marked as needed, determine whether there is the possibility/likelihood that they may contain personally identifiable information. Kevin B. Smith From: llvm-dev [ <mailto:llvm-dev-bounces at lists.llvm.org> mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of James Y Knight via llvm-dev Sent: Friday, February 05, 2016 3:13 PM To: Steven Wu < <mailto:stevenwu at apple.com> stevenwu at apple.com> Cc: LLVM Developers Mailing List < <mailto:llvm-dev at lists.llvm.org> llvm-dev at lists.llvm.org>; Clang Dev < <mailto:cfe-dev at lists.llvm.org> cfe-dev at lists.llvm.org> Subject: Re: [llvm-dev] [RFC] Embedding Bitcode in Object Files On Fri, Feb 5, 2016 at 6:06 PM, Steven Wu < <mailto:stevenwu at apple.com> stevenwu at apple.com> wrote: I don't think we need any path in the command line section. We only record the command-line options that will affect CodeGen. See my example in one of the preview reply: $ clang -fembed-bitcode -O0 test.c -c -### "clang" "-cc1" (...lots of options...) "-o" "test.bc" "-x" "c" "test.c" <--- First stage "clang" "-cc1" "-triple" "x86_64-apple-macosx10.11.0" "-emit-obj" "-fembed-bitcode" "-O0" "-disable-llvm-optzns" "-o" "test.o" "-x" "ir" "test.bc" <--- Second stage I can't think of any source path that can affect CodeGen. There should not be any paths other than the bitcode input path and binary output path exists in the second stage and they are excluded from the command line section as well. -fdebug-prefix-map is consumed by the front-end and prefixed paths are a part of the debug info in the metadata. You don't need to encode -fdebug-prefix-map in the bitcode section to reproduce the object file with the same debug info. Did that answer your concern? Great -- it wasn't clear from the first message if you were just embedding the whole command-line as is. If the plan instead to embed only a few relevant options, I agree there should be no issue as far as paths go. _______________________________________________ LLVM Developers mailing list <mailto:llvm-dev at lists.llvm.org> llvm-dev at lists.llvm.org <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160208/51e4ed00/attachment-0001.html>