thr3ads.net - llvm dev - [LLVMdev] Adding diversity for security (and testing) [Sep 2013]

If this information is useful, please help other people find it:
Share via:

Stephen Crane

2013-Sep-19 17:49 UTC

[LLVMdev] Adding diversity for security (and testing)

Thanks for all the feedback! It seems there is some interest, so I thought
I'd try to summarize discussions so far, and provide patches for closer
inspection. I'm not sure if patches should end up here or on a different
list in this instance, so if I should instead send this to a different list,
I'm happy to do so.

- Is diversity needed, or are existing protections sufficient? As several people
have mentioned, industry adopters need very low overhead in many situations, and
diversity might be a good fit here. Ultimately diversity is just one piece of
the whole security puzzle, and could be combined with other techniques for
increased assurance. There did not seem to be a strong consensus in this
discussion about whether diversity is critical, although it appears that
existing static analysis tools are insufficient to cover all cases.

- Distribution. Distributing large numbers of randomized variants to end-users
could be difficult. Prelink diversification was mentioned, which may be a good
direction to go with this for the future. However, a basic implementation in
LLVM would provide a good starting point for future improvements. This
implementation would also provide a useful security measure now for users who
are already compiling their own software.

- Security. As was pointed out, diversity is not a perfect solution, and could
be bypassed through the use of memory disclosures and scripting. However, this
is certainly much harder for the attacker to pull off. In addition, existing
security techniques should be used alongside diversity for defense-in-depth.

- Random number generation. Discussions established that the use of some sort of
an internal RNG is required for reproducibility. The use of a cryptographically
secure RNG is considered best-practice for any randomness used for security
purposes. Since the RNG we are proposing is still very fast, there doesn't
appear to be a significant disadvantage to using it. We do need a fallback RNG
when OpenSSL is unavailable, and have therefore included a simple LCG as well.

I have attached our current patches against r190882, as well as a bit of
documentation. Not sure what the next steps here are, but feel free to test out
the code and let us know how it goes.

- stephen

-------------- next part --------------
A non-text attachment was scrubbed...
Name: USAGE
Type: application/octet-stream
Size: 995 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130919/65136070/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm_diversity.diff
Type: application/octet-stream
Size: 58397 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130919/65136070/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: clang_diversity.diff
Type: application/octet-stream
Size: 7036 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130919/65136070/attachment-0002.obj>

Nick Lewycky

2013-Sep-20 07:52 UTC

head link

[LLVMdev] Adding diversity for security (and testing)

Stephen Crane wrote:> Thanks for all the feedback! It seems there is some interest, so I thought
I'd try to summarize discussions so far, and provide patches for closer
inspection. I'm not sure if patches should end up here or on a different
list in this instance, so if I should instead send this to a different list,
I'm happy to do so.
>
> - Is diversity needed, or are existing protections sufficient? As several
people have mentioned, industry adopters need very low overhead in many
situations, and diversity might be a good fit here. Ultimately diversity is just
one piece of the whole security puzzle, and could be combined with other
techniques for increased assurance. There did not seem to be a strong consensus
in this discussion about whether diversity is critical, although it appears that
existing static analysis tools are insufficient to cover all cases.
>
> - Distribution. Distributing large numbers of randomized variants to
end-users could be difficult. Prelink diversification was mentioned, which may
be a good direction to go with this for the future. However, a basic
implementation in LLVM would provide a good starting point for future
improvements. This implementation would also provide a useful security measure
now for users who are already compiling their own software.
>
> - Security. As was pointed out, diversity is not a perfect solution, and
could be bypassed through the use of memory disclosures and scripting. However,
this is certainly much harder for the attacker to pull off. In addition,
existing security techniques should be used alongside diversity for
defense-in-depth.
>
> - Random number generation. Discussions established that the use of some
sort of an internal RNG is required for reproducibility. The use of a
cryptographically secure RNG is considered best-practice for any randomness used
for security purposes. Since the RNG we are proposing is still very fast, there
doesn't appear to be a significant disadvantage to using it. We do need a
fallback RNG when OpenSSL is unavailable, and have therefore included a simple
LCG as well.
>
> I have attached our current patches against r190882, as well as a bit of
documentation. Not sure what the next steps here are, but feel free to test out
the code and let us know how it goes.
Awesome! I'll jump right in to reviewing. Note that my reviews often 
focus on low-level stuff (formatting, typos). I understand that these 
patches may not be ready for that, and rather expect a higher-level 
review, but I'll do what I do anyways.

You have separate LLVM_WITH_OPENSSL and LLVM_ENABLE_RNG settings. Why? 
Either LLVM_ENABLE_RNG should always be on and then people can choose to 
link in openssl or not. Or maybe you're worried that this is insecure 
and you should remove LLVM_ENABLE_RNG and make it driven solely on 
LLVM_ENABLE_RNG? If you think we can put a good enough RNG into LLVM, we 
should go with turning on LLVM_ENABLE_RNG permanently, but if not then 
we should make the whole thing conditional on whether LLVM_WITH_OPENSSL 
is set.

+  //// isInsertedNOP - Return true if the instruction is an
+  //// artificially inserted NOP

Three slashes for a doxygen comment, not four.

+  /**
+   * Shuffles an iplist of type T
+   */

In LLVM, the style is:

   // Shuffles an iplist of type T

+    void shuffle(iplist<T>& list){

Missing space before '{'.

+    if(list.empty()) return;

Missing space before '('.

+    for(typename iplist<T>::iterator i = list.begin(); i != list.end();
){

See 
http://llvm.org/docs/CodingStandards.html#don-t-evaluate-end-every-time-through-a-loop
. The llvm way is:

     for (typename iplist<T>::iterator i = list.begin(), e = list.end();
          i != e; ++i) {

Also spaces before '(' and '{'.

+    for(typename SmallVector<T*, 10>::size_type i = 0; i < sv.size();
i++){

Similarly.

Are you familiar with std::random_device and the random number generator 
C++ 11 standard libary components? Yep, I'm going to ask you to expose a 
compatible API, or reasons why that doesn't work (or is silly). 
std::random_device is defined to be non-deterministic, and you needn't 
provide methods that you don't call (entropy()) but it would be 
convenient for everyone reading the API if it were a subset of an 
existing standard, or with clearly documented deviations from the standard.

--- /dev/null
+++ b/lib/Support/RandomNumberGenerator.cpp
@@ -0,0 +1,271 @@
+
+//
+//                     The LLVM Compiler Infrastructure

Whoops! Missed the ruler comment at the top.

+namespace {
+  static cl::opt<unsigned long long>
+  RandomSeed("rng-seed", cl::value_desc("seed"),
+             cl::desc("Seed for the random number generator"));

namespace and static are redundant. Just use static. 
http://llvm.org/docs/CodingStandards.html#static

+  PKCS5_PBKDF2_HMAC_SHA1(Password.data(), Password.size(), (unsigned 
char*)&Salt, sizeof(Salt), PBKDF_ITERATIONS, KeyLen, RandomBytes);

80-column violation. 
http://llvm.org/docs/CodingStandards.html#source-code-width

+void RandomNumberGenerator::ReadStateFile(StringRef StateFilename) {

There's nothing wrong with this function per se. It isn't buggy. I'm
just wondering how this API fits with the general "llvm as a library" 
approach. LLVM doesn't generally bake in the assumption that it's 
running on a system with a filesystem. For example, the bitcode reader 
has a convenience method which reads a .bc file, sure, but that just 
does the action of loading it into a MemoryBuffer, and forwards it along 
to the API using MemoryBuffer that does the work. I think you should 
follow that pattern here too.

+void RandomNumberGenerator::WriteStateFile(StringRef StateFilename) {

llvm::raw_ostream, for similar reasons?

+//===- NOPInsertion.cpp - Insert NOPs between instructions    ---*- C++ 
-*-===//

The emacs mode marker isn't necessary on .cpp files, it's only .h files 
which are ambiguous (C vs. C++). Also, some of those spaces should be 
hyphens.

+bool NOPInsertionPass::runOnMachineFunction(MachineFunction &Fn) {
+  const TargetInstrInfo *TII = Fn.getTarget().getInstrInfo();
+  PreNOPFunctionCount++;
+  static int nopsInserted = 0;

Hmmm. We don't currently run machine function passes in parallel, but in 
theory LLVM is designed to allow it (function passes aren't allowed to 
look outside their functions, for example). Fortunately, it looks like 
the only thing you do with this variable is increment it, but never read 
it for anything? Nuke it?

Overall this looks like a great start. However, I would like some other 
people to review things:
   - I don't actually approve lib/CodeGen and lib/Target changes. 
Somebody else is going to have to think about whether "InsertedNOP" 
belongs as an MIFlag.
   - I'm concerned about seeding the RNG. I'm especially concerned about
seeding it in the LTO case, I hadn't thought of that until I saw it in 
the patch. I'm appreciate if somebody with a security background could 
ponder that one.
   - There's also a clang patch which should be reviewed by cfe-commits.

Nick

Stephen Crane

2013-Sep-20 21:36 UTC

head link

[LLVMdev] Adding diversity for security (and testing)

Nick,

Thanks so much for such a detailed review. I definitely missed a few of the
details of the LLVM standards. Sorry. Here's a new patch that should resolve
the issues you pointed out. I've also included a few comments below --
anything I haven't replied to has been fixed.

In particular, I'd like to discuss RNG seeding with the list. I currently
use a static singleton to make sure there is only a single RNG ever
instantiated, but this is obviously undesirable. Is there a better place where
we can initialize the RNG in a thread-safe manner? We really need an RNG for
each thread in that case, so that the results are consistent, regardless of
thread ordering. I'm not familiar enough with LLVM to know if such a place
even exists.


On Sep 20, 2013, at 00:52 , Nick Lewycky <nicholas at mxc.ca>
wrote:> Awesome! I'll jump right in to reviewing. Note that my reviews often
focus on low-level stuff (formatting, typos). I understand that these patches
may not be ready for that, and rather expect a higher-level review, but I'll
do what I do anyways.
Works for me! You're completely right that I was initially expecting some
higher-level feedback, but this is great.
> You have separate LLVM_WITH_OPENSSL and LLVM_ENABLE_RNG settings. Why?
Either LLVM_ENABLE_RNG should always be on and then people can choose to link in
openssl or not. Or maybe you're worried that this is insecure and you should
remove LLVM_ENABLE_RNG and make it driven solely on LLVM_ENABLE_RNG? If you
think we can put a good enough RNG into LLVM, we should go with turning on
LLVM_ENABLE_RNG permanently, but if not then we should make the whole thing
conditional on whether LLVM_WITH_OPENSSL is set.
I had initially included this because I expected resistance to adding an RNG at
all, but it now appears that we're all on the same page that some RNG is
required. I'll go ahead and remove the ENABLE_RNG defines.
> Are you familiar with std::random_device and the random number generator
C++ 11 standard libary components? Yep, I'm going to ask you to expose a
compatible API, or reasons why that doesn't work (or is silly).
std::random_device is defined to be non-deterministic, and you needn't
provide methods that you don't call (entropy()) but it would be convenient
for everyone reading the API if it were a subset of an existing standard, or
with clearly documented deviations from the standard.
I had not actually thought about using the std::random_device interface. If we
were able to use C++11 features, this would be quite nice, since that includes
builtin things like shuffle over RandomAccessIterators and such niceties.
However, since this is not the case, I think complying to the <random> API
(which is quite odd, there's not even an overarching interface) might be
more complication than is necessary. We would have to emulate not only the
generator, but also the distribution classes and the generic shuffle method.
Even using operator() makes the client code particularly unreadable, since it
currently uses a static Generator() getter to get the instance of the RNG.
I'd like to resolve this first, then we can look at whether using the
operator() interface makes sense.
> +void RandomNumberGenerator::ReadStateFile(StringRef StateFilename) {
> 
> There's nothing wrong with this function per se. It isn't buggy.
I'm just wondering how this API fits with the general "llvm as a
library" approach. LLVM doesn't generally bake in the assumption that
it's running on a system with a filesystem. For example, the bitcode reader
has a convenience method which reads a .bc file, sure, but that just does the
action of loading it into a MemoryBuffer, and forwards it along to the API using
MemoryBuffer that does the work. I think you should follow that pattern here
too.
On thinking this over further, I think this feature is simply not needed any
more. We had some trouble with getting 64-bit command-line options working, but
since this works now, the state file really shouldn't be needed. Nuked.
> Overall this looks like a great start. However, I would like some other
people to review things:
>  - I don't actually approve lib/CodeGen and lib/Target changes.
Somebody else is going to have to think about whether "InsertedNOP"
belongs as an MIFlag.
The reasoning behind making this an MIFlag was to be able to skip over NOPs in
the terminators of a block. For uniform security we want to be able to place
NOPs in the terminators, but still need to skip over them when finding the first
terminator.
>  - I'm concerned about seeding the RNG. I'm especially concerned
about seeding it in the LTO case, I hadn't thought of that until I saw it in
the patch. I'm appreciate if somebody with a security background could
ponder that one.
>  - There's also a clang patch which should be reviewed by cfe-commits.
> 
> Nick
-------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm_diversity-0.2.diff
Type: application/octet-stream
Size: 49872 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130920/dd455f79/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: clang_diversity-0.2.diff
Type: application/octet-stream
Size: 7007 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130920/dd455f79/attachment-0001.obj>

Seemingly Similar Threads

Search for more possibly parallel threads

llvm dev - Sep 2013 - [LLVMdev] Adding diversity for security (and testing)

[LLVMdev] Adding diversity for security (and testing)

[LLVMdev] Adding diversity for security (and testing)

[LLVMdev] Adding diversity for security (and testing)

Seemingly Similar Threads