thr3ads.net - llvm dev - [LLVMdev] Type-based analysis for LLVM IR [Apr 2013]

If this information is useful, please help other people find it:
Share via:

Marcelo Sousa

2013-Apr-03 13:21 UTC

[LLVMdev] Type-based analysis for LLVM IR

I am interested in type-based verification of LLVM IR in the areas of
certified compilation and software verification. It seems to me that the
LLVM IR type system is rather informal in the sense that there is no paper
with a proper formalization of the type rules, and for example, a proof of
soundness for well-formedness of the code.
I would like to know if you are aware of any work in this direction for
LLVM IR so that I can compare with my own implementation.
Moreover, it would be valuable to get some feedback on suitable type-based
analysis that the LLVM community would like to have in the infrastructure.

Regards,
Marcelo
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130403/b3ac0fdd/attachment.html>

Roel Jordans

2013-Apr-03 13:59 UTC

head link

[LLVMdev] Type-based analysis for LLVM IR

Hi Marcello,

It sounds like Vellvm could be interesting for you to consider

http://www.cis.upenn.edu/~jianzhou/Vellvm/

Regards,
  Roel

On 03/04/13 15:21, Marcelo Sousa wrote:> I am interested in type-based verification of LLVM IR in the areas of
> certified compilation and software verification. It seems to me that the
> LLVM IR type system is rather informal in the sense that there is no
> paper with a proper formalization of the type rules, and for example, a
> proof of soundness for well-formedness of the code.
> I would like to know if you are aware of any work in this direction for
> LLVM IR so that I can compare with my own implementation.
> Moreover, it would be valuable to get some feedback on suitable
> type-based analysis that the LLVM community would like to have in the
> infrastructure.
>
> Regards,
> Marcelo

John Criswell

2013-Apr-03 15:31 UTC

head link

[LLVMdev] Type-based analysis for LLVM IR

On 4/3/13 8:21 AM, Marcelo Sousa wrote:> I am interested in type-based verification of LLVM IR in the areas of 
> certified compilation and software
Can you explain what you mean by type-based verification?  Do you mean 
that you want to use a set of typing rules to perform verification, or 
do you mean something else?
> verification. It seems to me that the LLVM IR type system is rather 
> informal in the sense that there is no paper with a proper 
> formalization of the type rules, and for example, a proof of soundness 
> for well-formedness of the code.
Another important feature of the LLVM type system is that it is not a 
"safe" type system.  It is possible to do all sorts of type unsafe 
things, such as casting integers to pointers and dereferencing them.  
One of the standard LLVM transforms (load widening) will generate loads 
that access memory past the end of a memory object. LLVM IR is capable 
of representing all the unsafe computations that C can do.
> I would like to know if you are aware of any work in this direction 
> for LLVM IR so that I can compare with my own implementation.
The SAFECode/SVA work uses a points-to analysis called DSA which infers 
the types of memory objects.  A key result of the work is that the 
points-to analysis/type inference results of DSA can be made sound at 
run-time.  The original version of DSA, working on older versions of 
LLVM, was able to find a fairly decent number of type-safe memory 
objects.  The SAFECode/SVA work developed a simple type checker that 
verified that the type-inference results returned by DSA are consistent 
(i.e., if DSA says that a memory object is used in a type-safe manner, 
then it is actually used in a type-safe manner).  There is also some 
work by Greg Morrisett's group at Harvard at creating a verified 
type-checker for the SAFECode/SVA system.  SAFECode/SVA papers can be 
found at http://sva.cs.illinois.edu/pubs.html.

As far as formalisms for the LLVM IR, there is one from University of 
Pennsylvania (Santosh Nagarakatte, now at Rutgers, was involved with 
that work), the one used by Greg Morrisett's group for the verified type 
checker, and there might be one from Grigore Rosu's group here at the 
University of Illinois.  There is also an LLVM translation validator 
from Greg Morrisett's group (http://llvm-md.org).

I think another person wrote asking about translation validation a few 
months ago.  You might want to search the llvmdev archives for that 
conversation.
> Moreover, it would be valuable to get some feedback on suitable 
> type-based analysis that the LLVM community would like to have in the 
> infrastructure.
LLVM already as a type-based alias analysis (called TBAA) which is used 
for optimization.  There's a proposal floating around on the list to 
improve it; you might want to take a look at that.

SAFECode can use type-safety to optimize away load/store checks 
(provided that it uses garbage collection or automatic pool allocation 
to make the optimization sound).  Using TBAA instead of DSA might 
generate better results.  Address Sanitizer (ASan) might benefit from 
that optimization as well.

Automatic pool allocation could also benefit from something like TBAA.

I think you'll have to be more specific, though, on what you mean by 
type-based analysis and provide details on how it could be applied.

I apologize if my response is a bit of a ramble; I'm basically providing 
a bit of information hoping that some of it is relevant to what you are 
asking.  I hope you find some of it useful.

-- John T.
>
> Regards,
> Marcelo
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130403/c8c88793/attachment.html>

Marcelo Sousa

2013-Apr-04 13:20 UTC

head link

[LLVMdev] Type-based analysis for LLVM IR

Hello,


On Wed, Apr 3, 2013 at 4:31 PM, John Criswell <criswell at illinois.edu>
wrote:
>  On 4/3/13 8:21 AM, Marcelo Sousa wrote:
>
>  I am interested in type-based verification of LLVM IR in the areas of
> certified compilation and software
>
>
> Can you explain what you mean by type-based verification?  Do you mean
> that you want to use a set of typing rules to perform verification, or do
> you mean something else?
>
> By type-based verification, I mean to use a set of typing rules thatenforce certain properties of interest, namely: memory safety,
well-formedness of the SSA code, concurrency related problems, etc. I think
that even a type checker for LLVM IR could be useful. Part of the
motivation was that it seems to me that the set of typing rules hasn't been
properly formalized anywhere and I think it's quite useful as documentation
and also for algorithmic type-checking.

>
>  verification. It seems to me that the LLVM IR type system is rather
> informal in the sense that there is no paper with a proper formalization of
> the type rules, and for example, a proof of soundness for well-formedness
> of the code.
>
>
> Another important feature of the LLVM type system is that it is not a
> "safe" type system.  It is possible to do all sorts of type
unsafe things,
> such as casting integers to pointers and dereferencing them.  One of the
> standard LLVM transforms (load widening) will generate loads that access
> memory past the end of a memory object.  LLVM IR is capable of representing
> all the unsafe computations that C can do.
>
> I understand, although this is not a problem if you consider a refinedtype system that uses type qualifiers. You could then have an integer
annotated with a pointer type that was casted from.

>
>  I would like to know if you are aware of any work in this direction for
> LLVM IR so that I can compare with my own implementation.
>
>
> The SAFECode/SVA work uses a points-to analysis called DSA which infers
> the types of memory objects.  A key result of the work is that the
> points-to analysis/type inference results of DSA can be made sound at
> run-time.  The original version of DSA, working on older versions of LLVM,
> was able to find a fairly decent number of type-safe memory objects.  The
> SAFECode/SVA work developed a simple type checker that verified that the
> type-inference results returned by DSA are consistent (i.e., if DSA says
> that a memory object is used in a type-safe manner, then it is actually
> used in a type-safe manner).  There is also some work by Greg
Morrisett's
> group at Harvard at creating a verified type-checker for the SAFECode/SVA
> system.  SAFECode/SVA papers can be found at
> http://sva.cs.illinois.edu/pubs.html.
>
Thanks, this is useful information. Actually, your paper on Memory Safety
for Low-Level Software/Hardware interactions is in the direction of what
I'm interested. I devised a refined type system for a subset of LLVM IR
that enforces separation of IO and regular memory, and identifies use after
free. My goal is to apply this type system to verification of the Linux
Kernel.
Your work on SVA does not seem to focus on a type-based approach since it's
based on run-time monitoring and guaranteeing a safe executing environment.


>
> As far as formalisms for the LLVM IR, there is one from University of
> Pennsylvania (Santosh Nagarakatte, now at Rutgers, was involved with that
> work), the one used by Greg Morrisett's group for the verified type
> checker, and there might be one from Grigore Rosu's group here at the
> University of Illinois.  There is also an LLVM translation validator from
> Greg Morrisett's group (http://llvm-md.org).
>
> I think another person wrote asking about translation validation a few
> months ago.  You might want to search the llvmdev archives for that
> conversation.
>
I am aware of these formalisms, but not sure what you mean by the verified
type checker. I may need to read again their POPL paper but I was not aware
of a verified type checker for LLVM IR in Coq.

>
>  Moreover, it would be valuable to get some feedback on suitable
> type-based analysis that the LLVM community would like to have in the
> infrastructure.
>
>
> LLVM already as a type-based alias analysis (called TBAA) which is used
> for optimization.  There's a proposal floating around on the list to
> improve it; you might want to take a look at that.
>
> SAFECode can use type-safety to optimize away load/store checks (provided
> that it uses garbage collection or automatic pool allocation to make the
> optimization sound).  Using TBAA instead of DSA might generate better
> results.  Address Sanitizer (ASan) might benefit from that optimization as
> well.
>
> Automatic pool allocation could also benefit from something like TBAA.
>
> This is of absolute interest to me. I will look for the proposal.
Regards,
Marcelo

I think you'll have to be more specific, though, on what you mean
by> type-based analysis and provide details on how it could be applied.
>
> I apologize if my response is a bit of a ramble; I'm basically
providing a
> bit of information hoping that some of it is relevant to what you are
> asking.  I hope you find some of it useful.
>
> -- John T.
>
>
>  Regards,
> Marcelo
>
>
> _______________________________________________
> LLVM Developers mailing listLLVMdev at cs.uiuc.edu        
http://llvm.cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130404/54d2e660/attachment.html>

Maybe Matching Threads

Search for more seemingly similar threads

llvm dev - Apr 2013 - [LLVMdev] Type-based analysis for LLVM IR

[LLVMdev] Type-based analysis for LLVM IR

[LLVMdev] Type-based analysis for LLVM IR

[LLVMdev] Type-based analysis for LLVM IR

[LLVMdev] Type-based analysis for LLVM IR

Maybe Matching Threads