changze cui via llvm-dev
2018-Nov-08 03:27 UTC
[llvm-dev] LLVM Call Graph may not cover all calls
Hi there, I am working with opt-6.0 and try to generate a call graph of libsndfile, but it seems the call graph doesn't cover all call relationship. Actually, I am doing static analysis on *CVE-2014-8130*, which is a zero division on libtiff/tif_write.c TIFFWriteScanline. (see https://security-tracker.debian.org/tracker/CVE-2014-8130) Theoretically, the main function in tiffdither.c will call fsdither, and fsdither will call TIFFWriteScanLine. main (tiffdither.c) -> fsdither (tiffdither.c) -> TIFFWriteScanLine (tif_write.c) I want to get a call graph of the buggy program tiffdither but I find the call graph generated doesn't cover the call relationship from fsdither -> TIFFWriteScanLine. For short, the call graph now shows TIFFWriteScanLine is only called by an external node. I already compile tiffdither, and I upload it as an attached file. I also write a small python to help analyze the dot file. Actually, I do opt-6.0 -analyze -dot-callgraph tiffdither.bc to generate the dot file. And then modify the dotPath in dotHandle.py. You can modify the python code to help analyze. I can't figure out why this happens, and I will be very appreciate if you can help! Thanks & Regards, Chaz -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181108/ae3808dd/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: tiffdither.bc Type: application/octet-stream Size: 2081632 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181108/ae3808dd/attachment-0001.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: dotHandle.py Type: text/x-python Size: 2086 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181108/ae3808dd/attachment-0001.py>
David Blaikie via llvm-dev
2018-Nov-09 17:44 UTC
[llvm-dev] LLVM Call Graph may not cover all calls
How are you generating the calligraph from? Generally a compiler only acts on a per translation unit basis, so it couldn't form the complete program call graph across multiple files (hence why it's missing the edge that crosses the boundary between .C files) On Thu, Nov 8, 2018, 8:43 AM changze cui via llvm-dev < llvm-dev at lists.llvm.org wrote:> Hi there, > I am working with opt-6.0 and try to generate a call graph of > libsndfile, but it seems the call graph doesn't cover all call relationship. > Actually, I am doing static analysis on *CVE-2014-8130*, which is a > zero division on libtiff/tif_write.c TIFFWriteScanline. (see > https://security-tracker.debian.org/tracker/CVE-2014-8130) > Theoretically, the main function in tiffdither.c will call fsdither, > and fsdither will call TIFFWriteScanLine. main (tiffdither.c) -> fsdither > (tiffdither.c) -> TIFFWriteScanLine (tif_write.c) > I want to get a call graph of the buggy program tiffdither but I find > the call graph generated doesn't cover the call relationship from fsdither > -> TIFFWriteScanLine. > For short, the call graph now shows TIFFWriteScanLine is only called by > an external node. > I already compile tiffdither, and I upload it as an attached file. I > also write a small python to help analyze the dot file. > Actually, I do opt-6.0 -analyze -dot-callgraph tiffdither.bc to > generate the dot file. And then modify the dotPath in dotHandle.py. You can > modify the python code to help analyze. > I can't figure out why this happens, and I will be very appreciate if > you can help! > > Thanks & Regards, > Chaz > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181109/f540f8e0/attachment.html>
David Blaikie via llvm-dev
2018-Nov-15 18:47 UTC
[llvm-dev] LLVM Call Graph may not cover all calls
Looks like this is old LLVM IR - I don't have an old version of LLVM that can read this. (I only have LLVM from subversion). Not sure what version it is, but if it's not past the upgrade horizon, distributing/sharing bitcode (as LLVM has backwards compatibility guarantees for that) rather than textual IR might be more effective. In any case, I'd suggest you try to create a reduced test case that demonstrates the problem. Strip out unrelated instructions, functions, etc, while preserving the IR, and see if it still produces the problem - when you can't strip out anything else without losing the problem, then you have something someone else can more easily look at and explain/understand (equally, by oding this, maybe you find something critical that helps you understand what's going on) On Wed, Nov 14, 2018 at 7:22 PM changze cui <changzecui at gmail.com> wrote:> Hi Dave, > As you mention, I do use wllvm to do the compilation and extract-bc > work. > For now, the call graph works fine on CVE-2014-8130 after I recompile > the program. I don't know why. It is weird. > However, the call graph stll has some problem on CVE-2017-16942. The > call graph just miss something. I follow your advice and I check the IR > and find everything is in there. By the way, I also try to recompile the > program but don't work. > According to the code, the call graph in CVE-2017-16942 is : > psf_open_file -> wav_open -> wav_read_header -> > wav_w64_read_fmt_chunk (this is the buggy function!) > The IR shows the same call relationship (see the attached file > 16942.ll). > But if I generate the call graph by opt, it will miss psf_open_file > -> wav_open and wav_read_header-> wav_w64_read_fmt_chunk. > Also, I find some interesting phenomenon. When i generate the call > graph, I find some nodes in edge won't show up in nodeList. So it may > looks like psf_open_file -> "" (For now I am using pydot to handle > the dot generated by opt). Maybe the phenomenon is related to the missing > call relationship? I have no idea. > I put the dot file and analysis result in the attached file. The dot > is generated by opt and the analysis result show the map of caller callee > (map[caller]= [callee1 callee2 calee3 ...]). > Do you have other idea??? > Thanks a lot!!!!!!!! > > Regards, > Chaz > > > David Blaikie <dblaikie at gmail.com> 于2018年11月11日周日 上午5:01写道: > >> >> >> On Fri, Nov 9, 2018 at 10:39 PM changze cui <changzecui at gmail.com> wrote: >> >>> Hi David, >>> Thanks for your reply ! >>> Actually, I compile the program into an executable program, and I >>> use extract-bc to get .bc file from the executable program. >>> >> >> I can't say I'd heard of extract-bc - googling around I came across this? >> https://github.com/travitch/whole-program-llvm - is that what you're >> using? & you built the program with 'wllvm'? >> >> >>> Also, I can trigger the CVE from the executable program, which means >>> function TIFFWriteScanLine is inside the executable program. So i think it >>> is one translation unit and the dot-callgraph are supposed to handle this. >>> >> >> Did you take a look at the LLVM IR (llvm-dis will give you a textual >> representation of a bitcode file) to make sure everything's in there that >> you expect to be? Are there function definitions (not only declarations) of >> all the entities you want in the CFG? Are they calling each other directly, >> etc? >> >> - Dave >> >> >>> Do you have other ideas? >>> >>> Thanks & Regards, >>> Chaz >>> >>> David Blaikie <dblaikie at gmail.com> 于2018年11月10日周六 上午1:44写道: >>> >>>> How are you generating the calligraph from? Generally a compiler only >>>> acts on a per translation unit basis, so it couldn't form the complete >>>> program call graph across multiple files (hence why it's missing the edge >>>> that crosses the boundary between .C files) >>>> >>>> On Thu, Nov 8, 2018, 8:43 AM changze cui via llvm-dev < >>>> llvm-dev at lists.llvm.org wrote: >>>> >>>>> Hi there, >>>>> I am working with opt-6.0 and try to generate a call graph of >>>>> libsndfile, but it seems the call graph doesn't cover all call relationship. >>>>> Actually, I am doing static analysis on *CVE-2014-8130*, which is >>>>> a zero division on libtiff/tif_write.c TIFFWriteScanline. (see >>>>> https://security-tracker.debian.org/tracker/CVE-2014-8130) >>>>> Theoretically, the main function in tiffdither.c will call >>>>> fsdither, and fsdither will call TIFFWriteScanLine. main (tiffdither.c) >>>>> -> fsdither (tiffdither.c) -> TIFFWriteScanLine (tif_write.c) >>>>> I want to get a call graph of the buggy program tiffdither but I >>>>> find the call graph generated doesn't cover the call relationship from >>>>> fsdither -> TIFFWriteScanLine. >>>>> For short, the call graph now shows TIFFWriteScanLine is only >>>>> called by an external node. >>>>> I already compile tiffdither, and I upload it as an attached file. >>>>> I also write a small python to help analyze the dot file. >>>>> Actually, I do opt-6.0 -analyze -dot-callgraph tiffdither.bc to >>>>> generate the dot file. And then modify the dotPath in dotHandle.py. You can >>>>> modify the python code to help analyze. >>>>> I can't figure out why this happens, and I will be very appreciate >>>>> if you can help! >>>>> >>>>> Thanks & Regards, >>>>> Chaz >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> llvm-dev at lists.llvm.org >>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>> >>>>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181115/d6917e98/attachment.html>
cszide via llvm-dev
2018-Nov-17 01:38 UTC
[llvm-dev] LLVM Call Graph may not cover all calls
Hi, I also have the same problem and I wrote some codes to mitigate this problem. You can find it on github https://github.com/coffezhou/OverCG. I try it on the IR you provided and it can get the call relationship from fsdither -> TIFFWriteScanLine. I hope it can help you. Best, Zhide At 2018-11-09 00:44:18, "changze cui via llvm-dev" <llvm-dev at lists.llvm.org> wrote: Hi there, I am working with opt-6.0 and try to generate a call graph of libsndfile, but it seems the call graph doesn't cover all call relationship. Actually, I am doing static analysis on CVE-2014-8130, which is a zero division on libtiff/tif_write.c TIFFWriteScanline. (see https://security-tracker.debian.org/tracker/CVE-2014-8130) Theoretically, the main function in tiffdither.c will call fsdither, and fsdither will call TIFFWriteScanLine. main (tiffdither.c) -> fsdither (tiffdither.c) -> TIFFWriteScanLine (tif_write.c) I want to get a call graph of the buggy program tiffdither but I find the call graph generated doesn't cover the call relationship from fsdither -> TIFFWriteScanLine. For short, the call graph now shows TIFFWriteScanLine is only called by an external node. I already compile tiffdither, and I upload it as an attached file. I also write a small python to help analyze the dot file. Actually, I do opt-6.0 -analyze -dot-callgraph tiffdither.bc to generate the dot file. And then modify the dotPath in dotHandle.py. You can modify the python code to help analyze. I can't figure out why this happens, and I will be very appreciate if you can help! Thanks & Regards, Chaz -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181117/0b6f5545/attachment.html>
changze cui via llvm-dev
2018-Nov-19 02:49 UTC
[llvm-dev] LLVM Call Graph may not cover all calls
Hi zhide, Cool Bro!!!!!! Your tool solve my problem perfectly !!! It seems the original call graph has some problems. Then why don't you submit your solution to LLVM-dev and integrate your code into the next version of opt? Your code may help lots of people like me who is not very familiar with LLVM. BTW, probably CFG has the same problem because I think call graph is generated from CFG. Did you check before? Anyway, Thanks a lot !! Regards, Chaz cszide <cszide at 163.com> 于2018年11月17日周六 上午9:38写道:> Hi, > I also have the same problem and I wrote some codes to mitigate this > problem. > You can find it on github https://github.com/coffezhou/OverCG. I try it > on the IR you > provided and it can get the call relationship from fsdither -> > TIFFWriteScanLine. > I hope it can help you. > > Best, > Zhide > > > > > > At 2018-11-09 00:44:18, "changze cui via llvm-dev" < > llvm-dev at lists.llvm.org> wrote: > > Hi there, > I am working with opt-6.0 and try to generate a call graph of > libsndfile, but it seems the call graph doesn't cover all call relationship. > Actually, I am doing static analysis on *CVE-2014-8130*, which is a > zero division on libtiff/tif_write.c TIFFWriteScanline. (see > https://security-tracker.debian.org/tracker/CVE-2014-8130) > Theoretically, the main function in tiffdither.c will call fsdither, > and fsdither will call TIFFWriteScanLine. main (tiffdither.c) -> fsdither > (tiffdither.c) -> TIFFWriteScanLine (tif_write.c) > I want to get a call graph of the buggy program tiffdither but I find > the call graph generated doesn't cover the call relationship from fsdither > -> TIFFWriteScanLine. > For short, the call graph now shows TIFFWriteScanLine is only called by > an external node. > I already compile tiffdither, and I upload it as an attached file. I > also write a small python to help analyze the dot file. > Actually, I do opt-6.0 -analyze -dot-callgraph tiffdither.bc to > generate the dot file. And then modify the dotPath in dotHandle.py. You can > modify the python code to help analyze. > I can't figure out why this happens, and I will be very appreciate if > you can help! > > Thanks & Regards, > Chaz > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181119/c0623c0a/attachment.html>