On 26 November 2013 19:05, Kaylor, Andrew <andrew.kaylor at intel.com> wrote:> I would also note that the failure isn’t actually in anything > MCJIT-specific. Aside from the fact that it seems to be clang-specific, > the code that is failing is specific to the lli remote implementation. > It’s not clear to me why it would fail under aggressive optimization with > clang, but I wouldn’t characterize that code as particularly robust. >I agree. I think this is more likely a codegen fault on Clang's side that crashes the client, not even the remote implementation, that even being crude, has very little room for failure of that magnitude. I just updated the bugzilla report with a few comments about the failure.> The short of it is that there’s nothing MCJIT-specific about this failure. > It’s most likely a pipe I/O problem. I think it’s possible that the clang > optimizations are just exposing a timing-related vulnerability in the pipe > handling. >Ok, I'll disable those tests for ARM for now and will look into the bug. I don't know much about how MCJIT works, so creating the reduced test case will prove difficult. But I'll progress, because I do want MCJIT to work well on ARM, and disabling tests is the wrong way to head. ;) cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131126/88b09dab/attachment.html>
Looking at the code, one obvious source of intermittent failure is that the Linux implementations of ReadBytes and WriteBytes don't check for EINTR. I doubt that's the failure you're seeing because it would be more randomly distributed but it's something that should be fixed. More likely as the cause of failure in your case is that read is returning less than the number of bytes requested. In theory, this can happen if we read one end of the pipe while the other end is being written, but the current code doesn't check for it. A race condition like this seems more likely than a code generation problem. I'm attaching a patch (which I haven't even tried to compile) that I think addresses these issues. Can you try it out and see if it fixes this problem for you? If this doesn't do the trick, by stepping through the remote case in the debugger you can see what the communication is leading up to the failure. From there it should be relatively simple to use just the RemoteTargetExternal class to create a test driver that communicates with the child process in the same way. This ought to give you a failing test case completely independent of any of significant part of LLVM (unless the failure is entirely timing dependent). Thanks, Andy From: Renato Golin [mailto:renato.golin at linaro.org] Sent: Tuesday, November 26, 2013 2:44 PM To: Kaylor, Andrew Cc: NAKAMURA Takumi; LLVM Dev Subject: Re: MCJIT RemoteMemoryManager Failures on ARM On 26 November 2013 19:05, Kaylor, Andrew <andrew.kaylor at intel.com<mailto:andrew.kaylor at intel.com>> wrote: I would also note that the failure isn't actually in anything MCJIT-specific. Aside from the fact that it seems to be clang-specific, the code that is failing is specific to the lli remote implementation. It's not clear to me why it would fail under aggressive optimization with clang, but I wouldn't characterize that code as particularly robust. I agree. I think this is more likely a codegen fault on Clang's side that crashes the client, not even the remote implementation, that even being crude, has very little room for failure of that magnitude. I just updated the bugzilla report with a few comments about the failure. The short of it is that there's nothing MCJIT-specific about this failure. It's most likely a pipe I/O problem. I think it's possible that the clang optimizations are just exposing a timing-related vulnerability in the pipe handling. Ok, I'll disable those tests for ARM for now and will look into the bug. I don't know much about how MCJIT works, so creating the reduced test case will prove difficult. But I'll progress, because I do want MCJIT to work well on ARM, and disabling tests is the wrong way to head. ;) cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131126/77118bd3/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: lli-remote-comm.patch Type: application/octet-stream Size: 2997 bytes Desc: lli-remote-comm.patch URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131126/77118bd3/attachment.obj>
On 26 November 2013 23:29, Kaylor, Andrew <andrew.kaylor at intel.com> wrote:> Looking at the code, one obvious source of intermittent failure is that > the Linux implementations of ReadBytes and WriteBytes don’t check for > EINTR. I doubt that’s the failure you’re seeing because it would be more > randomly distributed but it’s something that should be fixed. >Agreed. More likely as the cause of failure in your case is that read is returning> less than the number of bytes requested. In theory, this can happen if we > read one end of the pipe while the other end is being written, but the > current code doesn’t check for it. A race condition like this seems more > likely than a code generation problem. >Right. What I meant by a codegen problem was not *just* a crash in the client, but code movement that would induce instability, like moving things beyond memory barriers, etc. However, I agree that the code, as it is, is not robust enough and that the compiler can be more aggressive to remove the lucky balance it has now. I’m attaching a patch (which I haven’t even tried to compile) that I think> addresses these issues. Can you try it out and see if it fixes this > problem for you? >The patch indeed fixes the problem, but it introduces lock-ups on other (random) tests when they run simultaneously, but not so when I run them independently. cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131127/7dacd5f7/attachment.html>
Seemingly Similar Threads
- [LLVMdev] MCJIT RemoteMemoryManager Failures on ARM
- [LLVMdev] MCJIT RemoteMemoryManager Failures on ARM
- [LLVMdev] MCJIT RemoteMemoryManager Failures on ARM
- [LLVMdev] MCJIT finalizeObject output to use in external process
- [LLVMdev] MCJIT finalizeObject output to use in external process