> On Dec 7, 2016, at 10:27 AM, Viacheslav Nikolaev <viacheslav.nikolaev at gmail.com> wrote: > > > I believe it’ll always forward directly to raw_fd_ostream::write_impl(), which is calling the libc ::write(). > > Do you mean raw_fd_ostream somehow overrides the operator<< the code for which I extracted? > I cannot see if that is so. And I really saw it didn't albeit in a very old master.No, I meant I didn’t see the race in the code you showed for the operator<<(). It is very possible I missed something, but you’ll need to help me figuring out where the race is. — Mehdi> > > I agree. Acquiring a lock on each write to a buffered raw_ostream is too expensive. You can always explicitly use std::mutex if you have shared raw_ostreams. > > Of course you can do it in your code. But there's a lot of "dbgs() <<" in the backends. > And e.g. if you want to have multithread invocation of a debug version of the lib, but without changing the backend... you might get the race condition. > > That's why I'm asking if there's a good way to avoid a problem like that. > > On Wed, Dec 7, 2016 at 9:12 PM, Rui Ueyama <ruiu at google.com <mailto:ruiu at google.com>> wrote: > On Wed, Dec 7, 2016 at 10:02 AM, Mehdi Amini via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > > On Dec 7, 2016, at 1:52 AM, Viacheslav Nikolaev via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > > > This code from raw_ostream.h is really racy: > > > > raw_ostream &operator<<(StringRef Str) { > > // Inline fast path, particularly for strings with a known length. > > size_t Size = Str.size(); > > > > // Make sure we can use the fast path. > > if (Size > (size_t)(OutBufEnd - OutBufCur)) > > return write(Str.data(), Size); > > > > if (Size) { > > memcpy(OutBufCur, Str.data(), Size); > > OutBufCur += Size; > > } > > return *this; > > } > > I don’t believe "the is racy” is an appropriate qualification, “buffered raw_ostream are not providing a thread-safe API" seems more accurate to me. > > I agree. Acquiring a lock on each write to a buffered raw_ostream is too expensive. You can always explicitly use std::mutex if you have shared raw_ostreams. > > > > > Of course, one might wonder why someone would need to output to a stream from multiple threads at the same time. > > > > But imagine someone might get logs to dbgs() or errs() running the backend for a target in multiple threads. > > These are unbuffered, I wouldn’t expect a race in the code you list above. I believe it’ll always forward directly to raw_fd_ostream::write_impl(), which is calling the libc ::write(). > > — > Mehdi > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161207/51320b92/attachment.html>
Viacheslav Nikolaev via llvm-dev
2016-Dec-07 18:52 UTC
[llvm-dev] Race condition in raw_ostream
Two threads may both pass this condition: if (Size > (size_t)(OutBufEnd - OutBufCur)) return write(Str.data(), Size); Then they both appear at this point. But one of them might update OutBufCur in such a way that the subsequent memcpy of the other one will overrun the buffer. if (Size) { memcpy(OutBufCur, Str.data(), Size); OutBufCur += Size; } On Wed, Dec 7, 2016 at 9:47 PM, Mehdi Amini <mehdi.amini at apple.com> wrote:> > On Dec 7, 2016, at 10:27 AM, Viacheslav Nikolaev < > viacheslav.nikolaev at gmail.com> wrote: > > > I believe it’ll always forward directly to raw_fd_ostream::write_impl(), > which is calling the libc ::write(). > > Do you mean raw_fd_ostream somehow overrides the operator<< the code for > which I extracted? > I cannot see if that is so. And I really saw it didn't albeit in a very > old master. > > > No, I meant I didn’t see the race in the code you showed for the > operator<<(). It is very possible I missed something, but you’ll need to > help me figuring out where the race is. > > — > Mehdi > > > > I agree. Acquiring a lock on each write to a buffered raw_ostream is too > expensive. You can always explicitly use std::mutex if you have shared > raw_ostreams. > > Of course you can do it in your code. But there's a lot of "dbgs() <<" in > the backends. > And e.g. if you want to have multithread invocation of a debug version of > the lib, but without changing the backend... you might get the race > condition. > > That's why I'm asking if there's a good way to avoid a problem like that. > > On Wed, Dec 7, 2016 at 9:12 PM, Rui Ueyama <ruiu at google.com> wrote: > >> On Wed, Dec 7, 2016 at 10:02 AM, Mehdi Amini via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> >>> > On Dec 7, 2016, at 1:52 AM, Viacheslav Nikolaev via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> > >>> > This code from raw_ostream.h is really racy: >>> > >>> > raw_ostream &operator<<(StringRef Str) { >>> > // Inline fast path, particularly for strings with a known length. >>> > size_t Size = Str.size(); >>> > >>> > // Make sure we can use the fast path. >>> > if (Size > (size_t)(OutBufEnd - OutBufCur)) >>> > return write(Str.data(), Size); >>> > >>> > if (Size) { >>> > memcpy(OutBufCur, Str.data(), Size); >>> > OutBufCur += Size; >>> > } >>> > return *this; >>> > } >>> >>> I don’t believe "the is racy” is an appropriate qualification, “buffered >>> raw_ostream are not providing a thread-safe API" seems more accurate to me. >> >> >> I agree. Acquiring a lock on each write to a buffered raw_ostream is too >> expensive. You can always explicitly use std::mutex if you have shared >> raw_ostreams. >> >> >>> >>> > Of course, one might wonder why someone would need to output to a >>> stream from multiple threads at the same time. >>> > >>> > But imagine someone might get logs to dbgs() or errs() running the >>> backend for a target in multiple threads. >>> >>> These are unbuffered, I wouldn’t expect a race in the code you list >>> above. I believe it’ll always forward directly to >>> raw_fd_ostream::write_impl(), which is calling the libc ::write(). >>> >>> — >>> Mehdi >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >> >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161207/0fbe11da/attachment-0001.html>
> On Dec 7, 2016, at 10:52 AM, Viacheslav Nikolaev <viacheslav.nikolaev at gmail.com> wrote: > > Two threads may both pass this condition: > if (Size > (size_t)(OutBufEnd - OutBufCur)) > return write(Str.data(), Size); > > Then they both appear at this point. > But one of them might update OutBufCur in such a way that the subsequent memcpy of the other one will overrun the buffer.I mentioned in my original email that I don’t believe this can happen for unbuffered stream, they will always call write. — Mehdi> if (Size) { > memcpy(OutBufCur, Str.data(), Size);1 > OutBufCur += Size; > } > > > > On Wed, Dec 7, 2016 at 9:47 PM, Mehdi Amini <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> wrote: > >> On Dec 7, 2016, at 10:27 AM, Viacheslav Nikolaev <viacheslav.nikolaev at gmail.com <mailto:viacheslav.nikolaev at gmail.com>> wrote: >> >> > I believe it’ll always forward directly to raw_fd_ostream::write_impl(), which is calling the libc ::write(). >> >> Do you mean raw_fd_ostream somehow overrides the operator<< the code for which I extracted? >> I cannot see if that is so. And I really saw it didn't albeit in a very old master. > > No, I meant I didn’t see the race in the code you showed for the operator<<(). It is very possible I missed something, but you’ll need to help me figuring out where the race is. > > — > Mehdi > >> >> > I agree. Acquiring a lock on each write to a buffered raw_ostream is too expensive. You can always explicitly use std::mutex if you have shared raw_ostreams. >> >> Of course you can do it in your code. But there's a lot of "dbgs() <<" in the backends. >> And e.g. if you want to have multithread invocation of a debug version of the lib, but without changing the backend... you might get the race condition. >> >> That's why I'm asking if there's a good way to avoid a problem like that. >> >> On Wed, Dec 7, 2016 at 9:12 PM, Rui Ueyama <ruiu at google.com <mailto:ruiu at google.com>> wrote: >> On Wed, Dec 7, 2016 at 10:02 AM, Mehdi Amini via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> > On Dec 7, 2016, at 1:52 AM, Viacheslav Nikolaev via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> > >> > This code from raw_ostream.h is really racy: >> > >> > raw_ostream &operator<<(StringRef Str) { >> > // Inline fast path, particularly for strings with a known length. >> > size_t Size = Str.size(); >> > >> > // Make sure we can use the fast path. >> > if (Size > (size_t)(OutBufEnd - OutBufCur)) >> > return write(Str.data(), Size); >> > >> > if (Size) { >> > memcpy(OutBufCur, Str.data(), Size); >> > OutBufCur += Size; >> > } >> > return *this; >> > } >> >> I don’t believe "the is racy” is an appropriate qualification, “buffered raw_ostream are not providing a thread-safe API" seems more accurate to me. >> >> I agree. Acquiring a lock on each write to a buffered raw_ostream is too expensive. You can always explicitly use std::mutex if you have shared raw_ostreams. >> >> >> >> > Of course, one might wonder why someone would need to output to a stream from multiple threads at the same time. >> > >> > But imagine someone might get logs to dbgs() or errs() running the backend for a target in multiple threads. >> >> These are unbuffered, I wouldn’t expect a race in the code you list above. I believe it’ll always forward directly to raw_fd_ostream::write_impl(), which is calling the libc ::write(). >> >> — >> Mehdi >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >> >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161207/cfd89505/attachment.html>