Rafael Avila de Espindola via llvm-dev
2017-Nov-09  21:13 UTC
[llvm-dev] Experiment on how to improve our temporary file handing.
Currently a power failure or other hard crash can cause lld leave a temporary
file around. The same is true for other llvm tools.
As an example, put a breakpoint in Writer.cpp:236 ("writeBuildId()")
and
restart the run a few times. You will get
t.tmp43a735a  t.tmp4deeabb  t.tmp9bacdd3  t.tmpe4115c4  t.tmpeb01fff
The same would happen if there was a fatal error between the
FileOutputBuffer creation and commit. I don't think that is a code path
where that is possible right now, but it would be an easy thing to miss
in a code review.
I was hopping the OS could help us manage the temporary file so that
there was no way to accidentally leave it behind.
On linux there is O_TMPFILE, which allows us to create a file with no
name in the file system. A name can be given with linkat. Unfortunately
we can't use
linkat(fd, "", AT_FDCWD, "destination", AT_EMPTY_PATH)
Without special permissions and have instead to depend on proc:
linkat(AT_FDCWD, "/proc/self/fd/<num>", AT_FDCWD,
"destination",
       AT_SYMLINK_FOLLOW)
Another annoyance is that linkat will not replace the destination and
renameat2 doesn't support AT_EMPTY_PATH. The result is that we have to
use unlink+linkat and loop.
On windows there is FILE_FLAG_DELETE_ON_CLOSE, but there seems to be no
way to cancel it. If a file is created with it I can rename it, but it
is still deleted in the end.
I couldn't find any support for this on FreeBSD.
This suggest that we cannot just have a createUniqueEntity that returs
just an FD. We will need a class that stores a temporary name too for
the systems where we cannot give a filename to a FD.
I have attached the patch I got so far, but given the existing
restrictions I would say it is not worth it.
It will try to just make it more obvious that lld's FileOutputBuffer is
deleted on all code paths.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: t.diff
Type: text/x-patch
Size: 7721 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20171109/ef7a7c2e/attachment.bin>
-------------- next part --------------
Cheers,
Rafael
Reid Kleckner via llvm-dev
2017-Nov-09  21:31 UTC
[llvm-dev] Experiment on how to improve our temporary file handing.
I agree, having a nameless FD would be the most elegant way to solve this problem on Unix, but when I investigated it, I came to the same conclusion that you did: it can't be done portably. Nameless files are just too weird, so I think we're stuck with RemoveFileOnSignal. =/ For Windows, I was able to use SetFileInformationByHandle to get the OS to clean up files for us on exit. I attached my proof-of-concept program that does this. On Thu, Nov 9, 2017 at 1:13 PM, Rafael Avila de Espindola via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Currently a power failure or other hard crash can cause lld leave a > temporary > file around. The same is true for other llvm tools. > > As an example, put a breakpoint in Writer.cpp:236 ("writeBuildId()") and > restart the run a few times. You will get > > t.tmp43a735a t.tmp4deeabb t.tmp9bacdd3 t.tmpe4115c4 t.tmpeb01fff > > The same would happen if there was a fatal error between the > FileOutputBuffer creation and commit. I don't think that is a code path > where that is possible right now, but it would be an easy thing to miss > in a code review. > > I was hopping the OS could help us manage the temporary file so that > there was no way to accidentally leave it behind. > > On linux there is O_TMPFILE, which allows us to create a file with no > name in the file system. A name can be given with linkat. Unfortunately > we can't use > > linkat(fd, "", AT_FDCWD, "destination", AT_EMPTY_PATH) > > Without special permissions and have instead to depend on proc: > > linkat(AT_FDCWD, "/proc/self/fd/<num>", AT_FDCWD, "destination", > AT_SYMLINK_FOLLOW) > > Another annoyance is that linkat will not replace the destination and > renameat2 doesn't support AT_EMPTY_PATH. The result is that we have to > use unlink+linkat and loop. > > On windows there is FILE_FLAG_DELETE_ON_CLOSE, but there seems to be no > way to cancel it. If a file is created with it I can rename it, but it > is still deleted in the end. > > I couldn't find any support for this on FreeBSD. > > This suggest that we cannot just have a createUniqueEntity that returs > just an FD. We will need a class that stores a temporary name too for > the systems where we cannot give a filename to a FD. > > I have attached the patch I got so far, but given the existing > restrictions I would say it is not worth it. > > It will try to just make it more obvious that lld's FileOutputBuffer is > deleted on all code paths. > > > diff --git a/include/llvm/Support/FileSystem.h b/include/llvm/Support/ > FileSystem.h > index 03015a0ca3b..5d80db1d5a2 100644 > --- a/include/llvm/Support/FileSystem.h > +++ b/include/llvm/Support/FileSystem.h > @@ -31,6 +31,7 @@ > #include "llvm/ADT/StringRef.h" > #include "llvm/ADT/Twine.h" > #include "llvm/Support/Chrono.h" > +#include "llvm/Support/Error.h" > #include "llvm/Support/ErrorHandling.h" > #include "llvm/Support/ErrorOr.h" > #include "llvm/Support/MD5.h" > @@ -694,6 +695,26 @@ std::error_code createUniqueFile(const Twine &Model, > int &ResultFD, > std::error_code createUniqueFile(const Twine &Model, > SmallVectorImpl<char> &ResultPath); > > +// Represents a temporary file and hides the name. This allows not even > having > +// no name when O_TMPFILE is supported. > +class TempFile { > + // Temporary name used when O_TMPFILE is not avilable. > + std::string TmpName; > + > +public: > + TempFile(StringRef Name, int FD); > + TempFile(TempFile &&Other); > + > + // The open file descriptor. > + int FD; > + > + Error keep(const Twine &Name); > + ~TempFile(); > +}; > + > +Expected<TempFile> createUniqueFile(const Twine &Directory, > + unsigned Mode = all_read | all_write); > + > /// @brief Create a file in the system temporary directory. > /// > /// The filename is of the form prefix-random_chars.suffix. Since the > directory > diff --git a/lib/Support/FileOutputBuffer.cpp b/lib/Support/ > FileOutputBuffer.cpp > index db8ff38e5b5..af61af3c32e 100644 > --- a/lib/Support/FileOutputBuffer.cpp > +++ b/lib/Support/FileOutputBuffer.cpp > @@ -34,9 +34,9 @@ using namespace llvm::sys; > // with the temporary file on commit(). > class OnDiskBuffer : public FileOutputBuffer { > public: > - OnDiskBuffer(StringRef Path, StringRef TempPath, > + OnDiskBuffer(StringRef Path, fs::TempFile Temp, > std::unique_ptr<fs::mapped_file_region> Buf) > - : FileOutputBuffer(Path), Buffer(std::move(Buf)), > TempPath(TempPath) {} > + : FileOutputBuffer(Path), Buffer(std::move(Buf)), > Temp(std::move(Temp)) {} > > uint8_t *getBufferStart() const override { return (uint8_t > *)Buffer->data(); } > > @@ -51,21 +51,18 @@ public: > Buffer.reset(); > > // Atomically replace the existing file with the new one. > - auto EC = fs::rename(TempPath, FinalPath); > - sys::DontRemoveFileOnSignal(TempPath); > - return errorCodeToError(EC); > + return Temp.keep(FinalPath); > } > > ~OnDiskBuffer() override { > // Close the mapping before deleting the temp file, so that the > removal > // succeeds. > Buffer.reset(); > - fs::remove(TempPath); > } > > private: > std::unique_ptr<fs::mapped_file_region> Buffer; > - std::string TempPath; > + fs::TempFile Temp; > }; > > // A FileOutputBuffer which keeps data in memory and writes to the final > @@ -110,13 +107,13 @@ createInMemoryBuffer(StringRef Path, size_t Size, > unsigned Mode) { > > static Expected<std::unique_ptr<OnDiskBuffer>> > createOnDiskBuffer(StringRef Path, size_t Size, unsigned Mode) { > - // Create new file in same directory but with random name. > - SmallString<128> TempPath; > - int FD; > - if (auto EC = fs::createUniqueFile(Path + ".tmp%%%%%%%", FD, TempPath, > Mode)) > - return errorCodeToError(EC); > - > - sys::RemoveFileOnSignal(TempPath); > + StringRef Directory = path::parent_path(Path); > + if (Directory.empty()) > + Directory = "."; > + Expected<fs::TempFile> FileOrErr = fs::createUniqueFile(Directory, > Mode); > + if (!FileOrErr) > + return FileOrErr.takeError(); > + fs::TempFile File = std::move(*FileOrErr); > > #ifndef LLVM_ON_WIN32 > // On Windows, CreateFileMapping (the mmap function on Windows) > @@ -124,18 +121,18 @@ createOnDiskBuffer(StringRef Path, size_t Size, > unsigned Mode) { > // extend the file beforehand. _chsize (ftruncate on Windows) is > // pretty slow just like it writes specified amount of bytes, > // so we should avoid calling that function. > - if (auto EC = fs::resize_file(FD, Size)) > + if (auto EC = fs::resize_file(File.FD, Size)) > return errorCodeToError(EC); > #endif > > // Mmap it. > std::error_code EC; > auto MappedFile = llvm::make_unique<fs::mapped_file_region>( > - FD, fs::mapped_file_region::readwrite, Size, 0, EC); > - close(FD); > + File.FD, fs::mapped_file_region::readwrite, Size, 0, EC); > if (EC) > return errorCodeToError(EC); > - return llvm::make_unique<OnDiskBuffer>(Path, TempPath, > std::move(MappedFile)); > + return llvm::make_unique<OnDiskBuffer>(Path, std::move(File), > + std::move(MappedFile)); > } > > // Create an instance of FileOutputBuffer. > diff --git a/lib/Support/Path.cpp b/lib/Support/Path.cpp > index 9692acb5283..d3a356cecde 100644 > --- a/lib/Support/Path.cpp > +++ b/lib/Support/Path.cpp > @@ -18,8 +18,12 @@ > #include "llvm/Support/ErrorHandling.h" > #include "llvm/Support/FileSystem.h" > #include "llvm/Support/Process.h" > +#include "llvm/Support/Signals.h" > #include <cctype> > #include <cstring> > +#include <fcntl.h> > +#include <sys/stat.h> > +#include <sys/types.h> > > #if !defined(_MSC_VER) && !defined(__MINGW32__) > #include <unistd.h> > @@ -759,6 +763,73 @@ std::error_code createUniqueFile(const Twine &Model, > return createUniqueEntity(Model, Dummy, ResultPath, false, 0, FS_Name); > } > > +TempFile::TempFile(StringRef Name, int FD) : TmpName(Name), FD(FD) {} > +TempFile::TempFile(TempFile &&Other) { > + TmpName = std::move(Other.TmpName); > + FD = Other.FD; > + Other.FD = -1; > +} > + > +TempFile::~TempFile() { > + // FIXME: error handling. Move this to a discard method? > + if (FD != -1) > + close(FD); > + if (!TmpName.empty()) > + fs::remove(TmpName); > + sys::DontRemoveFileOnSignal(TmpName); > +} > + > +Error TempFile::keep(const Twine &Name) { > + if (TmpName.empty()) { > + // We can't do it atomically. > + // FIXME: should we loop? > + if (std::error_code EC = fs::remove(Name)) > + return errorCodeToError(EC); > + SmallString<128> Storage; > + StringRef NameP = Name.toNullTerminatedStringRef(Storage); > + std::string ProcFD = "/proc/self/fd/" + std::to_string(FD); > + if (linkat(AT_FDCWD, ProcFD.c_str(), AT_FDCWD, NameP.begin(), > + AT_SYMLINK_FOLLOW) == -1) { > + std::error_code EC(errno, std::generic_category()); > + return errorCodeToError(EC); > + } > + } else { > + if (std::error_code EC = fs::rename(TmpName, Name)) > + return errorCodeToError(EC); > + sys::DontRemoveFileOnSignal(TmpName); > + TmpName = ""; > + } > + > + if (close(FD) == -1) { > + std::error_code EC(errno, std::generic_category()); > + return errorCodeToError(EC); > + } > + FD = -1; > + > + return Error::success(); > +} > + > +Expected<TempFile> createUniqueFile(const Twine &Directory, unsigned > Mode) { > +#if 1 > + SmallString<128> Storage; > + StringRef P = Directory.toNullTerminatedStringRef(Storage); > + int FD = open(P.begin(), O_RDWR | O_TMPFILE | O_CLOEXEC, Mode); > + if (FD == -1) { > + std::error_code EC(errno, std::generic_category()); > + return errorCodeToError(EC); > + } > + return TempFile("", FD); > +#else > + int FD; > + SmallString<128> ResultPath; > + if (std::error_code EC > + createUniqueFile(Directory + "/tmp%%%%%%%", FD, ResultPath, > Mode)) > + return errorCodeToError(EC); > + sys::RemoveFileOnSignal(ResultPath); > + return TempFile(ResultPath, FD); > +#endif > +} > + > static std::error_code > createTemporaryFile(const Twine &Model, int &ResultFD, > llvm::SmallVectorImpl<char> &ResultPath, FSEntity > Type) { > diff --git a/lib/Support/Unix/Path.inc b/lib/Support/Unix/Path.inc > index 2ecb97316c8..3099b75a1b8 100644 > --- a/lib/Support/Unix/Path.inc > +++ b/lib/Support/Unix/Path.inc > @@ -779,6 +779,8 @@ std::error_code openFileForWrite(const Twine &Name, > int &ResultFD, > > int OpenFlags = O_CREAT; > > + // add O_TMPFILE? > + > #ifdef O_CLOEXEC > OpenFlags |= O_CLOEXEC; > #endif > > > Cheers, > Rafael > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171109/48f36447/attachment.html> -------------- next part -------------- #include <windows.h> #include <string> #include <stdio.h> #include <stdlib.h> #include <string.h> // Create a string with last error message static std::string GetLastErrorStdStr(DWORD error) { if (error == 0) return ""; LPVOID lpMsgBuf; DWORD bufLen = FormatMessage( FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_IGNORE_INSERTS, NULL, error, MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), (LPTSTR)&lpMsgBuf, 0, NULL); if (bufLen == 0) return ""; LPCSTR lpMsgStr = (LPCSTR)lpMsgBuf; std::string result(lpMsgStr, lpMsgStr + bufLen); LocalFree(lpMsgBuf); return result; } static void error(const char *func) { DWORD error = GetLastError(); std::string err_str = GetLastErrorStdStr(error); fprintf(stderr, "call to %s failed: %lu %s\n", func, error, err_str.c_str()); fflush(stderr); exit(1); } int main(int argc, char **argv) { printf("starting\n"); fflush(stdout); HANDLE h = CreateFileW( // L"outfile.tmp.txt", // lpFileName GENERIC_READ | GENERIC_WRITE | DELETE, // dwDesiredAccess 0, // dwShareMode NULL, // lpSecurityAttributes CREATE_ALWAYS, // dwCreationDisposition FILE_ATTRIBUTE_NORMAL, // dwFlagsAndAttributes NULL // hTemplateFile ); if (h == INVALID_HANDLE_VALUE) error("CreateFileW"); FILE_DISPOSITION_INFO disposition; disposition.DeleteFile = TRUE; if (!SetFileInformationByHandle(h, FileDispositionInfo, &disposition, sizeof(disposition))) error("SetFileInformationByHandle"); static const char buf[] = "output data\n"; DWORD written; if (!WriteFile(h, buf, sizeof(buf), &written, NULL)) error("WriteFile"); if (written != sizeof(buf)) { fprintf(stderr, "short write?\n"); fflush(stderr); exit(1); } if (argc > 1) { if (strcmp("abort", argv[1]) == 0) { fprintf(stderr, "aborting, file should be deleted\n"); fflush(stderr); abort(); } else if (strcmp("fastfail", argv[1]) == 0) { fprintf(stderr, "__fastfail, file should be deleted\n"); fflush(stderr); __fastfail(FAST_FAIL_FATAL_APP_EXIT); } else if (strcmp("derefnull", argv[1]) == 0) { fprintf(stderr, "null deref, file should be deleted\n"); fflush(stderr); *(volatile int *)0 = 42; } } // OK, we want to keep this file. disposition.DeleteFile = FALSE; if (!SetFileInformationByHandle(h, FileDispositionInfo, &disposition, sizeof(disposition))) error("SetFileInformationByHandle"); char rename_info_buf[sizeof(FILE_RENAME_INFO) + 128 * sizeof(wchar_t)]; FILE_RENAME_INFO *rename_info reinterpret_cast<FILE_RENAME_INFO *>(rename_info_buf); const wchar_t *outfile = L"C:\\src\\llvm-project\\build\\outfile.txt"; rename_info->ReplaceIfExists = TRUE; rename_info->RootDirectory = NULL; rename_info->FileNameLength = wcslen(outfile) * sizeof(wchar_t); wcsncpy_s(&rename_info->FileName[0], 128, outfile, wcslen(outfile)); printf("before rename\n"); fflush(stdout); if (!SetFileInformationByHandle(h, FileRenameInfo, rename_info, sizeof(rename_info_buf))) error("SetFileInformationByHandle"); printf("after rename\n"); fflush(stdout); CloseHandle(h); printf("post close\n"); fflush(stdout); }
Rafael Avila de Espindola via llvm-dev
2017-Nov-09  21:53 UTC
[llvm-dev] Experiment on how to improve our temporary file handing.
Reid Kleckner <rnk at google.com> writes:> // OK, we want to keep this file. > disposition.DeleteFile = FALSE; > if (!SetFileInformationByHandle(h, FileDispositionInfo, &disposition, > sizeof(disposition))) > error("SetFileInformationByHandle");Nice! That seems to be the part I was missing. That motivates me to finish the implementation. Having perfect output file handling on windows is going to be useful for us and might hopefully put some pressure on posix vendors to make it possible :-) Cheers, Rafael
Mark Kettenis via llvm-dev
2017-Nov-10  16:21 UTC
[llvm-dev] Experiment on how to improve our temporary file handing.
> Date: Thu, 09 Nov 2017 13:13:52 -0800 > From: Rafael Avila de Espindola via llvm-dev <llvm-dev at lists.llvm.org> > > Currently a power failure or other hard crash can cause lld leave a temporary > file around. The same is true for other llvm tools. > > As an example, put a breakpoint in Writer.cpp:236 ("writeBuildId()") and > restart the run a few times. You will get > > t.tmp43a735a t.tmp4deeabb t.tmp9bacdd3 t.tmpe4115c4 t.tmpeb01fff > > The same would happen if there was a fatal error between the > FileOutputBuffer creation and commit. I don't think that is a code path > where that is possible right now, but it would be an easy thing to miss > in a code review. > > I was hopping the OS could help us manage the temporary file so that > there was no way to accidentally leave it behind. > > On linux there is O_TMPFILE, which allows us to create a file with no > name in the file system.For the LLVM use case creatinmg the file and immediately unlinking it would probably be good enough. Howver...> A name can be given with linkat. Unfortunately we can't use > > linkat(fd, "", AT_FDCWD, "destination", AT_EMPTY_PATH) > > Without special permissions and have instead to depend on proc: > > linkat(AT_FDCWD, "/proc/self/fd/<num>", AT_FDCWD, "destination", > AT_SYMLINK_FOLLOW) > > Another annoyance is that linkat will not replace the destination and > renameat2 doesn't support AT_EMPTY_PATH. The result is that we have to > use unlink+linkat and loop.I'd strongly recommend avoiding such unportable cleverness. I'm actually surprised Linux allows this as there are some serious security implications as this allows programs to create an entry in the filesystem for file descriptors passed over a socket.
Nikodemus Siivola via llvm-dev
2017-Nov-10  16:39 UTC
[llvm-dev] Experiment on how to improve our temporary file handing.
In a very large class of cases the correct thing is to cleanup on next run. Not sure clang is in that class by any means, just tossing this out there... On 10 Nov 2017 18.21, "Mark Kettenis via llvm-dev" <llvm-dev at lists.llvm.org> wrote:> > Date: Thu, 09 Nov 2017 13:13:52 -0800 > > From: Rafael Avila de Espindola via llvm-dev <llvm-dev at lists.llvm.org> > > > > Currently a power failure or other hard crash can cause lld leave a > temporary > > file around. The same is true for other llvm tools. > > > > As an example, put a breakpoint in Writer.cpp:236 ("writeBuildId()") and > > restart the run a few times. You will get > > > > t.tmp43a735a t.tmp4deeabb t.tmp9bacdd3 t.tmpe4115c4 t.tmpeb01fff > > > > The same would happen if there was a fatal error between the > > FileOutputBuffer creation and commit. I don't think that is a code path > > where that is possible right now, but it would be an easy thing to miss > > in a code review. > > > > I was hopping the OS could help us manage the temporary file so that > > there was no way to accidentally leave it behind. > > > > On linux there is O_TMPFILE, which allows us to create a file with no > > name in the file system. > > For the LLVM use case creatinmg the file and immediately unlinking it > would probably be good enough. Howver... > > > A name can be given with linkat. Unfortunately we can't use > > > > linkat(fd, "", AT_FDCWD, "destination", AT_EMPTY_PATH) > > > > Without special permissions and have instead to depend on proc: > > > > linkat(AT_FDCWD, "/proc/self/fd/<num>", AT_FDCWD, "destination", > > AT_SYMLINK_FOLLOW) > > > > Another annoyance is that linkat will not replace the destination and > > renameat2 doesn't support AT_EMPTY_PATH. The result is that we have to > > use unlink+linkat and loop. > > I'd strongly recommend avoiding such unportable cleverness. > > I'm actually surprised Linux allows this as there are some serious > security implications as this allows programs to create an entry in > the filesystem for file descriptors passed over a socket. > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171110/62a31f44/attachment.html>
Rafael Avila de Espindola via llvm-dev
2017-Nov-10  21:34 UTC
[llvm-dev] Experiment on how to improve our temporary file handing.
Mark Kettenis <mark.kettenis at xs4all.nl> writes:>> The same would happen if there was a fatal error between the >> FileOutputBuffer creation and commit. I don't think that is a code path >> where that is possible right now, but it would be an easy thing to miss >> in a code review. >> >> I was hopping the OS could help us manage the temporary file so that >> there was no way to accidentally leave it behind. >> >> On linux there is O_TMPFILE, which allows us to create a file with no >> name in the file system. > > For the LLVM use case creatinmg the file and immediately unlinking it > would probably be good enough. Howver...One is not normally allowed to create a link once the count goes down to zero. O_TMPFILE is a special case. Cheers, Rafael
Rafael Avila de Espindola via llvm-dev
2017-Nov-10  21:35 UTC
[llvm-dev] Experiment on how to improve our temporary file handing.
Mark Kettenis <mark.kettenis at xs4all.nl> writes:> I'm actually surprised Linux allows this as there are some serious > security implications as this allows programs to create an entry in > the filesystem for file descriptors passed over a socket.BTW, would you mind expanding on what is the security problem of that? Thanks, Rafael
Rui Ueyama via llvm-dev
2017-Nov-13  09:06 UTC
[llvm-dev] Experiment on how to improve our temporary file handing.
On Fri, Nov 10, 2017 at 6:13 AM, Rafael Avila de Espindola < rafael.espindola at gmail.com> wrote:> Currently a power failure or other hard crash can cause lld leave a > temporary > file around. The same is true for other llvm tools. > > As an example, put a breakpoint in Writer.cpp:236 ("writeBuildId()") and > restart the run a few times. You will get > > t.tmp43a735a t.tmp4deeabb t.tmp9bacdd3 t.tmpe4115c4 t.tmpeb01fff > > The same would happen if there was a fatal error between the > FileOutputBuffer creation and commit. I don't think that is a code path > where that is possible right now, but it would be an easy thing to miss > in a code review. > > I was hopping the OS could help us manage the temporary file so that > there was no way to accidentally leave it behind. > > On linux there is O_TMPFILE, which allows us to create a file with no > name in the file system. A name can be given with linkat. Unfortunately > we can't use > > linkat(fd, "", AT_FDCWD, "destination", AT_EMPTY_PATH) > > Without special permissions and have instead to depend on proc: > > linkat(AT_FDCWD, "/proc/self/fd/<num>", AT_FDCWD, "destination", > AT_SYMLINK_FOLLOW) >I often execute lld in a chroot environment in which /proc is not mounted, and I expect lld would work just fine in that environment. So the presence of /proc should be considered optional even on Linux. Another annoyance is that linkat will not replace the destination and> renameat2 doesn't support AT_EMPTY_PATH. The result is that we have to > use unlink+linkat and loop. >I think you can avoid unlink+linkat. You can instead give a temporary file name to an unnamed file and then rename the temporary file the destination file. There's a small chance between linkat and rename, but that race condition is probably better than unlink+linkat. On windows there is FILE_FLAG_DELETE_ON_CLOSE, but there seems to be no> way to cancel it. If a file is created with it I can rename it, but it > is still deleted in the end. > > I couldn't find any support for this on FreeBSD. > > This suggest that we cannot just have a createUniqueEntity that returs > just an FD. We will need a class that stores a temporary name too for > the systems where we cannot give a filename to a FD. > > I have attached the patch I got so far, but given the existing > restrictions I would say it is not worth it. > > It will try to just make it more obvious that lld's FileOutputBuffer is > deleted on all code paths. > > > diff --git a/include/llvm/Support/FileSystem.h b/include/llvm/Support/ > FileSystem.h > index 03015a0ca3b..5d80db1d5a2 100644 > --- a/include/llvm/Support/FileSystem.h > +++ b/include/llvm/Support/FileSystem.h > @@ -31,6 +31,7 @@ > #include "llvm/ADT/StringRef.h" > #include "llvm/ADT/Twine.h" > #include "llvm/Support/Chrono.h" > +#include "llvm/Support/Error.h" > #include "llvm/Support/ErrorHandling.h" > #include "llvm/Support/ErrorOr.h" > #include "llvm/Support/MD5.h" > @@ -694,6 +695,26 @@ std::error_code createUniqueFile(const Twine &Model, > int &ResultFD, > std::error_code createUniqueFile(const Twine &Model, > SmallVectorImpl<char> &ResultPath); > > +// Represents a temporary file and hides the name. This allows not even > having > +// no name when O_TMPFILE is supported. > +class TempFile { > + // Temporary name used when O_TMPFILE is not avilable. > + std::string TmpName; > + > +public: > + TempFile(StringRef Name, int FD); > + TempFile(TempFile &&Other); > + > + // The open file descriptor. > + int FD; > + > + Error keep(const Twine &Name); > + ~TempFile(); > +}; > + > +Expected<TempFile> createUniqueFile(const Twine &Directory, > + unsigned Mode = all_read | all_write); > + > /// @brief Create a file in the system temporary directory. > /// > /// The filename is of the form prefix-random_chars.suffix. Since the > directory > diff --git a/lib/Support/FileOutputBuffer.cpp b/lib/Support/ > FileOutputBuffer.cpp > index db8ff38e5b5..af61af3c32e 100644 > --- a/lib/Support/FileOutputBuffer.cpp > +++ b/lib/Support/FileOutputBuffer.cpp > @@ -34,9 +34,9 @@ using namespace llvm::sys; > // with the temporary file on commit(). > class OnDiskBuffer : public FileOutputBuffer { > public: > - OnDiskBuffer(StringRef Path, StringRef TempPath, > + OnDiskBuffer(StringRef Path, fs::TempFile Temp, > std::unique_ptr<fs::mapped_file_region> Buf) > - : FileOutputBuffer(Path), Buffer(std::move(Buf)), > TempPath(TempPath) {} > + : FileOutputBuffer(Path), Buffer(std::move(Buf)), > Temp(std::move(Temp)) {} > > uint8_t *getBufferStart() const override { return (uint8_t > *)Buffer->data(); } > > @@ -51,21 +51,18 @@ public: > Buffer.reset(); > > // Atomically replace the existing file with the new one. > - auto EC = fs::rename(TempPath, FinalPath); > - sys::DontRemoveFileOnSignal(TempPath); > - return errorCodeToError(EC); > + return Temp.keep(FinalPath); > } > > ~OnDiskBuffer() override { > // Close the mapping before deleting the temp file, so that the > removal > // succeeds. > Buffer.reset(); > - fs::remove(TempPath); > } > > private: > std::unique_ptr<fs::mapped_file_region> Buffer; > - std::string TempPath; > + fs::TempFile Temp; > }; > > // A FileOutputBuffer which keeps data in memory and writes to the final > @@ -110,13 +107,13 @@ createInMemoryBuffer(StringRef Path, size_t Size, > unsigned Mode) { > > static Expected<std::unique_ptr<OnDiskBuffer>> > createOnDiskBuffer(StringRef Path, size_t Size, unsigned Mode) { > - // Create new file in same directory but with random name. > - SmallString<128> TempPath; > - int FD; > - if (auto EC = fs::createUniqueFile(Path + ".tmp%%%%%%%", FD, TempPath, > Mode)) > - return errorCodeToError(EC); > - > - sys::RemoveFileOnSignal(TempPath); > + StringRef Directory = path::parent_path(Path); > + if (Directory.empty()) > + Directory = "."; > + Expected<fs::TempFile> FileOrErr = fs::createUniqueFile(Directory, > Mode); > + if (!FileOrErr) > + return FileOrErr.takeError(); > + fs::TempFile File = std::move(*FileOrErr); > > #ifndef LLVM_ON_WIN32 > // On Windows, CreateFileMapping (the mmap function on Windows) > @@ -124,18 +121,18 @@ createOnDiskBuffer(StringRef Path, size_t Size, > unsigned Mode) { > // extend the file beforehand. _chsize (ftruncate on Windows) is > // pretty slow just like it writes specified amount of bytes, > // so we should avoid calling that function. > - if (auto EC = fs::resize_file(FD, Size)) > + if (auto EC = fs::resize_file(File.FD, Size)) > return errorCodeToError(EC); > #endif > > // Mmap it. > std::error_code EC; > auto MappedFile = llvm::make_unique<fs::mapped_file_region>( > - FD, fs::mapped_file_region::readwrite, Size, 0, EC); > - close(FD); > + File.FD, fs::mapped_file_region::readwrite, Size, 0, EC); > if (EC) > return errorCodeToError(EC); > - return llvm::make_unique<OnDiskBuffer>(Path, TempPath, > std::move(MappedFile)); > + return llvm::make_unique<OnDiskBuffer>(Path, std::move(File), > + std::move(MappedFile)); > } > > // Create an instance of FileOutputBuffer. > diff --git a/lib/Support/Path.cpp b/lib/Support/Path.cpp > index 9692acb5283..d3a356cecde 100644 > --- a/lib/Support/Path.cpp > +++ b/lib/Support/Path.cpp > @@ -18,8 +18,12 @@ > #include "llvm/Support/ErrorHandling.h" > #include "llvm/Support/FileSystem.h" > #include "llvm/Support/Process.h" > +#include "llvm/Support/Signals.h" > #include <cctype> > #include <cstring> > +#include <fcntl.h> > +#include <sys/stat.h> > +#include <sys/types.h> > > #if !defined(_MSC_VER) && !defined(__MINGW32__) > #include <unistd.h> > @@ -759,6 +763,73 @@ std::error_code createUniqueFile(const Twine &Model, > return createUniqueEntity(Model, Dummy, ResultPath, false, 0, FS_Name); > } > > +TempFile::TempFile(StringRef Name, int FD) : TmpName(Name), FD(FD) {} > +TempFile::TempFile(TempFile &&Other) { > + TmpName = std::move(Other.TmpName); > + FD = Other.FD; > + Other.FD = -1; > +} > + > +TempFile::~TempFile() { > + // FIXME: error handling. Move this to a discard method? > + if (FD != -1) > + close(FD); > + if (!TmpName.empty()) > + fs::remove(TmpName); > + sys::DontRemoveFileOnSignal(TmpName); > +} > + > +Error TempFile::keep(const Twine &Name) { > + if (TmpName.empty()) { > + // We can't do it atomically. > + // FIXME: should we loop? > + if (std::error_code EC = fs::remove(Name)) > + return errorCodeToError(EC); > + SmallString<128> Storage; > + StringRef NameP = Name.toNullTerminatedStringRef(Storage); > + std::string ProcFD = "/proc/self/fd/" + std::to_string(FD); > + if (linkat(AT_FDCWD, ProcFD.c_str(), AT_FDCWD, NameP.begin(), > + AT_SYMLINK_FOLLOW) == -1) { > + std::error_code EC(errno, std::generic_category()); > + return errorCodeToError(EC); > + } > + } else { > + if (std::error_code EC = fs::rename(TmpName, Name)) > + return errorCodeToError(EC); > + sys::DontRemoveFileOnSignal(TmpName); > + TmpName = ""; > + } > + > + if (close(FD) == -1) { > + std::error_code EC(errno, std::generic_category()); > + return errorCodeToError(EC); > + } > + FD = -1; > + > + return Error::success(); > +} > + > +Expected<TempFile> createUniqueFile(const Twine &Directory, unsigned > Mode) { > +#if 1 > + SmallString<128> Storage; > + StringRef P = Directory.toNullTerminatedStringRef(Storage); > + int FD = open(P.begin(), O_RDWR | O_TMPFILE | O_CLOEXEC, Mode); > + if (FD == -1) { > + std::error_code EC(errno, std::generic_category()); > + return errorCodeToError(EC); > + } > + return TempFile("", FD); > +#else > + int FD; > + SmallString<128> ResultPath; > + if (std::error_code EC > + createUniqueFile(Directory + "/tmp%%%%%%%", FD, ResultPath, > Mode)) > + return errorCodeToError(EC); > + sys::RemoveFileOnSignal(ResultPath); > + return TempFile(ResultPath, FD); > +#endif > +} > + > static std::error_code > createTemporaryFile(const Twine &Model, int &ResultFD, > llvm::SmallVectorImpl<char> &ResultPath, FSEntity > Type) { > diff --git a/lib/Support/Unix/Path.inc b/lib/Support/Unix/Path.inc > index 2ecb97316c8..3099b75a1b8 100644 > --- a/lib/Support/Unix/Path.inc > +++ b/lib/Support/Unix/Path.inc > @@ -779,6 +779,8 @@ std::error_code openFileForWrite(const Twine &Name, > int &ResultFD, > > int OpenFlags = O_CREAT; > > + // add O_TMPFILE? > + > #ifdef O_CLOEXEC > OpenFlags |= O_CLOEXEC; > #endif > > > Cheers, > Rafael > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171113/3540f2c1/attachment.html>
Rafael Avila de Espindola via llvm-dev
2017-Nov-13  17:42 UTC
[llvm-dev] Experiment on how to improve our temporary file handing.
>> Without special permissions and have instead to depend on proc: >> >> linkat(AT_FDCWD, "/proc/self/fd/<num>", AT_FDCWD, "destination", >> AT_SYMLINK_FOLLOW) >> > > I often execute lld in a chroot environment in which /proc is not mounted, > and I expect lld would work just fine in that environment. So the presence > of /proc should be considered optional even on Linux.I agree. We would have to check if proc is present before using O_TMPFILE.> Another annoyance is that linkat will not replace the destination and >> renameat2 doesn't support AT_EMPTY_PATH. The result is that we have to >> use unlink+linkat and loop. >> > > I think you can avoid unlink+linkat. You can instead give a temporary file > name to an unnamed file and then rename the temporary file the destination > file. There's a small chance between linkat and rename, but that race > condition is probably better than unlink+linkat.True. I feels silly that we still need to create a temporary name with O_TMPFILE, but that is a possibility. In any case. I will focus on the windows case first as it seems to be the more generally available API. Cheers, Rafael
Davide Italiano via llvm-dev
2017-Nov-13  19:03 UTC
[llvm-dev] Experiment on how to improve our temporary file handing.
On Thu, Nov 9, 2017 at 1:13 PM, Rafael Avila de Espindola <rafael.espindola at gmail.com> wrote:> Currently a power failure or other hard crash can cause lld leave a temporary > file around. The same is true for other llvm tools. > > As an example, put a breakpoint in Writer.cpp:236 ("writeBuildId()") and > restart the run a few times. You will get > > t.tmp43a735a t.tmp4deeabb t.tmp9bacdd3 t.tmpe4115c4 t.tmpeb01fff > > The same would happen if there was a fatal error between the > FileOutputBuffer creation and commit. I don't think that is a code path > where that is possible right now, but it would be an easy thing to miss > in a code review. > > I was hopping the OS could help us manage the temporary file so that > there was no way to accidentally leave it behind. > > On linux there is O_TMPFILE, which allows us to create a file with no > name in the file system. A name can be given with linkat. Unfortunately > we can't use > > linkat(fd, "", AT_FDCWD, "destination", AT_EMPTY_PATH) > > Without special permissions and have instead to depend on proc: > > linkat(AT_FDCWD, "/proc/self/fd/<num>", AT_FDCWD, "destination", > AT_SYMLINK_FOLLOW) > > Another annoyance is that linkat will not replace the destination and > renameat2 doesn't support AT_EMPTY_PATH. The result is that we have to > use unlink+linkat and loop. > > On windows there is FILE_FLAG_DELETE_ON_CLOSE, but there seems to be no > way to cancel it. If a file is created with it I can rename it, but it > is still deleted in the end. > > I couldn't find any support for this on FreeBSD. >AFAIK FreeBSD supports some variant of /proc that should map Linux (although the mapping isn't 1:1). Does it lack support for this? Thanks for looking into this, BTW! -- Davide
Rafael Avila de Espindola via llvm-dev
2017-Nov-13  19:48 UTC
[llvm-dev] Experiment on how to improve our temporary file handing.
Davide Italiano <davide.italiano at gmail.com> writes:>> I couldn't find any support for this on FreeBSD. >> > > AFAIK FreeBSD supports some variant of /proc that should map Linux > (although the mapping isn't 1:1). > Does it lack support for this? Thanks for looking into this, BTW!O_TMPFILE is the main thing that seems to be missing. Cheers, Rafael