David Blaikie via llvm-dev
2017-May-29 17:35 UTC
[llvm-dev] Should we split llvm Support and ADT?
On Mon, May 29, 2017 at 9:25 AM Zachary Turner via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On Sun, May 28, 2017 at 8:54 PM Mehdi AMINI via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> 2017-05-26 17:47 GMT-07:00 Zachary Turner via llvm-dev < >> llvm-dev at lists.llvm.org>: >> >>> Changing a header file somewhere and having to spend 10 minutes waiting >>> for a build leads to a lot of wasted developer time. >>> >>> The real culprit here is tablegen. Can we split support and ADT into >>> two - the parts that tablegen depends on and the parts that it doesn't? >>> >> >> Splitting ADT just based on tablegen usage seems dubious to me. If we >> need to go this route, I'd replace as many uses of ADT data structure with >> STL ones to begin with to reduce the surface. >> > > Tablegen would not need to determine WHERE to split, it would just > motivate the why. It's obvious just from looking at Support's include > directory though that a lot of stuff in there doesn't belong together. A > quick look over the include directory already suggests a split into > "broadly useful stuff" and "narrowly useful stuff" > > Broadly useful stuff: > AlignOf > Allocator > ArrayRecycler > Atomic > AtomicOrdering > Capacity > Casting > Chrono > circular_raw_ostream > COM.h > CommandLine.h > Compiler.h > ConvertUTF.h > CrashRecoveryContext.h > DataExtractor.h > Debug.h > Endian.h > EndianStream.h > Errc.h > Errno.h > Error.h > ErrorHandling.h > ErrorOr.h > FileOutputBuffer.h > FileSystem.h > FileUtilities.h > Format*.h > GlobPattern.h > Host.h > JamCRC.h > KnownBits.h > LineIterator.h > Locale.h > ManagedStatic.h > MathExtras.h > MD5.h > Memory.h > MemoryBuffer.h > Mutex.h > MutexGuard.h > NativeFormatting.h > Options.h > Parallel.h > Path.h > PointerLikeTypeTraits.h > PrettyStackTrace.h > Printable.h > Process.h > Program.h > RandomNumberGenerator.h > raw_os_ostream.h > raw_ostream.h > raw_sha1_ostream.h > Recycler.h > RecyclingAllocator.h > Regex.h > RWMutex.h > SaveAndRestore.h > ScaledNumber.h > SHA1.h > Signals.h > StringPool.h > StringSaver.h > SwapByteOrder.h > SystemUtils.h > thread.h > Threading.h > ThreadLocal.h > ThreadPool.h > Timer.h > TrailingObjects.h > Unicode.h > UnicodeCharRanges.h > UniqueLock.h > Watchdog.h > Win64EH.h > WindowsError.h > xxhash.h > > > Narrowly useful stuff: > AArch64TargetParser.def > ARMAttributeParser.h > ARMBuildAttriubtes.h > ARMEHABI.h > ARMTargetParser.def > ARMWinEH.h > Binary*Stream*.h > BlockFrequency.h > BranchProbability.h > CachePruning.h > CBindingWrapping.h > CodeGen.h > CodeGenCWrappers.h > COFF.h > Compression.h > DebugCounter.h > DotGraphTraits.h > Dwarf.def > Dwarf.h > DynamicLibrary.h > ELF.h > GCOV.h > GenericDomTree.h > GenericDomTreeConstruction.h > GraphWriter.h > LEB128.h > LockFileManager.h > LowLevelTypeImpl.h > MachO.def > MachO.h > MipsABIFlags.h > OnDiskHashTable.h > PluginLoader.h > Registry.h > ScopedPrinter.h > SMLoc.h > Solaris.h > SourceMgr.h > SpecialCaseList.h > TargetParser.h > TargetRegistry.h > TargetSelect.h > TarWriter.h > ToolOutputFile.h > TrigramIndex.h > TypeName.h > Valgrind.h > Wasm.h > YAMLParser.h > YAMLTraits.h > > > So, as a very crude first attempt, you call the first group of stuff > "Base", the second group "Support", and add a dependency from Support to > Base. This has nothing to do with tablegen, btw, and tablegen would still > probably depend on Support even after this separation, but it makes sense > even from a high level layering perspective (IMO) >To what end, though? If most things will depend on both anyway - it doesn't sound like the split would help/improve the situation re: changes to base or support leading to full rebuilds of LLVM... no?> > >> >> 2017-05-28 8:25 GMT-07:00 Krzysztof Parzyszek via llvm-dev < >> llvm-dev at lists.llvm.org>: >> >>> On 5/26/2017 7:59 PM, Zachary Turner via llvm-dev wrote: >>> >>>> It's that TableGen depends on Support, so if you change one file in >>>> support, support gets recompiled into a new static archive, which >>>> triggers a rerun of tablegen on all the tablegen inputs, which is >>>> extremely slow. >>>> >>> >>> What exactly is extremely slow? In my experience TableGen itself takes a >>> negligible amount of time compared to the rest of the build. This is >>> particularly true in cases when something in Support or ADT is modified, as >>> this usually triggers recompilation of large parts of LLVM. >>> >> >> Tablegen built in debug is really slow though, I remember an out-of-tree >> backend where running llvm-tblgen was taking up to 5 min per file! >> The CMake option LLVM_OPTIMIZED_TABLEGEN helps a lot with this. Otherwise >> LLVM_TABLEGEN is even more efficient, but it's a double-edged sword. >> >> But we could also use the diff-and-copy approach not on the tablegen >> output but on the llvm-tblgen binary itself, that way we wouldn't re-run it >> when it does not change itself (I'm not sure why CMake does not use this >> strategy by default for any file including .o and .a?). >> >> -- >> Mehdi >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170529/1acc4e64/attachment.html>
Zachary Turner via llvm-dev
2017-May-29 17:52 UTC
[llvm-dev] Should we split llvm Support and ADT?
Well you have to start somewhere right? Even if it doesn't completely solve the issue with TableGen, it is a step towards that, and it does help the issue of Support being a mess with everything under the kitchen sink. After doing that split, I'm going to wager that everything in ADT would depend only on Base (or be very close to depending only on Base), at which point LLVMBase.a can be ac ombination of ADT and Base instead of a combination of ADT and Support, and then you only have to look at the remaining headers in Support to see what tablegen depends on. The original list I posted that tablegen depended on from Support was #include "llvm/Support/Casting.h" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Compiler.h" #include "llvm/Support/DataTypes.h" #include "llvm/Support/Debug.h" #include "llvm/Support/Error.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/Format.h" #include "llvm/Support/FormattedStream.h" #include "llvm/Support/LEB128.h" #include "llvm/Support/LowLevelTypeImpl.h" #include "llvm/Support/ManagedStatic.h" #include "llvm/Support/MathExtras.h" #include "llvm/Support/MemoryBuffer.h" #include "llvm/Support/PrettyStackTrace.h" #include "llvm/Support/Regex.h" #include "llvm/Support/SMLoc.h" #include "llvm/Support/ScopedPrinter.h" #include "llvm/Support/Signals.h" #include "llvm/Support/SourceMgr.h" #include "llvm/Support/raw_ostream.h" And there's only 2 or 3 of these that I put on the "narrowly useful" list, so it doesn't seem that difficult to address from here. (And I came up with that separation without even looking at TableGen's dependencies, so it seems TableGen already only depends on most of the actually broadly useful stuff). Of course, TableGen is only part of the build process (albeit the slowest part), I would be rather surprised if a split such as the one proposed (with some inevitable alternations based on user feedback) did not allow various other projects to also remove a dependency on Support in favor of one on Base, which could have a positive trickle down effect on build times for other projects not related to tablegen. On Mon, May 29, 2017 at 10:35 AM David Blaikie <dblaikie at gmail.com> wrote:> On Mon, May 29, 2017 at 9:25 AM Zachary Turner via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> On Sun, May 28, 2017 at 8:54 PM Mehdi AMINI via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> 2017-05-26 17:47 GMT-07:00 Zachary Turner via llvm-dev < >>> llvm-dev at lists.llvm.org>: >>> >>>> Changing a header file somewhere and having to spend 10 minutes waiting >>>> for a build leads to a lot of wasted developer time. >>>> >>>> The real culprit here is tablegen. Can we split support and ADT into >>>> two - the parts that tablegen depends on and the parts that it doesn't? >>>> >>> >>> Splitting ADT just based on tablegen usage seems dubious to me. If we >>> need to go this route, I'd replace as many uses of ADT data structure with >>> STL ones to begin with to reduce the surface. >>> >> >> Tablegen would not need to determine WHERE to split, it would just >> motivate the why. It's obvious just from looking at Support's include >> directory though that a lot of stuff in there doesn't belong together. A >> quick look over the include directory already suggests a split into >> "broadly useful stuff" and "narrowly useful stuff" >> >> Broadly useful stuff: >> AlignOf >> Allocator >> ArrayRecycler >> Atomic >> AtomicOrdering >> Capacity >> Casting >> Chrono >> circular_raw_ostream >> COM.h >> CommandLine.h >> Compiler.h >> ConvertUTF.h >> CrashRecoveryContext.h >> DataExtractor.h >> Debug.h >> Endian.h >> EndianStream.h >> Errc.h >> Errno.h >> Error.h >> ErrorHandling.h >> ErrorOr.h >> FileOutputBuffer.h >> FileSystem.h >> FileUtilities.h >> Format*.h >> GlobPattern.h >> Host.h >> JamCRC.h >> KnownBits.h >> LineIterator.h >> Locale.h >> ManagedStatic.h >> MathExtras.h >> MD5.h >> Memory.h >> MemoryBuffer.h >> Mutex.h >> MutexGuard.h >> NativeFormatting.h >> Options.h >> Parallel.h >> Path.h >> PointerLikeTypeTraits.h >> PrettyStackTrace.h >> Printable.h >> Process.h >> Program.h >> RandomNumberGenerator.h >> raw_os_ostream.h >> raw_ostream.h >> raw_sha1_ostream.h >> Recycler.h >> RecyclingAllocator.h >> Regex.h >> RWMutex.h >> SaveAndRestore.h >> ScaledNumber.h >> SHA1.h >> Signals.h >> StringPool.h >> StringSaver.h >> SwapByteOrder.h >> SystemUtils.h >> thread.h >> Threading.h >> ThreadLocal.h >> ThreadPool.h >> Timer.h >> TrailingObjects.h >> Unicode.h >> UnicodeCharRanges.h >> UniqueLock.h >> Watchdog.h >> Win64EH.h >> WindowsError.h >> xxhash.h >> >> >> Narrowly useful stuff: >> AArch64TargetParser.def >> ARMAttributeParser.h >> ARMBuildAttriubtes.h >> ARMEHABI.h >> ARMTargetParser.def >> ARMWinEH.h >> Binary*Stream*.h >> BlockFrequency.h >> BranchProbability.h >> CachePruning.h >> CBindingWrapping.h >> CodeGen.h >> CodeGenCWrappers.h >> COFF.h >> Compression.h >> DebugCounter.h >> DotGraphTraits.h >> Dwarf.def >> Dwarf.h >> DynamicLibrary.h >> ELF.h >> GCOV.h >> GenericDomTree.h >> GenericDomTreeConstruction.h >> GraphWriter.h >> LEB128.h >> LockFileManager.h >> LowLevelTypeImpl.h >> MachO.def >> MachO.h >> MipsABIFlags.h >> OnDiskHashTable.h >> PluginLoader.h >> Registry.h >> ScopedPrinter.h >> SMLoc.h >> Solaris.h >> SourceMgr.h >> SpecialCaseList.h >> TargetParser.h >> TargetRegistry.h >> TargetSelect.h >> TarWriter.h >> ToolOutputFile.h >> TrigramIndex.h >> TypeName.h >> Valgrind.h >> Wasm.h >> YAMLParser.h >> YAMLTraits.h >> >> >> So, as a very crude first attempt, you call the first group of stuff >> "Base", the second group "Support", and add a dependency from Support to >> Base. This has nothing to do with tablegen, btw, and tablegen would still >> probably depend on Support even after this separation, but it makes sense >> even from a high level layering perspective (IMO) >> > > To what end, though? If most things will depend on both anyway - it > doesn't sound like the split would help/improve the situation re: changes > to base or support leading to full rebuilds of LLVM... no? > > >> >> >>> >>> 2017-05-28 8:25 GMT-07:00 Krzysztof Parzyszek via llvm-dev < >>> llvm-dev at lists.llvm.org>: >>> >>>> On 5/26/2017 7:59 PM, Zachary Turner via llvm-dev wrote: >>>> >>>>> It's that TableGen depends on Support, so if you change one file in >>>>> support, support gets recompiled into a new static archive, which >>>>> triggers a rerun of tablegen on all the tablegen inputs, which is >>>>> extremely slow. >>>>> >>>> >>>> What exactly is extremely slow? In my experience TableGen itself takes >>>> a negligible amount of time compared to the rest of the build. This is >>>> particularly true in cases when something in Support or ADT is modified, as >>>> this usually triggers recompilation of large parts of LLVM. >>>> >>> >>> Tablegen built in debug is really slow though, I remember an out-of-tree >>> backend where running llvm-tblgen was taking up to 5 min per file! >>> The CMake option LLVM_OPTIMIZED_TABLEGEN helps a lot with this. >>> Otherwise LLVM_TABLEGEN is even more efficient, but it's a double-edged >>> sword. >>> >>> But we could also use the diff-and-copy approach not on the tablegen >>> output but on the llvm-tblgen binary itself, that way we wouldn't re-run it >>> when it does not change itself (I'm not sure why CMake does not use this >>> strategy by default for any file including .o and .a?). >>> >>> -- >>> Mehdi >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170529/5db02903/attachment-0001.html>
Zachary Turner via llvm-dev
2017-May-29 17:55 UTC
[llvm-dev] Should we split llvm Support and ADT?
Also, like Mehdi said, it would be nice if there were a way for non LLVM projects to use Support. I one tried to make a program unrelated to LLVM or compilers just for a hobbyist project, and I wanted to link against Support to use things like StringRef, DenseMap, etc, but because of all the added cruft in support, I could never get it even building. The less stuff there is the easier it would be for external projects to benefit from some of LLVM's common libraries. Obviously this is not an explicitly stated goal of the project, but if it can fall out naturally from otherwise good design principles, why not? On Mon, May 29, 2017 at 10:52 AM Zachary Turner <zturner at google.com> wrote:> Well you have to start somewhere right? Even if it doesn't completely > solve the issue with TableGen, it is a step towards that, and it does help > the issue of Support being a mess with everything under the kitchen sink. > After doing that split, I'm going to wager that everything in ADT would > depend only on Base (or be very close to depending only on Base), at which > point LLVMBase.a can be ac ombination of ADT and Base instead of a > combination of ADT and Support, and then you only have to look at the > remaining headers in Support to see what tablegen depends on. The original > list I posted that tablegen depended on from Support was > > #include "llvm/Support/Casting.h" > #include "llvm/Support/CommandLine.h" > #include "llvm/Support/Compiler.h" > #include "llvm/Support/DataTypes.h" > #include "llvm/Support/Debug.h" > #include "llvm/Support/Error.h" > #include "llvm/Support/ErrorHandling.h" > #include "llvm/Support/Format.h" > #include "llvm/Support/FormattedStream.h" > #include "llvm/Support/LEB128.h" > #include "llvm/Support/LowLevelTypeImpl.h" > #include "llvm/Support/ManagedStatic.h" > #include "llvm/Support/MathExtras.h" > #include "llvm/Support/MemoryBuffer.h" > #include "llvm/Support/PrettyStackTrace.h" > #include "llvm/Support/Regex.h" > #include "llvm/Support/SMLoc.h" > #include "llvm/Support/ScopedPrinter.h" > #include "llvm/Support/Signals.h" > #include "llvm/Support/SourceMgr.h" > #include "llvm/Support/raw_ostream.h" > > And there's only 2 or 3 of these that I put on the "narrowly useful" list, > so it doesn't seem that difficult to address from here. (And I came up > with that separation without even looking at TableGen's dependencies, so it > seems TableGen already only depends on most of the actually broadly useful > stuff). > > Of course, TableGen is only part of the build process (albeit the slowest > part), I would be rather surprised if a split such as the one proposed > (with some inevitable alternations based on user feedback) did not allow > various other projects to also remove a dependency on Support in favor of > one on Base, which could have a positive trickle down effect on build times > for other projects not related to tablegen. > > On Mon, May 29, 2017 at 10:35 AM David Blaikie <dblaikie at gmail.com> wrote: > >> On Mon, May 29, 2017 at 9:25 AM Zachary Turner via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> On Sun, May 28, 2017 at 8:54 PM Mehdi AMINI via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> 2017-05-26 17:47 GMT-07:00 Zachary Turner via llvm-dev < >>>> llvm-dev at lists.llvm.org>: >>>> >>>>> Changing a header file somewhere and having to spend 10 minutes >>>>> waiting for a build leads to a lot of wasted developer time. >>>>> >>>>> The real culprit here is tablegen. Can we split support and ADT into >>>>> two - the parts that tablegen depends on and the parts that it doesn't? >>>>> >>>> >>>> Splitting ADT just based on tablegen usage seems dubious to me. If we >>>> need to go this route, I'd replace as many uses of ADT data structure with >>>> STL ones to begin with to reduce the surface. >>>> >>> >>> Tablegen would not need to determine WHERE to split, it would just >>> motivate the why. It's obvious just from looking at Support's include >>> directory though that a lot of stuff in there doesn't belong together. A >>> quick look over the include directory already suggests a split into >>> "broadly useful stuff" and "narrowly useful stuff" >>> >>> Broadly useful stuff: >>> AlignOf >>> Allocator >>> ArrayRecycler >>> Atomic >>> AtomicOrdering >>> Capacity >>> Casting >>> Chrono >>> circular_raw_ostream >>> COM.h >>> CommandLine.h >>> Compiler.h >>> ConvertUTF.h >>> CrashRecoveryContext.h >>> DataExtractor.h >>> Debug.h >>> Endian.h >>> EndianStream.h >>> Errc.h >>> Errno.h >>> Error.h >>> ErrorHandling.h >>> ErrorOr.h >>> FileOutputBuffer.h >>> FileSystem.h >>> FileUtilities.h >>> Format*.h >>> GlobPattern.h >>> Host.h >>> JamCRC.h >>> KnownBits.h >>> LineIterator.h >>> Locale.h >>> ManagedStatic.h >>> MathExtras.h >>> MD5.h >>> Memory.h >>> MemoryBuffer.h >>> Mutex.h >>> MutexGuard.h >>> NativeFormatting.h >>> Options.h >>> Parallel.h >>> Path.h >>> PointerLikeTypeTraits.h >>> PrettyStackTrace.h >>> Printable.h >>> Process.h >>> Program.h >>> RandomNumberGenerator.h >>> raw_os_ostream.h >>> raw_ostream.h >>> raw_sha1_ostream.h >>> Recycler.h >>> RecyclingAllocator.h >>> Regex.h >>> RWMutex.h >>> SaveAndRestore.h >>> ScaledNumber.h >>> SHA1.h >>> Signals.h >>> StringPool.h >>> StringSaver.h >>> SwapByteOrder.h >>> SystemUtils.h >>> thread.h >>> Threading.h >>> ThreadLocal.h >>> ThreadPool.h >>> Timer.h >>> TrailingObjects.h >>> Unicode.h >>> UnicodeCharRanges.h >>> UniqueLock.h >>> Watchdog.h >>> Win64EH.h >>> WindowsError.h >>> xxhash.h >>> >>> >>> Narrowly useful stuff: >>> AArch64TargetParser.def >>> ARMAttributeParser.h >>> ARMBuildAttriubtes.h >>> ARMEHABI.h >>> ARMTargetParser.def >>> ARMWinEH.h >>> Binary*Stream*.h >>> BlockFrequency.h >>> BranchProbability.h >>> CachePruning.h >>> CBindingWrapping.h >>> CodeGen.h >>> CodeGenCWrappers.h >>> COFF.h >>> Compression.h >>> DebugCounter.h >>> DotGraphTraits.h >>> Dwarf.def >>> Dwarf.h >>> DynamicLibrary.h >>> ELF.h >>> GCOV.h >>> GenericDomTree.h >>> GenericDomTreeConstruction.h >>> GraphWriter.h >>> LEB128.h >>> LockFileManager.h >>> LowLevelTypeImpl.h >>> MachO.def >>> MachO.h >>> MipsABIFlags.h >>> OnDiskHashTable.h >>> PluginLoader.h >>> Registry.h >>> ScopedPrinter.h >>> SMLoc.h >>> Solaris.h >>> SourceMgr.h >>> SpecialCaseList.h >>> TargetParser.h >>> TargetRegistry.h >>> TargetSelect.h >>> TarWriter.h >>> ToolOutputFile.h >>> TrigramIndex.h >>> TypeName.h >>> Valgrind.h >>> Wasm.h >>> YAMLParser.h >>> YAMLTraits.h >>> >>> >>> So, as a very crude first attempt, you call the first group of stuff >>> "Base", the second group "Support", and add a dependency from Support to >>> Base. This has nothing to do with tablegen, btw, and tablegen would still >>> probably depend on Support even after this separation, but it makes sense >>> even from a high level layering perspective (IMO) >>> >> >> To what end, though? If most things will depend on both anyway - it >> doesn't sound like the split would help/improve the situation re: changes >> to base or support leading to full rebuilds of LLVM... no? >> >> >>> >>> >>>> >>>> 2017-05-28 8:25 GMT-07:00 Krzysztof Parzyszek via llvm-dev < >>>> llvm-dev at lists.llvm.org>: >>>> >>>>> On 5/26/2017 7:59 PM, Zachary Turner via llvm-dev wrote: >>>>> >>>>>> It's that TableGen depends on Support, so if you change one file in >>>>>> support, support gets recompiled into a new static archive, which >>>>>> triggers a rerun of tablegen on all the tablegen inputs, which is >>>>>> extremely slow. >>>>>> >>>>> >>>>> What exactly is extremely slow? In my experience TableGen itself takes >>>>> a negligible amount of time compared to the rest of the build. This is >>>>> particularly true in cases when something in Support or ADT is modified, as >>>>> this usually triggers recompilation of large parts of LLVM. >>>>> >>>> >>>> Tablegen built in debug is really slow though, I remember an >>>> out-of-tree backend where running llvm-tblgen was taking up to 5 min per >>>> file! >>>> The CMake option LLVM_OPTIMIZED_TABLEGEN helps a lot with this. >>>> Otherwise LLVM_TABLEGEN is even more efficient, but it's a double-edged >>>> sword. >>>> >>>> But we could also use the diff-and-copy approach not on the tablegen >>>> output but on the llvm-tblgen binary itself, that way we wouldn't re-run it >>>> when it does not change itself (I'm not sure why CMake does not use this >>>> strategy by default for any file including .o and .a?). >>>> >>>> -- >>>> Mehdi >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170529/db98319f/attachment.html>
Matthias Braun via llvm-dev
2017-May-30 18:19 UTC
[llvm-dev] Should we split llvm Support and ADT?
If the end result is "Base", "Misc", "Broad", "Narrow", "Util", "Support" or other non-descriptive names then I'm against it! If you can find pieces that can be broken out as a logical units such as "TargetTriple"(for Triple.h and all the ARM parsers) or "object file formats" then I don't think anyone would object patches to do so. Motivating this with a better project structure sounds sensible to me, I am not convinced your buildtimes will change all that much: The next change to something like ADT/StringRef.h will come and you have to do the tablegen rebuild. - Matthias> On May 29, 2017, at 10:52 AM, Zachary Turner via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Well you have to start somewhere right? Even if it doesn't completely solve the issue with TableGen, it is a step towards that, and it does help the issue of Support being a mess with everything under the kitchen sink. After doing that split, I'm going to wager that everything in ADT would depend only on Base (or be very close to depending only on Base), at which point LLVMBase.a can be ac ombination of ADT and Base instead of a combination of ADT and Support, and then you only have to look at the remaining headers in Support to see what tablegen depends on. The original list I posted that tablegen depended on from Support was > > #include "llvm/Support/Casting.h" > #include "llvm/Support/CommandLine.h" > #include "llvm/Support/Compiler.h" > #include "llvm/Support/DataTypes.h" > #include "llvm/Support/Debug.h" > #include "llvm/Support/Error.h" > #include "llvm/Support/ErrorHandling.h" > #include "llvm/Support/Format.h" > #include "llvm/Support/FormattedStream.h" > #include "llvm/Support/LEB128.h" > #include "llvm/Support/LowLevelTypeImpl.h" > #include "llvm/Support/ManagedStatic.h" > #include "llvm/Support/MathExtras.h" > #include "llvm/Support/MemoryBuffer.h" > #include "llvm/Support/PrettyStackTrace.h" > #include "llvm/Support/Regex.h" > #include "llvm/Support/SMLoc.h" > #include "llvm/Support/ScopedPrinter.h" > #include "llvm/Support/Signals.h" > #include "llvm/Support/SourceMgr.h" > #include "llvm/Support/raw_ostream.h" > > And there's only 2 or 3 of these that I put on the "narrowly useful" list, so it doesn't seem that difficult to address from here. (And I came up with that separation without even looking at TableGen's dependencies, so it seems TableGen already only depends on most of the actually broadly useful stuff). > > Of course, TableGen is only part of the build process (albeit the slowest part), I would be rather surprised if a split such as the one proposed (with some inevitable alternations based on user feedback) did not allow various other projects to also remove a dependency on Support in favor of one on Base, which could have a positive trickle down effect on build times for other projects not related to tablegen. > > On Mon, May 29, 2017 at 10:35 AM David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> wrote: > On Mon, May 29, 2017 at 9:25 AM Zachary Turner via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > On Sun, May 28, 2017 at 8:54 PM Mehdi AMINI via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > 2017-05-26 17:47 GMT-07:00 Zachary Turner via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>: > Changing a header file somewhere and having to spend 10 minutes waiting for a build leads to a lot of wasted developer time. > > The real culprit here is tablegen. Can we split support and ADT into two - the parts that tablegen depends on and the parts that it doesn't? > > Splitting ADT just based on tablegen usage seems dubious to me. If we need to go this route, I'd replace as many uses of ADT data structure with STL ones to begin with to reduce the surface. > > Tablegen would not need to determine WHERE to split, it would just motivate the why. It's obvious just from looking at Support's include directory though that a lot of stuff in there doesn't belong together. A quick look over the include directory already suggests a split into "broadly useful stuff" and "narrowly useful stuff" > > Broadly useful stuff: > AlignOf > Allocator > ArrayRecycler > Atomic > AtomicOrdering > Capacity > Casting > Chrono > circular_raw_ostream > COM.h > CommandLine.h > Compiler.h > ConvertUTF.h > CrashRecoveryContext.h > DataExtractor.h > Debug.h > Endian.h > EndianStream.h > Errc.h > Errno.h > Error.h > ErrorHandling.h > ErrorOr.h > FileOutputBuffer.h > FileSystem.h > FileUtilities.h > Format*.h > GlobPattern.h > Host.h > JamCRC.h > KnownBits.h > LineIterator.h > Locale.h > ManagedStatic.h > MathExtras.h > MD5.h > Memory.h > MemoryBuffer.h > Mutex.h > MutexGuard.h > NativeFormatting.h > Options.h > Parallel.h > Path.h > PointerLikeTypeTraits.h > PrettyStackTrace.h > Printable.h > Process.h > Program.h > RandomNumberGenerator.h > raw_os_ostream.h > raw_ostream.h > raw_sha1_ostream.h > Recycler.h > RecyclingAllocator.h > Regex.h > RWMutex.h > SaveAndRestore.h > ScaledNumber.h > SHA1.h > Signals.h > StringPool.h > StringSaver.h > SwapByteOrder.h > SystemUtils.h > thread.h > Threading.h > ThreadLocal.h > ThreadPool.h > Timer.h > TrailingObjects.h > Unicode.h > UnicodeCharRanges.h > UniqueLock.h > Watchdog.h > Win64EH.h > WindowsError.h > xxhash.h > > > Narrowly useful stuff: > AArch64TargetParser.def > ARMAttributeParser.h > ARMBuildAttriubtes.h > ARMEHABI.h > ARMTargetParser.def > ARMWinEH.h > Binary*Stream*.h > BlockFrequency.h > BranchProbability.h > CachePruning.h > CBindingWrapping.h > CodeGen.h > CodeGenCWrappers.h > COFF.h > Compression.h > DebugCounter.h > DotGraphTraits.h > Dwarf.def > Dwarf.h > DynamicLibrary.h > ELF.h > GCOV.h > GenericDomTree.h > GenericDomTreeConstruction.h > GraphWriter.h > LEB128.h > LockFileManager.h > LowLevelTypeImpl.h > MachO.def > MachO.h > MipsABIFlags.h > OnDiskHashTable.h > PluginLoader.h > Registry.h > ScopedPrinter.h > SMLoc.h > Solaris.h > SourceMgr.h > SpecialCaseList.h > TargetParser.h > TargetRegistry.h > TargetSelect.h > TarWriter.h > ToolOutputFile.h > TrigramIndex.h > TypeName.h > Valgrind.h > Wasm.h > YAMLParser.h > YAMLTraits.h > > > So, as a very crude first attempt, you call the first group of stuff "Base", the second group "Support", and add a dependency from Support to Base. This has nothing to do with tablegen, btw, and tablegen would still probably depend on Support even after this separation, but it makes sense even from a high level layering perspective (IMO) > > To what end, though? If most things will depend on both anyway - it doesn't sound like the split would help/improve the situation re: changes to base or support leading to full rebuilds of LLVM... no? > > > > > 2017-05-28 8:25 GMT-07:00 Krzysztof Parzyszek via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>: > On 5/26/2017 7:59 PM, Zachary Turner via llvm-dev wrote: > It's that TableGen depends on Support, so if you change one file in > support, support gets recompiled into a new static archive, which > triggers a rerun of tablegen on all the tablegen inputs, which is > extremely slow. > > What exactly is extremely slow? In my experience TableGen itself takes a negligible amount of time compared to the rest of the build. This is particularly true in cases when something in Support or ADT is modified, as this usually triggers recompilation of large parts of LLVM. > > Tablegen built in debug is really slow though, I remember an out-of-tree backend where running llvm-tblgen was taking up to 5 min per file! > The CMake option LLVM_OPTIMIZED_TABLEGEN helps a lot with this. Otherwise LLVM_TABLEGEN is even more efficient, but it's a double-edged sword. > > But we could also use the diff-and-copy approach not on the tablegen output but on the llvm-tblgen binary itself, that way we wouldn't re-run it when it does not change itself (I'm not sure why CMake does not use this strategy by default for any file including .o and .a?). > > -- > Mehdi > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170530/a100ec85/attachment.html>