thr3ads.net - llvm dev - [llvm-dev] Should we split llvm Support and ADT? [Jun 2017]

If this information is useful, please help other people find it:
Share via:

Mehdi AMINI via llvm-dev

2017-May-29 17:22 UTC

[llvm-dev] Should we split llvm Support and ADT?

2017-05-29 9:25 GMT-07:00 Zachary Turner <zturner at google.com>:
> On Sun, May 28, 2017 at 8:54 PM Mehdi AMINI via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> 2017-05-26 17:47 GMT-07:00 Zachary Turner via llvm-dev <
>> llvm-dev at lists.llvm.org>:
>>
>>> Changing a header file somewhere and having to spend 10 minutes
waiting
>>> for a build leads to a lot of wasted developer time.
>>>
>>> The real culprit here is tablegen.  Can we split support and ADT
into
>>> two - the parts that tablegen depends on and the parts that it
doesn't?
>>>
>>
>> Splitting ADT just based on tablegen usage seems dubious to me. If we
>> need to go this route, I'd replace as many uses of ADT data
structure with
>> STL ones to begin with to reduce the surface.
>>
>
> Tablegen would not need to determine WHERE to split, it would just
> motivate the why.
>
Well even the why :)
(note I was mentioning ADT and not Support above).


>   It's obvious just from looking at Support's include directory
though
> that a lot of stuff in there doesn't belong together.  A quick look
over
> the include directory already suggests a split into "broadly useful
stuff"
> and "narrowly useful stuff"
>
I agree, Support is a mess IMO (we have target specific stuff here just for
the sake of sharing code with clang AFAIK) and I'm not sure anyone would
oppose to split it. Ideally the way I would split it would be such that it
could (at some point) be useful outside of LLVM (just like ADT), so one
main criteria could be "could this component of Support be useful outside
of LLVM (and its subprojects)".


> Broadly useful stuff:
> AlignOf
> Allocator
> ArrayRecycler
> Atomic
> AtomicOrdering
> Capacity
> Casting
> Chrono
> circular_raw_ostream
> COM.h
> CommandLine.h
> Compiler.h
> ConvertUTF.h
> CrashRecoveryContext.h
> DataExtractor.h
> Debug.h
> Endian.h
> EndianStream.h
> Errc.h
> Errno.h
> Error.h
> ErrorHandling.h
> ErrorOr.h
> FileOutputBuffer.h
> FileSystem.h
> FileUtilities.h
> Format*.h
> GlobPattern.h
> Host.h
> JamCRC.h
> KnownBits.h
> LineIterator.h
> Locale.h
> ManagedStatic.h
> MathExtras.h
> MD5.h
> Memory.h
> MemoryBuffer.h
> Mutex.h
> MutexGuard.h
> NativeFormatting.h
> Options.h
> Parallel.h
> Path.h
> PointerLikeTypeTraits.h
> PrettyStackTrace.h
> Printable.h
> Process.h
> Program.h
> RandomNumberGenerator.h
> raw_os_ostream.h
> raw_ostream.h
> raw_sha1_ostream.h
> Recycler.h
> RecyclingAllocator.h
> Regex.h
> RWMutex.h
> SaveAndRestore.h
> ScaledNumber.h
> SHA1.h
> Signals.h
> StringPool.h
> StringSaver.h
> SwapByteOrder.h
> SystemUtils.h
> thread.h
> Threading.h
> ThreadLocal.h
> ThreadPool.h
> Timer.h
> TrailingObjects.h
> Unicode.h
> UnicodeCharRanges.h
> UniqueLock.h
> Watchdog.h
> Win64EH.h
> WindowsError.h
> xxhash.h
>
>
> Narrowly useful stuff:
> AArch64TargetParser.def
> ARMAttributeParser.h
> ARMBuildAttriubtes.h
> ARMEHABI.h
> ARMTargetParser.def
> ARMWinEH.h
> Binary*Stream*.h
> BlockFrequency.h
> BranchProbability.h
> CachePruning.h
> CBindingWrapping.h
> CodeGen.h
> CodeGenCWrappers.h
> COFF.h
> Compression.h
> DebugCounter.h
> DotGraphTraits.h
> Dwarf.def
> Dwarf.h
> DynamicLibrary.h
> ELF.h
> GCOV.h
> GenericDomTree.h
> GenericDomTreeConstruction.h
> GraphWriter.h
> LEB128.h
>
LEB128.h seems quite generic.


> LockFileManager.h
> LowLevelTypeImpl.h
> MachO.def
> MachO.h
> MipsABIFlags.h
> OnDiskHashTable.h
> PluginLoader.h
> Registry.h
> ScopedPrinter.h
> SMLoc.h
> Solaris.h
> SourceMgr.h
> SpecialCaseList.h
> TargetParser.h
> TargetRegistry.h
> TargetSelect.h
> TarWriter.h
> ToolOutputFile.h
> TrigramIndex.h
> TypeName.h
> Valgrind.h
> Wasm.h
> YAMLParser.h
> YAMLTraits.h
>
YAML Parser as well.

-- 
Mehdi



>
>
> So, as a very crude first attempt, you call the first group of stuff
> "Base", the second group "Support", and add a
dependency from Support to
> Base.  This has nothing to do with tablegen, btw, and tablegen would still
> probably depend on Support even after this separation, but it makes sense
> even from a high level layering perspective (IMO)
>
>
>>
>> 2017-05-28 8:25 GMT-07:00 Krzysztof Parzyszek via llvm-dev <
>> llvm-dev at lists.llvm.org>:
>>
>>> On 5/26/2017 7:59 PM, Zachary Turner via llvm-dev wrote:
>>>
>>>> It's that TableGen depends on Support, so if you change one
file in
>>>> support, support gets recompiled into a new static archive,
which
>>>> triggers a rerun of tablegen on all the tablegen inputs, which
is
>>>> extremely slow.
>>>>
>>>
>>> What exactly is extremely slow? In my experience TableGen itself
takes a
>>> negligible amount of time compared to the rest of the build. This
is
>>> particularly true in cases when something in Support or ADT is
modified, as
>>> this usually triggers recompilation of large parts of LLVM.
>>>
>>
>> Tablegen built in debug is really slow though, I remember an
out-of-tree
>> backend where running llvm-tblgen was taking up to 5 min per file!
>> The CMake option LLVM_OPTIMIZED_TABLEGEN helps a lot with this.
Otherwise
>> LLVM_TABLEGEN is even more efficient, but it's a double-edged
sword.
>>
>> But we could also use the diff-and-copy approach not on the tablegen
>> output but on the llvm-tblgen binary itself, that way we wouldn't
re-run it
>> when it does not change itself (I'm not sure why CMake does not use
this
>> strategy by default for any file including .o and .a?).
>>
>> --
>> Mehdi
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170529/296e8503/attachment.html>

Zachary Turner via llvm-dev

2017-May-29 17:28 UTC

head link

[llvm-dev] Should we split llvm Support and ADT?

YAML Parser is only used by things that need to parse YAML, which while in
principle might seem generic, is actually only used by a small number of
tools.  Besides, if you're bringing in YAML to parse something, then
you're
probably linking against some library like libObject, which will in turn
link against Support (as I'm referring to it in this thread, not as it
exists today).

It might turn out though that in practice it would be hard to keep it out
of Base, because for example even Statistic uses it for dumping.  I did a
quick search and sadly it seems like a lot of libraries include YAML
support directly in the library rather than in the tool that dumps the
library.

That said, it would be nice if at all possible the barrier to entry for
getting something into Base were higher and stricter than the barrier to
entry for getting something into Support.  So whenever possible we should
err on the side of keeping stuff out of base.

On Mon, May 29, 2017 at 10:22 AM Mehdi AMINI <joker.eph at gmail.com>
wrote:
> 2017-05-29 9:25 GMT-07:00 Zachary Turner <zturner at google.com>:
>
>> On Sun, May 28, 2017 at 8:54 PM Mehdi AMINI via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> 2017-05-26 17:47 GMT-07:00 Zachary Turner via llvm-dev <
>>> llvm-dev at lists.llvm.org>:
>>>
>>>> Changing a header file somewhere and having to spend 10 minutes
waiting
>>>> for a build leads to a lot of wasted developer time.
>>>>
>>>> The real culprit here is tablegen.  Can we split support and
ADT into
>>>> two - the parts that tablegen depends on and the parts that it
doesn't?
>>>>
>>>
>>> Splitting ADT just based on tablegen usage seems dubious to me. If
we
>>> need to go this route, I'd replace as many uses of ADT data
structure with
>>> STL ones to begin with to reduce the surface.
>>>
>>
>> Tablegen would not need to determine WHERE to split, it would just
>> motivate the why.
>>
>
> Well even the why :)
> (note I was mentioning ADT and not Support above).
>
>
>
>>   It's obvious just from looking at Support's include directory
though
>> that a lot of stuff in there doesn't belong together.  A quick look
over
>> the include directory already suggests a split into "broadly
useful stuff"
>> and "narrowly useful stuff"
>>
>
> I agree, Support is a mess IMO (we have target specific stuff here just
> for the sake of sharing code with clang AFAIK) and I'm not sure anyone
> would oppose to split it. Ideally the way I would split it would be such
> that it could (at some point) be useful outside of LLVM (just like ADT), so
> one main criteria could be "could this component of Support be useful
> outside of LLVM (and its subprojects)".
>
>
>
>> Broadly useful stuff:
>> AlignOf
>> Allocator
>> ArrayRecycler
>> Atomic
>> AtomicOrdering
>> Capacity
>> Casting
>> Chrono
>> circular_raw_ostream
>> COM.h
>> CommandLine.h
>> Compiler.h
>> ConvertUTF.h
>> CrashRecoveryContext.h
>> DataExtractor.h
>> Debug.h
>> Endian.h
>> EndianStream.h
>> Errc.h
>> Errno.h
>> Error.h
>> ErrorHandling.h
>> ErrorOr.h
>> FileOutputBuffer.h
>> FileSystem.h
>> FileUtilities.h
>> Format*.h
>> GlobPattern.h
>> Host.h
>> JamCRC.h
>> KnownBits.h
>> LineIterator.h
>> Locale.h
>> ManagedStatic.h
>> MathExtras.h
>> MD5.h
>> Memory.h
>> MemoryBuffer.h
>> Mutex.h
>> MutexGuard.h
>> NativeFormatting.h
>> Options.h
>> Parallel.h
>> Path.h
>> PointerLikeTypeTraits.h
>> PrettyStackTrace.h
>> Printable.h
>> Process.h
>> Program.h
>> RandomNumberGenerator.h
>> raw_os_ostream.h
>> raw_ostream.h
>> raw_sha1_ostream.h
>> Recycler.h
>> RecyclingAllocator.h
>> Regex.h
>> RWMutex.h
>> SaveAndRestore.h
>> ScaledNumber.h
>> SHA1.h
>> Signals.h
>> StringPool.h
>> StringSaver.h
>> SwapByteOrder.h
>> SystemUtils.h
>> thread.h
>> Threading.h
>> ThreadLocal.h
>> ThreadPool.h
>> Timer.h
>> TrailingObjects.h
>> Unicode.h
>> UnicodeCharRanges.h
>> UniqueLock.h
>> Watchdog.h
>> Win64EH.h
>> WindowsError.h
>> xxhash.h
>>
>>
>> Narrowly useful stuff:
>> AArch64TargetParser.def
>> ARMAttributeParser.h
>> ARMBuildAttriubtes.h
>> ARMEHABI.h
>> ARMTargetParser.def
>> ARMWinEH.h
>> Binary*Stream*.h
>> BlockFrequency.h
>> BranchProbability.h
>> CachePruning.h
>> CBindingWrapping.h
>> CodeGen.h
>> CodeGenCWrappers.h
>> COFF.h
>> Compression.h
>> DebugCounter.h
>> DotGraphTraits.h
>> Dwarf.def
>> Dwarf.h
>> DynamicLibrary.h
>> ELF.h
>> GCOV.h
>> GenericDomTree.h
>> GenericDomTreeConstruction.h
>> GraphWriter.h
>> LEB128.h
>>
>
> LEB128.h seems quite generic.
>
>
>
>> LockFileManager.h
>> LowLevelTypeImpl.h
>> MachO.def
>> MachO.h
>> MipsABIFlags.h
>> OnDiskHashTable.h
>> PluginLoader.h
>> Registry.h
>> ScopedPrinter.h
>> SMLoc.h
>> Solaris.h
>> SourceMgr.h
>> SpecialCaseList.h
>> TargetParser.h
>> TargetRegistry.h
>> TargetSelect.h
>> TarWriter.h
>> ToolOutputFile.h
>> TrigramIndex.h
>> TypeName.h
>> Valgrind.h
>> Wasm.h
>> YAMLParser.h
>> YAMLTraits.h
>>
>
> YAML Parser as well.
>
> --
> Mehdi
>
>
>
>
>>
>>
>> So, as a very crude first attempt, you call the first group of stuff
>> "Base", the second group "Support", and add a
dependency from Support to
>> Base.  This has nothing to do with tablegen, btw, and tablegen would
still
>> probably depend on Support even after this separation, but it makes
sense
>> even from a high level layering perspective (IMO)
>>
>>
>>>
>>> 2017-05-28 8:25 GMT-07:00 Krzysztof Parzyszek via llvm-dev <
>>> llvm-dev at lists.llvm.org>:
>>>
>>>> On 5/26/2017 7:59 PM, Zachary Turner via llvm-dev wrote:
>>>>
>>>>> It's that TableGen depends on Support, so if you change
one file in
>>>>> support, support gets recompiled into a new static archive,
which
>>>>> triggers a rerun of tablegen on all the tablegen inputs,
which is
>>>>> extremely slow.
>>>>>
>>>>
>>>> What exactly is extremely slow? In my experience TableGen
itself takes
>>>> a negligible amount of time compared to the rest of the build.
This is
>>>> particularly true in cases when something in Support or ADT is
modified, as
>>>> this usually triggers recompilation of large parts of LLVM.
>>>>
>>>
>>> Tablegen built in debug is really slow though, I remember an
out-of-tree
>>> backend where running llvm-tblgen was taking up to 5 min per file!
>>> The CMake option LLVM_OPTIMIZED_TABLEGEN helps a lot with this.
>>> Otherwise LLVM_TABLEGEN is even more efficient, but it's a
double-edged
>>> sword.
>>>
>>> But we could also use the diff-and-copy approach not on the
tablegen
>>> output but on the llvm-tblgen binary itself, that way we
wouldn't re-run it
>>> when it does not change itself (I'm not sure why CMake does not
use this
>>> strategy by default for any file including .o and .a?).
>>>
>>> --
>>> Mehdi
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170529/94eaaaa3/attachment.html>

Michael Spencer via llvm-dev

2017-Jun-01 00:11 UTC

head link

[llvm-dev] Should we split llvm Support and ADT?

On Mon, May 29, 2017 at 10:22 AM, Mehdi AMINI via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
>
> 2017-05-29 9:25 GMT-07:00 Zachary Turner <zturner at google.com>:
>
>> On Sun, May 28, 2017 at 8:54 PM Mehdi AMINI via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> 2017-05-26 17:47 GMT-07:00 Zachary Turner via llvm-dev <
>>> llvm-dev at lists.llvm.org>:
>>>
>>>> Changing a header file somewhere and having to spend 10 minutes
waiting
>>>> for a build leads to a lot of wasted developer time.
>>>>
>>>> The real culprit here is tablegen.  Can we split support and
ADT into
>>>> two - the parts that tablegen depends on and the parts that it
doesn't?
>>>>
>>>
>>> Splitting ADT just based on tablegen usage seems dubious to me. If
we
>>> need to go this route, I'd replace as many uses of ADT data
structure with
>>> STL ones to begin with to reduce the surface.
>>>
>>
>> Tablegen would not need to determine WHERE to split, it would just
>> motivate the why.
>>
>
> Well even the why :)
> (note I was mentioning ADT and not Support above).
>
>
>
>>   It's obvious just from looking at Support's include directory
though
>> that a lot of stuff in there doesn't belong together.  A quick look
over
>> the include directory already suggests a split into "broadly
useful stuff"
>> and "narrowly useful stuff"
>>
>
> I agree, Support is a mess IMO (we have target specific stuff here just
> for the sake of sharing code with clang AFAIK) and I'm not sure anyone
> would oppose to split it. Ideally the way I would split it would be such
> that it could (at some point) be useful outside of LLVM (just like ADT), so
> one main criteria could be "could this component of Support be useful
> outside of LLVM (and its subprojects)".
>
While it would be nice to easily use parts of support outside of llvm (as
I've done on many occasions), I'm not sure the llvm project wants to
maintain and support a generic c++ library. Given that I don't think use
outside of llvm is a great line to split on.

- Michael Spencer
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170531/dde81afe/attachment.html>

Mehdi AMINI via llvm-dev

2017-Jun-01 04:36 UTC

head link

[llvm-dev] Should we split llvm Support and ADT?

2017-05-31 17:11 GMT-07:00 Michael Spencer <bigcheesegs at gmail.com>:
> On Mon, May 29, 2017 at 10:22 AM, Mehdi AMINI via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>>
>>
>> 2017-05-29 9:25 GMT-07:00 Zachary Turner <zturner at google.com>:
>>
>>> On Sun, May 28, 2017 at 8:54 PM Mehdi AMINI via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> 2017-05-26 17:47 GMT-07:00 Zachary Turner via llvm-dev <
>>>> llvm-dev at lists.llvm.org>:
>>>>
>>>>> Changing a header file somewhere and having to spend 10
minutes
>>>>> waiting for a build leads to a lot of wasted developer
time.
>>>>>
>>>>> The real culprit here is tablegen.  Can we split support
and ADT into
>>>>> two - the parts that tablegen depends on and the parts that
it doesn't?
>>>>>
>>>>
>>>> Splitting ADT just based on tablegen usage seems dubious to me.
If we
>>>> need to go this route, I'd replace as many uses of ADT data
structure with
>>>> STL ones to begin with to reduce the surface.
>>>>
>>>
>>> Tablegen would not need to determine WHERE to split, it would just
>>> motivate the why.
>>>
>>
>> Well even the why :)
>> (note I was mentioning ADT and not Support above).
>>
>>
>>
>>>   It's obvious just from looking at Support's include
directory though
>>> that a lot of stuff in there doesn't belong together.  A quick
look over
>>> the include directory already suggests a split into "broadly
useful stuff"
>>> and "narrowly useful stuff"
>>>
>>
>> I agree, Support is a mess IMO (we have target specific stuff here just
>> for the sake of sharing code with clang AFAIK) and I'm not sure
anyone
>> would oppose to split it. Ideally the way I would split it would be
such
>> that it could (at some point) be useful outside of LLVM (just like
ADT), so
>> one main criteria could be "could this component of Support be
useful
>> outside of LLVM (and its subprojects)".
>>
>
> While it would be nice to easily use parts of support outside of llvm (as
> I've done on many occasions), I'm not sure the llvm project wants
to
> maintain and support a generic c++ library.
>
Sorry, I'm not sure to follow you, because my impression is that we're
*already* maintaining a generic C++ library in-tree. The fact that the mess
that is libSupport today does not allow to split ADT (or other generic
utilities in libSupport) out-of-tree easily does not change that.

> Given that I don't think use outside of llvm is a great line to split
on.
>
I think on the opposite that it is a valuable line because it matches a
conceptual layer.

-- 
Mehdi
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170531/44f9123d/attachment.html>

llvm dev - Jun 2017 - Should we split llvm Support and ADT?

[llvm-dev] Should we split llvm Support and ADT?

[llvm-dev] Should we split llvm Support and ADT?

[llvm-dev] Should we split llvm Support and ADT?

[llvm-dev] Should we split llvm Support and ADT?