thr3ads.net - llvm dev - [llvm-dev] Contributing Bazel BUILD files similar to gn [Oct 2020]

If this information is useful, please help other people find it:
Share via:

Johannes Doerfert via llvm-dev

2020-Oct-30 03:10 UTC

[llvm-dev] Contributing Bazel BUILD files similar to gn

On 10/29/20 9:48 PM, Mehdi AMINI wrote:> On Thu, Oct 29, 2020 at 7:30 PM Johannes Doerfert <
> johannesdoerfert at gmail.com> wrote:
>
>> I replied only selectively.
>>
>>
>> On 10/29/20 5:47 PM, Mehdi AMINI via llvm-dev wrote:
>>> On Thu, Oct 29, 2020 at 2:35 PM Chris Tetreault <ctetreau at
quicinc.com>
>>> wrote:
>>>
>>>> Honestly, I’m hearing that some people would like the Bazel
build system
>>>> to be in community master, and the argument basically boils
down to
>> “It’ll
>>>> be fine. It’ll just sit there and mind its own business and you
don’t
>> have
>>>> to care about it.”
>>>>
>>> Not really: this argument is only the answer to why it does not
bear any
>>> weight on non-Bazel users, just like `gn` does already today.
>>>
>>> I think I explained the motivation to do it, but I can restate it:
many
>>> LLVM contributors need to collaborate on this piece of
infrastructure
>> that
>>> is very specific to LLVM and enabling some users of LLVM: the
natural
>> place
>>> of collaboration is the monorepo.
>>>
>>>
>>>>> So why are we doing it? I mentioned this in another answer:
this is
>>>> mainly to provide a collaboration space for the support of OSS
projects
>>>> using Bazel interested to use LLVM (and some subprojects). …
>>>>
>>>>
>>>>
>>>> Which could be handled by having it in an external public repo.
>>>>
>>> Sure, just like almost every new code could be handled in an
external
>> repo.
>>> However when many LLVM contributors are interested to collaborate
on
>>> something highly coupled to LLVM it seems like the natural place to
do
>> it.
>>> Also I don't know for Qualcomm, but most companies will want
you to sign
>> a
>>> CLA if they provide this "external repo" where we can
collaborate, and
>>> other parties won't be able to collaborate. The LLVM project is
in
>> general
>>> seen as quite "neutral" for collaborating.
>>>
>>>
>>>>> Having them in-tree means that we can publish every day (or
more) a git
>>>> hash that we validate with Bazel on private bots (like `gn`)
and every
>>>> project can use to clone the LLVM monorepo and integrate in
their build
>>>> flow easily.
>>>>
>>>>
>>>>
>>>> You could still publish this info: “Today, the head of
llvm-bazel is
>>>> confirmed to work with LLVM monorepo sha [foo]”. I don’t think
two git
>>>> clones is significantly harder than one.
>>>>
>>> For a developer at their desk, you could say it is just an
inconvenience
>>> that can be worked around (scripting, etc.).
>>> For the project on the other hand, Bazel has native support to
clone a
>> repo
>>> and build it itself as dependency.  For example TensorFlow has many
>>> dependencies, and it just points to a commit in the source repo:
>>>
>>
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/workspace.bzl#L689-L697
>>> You can see how it is convenient to update the SHA1 there and have
it
>> just
>>> work for any Bazel user.
>>>
>>>
>>>
>>>> I submit that in a way this is simpler because you can always
advertise
>>>> the head of the bazel repo. If the Bazel build system were in
the
>> community
>>>> repo, then you might have to tell users to use an older version
of the
>>>> bazel build if a fix went into the monorepo in the afternoon,
but the
>> next
>>>> morning’s nightly finds that the most recent sha that passes
the tests
>> is
>>>> prior to that fix.
>>>>
>>> This is not different from "a commit broke the ARM bootstrap
and a user
>> who
>>> checked out the repo at the time will be broken". From this
point of view
>>> this configuration is no different than any other, except that we
don't
>>> revert or notify the author of a breaking change, a set of
volunteers
>>> monitor a silent bot and fix-forward as needed, like `gn`.
>>> It is just much easier to have a bot publishing the "known
good" revision
>>> of the monorepo.
>>>
>>>
>>>> I guess my concern is that I’m not really hearing a compelling
(to my
>> ear)
>>>> argument for this inclusion.
>>>>
>>> Sure, but if other contributors have a strong interest, and you
don't
>>> really have a strong objection here that we need to address, we
should be
>>> able to get past that?
>> Wouldn't your argument hold for anything that "just
lives" in the mono
>> repo but doesn't impact people? I mean, where is the line for stuff
that
>> some contributors have "strong interest" in and others
can't really
>> "hear a compelling argument for inclusion"? People raise
concerns here
>> and from where I am sitting they are brushed over easily and more
>> aggressively as the thread progresses (up to the email I respond to).
>>
> Sorry, I invite you to reread the thread again and revisit your impression:
> Tom and Renato expressed clear concerns, and I believe I really tried to
> listen and address these with concrete proposals to mitigate:
> http://lists.llvm.org/pipermail/llvm-dev/2020-October/146182.html
> However there is not much I can do to address folks who object because
> "they don't see the interest" in it, this isn't a
productive way of moving
> forward with such proposal IMO.
>
>
>
>>>> I guess it would make the lives of google employees easier?
>>>>
>>> I explained before that Google internal integration flow is likely
better
>>> without this at the moment, TensorFlow itself is also in a
reasonably
>> good
>>> spot at the moment. But Google is also not a monolithic place, some
>> people
>>> are working on small independent projects that they are
open-sourcing,
>> and
>>> would like to be able to use LLVM.
>>>
>>>>    Then what’s to stop every large org from committing their
internal
>> stuff
>>> to master?
>>>
>>>
>>>
>>> If their "internal stuff" is highly-coupled to LLVM, has
zero-cost
>>> maintenance on the community, and is something that multiple other
>> parties
>>> can benefit and established members of the community want to
maintain and
>>> collaborate on, why not?
>> Let's be honest, nothing has "zero-cost".
>
> I hope you're not implying I'd be dishonest here right?
Long story short, I did not try to imply you were dishonest.

I'm saying that the sentence "has zero-cost maintenance on the
community"
cannot be true in a general sense but only in a narrow one. I believe that
everything has cost. I added, "let's be honest", because the cost
is not
obvious and one can easily overlook it. However, I assumed we all know
there has to be one as it would otherwise conflict with some universal
law or something. The way I see it you acknowledge the existence in a few
other places.


>
>> It seems unhelpful to pretend it does. (FWIW, I explained a simple
>> scenario that would make the bazel
>> inclusion "costly" in my previous mail.)
>>
> "zero-cost" is well defined: it is "as a community member:
feel free to
> ignore, no one will bother you about it", and a subset of the
community
> signed up for the maintenance.
> I think it is also helpful to be concrete here: we have existing data and
> history with `gn`, it isn't hypothetical.
>
> To be sure I address your previous email, that was about user expectations
> right? i.e. was it this part:
>
>> people will assume we (=the LLVM community) maintain(s) a bazel build,
> which can certainly be a benefit but also a cost", e.g., when the
build is
> not properly maintained, support is scarce, etc. and emails come in
> complaining about it (not thinking of prior examples here.)
>
> Isn't this similar to the concerns from Renato here:
> http://lists.llvm.org/pipermail/llvm-dev/2020-October/146179.html ?
> I acknowledge this as very valid concerns and offered some possibility to
> mitigate: http://lists.llvm.org/pipermail/llvm-dev/2020-October/146188.html
>
>
>
>>
>>> I mentioned it before, but Bazel is not something internal or
specific to
>>> Google: it isn't (actually there are many incompatibilities
between Bazel
>>> and the internal system), 400 people attended the Bazel conference
last
>>> year. I attended this conference 3 years ago when I was at Tesla
trying
>> to
>>> deploy Bazel internally. Many other companies are using Bazel,
>> open-source
>>> projects as well. Feel free to watch the talks online about SpaceX
>>> <https://www.youtube.com/watch?v=t_3bckhV_YI> or Two Sigma
and Uber
>>> <https://www.youtube.com/watch?v=_bPyEbAyC0s> for example
>> Let's not conflate "using bazel" and "benefit for
LLVM", the former
>> is not up for debate here. (I mean, a lot of people use autoconf but
>> we got rid of it anyway).
>>
> I doubt we wouldn't have got rid of Autoconf if a chunk of the
community
> offered to maintain it at "no cost" (again see definition).
It broke, ppl complained, and nobody wanted to fix it. That is the
kind of technical debt (aka. cost) you can accumulate.

>
>> That said, I think the original question is highly relevant. As I also
>> mentioned somewhere above, where do we draw the line is the key to this
>> RFC at the end of the day. A lot of the arguments I hear pro
integration
>> apply to various other things that currently live out-of-tree, some of
>> which were proposed and not integrated.
>
> Can you provide more concrete reference to these things that could have
> been integrated in similar "zero cost" fashion?
> I'm all for consistency, and the only point of comparison here is `gn`.
Let's say RV, in a subfolder not build by default. Or any other
project that was proposed for inclusion without being build by
default. (I remember also the discussion if we can/should add
isl to llvm, pre-mono repo.)

>
>
>
>> I think we should not dismiss
>> this easily, no matter on which side of the argument you are this time.
>>
>> ~ Johannes
>>
>>
>>
>>>
>>> I'm not trying to convince anyone to use Bazel, it has
drawbacks, but the
>>> point here is to recognize that this is about OpenSource
communities that
>>> Bazel is serving: these are users, some of us in the LLVM community
are
>>> trying to provide these users with a reasonably good integration
story,
>> and
>>> we're ready to pay the cost for everyone.
>>>
>>>
>>>
>>>> *From:* Mehdi AMINI <joker.eph at gmail.com>
>>>> *Sent:* Thursday, October 29, 2020 2:00 PM
>>>> *To:* Chris Tetreault <ctetreau at quicinc.com>
>>>> *Cc:* Sterling Augustine <saugustine at google.com>;
Mehdi Amini <
>>>> aminim at google.com>; LLVM Dev <llvm-dev at
lists.llvm.org>; Stella
>> Laurenzo <
>>>> laurenzo at google.com>; Tres Popp <tpopp at
google.com>; Geoffrey
>> Martin-Noble
>>>> <gcmn at google.com>; Thomas Joerg <tjoerg at
google.com>
>>>> *Subject:* [EXT] Re: [llvm-dev] Contributing Bazel BUILD files
similar
>> to
>>>> gn
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Oct 29, 2020 at 1:24 PM Chris Tetreault via llvm-dev
<
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>> The problem is that once it’s in community LLVM, it becomes the
>>>> community’s problem.  The expectation is that individual
contributors do
>>>> not break anything in upstream.
>>>>
>>>>
>>>>
>>>> I would expect that the community by now has concrete
experience with
>> `gn`
>>>> gained over a few years demonstrating that this hasn't been
a problem to
>>>> have this in-tree, without a burden of support on the
community.
>>>>
>>>> In particular, I think that a salient point is the guarantee
that no
>>>> public bot would be testing it (I mean here by "no public
bot" that no
>> bot
>>>> would email you when you break it).
>>>>
>>>>
>>>>
>>>> Why else would you contribute it to the LLVM monorepo? If the
goal is
>> just
>>>> to enable external-to-google orgs to collaborate on it, why not
>> contribute
>>>> it as a new repo separate from LLVM? You wouldn’t need to ask
anybody’s
>>>> permission to do this.
>>>>
>>>>
>>>>
>>>> Yes, we could do this, and you are correct that in many cases a
>> motivation
>>>> to upstream a component is to make sure it is maintained by the
>> community
>>>> and works out of the box.
>>>>
>>>> In this case it is slightly different: we are OK with people to
break
>>>> this. We are already maintaining these files out-of-tree for
our own
>>>> purposes, and this has been the case for years as Sterling
mentions. I
>>>> would even suspect that for Google internal build integration,
it is
>>>> actually easier to have these files internal only rather than
>> unsupported
>>>> upstream.
>>>>
>>>> So why are we doing it? I mentioned this in another answer:
this is
>> mainly
>>>> to provide a collaboration space for the support of OSS
projects using
>>>> Bazel interested to use LLVM (and some subprojects).
>>>>
>>>> Having them in-tree means that we can publish every day (or
more) a git
>>>> hash that we validate with Bazel on private bots (like `gn`)
and every
>>>> project can use to clone the LLVM monorepo and integrate in
their build
>>>> flow easily. Another repo, submodules, etc. are not making this
>> possible /
>>>> practical.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *From:* Sterling Augustine <saugustine at google.com>
>>>> *Sent:* Thursday, October 29, 2020 1:14 PM
>>>> *To:* Chris Tetreault <ctetreau at quicinc.com>
>>>> *Cc:* Renato Golin <rengolin at gmail.com>; tstellar at
redhat.com; Mehdi
>> Amini
>>>> <aminim at google.com>; LLVM Dev <llvm-dev at
lists.llvm.org>; Stella
>> Laurenzo <
>>>> laurenzo at google.com>; Tres Popp <tpopp at
google.com>; Geoffrey
>> Martin-Noble
>>>> <gcmn at google.com>; Thomas Joerg <tjoerg at
google.com>
>>>> *Subject:* [EXT] Re: [llvm-dev] Contributing Bazel BUILD files
similar
>> to
>>>> gn
>>>>
>>>>
>>>>
>>>> On Thu, Oct 29, 2020 at 12:29 PM Chris Tetreault via llvm-dev
<
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>> I think Renato has articulated quite well some concerns I have
about
>> this
>>>> but was unable to express. I would very much prefer if we just
focus on
>>>> using CMake effectively.
>>>>
>>>> ...
>>>>
>>>> For example, when trying to implement the same logic on both
will not be
>>>> trivial. So, whenever we want to add some functionality or
improve how
>> we
>>>> build LLVM with one system, we'll have to do so in multiple
build
>> systems
>>>> that do not easily match each other.
>>>>
>>>>
>>>>
>>>> Google already does all of this work, and has for years. I
think it is
>>>> fair to say that it hasn't been a burden on the community.
>>>>
>>>>
>>>>
>>>> If we don't try to match functionality, we'll segregate
the community,
>>>> because people will be able to do X on build system A but not
B, and the
>>>> similar features cluster together and then we have essentially
two
>> projects
>>>> built from the same source code.
>>>>
>>>>
>>>>
>>>> As long as we keep CMake as the canonical system everything
will be
>> fine.
>>>> It works perfectly well today, except that not everyone gets to
see or
>> use
>>>> the bazel files. They exist right now; they work right now; and
it
>> hasn't
>>>> been a burden on anyone but the people who care about bazel.
>>>>
>>>>
>>>>
>>>> Testing this, or worse, trying to fix a buildbot that is built
with
>> Bazel
>>>> (and having to install Java JDK and all its dependencies) on
>> potentially a
>>>> hardware that you do not have access to, will be a nightmare to
debug.
>> The
>>>> nature of post-commit testing, revert and review of LLVM will
not make
>> that
>>>> simpler. Unless we treat the Bazel build as "not our
problem" (which
>>>> defeats the point of having it?).
>>>>
>>>>
>>>>
>>>> Google makes it work like this today, with the rest of the
project
>>>> treating it as "not our problem" because they
don't even see that they
>>>> exist. The build bot issues would be real, but I think
surmountable,
>> given
>>>> that Google already cleans up the bazel files, it just
doesn't push
>> them.
>>>> Perhaps an explicit policy that cmake folks don't have to
update the
>> bazel
>>>> files would be helpful.
>>>>
>>>>
>>>>
>>>> To make matters worse, our CMake files are not simple, and do
not do all
>>>> of the things we want them to do in the way we understand
completely.
>> There
>>>> is a lot of kludge that we carry and with that comes in two
categories:
>> the
>>>> things that we hate and would love to fix, and the things that
are fixes
>>>> that we have no idea are there. The former are the reasons why
people
>> want
>>>> to start a new build system, the latter is why they soon
realise that
>> was a
>>>> mistake (insert XKCD joke here).
>>>>
>>>>
>>>>
>>>> It wouldn't be starting a new build system, it would be
making a
>>>> pre-existing, already extremely well functioning one, available
to more
>>>> people.
>>>>
>>>>
>>>>
>>>> I can definitely see folks who use cmake not wanting more
hassle--that
>> may
>>>> be a valid reason not to do it. But "it won't
work" or "it's hard to
>> keep
>>>> up" or "it's too complicated" seem well
refuted by a multi-year
>> existence
>>>> proof.
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Mehdi AMINI via llvm-dev

2020-Oct-30 03:38 UTC

head link

[llvm-dev] Contributing Bazel BUILD files similar to gn

On Thu, Oct 29, 2020 at 8:10 PM Johannes Doerfert <
johannesdoerfert at gmail.com> wrote:
>
> On 10/29/20 9:48 PM, Mehdi AMINI wrote:
> > On Thu, Oct 29, 2020 at 7:30 PM Johannes Doerfert <
> > johannesdoerfert at gmail.com> wrote:
> >
> >> I replied only selectively.
> >>
> >>
> >> On 10/29/20 5:47 PM, Mehdi AMINI via llvm-dev wrote:
> >>> On Thu, Oct 29, 2020 at 2:35 PM Chris Tetreault <ctetreau
at quicinc.com>
> >>> wrote:
> >>>
> >>>> Honestly, I’m hearing that some people would like the
Bazel build
> system
> >>>> to be in community master, and the argument basically
boils down to
> >> “It’ll
> >>>> be fine. It’ll just sit there and mind its own business
and you don’t
> >> have
> >>>> to care about it.”
> >>>>
> >>> Not really: this argument is only the answer to why it does
not bear
> any
> >>> weight on non-Bazel users, just like `gn` does already today.
> >>>
> >>> I think I explained the motivation to do it, but I can restate
it: many
> >>> LLVM contributors need to collaborate on this piece of
infrastructure
> >> that
> >>> is very specific to LLVM and enabling some users of LLVM: the
natural
> >> place
> >>> of collaboration is the monorepo.
> >>>
> >>>
> >>>>> So why are we doing it? I mentioned this in another
answer: this is
> >>>> mainly to provide a collaboration space for the support of
OSS
> projects
> >>>> using Bazel interested to use LLVM (and some subprojects).
…
> >>>>
> >>>>
> >>>>
> >>>> Which could be handled by having it in an external public
repo.
> >>>>
> >>> Sure, just like almost every new code could be handled in an
external
> >> repo.
> >>> However when many LLVM contributors are interested to
collaborate on
> >>> something highly coupled to LLVM it seems like the natural
place to do
> >> it.
> >>> Also I don't know for Qualcomm, but most companies will
want you to
> sign
> >> a
> >>> CLA if they provide this "external repo" where we
can collaborate, and
> >>> other parties won't be able to collaborate. The LLVM
project is in
> >> general
> >>> seen as quite "neutral" for collaborating.
> >>>
> >>>
> >>>>> Having them in-tree means that we can publish every
day (or more) a
> git
> >>>> hash that we validate with Bazel on private bots (like
`gn`) and every
> >>>> project can use to clone the LLVM monorepo and integrate
in their
> build
> >>>> flow easily.
> >>>>
> >>>>
> >>>>
> >>>> You could still publish this info: “Today, the head of
llvm-bazel is
> >>>> confirmed to work with LLVM monorepo sha [foo]”. I don’t
think two git
> >>>> clones is significantly harder than one.
> >>>>
> >>> For a developer at their desk, you could say it is just an
> inconvenience
> >>> that can be worked around (scripting, etc.).
> >>> For the project on the other hand, Bazel has native support to
clone a
> >> repo
> >>> and build it itself as dependency.  For example TensorFlow has
many
> >>> dependencies, and it just points to a commit in the source
repo:
> >>>
> >>
>
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/workspace.bzl#L689-L697
> >>> You can see how it is convenient to update the SHA1 there and
have it
> >> just
> >>> work for any Bazel user.
> >>>
> >>>
> >>>
> >>>> I submit that in a way this is simpler because you can
always
> advertise
> >>>> the head of the bazel repo. If the Bazel build system were
in the
> >> community
> >>>> repo, then you might have to tell users to use an older
version of the
> >>>> bazel build if a fix went into the monorepo in the
afternoon, but the
> >> next
> >>>> morning’s nightly finds that the most recent sha that
passes the tests
> >> is
> >>>> prior to that fix.
> >>>>
> >>> This is not different from "a commit broke the ARM
bootstrap and a user
> >> who
> >>> checked out the repo at the time will be broken". From
this point of
> view
> >>> this configuration is no different than any other, except that
we don't
> >>> revert or notify the author of a breaking change, a set of
volunteers
> >>> monitor a silent bot and fix-forward as needed, like `gn`.
> >>> It is just much easier to have a bot publishing the
"known good"
> revision
> >>> of the monorepo.
> >>>
> >>>
> >>>> I guess my concern is that I’m not really hearing a
compelling (to my
> >> ear)
> >>>> argument for this inclusion.
> >>>>
> >>> Sure, but if other contributors have a strong interest, and
you don't
> >>> really have a strong objection here that we need to address,
we should
> be
> >>> able to get past that?
> >> Wouldn't your argument hold for anything that "just
lives" in the mono
> >> repo but doesn't impact people? I mean, where is the line for
stuff that
> >> some contributors have "strong interest" in and others
can't really
> >> "hear a compelling argument for inclusion"? People raise
concerns here
> >> and from where I am sitting they are brushed over easily and more
> >> aggressively as the thread progresses (up to the email I respond
to).
> >>
> > Sorry, I invite you to reread the thread again and revisit your
> impression:
> > Tom and Renato expressed clear concerns, and I believe I really tried
to
> > listen and address these with concrete proposals to mitigate:
> > http://lists.llvm.org/pipermail/llvm-dev/2020-October/146182.html
> > However there is not much I can do to address folks who object because
> > "they don't see the interest" in it, this isn't a
productive way of
> moving
> > forward with such proposal IMO.
> >
> >
> >
> >>>> I guess it would make the lives of google employees
easier?
> >>>>
> >>> I explained before that Google internal integration flow is
likely
> better
> >>> without this at the moment, TensorFlow itself is also in a
reasonably
> >> good
> >>> spot at the moment. But Google is also not a monolithic place,
some
> >> people
> >>> are working on small independent projects that they are
open-sourcing,
> >> and
> >>> would like to be able to use LLVM.
> >>>
> >>>>    Then what’s to stop every large org from committing
their internal
> >> stuff
> >>> to master?
> >>>
> >>>
> >>>
> >>> If their "internal stuff" is highly-coupled to LLVM,
has zero-cost
> >>> maintenance on the community, and is something that multiple
other
> >> parties
> >>> can benefit and established members of the community want to
maintain
> and
> >>> collaborate on, why not?
> >> Let's be honest, nothing has "zero-cost".
> >
> > I hope you're not implying I'd be dishonest here right?
>
> Long story short, I did not try to imply you were dishonest.
>
Yes, I know you :)
(actually I thought I included a wink smiley above, but apparently not,
sorry about that)

>
> I'm saying that the sentence "has zero-cost maintenance on the
community"
> cannot be true in a general sense but only in a narrow one. I believe that
> everything has cost. I added, "let's be honest", because the
cost is not
> obvious and one can easily overlook it. However, I assumed we all know
> there has to be one as it would otherwise conflict with some universal
> law or something. The way I see it you acknowledge the existence in a few
> other places.
>
>
>
> >
> >> It seems unhelpful to pretend it does. (FWIW, I explained a simple
> >> scenario that would make the bazel
> >> inclusion "costly" in my previous mail.)
> >>
> > "zero-cost" is well defined: it is "as a community
member: feel free to
> > ignore, no one will bother you about it", and a subset of the
community
> > signed up for the maintenance.
> > I think it is also helpful to be concrete here: we have existing data
and
> > history with `gn`, it isn't hypothetical.
> >
> > To be sure I address your previous email, that was about user
> expectations
> > right? i.e. was it this part:
> >
> >> people will assume we (=the LLVM community) maintain(s) a bazel
build,
> > which can certainly be a benefit but also a cost", e.g., when the
build
> is
> > not properly maintained, support is scarce, etc. and emails come in
> > complaining about it (not thinking of prior examples here.)
> >
> > Isn't this similar to the concerns from Renato here:
> > http://lists.llvm.org/pipermail/llvm-dev/2020-October/146179.html ?
> > I acknowledge this as very valid concerns and offered some possibility
to
> > mitigate:
> http://lists.llvm.org/pipermail/llvm-dev/2020-October/146188.html
> >
> >
> >
> >>
> >>> I mentioned it before, but Bazel is not something internal or
specific
> to
> >>> Google: it isn't (actually there are many
incompatibilities between
> Bazel
> >>> and the internal system), 400 people attended the Bazel
conference last
> >>> year. I attended this conference 3 years ago when I was at
Tesla trying
> >> to
> >>> deploy Bazel internally. Many other companies are using Bazel,
> >> open-source
> >>> projects as well. Feel free to watch the talks online about
SpaceX
> >>> <https://www.youtube.com/watch?v=t_3bckhV_YI> or Two
Sigma and Uber
> >>> <https://www.youtube.com/watch?v=_bPyEbAyC0s> for
example
> >> Let's not conflate "using bazel" and "benefit
for LLVM", the former
> >> is not up for debate here. (I mean, a lot of people use autoconf
but
> >> we got rid of it anyway).
> >>
> > I doubt we wouldn't have got rid of Autoconf if a chunk of the
community
> > offered to maintain it at "no cost" (again see definition).
>
> It broke, ppl complained, and nobody wanted to fix it. That is the
> kind of technical debt (aka. cost) you can accumulate.
>
>
> >
> >> That said, I think the original question is highly relevant. As I
also
> >> mentioned somewhere above, where do we draw the line is the key to
this
> >> RFC at the end of the day. A lot of the arguments I hear pro
integration
> >> apply to various other things that currently live out-of-tree,
some of
> >> which were proposed and not integrated.
> >
> > Can you provide more concrete reference to these things that could
have
> > been integrated in similar "zero cost" fashion?
> > I'm all for consistency, and the only point of comparison here is
`gn`.
>
> Let's say RV, in a subfolder not build by default.

I don't know what RV is?

> Or any other
> project that was proposed for inclusion without being build by
> default. (I remember also the discussion if we can/should add
> isl to llvm, pre-mono repo.)
>
I am not sure I agree that we can compare new "projects" (or something
like
ISL) with "utilities for LLVM users".
I would expect a more comparable situation to me to be:
- the gdb scripts in llvm/utils/gdb-scripts/prettyprinters.py
- IDE visualizer in llvm/utils/LLVMVisualizers
- The Visual Studio Code syntax highlighting for LLVM IR and TableGen in
llvm/utils/vscode ; and similar for kate, jedit, vim, textmate, ...
- the gn files in llvm/utils/gn

The general theme here is that these are not "new projects" in
themselves:
they are highly coupled to LLVM itself and only allow a specific subset of
users to plug their tool/workflow into LLVM at a given revision.
Also all of these are "zero cost" in that they may be
"broken" and
maintained with best effort (I don't think we revert someone breaking any
of the visualizer or syntax highlighter?). And none of these are really
core to LLVM, and each could be in a separate repo where the interested
parties could maintain it.

Best,

-- 
Mehdi


>
>
> >
> >
> >
> >> I think we should not dismiss
> >> this easily, no matter on which side of the argument you are this
time.
> >>
> >> ~ Johannes
> >>
> >>
> >>
> >>>
> >>> I'm not trying to convince anyone to use Bazel, it has
drawbacks, but
> the
> >>> point here is to recognize that this is about OpenSource
communities
> that
> >>> Bazel is serving: these are users, some of us in the LLVM
community are
> >>> trying to provide these users with a reasonably good
integration story,
> >> and
> >>> we're ready to pay the cost for everyone.
> >>>
> >>>
> >>>
> >>>> *From:* Mehdi AMINI <joker.eph at gmail.com>
> >>>> *Sent:* Thursday, October 29, 2020 2:00 PM
> >>>> *To:* Chris Tetreault <ctetreau at quicinc.com>
> >>>> *Cc:* Sterling Augustine <saugustine at google.com>;
Mehdi Amini <
> >>>> aminim at google.com>; LLVM Dev <llvm-dev at
lists.llvm.org>; Stella
> >> Laurenzo <
> >>>> laurenzo at google.com>; Tres Popp <tpopp at
google.com>; Geoffrey
> >> Martin-Noble
> >>>> <gcmn at google.com>; Thomas Joerg <tjoerg at
google.com>
> >>>> *Subject:* [EXT] Re: [llvm-dev] Contributing Bazel BUILD
files similar
> >> to
> >>>> gn
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Thu, Oct 29, 2020 at 1:24 PM Chris Tetreault via
llvm-dev <
> >>>> llvm-dev at lists.llvm.org> wrote:
> >>>>
> >>>> The problem is that once it’s in community LLVM, it
becomes the
> >>>> community’s problem.  The expectation is that individual
contributors
> do
> >>>> not break anything in upstream.
> >>>>
> >>>>
> >>>>
> >>>> I would expect that the community by now has concrete
experience with
> >> `gn`
> >>>> gained over a few years demonstrating that this hasn't
been a problem
> to
> >>>> have this in-tree, without a burden of support on the
community.
> >>>>
> >>>> In particular, I think that a salient point is the
guarantee that no
> >>>> public bot would be testing it (I mean here by "no
public bot" that no
> >> bot
> >>>> would email you when you break it).
> >>>>
> >>>>
> >>>>
> >>>> Why else would you contribute it to the LLVM monorepo? If
the goal is
> >> just
> >>>> to enable external-to-google orgs to collaborate on it,
why not
> >> contribute
> >>>> it as a new repo separate from LLVM? You wouldn’t need to
ask
> anybody’s
> >>>> permission to do this.
> >>>>
> >>>>
> >>>>
> >>>> Yes, we could do this, and you are correct that in many
cases a
> >> motivation
> >>>> to upstream a component is to make sure it is maintained
by the
> >> community
> >>>> and works out of the box.
> >>>>
> >>>> In this case it is slightly different: we are OK with
people to break
> >>>> this. We are already maintaining these files out-of-tree
for our own
> >>>> purposes, and this has been the case for years as Sterling
mentions. I
> >>>> would even suspect that for Google internal build
integration, it is
> >>>> actually easier to have these files internal only rather
than
> >> unsupported
> >>>> upstream.
> >>>>
> >>>> So why are we doing it? I mentioned this in another
answer: this is
> >> mainly
> >>>> to provide a collaboration space for the support of OSS
projects using
> >>>> Bazel interested to use LLVM (and some subprojects).
> >>>>
> >>>> Having them in-tree means that we can publish every day
(or more) a
> git
> >>>> hash that we validate with Bazel on private bots (like
`gn`) and every
> >>>> project can use to clone the LLVM monorepo and integrate
in their
> build
> >>>> flow easily. Another repo, submodules, etc. are not making
this
> >> possible /
> >>>> practical.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> *From:* Sterling Augustine <saugustine at
google.com>
> >>>> *Sent:* Thursday, October 29, 2020 1:14 PM
> >>>> *To:* Chris Tetreault <ctetreau at quicinc.com>
> >>>> *Cc:* Renato Golin <rengolin at gmail.com>; tstellar
at redhat.com; Mehdi
> >> Amini
> >>>> <aminim at google.com>; LLVM Dev <llvm-dev at
lists.llvm.org>; Stella
> >> Laurenzo <
> >>>> laurenzo at google.com>; Tres Popp <tpopp at
google.com>; Geoffrey
> >> Martin-Noble
> >>>> <gcmn at google.com>; Thomas Joerg <tjoerg at
google.com>
> >>>> *Subject:* [EXT] Re: [llvm-dev] Contributing Bazel BUILD
files similar
> >> to
> >>>> gn
> >>>>
> >>>>
> >>>>
> >>>> On Thu, Oct 29, 2020 at 12:29 PM Chris Tetreault via
llvm-dev <
> >>>> llvm-dev at lists.llvm.org> wrote:
> >>>>
> >>>> I think Renato has articulated quite well some concerns I
have about
> >> this
> >>>> but was unable to express. I would very much prefer if we
just focus
> on
> >>>> using CMake effectively.
> >>>>
> >>>> ...
> >>>>
> >>>> For example, when trying to implement the same logic on
both will not
> be
> >>>> trivial. So, whenever we want to add some functionality or
improve how
> >> we
> >>>> build LLVM with one system, we'll have to do so in
multiple build
> >> systems
> >>>> that do not easily match each other.
> >>>>
> >>>>
> >>>>
> >>>> Google already does all of this work, and has for years. I
think it is
> >>>> fair to say that it hasn't been a burden on the
community.
> >>>>
> >>>>
> >>>>
> >>>> If we don't try to match functionality, we'll
segregate the community,
> >>>> because people will be able to do X on build system A but
not B, and
> the
> >>>> similar features cluster together and then we have
essentially two
> >> projects
> >>>> built from the same source code.
> >>>>
> >>>>
> >>>>
> >>>> As long as we keep CMake as the canonical system
everything will be
> >> fine.
> >>>> It works perfectly well today, except that not everyone
gets to see or
> >> use
> >>>> the bazel files. They exist right now; they work right
now; and it
> >> hasn't
> >>>> been a burden on anyone but the people who care about
bazel.
> >>>>
> >>>>
> >>>>
> >>>> Testing this, or worse, trying to fix a buildbot that is
built with
> >> Bazel
> >>>> (and having to install Java JDK and all its dependencies)
on
> >> potentially a
> >>>> hardware that you do not have access to, will be a
nightmare to debug.
> >> The
> >>>> nature of post-commit testing, revert and review of LLVM
will not make
> >> that
> >>>> simpler. Unless we treat the Bazel build as "not our
problem" (which
> >>>> defeats the point of having it?).
> >>>>
> >>>>
> >>>>
> >>>> Google makes it work like this today, with the rest of the
project
> >>>> treating it as "not our problem" because they
don't even see that they
> >>>> exist. The build bot issues would be real, but I think
surmountable,
> >> given
> >>>> that Google already cleans up the bazel files, it just
doesn't push
> >> them.
> >>>> Perhaps an explicit policy that cmake folks don't have
to update the
> >> bazel
> >>>> files would be helpful.
> >>>>
> >>>>
> >>>>
> >>>> To make matters worse, our CMake files are not simple, and
do not do
> all
> >>>> of the things we want them to do in the way we understand
completely.
> >> There
> >>>> is a lot of kludge that we carry and with that comes in
two
> categories:
> >> the
> >>>> things that we hate and would love to fix, and the things
that are
> fixes
> >>>> that we have no idea are there. The former are the reasons
why people
> >> want
> >>>> to start a new build system, the latter is why they soon
realise that
> >> was a
> >>>> mistake (insert XKCD joke here).
> >>>>
> >>>>
> >>>>
> >>>> It wouldn't be starting a new build system, it would
be making a
> >>>> pre-existing, already extremely well functioning one,
available to
> more
> >>>> people.
> >>>>
> >>>>
> >>>>
> >>>> I can definitely see folks who use cmake not wanting more
hassle--that
> >> may
> >>>> be a valid reason not to do it. But "it won't
work" or "it's hard to
> >> keep
> >>>> up" or "it's too complicated" seem well
refuted by a multi-year
> >> existence
> >>>> proof.
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> LLVM Developers mailing list
> >>>> llvm-dev at lists.llvm.org
> >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>>>
> >>>>
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> llvm-dev at lists.llvm.org
> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201029/ed53fbe0/attachment-0001.html>

Eric Christopher via llvm-dev

2020-Oct-30 03:48 UTC

head link

[llvm-dev] Contributing Bazel BUILD files similar to gn

On Thu, Oct 29, 2020 at 11:39 PM Mehdi AMINI via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
>
> On Thu, Oct 29, 2020 at 8:10 PM Johannes Doerfert <
> johannesdoerfert at gmail.com> wrote:
>
>>
>> On 10/29/20 9:48 PM, Mehdi AMINI wrote:
>> > On Thu, Oct 29, 2020 at 7:30 PM Johannes Doerfert <
>> > johannesdoerfert at gmail.com> wrote:
>> >
>> >> I replied only selectively.
>> >>
>> >>
>> >> On 10/29/20 5:47 PM, Mehdi AMINI via llvm-dev wrote:
>> >>> On Thu, Oct 29, 2020 at 2:35 PM Chris Tetreault
<ctetreau at quicinc.com
>> >
>> >>> wrote:
>> >>>
>> >>>> Honestly, I’m hearing that some people would like the
Bazel build
>> system
>> >>>> to be in community master, and the argument basically
boils down to
>> >> “It’ll
>> >>>> be fine. It’ll just sit there and mind its own
business and you don’t
>> >> have
>> >>>> to care about it.”
>> >>>>
>> >>> Not really: this argument is only the answer to why it
does not bear
>> any
>> >>> weight on non-Bazel users, just like `gn` does already
today.
>> >>>
>> >>> I think I explained the motivation to do it, but I can
restate it:
>> many
>> >>> LLVM contributors need to collaborate on this piece of
infrastructure
>> >> that
>> >>> is very specific to LLVM and enabling some users of LLVM:
the natural
>> >> place
>> >>> of collaboration is the monorepo.
>> >>>
>> >>>
>> >>>>> So why are we doing it? I mentioned this in
another answer: this is
>> >>>> mainly to provide a collaboration space for the
support of OSS
>> projects
>> >>>> using Bazel interested to use LLVM (and some
subprojects). …
>> >>>>
>> >>>>
>> >>>>
>> >>>> Which could be handled by having it in an external
public repo.
>> >>>>
>> >>> Sure, just like almost every new code could be handled in
an external
>> >> repo.
>> >>> However when many LLVM contributors are interested to
collaborate on
>> >>> something highly coupled to LLVM it seems like the natural
place to do
>> >> it.
>> >>> Also I don't know for Qualcomm, but most companies
will want you to
>> sign
>> >> a
>> >>> CLA if they provide this "external repo" where
we can collaborate, and
>> >>> other parties won't be able to collaborate. The LLVM
project is in
>> >> general
>> >>> seen as quite "neutral" for collaborating.
>> >>>
>> >>>
>> >>>>> Having them in-tree means that we can publish
every day (or more) a
>> git
>> >>>> hash that we validate with Bazel on private bots (like
`gn`) and
>> every
>> >>>> project can use to clone the LLVM monorepo and
integrate in their
>> build
>> >>>> flow easily.
>> >>>>
>> >>>>
>> >>>>
>> >>>> You could still publish this info: “Today, the head of
llvm-bazel is
>> >>>> confirmed to work with LLVM monorepo sha [foo]”. I
don’t think two
>> git
>> >>>> clones is significantly harder than one.
>> >>>>
>> >>> For a developer at their desk, you could say it is just an
>> inconvenience
>> >>> that can be worked around (scripting, etc.).
>> >>> For the project on the other hand, Bazel has native
support to clone a
>> >> repo
>> >>> and build it itself as dependency.  For example TensorFlow
has many
>> >>> dependencies, and it just points to a commit in the source
repo:
>> >>>
>> >>
>>
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/workspace.bzl#L689-L697
>> >>> You can see how it is convenient to update the SHA1 there
and have it
>> >> just
>> >>> work for any Bazel user.
>> >>>
>> >>>
>> >>>
>> >>>> I submit that in a way this is simpler because you can
always
>> advertise
>> >>>> the head of the bazel repo. If the Bazel build system
were in the
>> >> community
>> >>>> repo, then you might have to tell users to use an
older version of
>> the
>> >>>> bazel build if a fix went into the monorepo in the
afternoon, but the
>> >> next
>> >>>> morning’s nightly finds that the most recent sha that
passes the
>> tests
>> >> is
>> >>>> prior to that fix.
>> >>>>
>> >>> This is not different from "a commit broke the ARM
bootstrap and a
>> user
>> >> who
>> >>> checked out the repo at the time will be broken".
From this point of
>> view
>> >>> this configuration is no different than any other, except
that we
>> don't
>> >>> revert or notify the author of a breaking change, a set of
volunteers
>> >>> monitor a silent bot and fix-forward as needed, like `gn`.
>> >>> It is just much easier to have a bot publishing the
"known good"
>> revision
>> >>> of the monorepo.
>> >>>
>> >>>
>> >>>> I guess my concern is that I’m not really hearing a
compelling (to my
>> >> ear)
>> >>>> argument for this inclusion.
>> >>>>
>> >>> Sure, but if other contributors have a strong interest,
and you don't
>> >>> really have a strong objection here that we need to
address, we
>> should be
>> >>> able to get past that?
>> >> Wouldn't your argument hold for anything that "just
lives" in the mono
>> >> repo but doesn't impact people? I mean, where is the line
for stuff
>> that
>> >> some contributors have "strong interest" in and
others can't really
>> >> "hear a compelling argument for inclusion"? People
raise concerns here
>> >> and from where I am sitting they are brushed over easily and
more
>> >> aggressively as the thread progresses (up to the email I
respond to).
>> >>
>> > Sorry, I invite you to reread the thread again and revisit your
>> impression:
>> > Tom and Renato expressed clear concerns, and I believe I really
tried to
>> > listen and address these with concrete proposals to mitigate:
>> > http://lists.llvm.org/pipermail/llvm-dev/2020-October/146182.html
>> > However there is not much I can do to address folks who object
because
>> > "they don't see the interest" in it, this isn't
a productive way of
>> moving
>> > forward with such proposal IMO.
>> >
>> >
>> >
>> >>>> I guess it would make the lives of google employees
easier?
>> >>>>
>> >>> I explained before that Google internal integration flow
is likely
>> better
>> >>> without this at the moment, TensorFlow itself is also in a
reasonably
>> >> good
>> >>> spot at the moment. But Google is also not a monolithic
place, some
>> >> people
>> >>> are working on small independent projects that they are
open-sourcing,
>> >> and
>> >>> would like to be able to use LLVM.
>> >>>
>> >>>>    Then what’s to stop every large org from committing
their internal
>> >> stuff
>> >>> to master?
>> >>>
>> >>>
>> >>>
>> >>> If their "internal stuff" is highly-coupled to
LLVM, has zero-cost
>> >>> maintenance on the community, and is something that
multiple other
>> >> parties
>> >>> can benefit and established members of the community want
to maintain
>> and
>> >>> collaborate on, why not?
>> >> Let's be honest, nothing has "zero-cost".
>> >
>> > I hope you're not implying I'd be dishonest here right?
>>
>> Long story short, I did not try to imply you were dishonest.
>>
>
> Yes, I know you :)
> (actually I thought I included a wink smiley above, but apparently not,
> sorry about that)
>
>
>>
>> I'm saying that the sentence "has zero-cost maintenance on the
community"
>> cannot be true in a general sense but only in a narrow one. I believe
that
>> everything has cost. I added, "let's be honest", because
the cost is not
>> obvious and one can easily overlook it. However, I assumed we all know
>> there has to be one as it would otherwise conflict with some universal
>> law or something. The way I see it you acknowledge the existence in a
few
>> other places.
>>
>>
>>
>> >
>> >> It seems unhelpful to pretend it does. (FWIW, I explained a
simple
>> >> scenario that would make the bazel
>> >> inclusion "costly" in my previous mail.)
>> >>
>> > "zero-cost" is well defined: it is "as a community
member: feel free to
>> > ignore, no one will bother you about it", and a subset of the
community
>> > signed up for the maintenance.
>> > I think it is also helpful to be concrete here: we have existing
data
>> and
>> > history with `gn`, it isn't hypothetical.
>> >
>> > To be sure I address your previous email, that was about user
>> expectations
>> > right? i.e. was it this part:
>> >
>> >> people will assume we (=the LLVM community) maintain(s) a
bazel build,
>> > which can certainly be a benefit but also a cost", e.g., when
the build
>> is
>> > not properly maintained, support is scarce, etc. and emails come
in
>> > complaining about it (not thinking of prior examples here.)
>> >
>> > Isn't this similar to the concerns from Renato here:
>> > http://lists.llvm.org/pipermail/llvm-dev/2020-October/146179.html
?
>> > I acknowledge this as very valid concerns and offered some
possibility
>> to
>> > mitigate:
>> http://lists.llvm.org/pipermail/llvm-dev/2020-October/146188.html
>> >
>> >
>> >
>> >>
>> >>> I mentioned it before, but Bazel is not something internal
or
>> specific to
>> >>> Google: it isn't (actually there are many
incompatibilities between
>> Bazel
>> >>> and the internal system), 400 people attended the Bazel
conference
>> last
>> >>> year. I attended this conference 3 years ago when I was at
Tesla
>> trying
>> >> to
>> >>> deploy Bazel internally. Many other companies are using
Bazel,
>> >> open-source
>> >>> projects as well. Feel free to watch the talks online
about SpaceX
>> >>> <https://www.youtube.com/watch?v=t_3bckhV_YI> or Two
Sigma and Uber
>> >>> <https://www.youtube.com/watch?v=_bPyEbAyC0s> for
example
>> >> Let's not conflate "using bazel" and
"benefit for LLVM", the former
>> >> is not up for debate here. (I mean, a lot of people use
autoconf but
>> >> we got rid of it anyway).
>> >>
>> > I doubt we wouldn't have got rid of Autoconf if a chunk of the
community
>> > offered to maintain it at "no cost" (again see
definition).
>>
>> It broke, ppl complained, and nobody wanted to fix it. That is the
>> kind of technical debt (aka. cost) you can accumulate.
>>
>>
>> >
>> >> That said, I think the original question is highly relevant.
As I also
>> >> mentioned somewhere above, where do we draw the line is the
key to this
>> >> RFC at the end of the day. A lot of the arguments I hear pro
>> integration
>> >> apply to various other things that currently live out-of-tree,
some of
>> >> which were proposed and not integrated.
>> >
>> > Can you provide more concrete reference to these things that could
have
>> > been integrated in similar "zero cost" fashion?
>> > I'm all for consistency, and the only point of comparison here
is `gn`.
>>
>> Let's say RV, in a subfolder not build by default.
>
>
> I don't know what RV is?
>
>
>> Or any other
>> project that was proposed for inclusion without being build by
>> default. (I remember also the discussion if we can/should add
>> isl to llvm, pre-mono repo.)
>>
>
> I am not sure I agree that we can compare new "projects" (or
something
> like ISL) with "utilities for LLVM users".
> I would expect a more comparable situation to me to be:
> - the gdb scripts in llvm/utils/gdb-scripts/prettyprinters.py
> - IDE visualizer in llvm/utils/LLVMVisualizers
> - The Visual Studio Code syntax highlighting for LLVM IR and TableGen in
> llvm/utils/vscode ; and similar for kate, jedit, vim, textmate, ...
> - the gn files in llvm/utils/gn
>
> The general theme here is that these are not "new projects" in
themselves:
> they are highly coupled to LLVM itself and only allow a specific subset of
> users to plug their tool/workflow into LLVM at a given revision.
> Also all of these are "zero cost" in that they may be
"broken" and
> maintained with best effort (I don't think we revert someone breaking
any
> of the visualizer or syntax highlighter?). And none of these are really
> core to LLVM, and each could be in a separate repo where the interested
> parties could maintain it.
>
>FWIW this is the compelling argument to me and why I've also relaxed
significantly over the years on this sort of thing. It's cheap to remove
code when it's not useful anymore and we can do our best with documentation
to handle a lot of the new user problems (as Eric Astor mentioned).
Probably even a README in the various directories making sure that people
know that cmake is the supported system "just in case they didn't read
the
documentation" :)

-eric

> Best,
>
> --
> Mehdi
>
>
>
>>
>>
>> >
>> >
>> >
>> >> I think we should not dismiss
>> >> this easily, no matter on which side of the argument you are
this time.
>> >>
>> >> ~ Johannes
>> >>
>> >>
>> >>
>> >>>
>> >>> I'm not trying to convince anyone to use Bazel, it has
drawbacks, but
>> the
>> >>> point here is to recognize that this is about OpenSource
communities
>> that
>> >>> Bazel is serving: these are users, some of us in the LLVM
community
>> are
>> >>> trying to provide these users with a reasonably good
integration
>> story,
>> >> and
>> >>> we're ready to pay the cost for everyone.
>> >>>
>> >>>
>> >>>
>> >>>> *From:* Mehdi AMINI <joker.eph at gmail.com>
>> >>>> *Sent:* Thursday, October 29, 2020 2:00 PM
>> >>>> *To:* Chris Tetreault <ctetreau at quicinc.com>
>> >>>> *Cc:* Sterling Augustine <saugustine at
google.com>; Mehdi Amini <
>> >>>> aminim at google.com>; LLVM Dev <llvm-dev at
lists.llvm.org>; Stella
>> >> Laurenzo <
>> >>>> laurenzo at google.com>; Tres Popp <tpopp at
google.com>; Geoffrey
>> >> Martin-Noble
>> >>>> <gcmn at google.com>; Thomas Joerg <tjoerg at
google.com>
>> >>>> *Subject:* [EXT] Re: [llvm-dev] Contributing Bazel
BUILD files
>> similar
>> >> to
>> >>>> gn
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Thu, Oct 29, 2020 at 1:24 PM Chris Tetreault via
llvm-dev <
>> >>>> llvm-dev at lists.llvm.org> wrote:
>> >>>>
>> >>>> The problem is that once it’s in community LLVM, it
becomes the
>> >>>> community’s problem.  The expectation is that
individual
>> contributors do
>> >>>> not break anything in upstream.
>> >>>>
>> >>>>
>> >>>>
>> >>>> I would expect that the community by now has concrete
experience with
>> >> `gn`
>> >>>> gained over a few years demonstrating that this
hasn't been a
>> problem to
>> >>>> have this in-tree, without a burden of support on the
community.
>> >>>>
>> >>>> In particular, I think that a salient point is the
guarantee that no
>> >>>> public bot would be testing it (I mean here by
"no public bot" that
>> no
>> >> bot
>> >>>> would email you when you break it).
>> >>>>
>> >>>>
>> >>>>
>> >>>> Why else would you contribute it to the LLVM monorepo?
If the goal is
>> >> just
>> >>>> to enable external-to-google orgs to collaborate on
it, why not
>> >> contribute
>> >>>> it as a new repo separate from LLVM? You wouldn’t need
to ask
>> anybody’s
>> >>>> permission to do this.
>> >>>>
>> >>>>
>> >>>>
>> >>>> Yes, we could do this, and you are correct that in
many cases a
>> >> motivation
>> >>>> to upstream a component is to make sure it is
maintained by the
>> >> community
>> >>>> and works out of the box.
>> >>>>
>> >>>> In this case it is slightly different: we are OK with
people to break
>> >>>> this. We are already maintaining these files
out-of-tree for our own
>> >>>> purposes, and this has been the case for years as
Sterling mentions.
>> I
>> >>>> would even suspect that for Google internal build
integration, it is
>> >>>> actually easier to have these files internal only
rather than
>> >> unsupported
>> >>>> upstream.
>> >>>>
>> >>>> So why are we doing it? I mentioned this in another
answer: this is
>> >> mainly
>> >>>> to provide a collaboration space for the support of
OSS projects
>> using
>> >>>> Bazel interested to use LLVM (and some subprojects).
>> >>>>
>> >>>> Having them in-tree means that we can publish every
day (or more) a
>> git
>> >>>> hash that we validate with Bazel on private bots (like
`gn`) and
>> every
>> >>>> project can use to clone the LLVM monorepo and
integrate in their
>> build
>> >>>> flow easily. Another repo, submodules, etc. are not
making this
>> >> possible /
>> >>>> practical.
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> *From:* Sterling Augustine <saugustine at
google.com>
>> >>>> *Sent:* Thursday, October 29, 2020 1:14 PM
>> >>>> *To:* Chris Tetreault <ctetreau at quicinc.com>
>> >>>> *Cc:* Renato Golin <rengolin at gmail.com>;
tstellar at redhat.com; Mehdi
>> >> Amini
>> >>>> <aminim at google.com>; LLVM Dev <llvm-dev at
lists.llvm.org>; Stella
>> >> Laurenzo <
>> >>>> laurenzo at google.com>; Tres Popp <tpopp at
google.com>; Geoffrey
>> >> Martin-Noble
>> >>>> <gcmn at google.com>; Thomas Joerg <tjoerg at
google.com>
>> >>>> *Subject:* [EXT] Re: [llvm-dev] Contributing Bazel
BUILD files
>> similar
>> >> to
>> >>>> gn
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Thu, Oct 29, 2020 at 12:29 PM Chris Tetreault via
llvm-dev <
>> >>>> llvm-dev at lists.llvm.org> wrote:
>> >>>>
>> >>>> I think Renato has articulated quite well some
concerns I have about
>> >> this
>> >>>> but was unable to express. I would very much prefer if
we just focus
>> on
>> >>>> using CMake effectively.
>> >>>>
>> >>>> ...
>> >>>>
>> >>>> For example, when trying to implement the same logic
on both will
>> not be
>> >>>> trivial. So, whenever we want to add some
functionality or improve
>> how
>> >> we
>> >>>> build LLVM with one system, we'll have to do so in
multiple build
>> >> systems
>> >>>> that do not easily match each other.
>> >>>>
>> >>>>
>> >>>>
>> >>>> Google already does all of this work, and has for
years. I think it
>> is
>> >>>> fair to say that it hasn't been a burden on the
community.
>> >>>>
>> >>>>
>> >>>>
>> >>>> If we don't try to match functionality, we'll
segregate the
>> community,
>> >>>> because people will be able to do X on build system A
but not B, and
>> the
>> >>>> similar features cluster together and then we have
essentially two
>> >> projects
>> >>>> built from the same source code.
>> >>>>
>> >>>>
>> >>>>
>> >>>> As long as we keep CMake as the canonical system
everything will be
>> >> fine.
>> >>>> It works perfectly well today, except that not
everyone gets to see
>> or
>> >> use
>> >>>> the bazel files. They exist right now; they work right
now; and it
>> >> hasn't
>> >>>> been a burden on anyone but the people who care about
bazel.
>> >>>>
>> >>>>
>> >>>>
>> >>>> Testing this, or worse, trying to fix a buildbot that
is built with
>> >> Bazel
>> >>>> (and having to install Java JDK and all its
dependencies) on
>> >> potentially a
>> >>>> hardware that you do not have access to, will be a
nightmare to
>> debug.
>> >> The
>> >>>> nature of post-commit testing, revert and review of
LLVM will not
>> make
>> >> that
>> >>>> simpler. Unless we treat the Bazel build as "not
our problem" (which
>> >>>> defeats the point of having it?).
>> >>>>
>> >>>>
>> >>>>
>> >>>> Google makes it work like this today, with the rest of
the project
>> >>>> treating it as "not our problem" because
they don't even see that
>> they
>> >>>> exist. The build bot issues would be real, but I think
surmountable,
>> >> given
>> >>>> that Google already cleans up the bazel files, it just
doesn't push
>> >> them.
>> >>>> Perhaps an explicit policy that cmake folks don't
have to update the
>> >> bazel
>> >>>> files would be helpful.
>> >>>>
>> >>>>
>> >>>>
>> >>>> To make matters worse, our CMake files are not simple,
and do not do
>> all
>> >>>> of the things we want them to do in the way we
understand completely.
>> >> There
>> >>>> is a lot of kludge that we carry and with that comes
in two
>> categories:
>> >> the
>> >>>> things that we hate and would love to fix, and the
things that are
>> fixes
>> >>>> that we have no idea are there. The former are the
reasons why people
>> >> want
>> >>>> to start a new build system, the latter is why they
soon realise that
>> >> was a
>> >>>> mistake (insert XKCD joke here).
>> >>>>
>> >>>>
>> >>>>
>> >>>> It wouldn't be starting a new build system, it
would be making a
>> >>>> pre-existing, already extremely well functioning one,
available to
>> more
>> >>>> people.
>> >>>>
>> >>>>
>> >>>>
>> >>>> I can definitely see folks who use cmake not wanting
more
>> hassle--that
>> >> may
>> >>>> be a valid reason not to do it. But "it won't
work" or "it's hard to
>> >> keep
>> >>>> up" or "it's too complicated" seem
well refuted by a multi-year
>> >> existence
>> >>>> proof.
>> >>>>
>> >>>>
>> >>>>
>> >>>> _______________________________________________
>> >>>> LLVM Developers mailing list
>> >>>> llvm-dev at lists.llvm.org
>> >>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >>>>
>> >>>>
>> >>> _______________________________________________
>> >>> LLVM Developers mailing list
>> >>> llvm-dev at lists.llvm.org
>> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201029/785ff485/attachment.html>

Johannes Doerfert via llvm-dev

2020-Oct-30 03:58 UTC

head link

[llvm-dev] Contributing Bazel BUILD files similar to gn

On 10/29/20 10:38 PM, Mehdi AMINI wrote:
 > On Thu, Oct 29, 2020 at 8:10 PM Johannes Doerfert <
 > johannesdoerfert at gmail.com> wrote:
 >
 >>
 >> On 10/29/20 9:48 PM, Mehdi AMINI wrote:
 >>> On Thu, Oct 29, 2020 at 7:30 PM Johannes Doerfert <
 >>> johannesdoerfert at gmail.com> wrote:
 >>>
 >>>> I replied only selectively.
 >>>>
 >>>>
 >>>> On 10/29/20 5:47 PM, Mehdi AMINI via llvm-dev wrote:
 >>>>> On Thu, Oct 29, 2020 at 2:35 PM Chris Tetreault 
<ctetreau at quicinc.com>
 >>>>> wrote:
 >>>>>
 >>>>>> Honestly, I’m hearing that some people would like the
Bazel build
 >> system
 >>>>>> to be in community master, and the argument basically
boils down to
 >>>> “It’ll
 >>>>>> be fine. It’ll just sit there and mind its own
business and you
don’t
 >>>> have
 >>>>>> to care about it.”
 >>>>>>
 >>>>> Not really: this argument is only the answer to why it
does not bear
 >> any
 >>>>> weight on non-Bazel users, just like `gn` does already
today.
 >>>>>
 >>>>> I think I explained the motivation to do it, but I can
restate
it: many
 >>>>> LLVM contributors need to collaborate on this piece of
infrastructure
 >>>> that
 >>>>> is very specific to LLVM and enabling some users of LLVM:
the natural
 >>>> place
 >>>>> of collaboration is the monorepo.
 >>>>>
 >>>>>
 >>>>>>> So why are we doing it? I mentioned this in
another answer: this is
 >>>>>> mainly to provide a collaboration space for the
support of OSS
 >> projects
 >>>>>> using Bazel interested to use LLVM (and some
subprojects). …
 >>>>>>
 >>>>>>
 >>>>>>
 >>>>>> Which could be handled by having it in an external
public repo.
 >>>>>>
 >>>>> Sure, just like almost every new code could be handled in
an external
 >>>> repo.
 >>>>> However when many LLVM contributors are interested to
collaborate on
 >>>>> something highly coupled to LLVM it seems like the natural
place
to do
 >>>> it.
 >>>>> Also I don't know for Qualcomm, but most companies
will want you to
 >> sign
 >>>> a
 >>>>> CLA if they provide this "external repo" where
we can
collaborate, and
 >>>>> other parties won't be able to collaborate. The LLVM
project is in
 >>>> general
 >>>>> seen as quite "neutral" for collaborating.
 >>>>>
 >>>>>
 >>>>>>> Having them in-tree means that we can publish
every day (or more) a
 >> git
 >>>>>> hash that we validate with Bazel on private bots (like
`gn`) and
every
 >>>>>> project can use to clone the LLVM monorepo and
integrate in their
 >> build
 >>>>>> flow easily.
 >>>>>>
 >>>>>>
 >>>>>>
 >>>>>> You could still publish this info: “Today, the head of
llvm-bazel is
 >>>>>> confirmed to work with LLVM monorepo sha [foo]”. I
don’t think
two git
 >>>>>> clones is significantly harder than one.
 >>>>>>
 >>>>> For a developer at their desk, you could say it is just an
 >> inconvenience
 >>>>> that can be worked around (scripting, etc.).
 >>>>> For the project on the other hand, Bazel has native
support to
clone a
 >>>> repo
 >>>>> and build it itself as dependency.  For example TensorFlow
has many
 >>>>> dependencies, and it just points to a commit in the source
repo:
 >>>>>
 >>>>
 >> 
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/workspace.bzl#L689-L697
 >>>>> You can see how it is convenient to update the SHA1 there
and have it
 >>>> just
 >>>>> work for any Bazel user.
 >>>>>
 >>>>>
 >>>>>
 >>>>>> I submit that in a way this is simpler because you can
always
 >> advertise
 >>>>>> the head of the bazel repo. If the Bazel build system
were in the
 >>>> community
 >>>>>> repo, then you might have to tell users to use an
older version
of the
 >>>>>> bazel build if a fix went into the monorepo in the
afternoon,
but the
 >>>> next
 >>>>>> morning’s nightly finds that the most recent sha that
passes the
tests
 >>>> is
 >>>>>> prior to that fix.
 >>>>>>
 >>>>> This is not different from "a commit broke the ARM
bootstrap and
a user
 >>>> who
 >>>>> checked out the repo at the time will be broken".
From this point of
 >> view
 >>>>> this configuration is no different than any other, except
that we
don't
 >>>>> revert or notify the author of a breaking change, a set of
volunteers
 >>>>> monitor a silent bot and fix-forward as needed, like `gn`.
 >>>>> It is just much easier to have a bot publishing the
"known good"
 >> revision
 >>>>> of the monorepo.
 >>>>>
 >>>>>
 >>>>>> I guess my concern is that I’m not really hearing a
compelling
(to my
 >>>> ear)
 >>>>>> argument for this inclusion.
 >>>>>>
 >>>>> Sure, but if other contributors have a strong interest,
and you don't
 >>>>> really have a strong objection here that we need to
address, we
should
 >> be
 >>>>> able to get past that?
 >>>> Wouldn't your argument hold for anything that "just
lives" in the mono
 >>>> repo but doesn't impact people? I mean, where is the line
for
stuff that
 >>>> some contributors have "strong interest" in and
others can't really
 >>>> "hear a compelling argument for inclusion"? People
raise concerns here
 >>>> and from where I am sitting they are brushed over easily and
more
 >>>> aggressively as the thread progresses (up to the email I
respond to).
 >>>>
 >>> Sorry, I invite you to reread the thread again and revisit your
 >> impression:
 >>> Tom and Renato expressed clear concerns, and I believe I really 
tried to
 >>> listen and address these with concrete proposals to mitigate:
 >>> http://lists.llvm.org/pipermail/llvm-dev/2020-October/146182.html
 >>> However there is not much I can do to address folks who object
because
 >>> "they don't see the interest" in it, this isn't
a productive way of
 >> moving
 >>> forward with such proposal IMO.
 >>>
 >>>
 >>>
 >>>>>> I guess it would make the lives of google employees
easier?
 >>>>>>
 >>>>> I explained before that Google internal integration flow
is likely
 >> better
 >>>>> without this at the moment, TensorFlow itself is also in a
reasonably
 >>>> good
 >>>>> spot at the moment. But Google is also not a monolithic
place, some
 >>>> people
 >>>>> are working on small independent projects that they are 
open-sourcing,
 >>>> and
 >>>>> would like to be able to use LLVM.
 >>>>>
 >>>>>>    Then what’s to stop every large org from committing
their
internal
 >>>> stuff
 >>>>> to master?
 >>>>>
 >>>>>
 >>>>>
 >>>>> If their "internal stuff" is highly-coupled to
LLVM, has zero-cost
 >>>>> maintenance on the community, and is something that
multiple other
 >>>> parties
 >>>>> can benefit and established members of the community want
to maintain
 >> and
 >>>>> collaborate on, why not?
 >>>> Let's be honest, nothing has "zero-cost".
 >>>
 >>> I hope you're not implying I'd be dishonest here right?
 >>
 >> Long story short, I did not try to imply you were dishonest.
 >>
 >
 > Yes, I know you :)
 > (actually I thought I included a wink smiley above, but apparently not,
 > sorry about that)
 >
 >
 >>
 >> I'm saying that the sentence "has zero-cost maintenance on
the
community"
 >> cannot be true in a general sense but only in a narrow one. I 
believe that
 >> everything has cost. I added, "let's be honest", because
the cost is not
 >> obvious and one can easily overlook it. However, I assumed we all know
 >> there has to be one as it would otherwise conflict with some universal
 >> law or something. The way I see it you acknowledge the existence in 
a few
 >> other places.
 >>
 >>
 >>
 >>>
 >>>> It seems unhelpful to pretend it does. (FWIW, I explained a
simple
 >>>> scenario that would make the bazel
 >>>> inclusion "costly" in my previous mail.)
 >>>>
 >>> "zero-cost" is well defined: it is "as a community
member: feel free to
 >>> ignore, no one will bother you about it", and a subset of the
community
 >>> signed up for the maintenance.
 >>> I think it is also helpful to be concrete here: we have existing 
data and
 >>> history with `gn`, it isn't hypothetical.
 >>>
 >>> To be sure I address your previous email, that was about user
 >> expectations
 >>> right? i.e. was it this part:
 >>>
 >>>> people will assume we (=the LLVM community) maintain(s) a
bazel build,
 >>> which can certainly be a benefit but also a cost", e.g., when
the build
 >> is
 >>> not properly maintained, support is scarce, etc. and emails come
in
 >>> complaining about it (not thinking of prior examples here.)
 >>>
 >>> Isn't this similar to the concerns from Renato here:
 >>> http://lists.llvm.org/pipermail/llvm-dev/2020-October/146179.html
?
 >>> I acknowledge this as very valid concerns and offered some 
possibility to
 >>> mitigate:
 >> http://lists.llvm.org/pipermail/llvm-dev/2020-October/146188.html
 >>>
 >>>
 >>>
 >>>>
 >>>>> I mentioned it before, but Bazel is not something internal
or
specific
 >> to
 >>>>> Google: it isn't (actually there are many
incompatibilities between
 >> Bazel
 >>>>> and the internal system), 400 people attended the Bazel 
conference last
 >>>>> year. I attended this conference 3 years ago when I was at
Tesla
trying
 >>>> to
 >>>>> deploy Bazel internally. Many other companies are using
Bazel,
 >>>> open-source
 >>>>> projects as well. Feel free to watch the talks online
about SpaceX
 >>>>> <https://www.youtube.com/watch?v=t_3bckhV_YI> or Two
Sigma and Uber
 >>>>> <https://www.youtube.com/watch?v=_bPyEbAyC0s> for
example
 >>>> Let's not conflate "using bazel" and
"benefit for LLVM", the former
 >>>> is not up for debate here. (I mean, a lot of people use
autoconf but
 >>>> we got rid of it anyway).
 >>>>
 >>> I doubt we wouldn't have got rid of Autoconf if a chunk of the
community
 >>> offered to maintain it at "no cost" (again see
definition).
 >>
 >> It broke, ppl complained, and nobody wanted to fix it. That is the
 >> kind of technical debt (aka. cost) you can accumulate.
 >>
 >>
 >>>
 >>>> That said, I think the original question is highly relevant.
As I also
 >>>> mentioned somewhere above, where do we draw the line is the
key to
this
 >>>> RFC at the end of the day. A lot of the arguments I hear pro 
integration
 >>>> apply to various other things that currently live out-of-tree,
some of
 >>>> which were proposed and not integrated.
 >>>
 >>> Can you provide more concrete reference to these things that could
have
 >>> been integrated in similar "zero cost" fashion?
 >>> I'm all for consistency, and the only point of comparison here
is `gn`.
 >>
 >> Let's say RV, in a subfolder not build by default.
 >
 >
 > I don't know what RV is?

Sorry, the region vectorizer [0,1]. Came to mind because it is the last
thing I wished we had upstream so I could use it without forking under a
cmake flag.

[0] https://github.com/cdl-saarland/rv
[1] http://llvm.org/devmtg/2016-11/Slides/Moll-RV.pdf


 >
 >
 >> Or any other
 >> project that was proposed for inclusion without being build by
 >> default. (I remember also the discussion if we can/should add
 >> isl to llvm, pre-mono repo.)
 >>
 >
 > I am not sure I agree that we can compare new "projects" (or 
something like
 > ISL) with "utilities for LLVM users".
 > I would expect a more comparable situation to me to be:
 > - the gdb scripts in llvm/utils/gdb-scripts/prettyprinters.py
 > - IDE visualizer in llvm/utils/LLVMVisualizers
 > - The Visual Studio Code syntax highlighting for LLVM IR and TableGen in
 > llvm/utils/vscode ; and similar for kate, jedit, vim, textmate, ...
 > - the gn files in llvm/utils/gn
 >
 > The general theme here is that these are not "new projects" in 
themselves:
 > they are highly coupled to LLVM itself and only allow a specific 
subset of
 > users to plug their tool/workflow into LLVM at a given revision.
 > Also all of these are "zero cost" in that they may be
"broken" and
 > maintained with best effort (I don't think we revert someone breaking
any
 > of the visualizer or syntax highlighter?). And none of these are really
 > core to LLVM, and each could be in a separate repo where the interested
 > parties could maintain it.

If I want to use isl, RV, project XYZ from an in-tree pass, you cannot
upstream it if the dependences are not upstream or properly hooked up.
Both things have been very hard to get into upstream llvm in the past.
I'm aware this is a build system we are talking about so it's a bit 
different
but conceptually we should have better guidelines for integration of 
code not
build by default, especially the code that is not planned to be enabled by
default any time soon.

Eric mentioned in a follow up that he is more inclined to accept such 
code, at
least that is what I read. I am actually as well, probably always was ;)
I have no problem with gn, bazel, ... but I want us to be similarly open to
other projects that are used by the community and benefit from integration
without burdening everyone.

~ Johannes


 >
 > Best,
 >

llvm dev - Oct 2020 - Contributing Bazel BUILD files similar to gn

[llvm-dev] Contributing Bazel BUILD files similar to gn

[llvm-dev] Contributing Bazel BUILD files similar to gn

[llvm-dev] Contributing Bazel BUILD files similar to gn

[llvm-dev] Contributing Bazel BUILD files similar to gn