thr3ads.net - llvm dev - [llvm-dev] git strategy for handling the llvm-project repo together with ours [May 2021]

If this information is useful, please help other people find it:
Share via:

Mehdi AMINI via llvm-dev

2021-May-07 19:04 UTC

[llvm-dev] git strategy for handling the llvm-project repo together with ours

Another aspect is that rebasing a long-lived branch leads to an history
that does not make sense: you would likely just fix the APIs uses for the
top of the branch after rebasing which will lead to most of the history
that can't be build: this kills bisection since you can't build previous
revision of your own project (unless you actually fix every individual
commit during rebase, but that's not scalable).

On Fri, May 7, 2021 at 10:55 AM Geoffrey Martin-Noble via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> To avoid the issue of constantly having to rewrite history, you could
> merge from your main branch instead of rebasing. If I was contributing to a
> project and I constantly had to drop my local history, I'd probably be
a
> bit grumpy. You lose the benefit of your entire stream of patches being
> based on HEAD, but I'm guessing that eventual upstreaming wouldn't
be
> merging things in the historical order of patches anyway? (but maybe
I'm
> wrong, I've never upstreamed a target). Merge commits get a bad rap,
but I
> think they're actually quite a useful tool in git when used
judiciously,
> and if you know about `git log --first-parent` most of people's
complaints
> about them are handled.
>
> On Fri, May 7, 2021 at 9:42 AM Min-Yih Hsu via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hi,
>>
>> On Fri, May 7, 2021 at 5:52 AM <paul.robinson at sony.com> wrote:
>>
>>> > Dear all,
>>> >
>>> > we are integrating support in Clang + LLVM for a target
architecture
>>> (an
>>> > accelerator device) we are developing. Currently, we have a
git
>>> > repository, which contains LLVM 9 together with the
added/changed files
>>> > from us.
>>> >
>>> > Now, we want to have a sustainable strategy for handling the
>>> > llvm-project repository together with our changes. To
summarize, we
>>> want
>>> > to migrate to LLVM 12, start with 12.0.0, add our target
architecture
>>> > support based on this version, continue development of our
target, and
>>> > from time to time get the latest changes from the official
llvm-project
>>> > repo so we keep up to date over time.
>>> >
>>> > Hence, we somehow have to "merge" the llvm-project
and our repository,
>>> > and keep history of both. As far as I know, there are several
ways we
>>> > could do this, e.g. using submodules or subtree merge.
>>
>>
>>> I haven't personally had to deal with a new target, so someone
who
>>> has done it might have different suggestions.  I've cc'd
Min who
>>> seems to be the one driving the recent addition of the M68K target,
>>> which is the most recent new target added to LLVM.
>>>
>>
>> Yes, we actually had the exact same problem when we tried to migrate
from
>> LLVM 8 or so to mono repo. We used a script back then:
>> https://github.com/M680x0/M680x0-llvm/issues/58
>> But note that this script only does ~70% of the work. I still spent
quite
>> some time doing some migrations manually and cleaning up.
>>
>>
>>>
>>> I do have extensive experience with managing downstream changes and
>>> handling merges, so I have suggestions based on that experience.
>>>
>>> In your situation, I think the simplest way to manage your changes
>>> is to start with a clone of upstream LLVM, which has its
'main'
>>> branch.  Then create a 'target' branch off of
'main' and do all your
>>> work on 'target'.  When you decide to update to a new
revision of
>>> LLVM, you use 'git pull' to update 'main', and then
*rebase* your
>>> 'target' branch.  This tactic keeps all your changes at
HEAD, with
>>> history intact, which vastly simplifies figuring out where bugs
>>> have come from.  It would also be straightforward to move from a
>>> rare pull/rebase to doing this more frequently, which will reduce
>>> the "merge pain" of each pull/rebase.
>>>
>>
>> Second this approach. This is also roughly what I did before M68k got
>> upstreamed. It's also easier for you to collect patches if you plan
to
>> upstream your target in the future (just pick commits on the tip of
your
>> target branch). The only (little) downside was that there were some
users
>> who wanted to try our tree and using rebase meant that they needed to
`git
>> pull --force` every time they updated.
>>
>>
>>>
>>> I do *not* recommend what we (Sony) did, which is to keep our
>>> changes intermixed with upstream changes.  That decision was made
>>> too long ago to do anything about it with practical cost.  We have
>>> an automated merge system to keep ourselves continually updated to
>>> upstream HEAD, which is an ongoing maintenance cost but much much
>>> better than trying to do the same thing once every six months or
so.
>>>
>>> More about how we operate can be found here:
>>> https://llvm.org/devmtg/2015-10/#tutorial4
>>>
>>> > In the future, if support for our target architecture is
mature, and
>>> the
>>> > hardware is publicly available, for us it would be interesting
to have
>>> > our target support in the official llvm-project repository, in
which
>>> > case our additions potentially would have to go into the
official repo.
>>>
>>>
>> Upstream the target is another whole story. I can provide some tips and
>> suggestions on this matter when you decide to do so.
>>
>> Best
>> -Min
>>
>>
>>> This is another reason to try to keep your downstream changes at
HEAD.
>>> It will be much easier to post your new target's patches if you
are
>>> already based on top of 'main'.
>>
>>
>>> Best Regards,
>>> --paulr
>>>
>>>
>>
>> --
>> Min-Yih Hsu
>> Ph.D Student in ICS Department, University of California, Irvine (UCI).
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210507/3594cf6f/attachment.html>

Kai Plociennik via llvm-dev

2021-May-10 07:22 UTC

head link

[llvm-dev] git strategy for handling the llvm-project repo together with ours

Thank you very much for your detailed suggestions on my quetions, this 
helped me a lot!

Best regards,

Kai Plociennik

-- 
Dr. Kai Plociennik
Fraunhofer-Institut für Techno- und Wirtschaftsmathematik ITWM
Competence Center High Performance Computing
Fraunhofer-Platz 1
67663 Kaiserslautern
Tel: +49 (0)631 31600 4081
mail: kai.plociennik at itwm.fraunhofer.de
www.itwm.fraunhofer.de

via llvm-dev

2021-May-10 21:58 UTC

head link

[llvm-dev] git strategy for handling the llvm-project repo together with ours

Geoffrey, Mehdi,

Excellent observations, however I think it's worth remembering
the stated use-case.

Geoffrey Martin-Noble wrote:> To avoid the issue of constantly having to rewrite history,
Note that the OP said:
> from time to time get the latest changes from the official
> llvm-project repo so we keep up to date over time.
I think "from time to time" is far from "constantly."  If
they
were going to do continual updates, I wouldn't suggest rebasing 
at all; but when updates are rare, I think it's an extremely
viable choice.  Also they said,
> In the future, if support for our target architecture is mature,
> and the hardware is publicly available,
This (especially the not-public part) implies to me that the cadre 
of developers is small, and imposing a rare (2x/year?) requirement 
to do a force-pull or just re-clone is not a harsh burden.

Mehdi AMINI wrote:> Another aspect is that rebasing a long-lived branch leads to an
> history that does not make sense: you would likely just fix the
> APIs uses for the top of the branch after rebasing which will
> lead to most of the history that can't be build
I find rebasing is effectively a commit-by-commit merge-to-HEAD.
Normally when I've done this, conflicts are quite likely for API 
changes, which would have to be fixed up in the middle of the 
rebase before you could do the --continue; not a fix-at-the-end 
situation.

It's true that for a new target, most of the work would be in
target-specific files and git wouldn't notice any conflicts.
If I were doing this semi-annual rebase, I'd probably want to do
an incremental build after each commit just to catch that kind
of thing.  Tedious but not super expensive compute-wise, for a
new target, and very scriptable.

Rebasing instead of merging would also *improve* bisection, if
you pay attention to keeping the rebased commits buildable.
I promise you, having done it, bisecting the current problem to
a 6-months-of-upstream-changes merge commit *really* isn't helpful.
Eliminating those headaches was a significant benefit of our 
conversion to continual integration.

Basically, for bisection to work in a reasonable way, you have
to have either a linear history of small merges like we get now
with our continual integration, or you want a linear history
that is pure upstream with local changes at the very tip.

Anyway, best of luck to the original poster, and back to doing
real work!

--paulr

llvm dev - May 2021 - git strategy for handling the llvm-project repo together with ours

[llvm-dev] git strategy for handling the llvm-project repo together with ours

[llvm-dev] git strategy for handling the llvm-project repo together with ours

[llvm-dev] git strategy for handling the llvm-project repo together with ours