I've pushed 2.10pre to
git at gitorious.org:reposurgeon/reposurgeon.git
and there's a new version of the conversion metadata at
git at gitorious.org:reposurgeon/nut-conversion.git
I'm not entirely happy with the branch link deduction, but I think
this is the best we're going to do without manual intervention to
break some harmless but spurious parent links.
The underlying problem is twofold:
(1) The repo history is cluttered with odd little single-file copies,
probably mostly generated by cvs2svn. These show up in gitk as merge
bubbles with no commits on one side. The smallest such glitch looks
like this in the commit graph:
|
o
|\
| \
| o
| /
|/
o
|
(2) There are a couple of branch creations that were done wrong, as a
conventional (non-Subversion) directory copy followed by an add.
These check out OK but are missing the branch link information
perovided by a directory copy operation.
Because (1) is true, I cannot assume that every file copy implies a
branch link. Because (2) is true, I cannot only use directory copies
to impute branch structure - sometimes, I have to notice file copies
by comparing hashes and deduce that a branch creation was intended but
fumbled.
The compromise I'm using is to ignore *single* file copies in
computing branch structure - that is, if I detect more than one per
changeset I treat that changeset as a branch link. This rule pops
almost all spurious merge bubbles, but not all.
Charles, unless you spot a place where my branch analysis has gone
wrong, I think that part is done. All that's left to do is make the
code for manually creating clean branch merges work.
--
<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
The kind of charity you can force out of people nourishes about as much as
the kind of love you can buy --- and spreads even nastier diseases.