Arthur O'Dwyer via llvm-dev
2021-Dec-07 15:54 UTC
[llvm-dev] [cfe-dev] Bugzilla migration is stopped again
On Tue, Dec 7, 2021 at 7:33 AM Anton Korobeynikov via cfe-dev < cfe-dev at lists.llvm.org> wrote:> > > 6) We can edit the comments by hand (can you only edit your own comments > or can we edit someone else's comments, I'm thinking its only our own based > on testing I've done with other repos) > > - isn't this a requirement in order to fix up the "code-blocks"? >Yes, only admins can edit everything.>I noticed this yesterday with the existing test migration: compare https://bugs.llvm.org/show_bug.cgi?id=52598 versus https://github.com/llvm/llvm-bugzilla-archive/issues/52598 The current script seems to be forgetting that GitHub issues use Markdown, and so every existing Bugzilla comment needs to be wrapped in triple-backticks to preserve its semantics. (You could do *cleverer* things, like "don't wrap comments that are only one line long," but doing anything *less-clever* will be a non-starter.)> Assuming there is no obvious/immediate fix, Do we have any choice but to > move ahead with the existing import and fix the comments by hand > retrospectively (assuming 6) > This is what I asked GitHub engineers. They essentially asked for yet > another day to figure out the possible options. My rough estimate that > at least 5k issues will have broken links.Anton: I see about 35,000 issues in https://github.com/llvm/llvm-bugzilla-archive/issues but only 228 (i.e. essentially none, presumably just historical noise from newbie GitHub users) in https://github.com/llvm/llvm-project/issues Where are the 13,000 issues you are saying have already been migrated? IIUC, it's *very fortunate* that there aren't yet 13,000 issues in https://github.com/llvm/llvm-project/issues . That means that it is still an option to do a "practice" migration into a test repo — e.g., https://github.com/llvm/llvm-bugzilla-archive2 (and then if it works as intended, you can either "blow away https://github.com/llvm/llvm-bugzilla-archive and rename https://github.com/llvm/llvm-bugzilla-archive2 to https://github.com/llvm/llvm-bugzilla-archive", or "blow away https://github.com/llvm/llvm-bugzilla-archive and repeat the migration just to prove it works *reproducibly*". Only once the whole migration has been tested end-to-end on a test repo, would I recommend starting the migration into the production repo https://github.com/llvm/llvm-project. Thanks for the links to https://github.com/llvm/bugzilla2gitlab/tree/llvm and https://docs.google.com/document/d/1G6DZ6AxzSaOlrtTxoxtqYKnD4Myv40QfKK4wj54y8ms/edit . Those make it clear that someone's done a little bit of work to script this stuff; but the Google Doc also makes it clear that there is a long way to go to accomplish a "deploy plan": someone needs to take that English description and turn it into code (Python or even Bash or whatever) that can be (A) reviewed for correctness, without running it (B) run multiple times with guaranteed same behavior, with no risk that some human will accidentally forget a step in the middle Step 1, getting the XML files from Bugzilla, turns out to be super easy because there's a public API for that: https://github.com/Quuxplusone/BugzillaToGithub Step 3, transforming XML to GitHub's JSON schema, requires knowing what GitHub's schema looks like. I've found https://gist.github.com/jonmagic/5282384165e0f86ef105#start-an-issue-import although it's not real clear what the schema is or if that even still works (I haven't tried yet). Also, there seems to be no way for one GitHub user to create a comment or issue putatively authored by some other GitHub user. (Which certainly makes sense.) So this would result in issues and comments filed by "LLVM Import Bot" or whatever... but I think that's fine, and might even avoid some issues that you'd have otherwise, with scenarios like "Joe User created his GitHub account in 2015, but was making comments on LLVM issues back in 2012." Vice versa, btw, you've currently got some issues being incorrectly imported with the reporter listed* in the issue summary itself* as "LLVM Bugzilla Contributor"; e.g. this one from Chris Burel. https://github.com/llvm/llvm-bugzilla-archive/issues/52567 It certainly makes sense that you won't have a GitHub *username* for some people, but you still shouldn't throw away the information about their human name just because we're migrating from one platform to another. –Arthur -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20211207/7c8fa681/attachment.html>
Anton Korobeynikov via llvm-dev
2021-Dec-07 16:03 UTC
[llvm-dev] [cfe-dev] Bugzilla migration is stopped again
> I noticed this yesterday with the existing test migration: compare > https://bugs.llvm.org/show_bug.cgi?id=52598 > versus > https://github.com/llvm/llvm-bugzilla-archive/issues/52598 > > The current script seems to be forgetting that GitHub issues use Markdown, and so every existing Bugzilla comment needs to be wrapped in triple-backticks to preserve its semantics.No it is not. This was discussed at one of the roundtables and it was decided that the conversion will be done verbatim. If necessary for some issues it could be converted to proper Markdown by the reporters.> Anton: I see about 35,000 issues in > https://github.com/llvm/llvm-bugzilla-archive/issues > but only 228 (i.e. essentially none, presumably just historical noise from newbie GitHub users) in > https://github.com/llvm/llvm-project/issues > Where are the 13,000 issues you are saying have already been migrated?You cannot see them as issues are currently disabled in llvm-project repo to keep the things intact while we are waiting for suggestions from GitHub engineers. What you're seeing are pull requests (note the header).> IIUC, it's very fortunate that there aren't yet 13,000 issues in https://github.com/llvm/llvm-project/issuesThey are, see above.> Only once the whole migration has been tested end-to-end on a test repo, would I recommend starting the migration into the production repo https://github.com/llvm/llvm-project.> Those make it clear that someone's done a little bit of work to script this stuff; but the Google Doc also makes it clear that there is a long way to go to accomplish a "deploy plan": someone needs to take that English description and turn it into code (Python or even Bash or whatever) thatDo you want me to bash script the work which is done by GitHub engineers?> Step 1, getting the XML files from Bugzilla, turns out to be super easy because there's a public API for that: > https://github.com/Quuxplusone/BugzillaToGithub > Step 3, transforming XML to GitHub's JSON schema, requires knowing what GitHub's schema looks like. I've found > https://gist.github.com/jonmagic/5282384165e0f86ef105#start-an-issue-import > although it's not real clear what the schema is or if that even still works (I haven't tried yet). Also, there seems to be no way for one GitHub user to create a comment or issue putatively authored by some other GitHub user. (Which certainly makes sense.)Well, the current approach we're using certainly handles this well. Though, I would certainly like to see the migrated 10k issues at https://github.com/Quuxplusone/ at the end of the week as you promised and compare with what we already have in the llvm-bugzilla-archive. So this would result in issues and comments filed by "LLVM Import Bot" or whatever... but I think that's fine, and might even avoid some issues that you'd have otherwise, with scenarios like "Joe User created his GitHub account in 2015, but was making comments on LLVM issues back in 2012."> Vice versa, btw, you've currently got some issues being incorrectly imported with the reporter listed in the issue summary itself as "LLVM Bugzilla Contributor"; e.g. this one from Chris Burel.Chris Burel did not fill the survey therefore the data is anonymised. -- With best regards, Anton Korobeynikov Department of Statistical Modelling, Saint Petersburg State University
Arthur O'Dwyer via llvm-dev
2021-Dec-08 22:47 UTC
[llvm-dev] [cfe-dev] Bugzilla migration is stopped again
On Tue, Dec 7, 2021 at 10:54 AM Arthur O'Dwyer <arthur.j.odwyer at gmail.com> wrote:> The current script seems to be forgetting that GitHub issues use Markdown, > and so every existing Bugzilla comment needs to be wrapped in > triple-backticks to preserve its semantics. (You could do *cleverer* > things, like "don't wrap comments that are only one line long," but doing > anything *less-clever* will be a non-starter.) >[...] btw, you've currently got some issues being incorrectly imported with> the reporter listed* in the issue summary itself* as "LLVM Bugzilla > Contributor"; e.g. this one from Chris Burel. > https://github.com/llvm/llvm-bugzilla-archive/issues/52567 > It certainly makes sense that you won't have a GitHub *username* for some > people, but you still shouldn't throw away the information about their > human name just because we're migrating from one platform to another.Two more things I've noticed while spot-checking: https://github.com/llvm/llvm-bugzilla-archive/issues/36617 - Bugzilla lets you attach file attachments; GitHub doesn't. Attachments are not preserved by the migration. - Bugzilla comments are numbered, so people sometimes say e.g. "see comment 16"; GitHub comments are not numbered. The migration script might consider automagically turning these references into hyperlinks similar to how Bugzilla does it. –Arthur>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20211208/d44e1ef7/attachment.html>