Anton Korobeynikov via llvm-dev
2020-Jul-10 08:10 UTC
[llvm-dev] RFC: Bugzilla migration plan
Dear all, Over the last few weeks with the help of GH folks I've been exploring the options of Bugzilla migration. I believe finally we came to the viable solution which is detailed below. It turned out that GitHub has an internal project rehydration tool that could be used to populate the empty repo contents from the simple serialized format. There is a big advantage of this approach as compared to using GH API as we are not bound to various thresholds and throttling limits (remember, that we need to import 35k+ bz issues). The downside is that such rehydration requires the empty repo and we cannot delete the current llvm-project: this way we will lose releases, fork connections, stars and watches. Unfortunately, there is no way to recreate releases while keeping the origins dates, so this is a no-go for us. Losing forks connections would strongly affect downstream users as well. This allowed to formulate the following scheme: 1. Migrate Bugzilla to a new repo, say, llvm-bugzilla-import using the internal storage format. 2. Install redirects llvm.org/PR1234 => gh/llvm/llvm-bugzilla-import/issues/1234 3. Wipe existing issues and pull requests 4. Migrate all issues from llvm-bugzilla-import to llvm-project using GH API. Github will take about llvm-bugzilla-import/issues/1234 => llvm-project/issues/5678 redirects The only downside of this approach is that we will be seeing 30k events like "llvm-bugzilla-import/issues/1234 migrated to llvm-project/issues/5678". Here is the tentative timeline / list of action points: 1. Collect the mapping email (used by bugzilla) => GH account name (used by issues). We are going to collect using different sources: - Auto-populating the mapping from the list of known committers - Asking GH API (works only if a person made their email public and only when allowed by local law) - Emailing everyone who submitted to Bugzilla over last year or maybe two asking to fill in the form with the GH username - We would likely allow a month or so to let everyone respond. 2. While 1. is in progress, we will work on various format issues for migration. For this we will use probable first 1k issues or so. It would be nice to include some meta-bugs here to ensure we could re-recreate issues. Things to consider: - Comment migration (GH uses markdown everywhere, so we'd need to carefully escape bugzilla contents) - Components => labels mapping and migration - Linking between the issues. Maybe automatically replace PR1234 in the text with #1234 to enable auto linking. - Authorship: reporter / commenter - Attaches 3. After we are sure everyone is ready, we will do the test migration of the whole bugzilla. - Estimate the necessary time it would be required to make such a transition. - Fix remaining issues, if any 4. Put bugzilla into read-only mode and perform the final migration to llvm-bugzilla-archive 5. Wipe issues / PRs in llvm-project repo and perform migration from llvm-bugzilla-archive to llvm-project 6. Migration done. Probably bugzilla will be kept in read-only mode for some time just for the sake of consistency and should any issues be found. Any comments & ideas? -- With best regards, Anton Korobeynikov Department of Statistical Modelling, Saint Petersburg State University
Kristof Beyls via llvm-dev
2020-Jul-10 12:26 UTC
[llvm-dev] [cfe-dev] RFC: Bugzilla migration plan
Thank you very much for all the work on this Anton! The steps outlined seem like it should work. If I remember correctly, one of the main concerns here is making sure that one can still easily find issues based on existing bugzilla IDs in existing comments, commit messages etc. Do I understand correctly that after the migration, for an existing reference to "PR1234", you'll need to go to https://github.com/llvm/llvm-bugzilla-import/issues/1234 to find it (after our bugzilla server has been shut down)? That seems workable if we document that well. One other area where I thought there was quite a bit of debate was about how components will map to labels; mainly triggered because current bug triagers and watchers are looking for how they will be able to set up filters to see the bug updates they are interested in. I wonder what the most recent thinking is on that? Thanks, Kristof Op vr 10 jul. 2020 om 10:11 schreef Anton Korobeynikov via cfe-dev < cfe-dev at lists.llvm.org>:> Dear all, > > Over the last few weeks with the help of GH folks I've been exploring > the options of Bugzilla migration. I believe finally we came to the > viable solution which is detailed below. > > It turned out that GitHub has an internal project rehydration tool > that could be used to populate the empty repo contents from the simple > serialized format. There is a big advantage of this approach as > compared to using GH API as we are not bound to various thresholds and > throttling limits (remember, that we need to import 35k+ bz issues). > The downside is that such rehydration requires the empty repo and we > cannot delete the current llvm-project: this way we will lose > releases, fork connections, stars and watches. Unfortunately, there is > no way to recreate releases while keeping the origins dates, so this > is a no-go for us. Losing forks connections would strongly affect > downstream users as well. This allowed to formulate the following > scheme: > > 1. Migrate Bugzilla to a new repo, say, llvm-bugzilla-import using the > internal storage format. > 2. Install redirects llvm.org/PR1234 => > gh/llvm/llvm-bugzilla-import/issues/1234 > 3. Wipe existing issues and pull requests > 4. Migrate all issues from llvm-bugzilla-import to llvm-project using > GH API. Github will take about llvm-bugzilla-import/issues/1234 => > llvm-project/issues/5678 redirects > > The only downside of this approach is that we will be seeing 30k > events like "llvm-bugzilla-import/issues/1234 migrated to > llvm-project/issues/5678". > > Here is the tentative timeline / list of action points: > > 1. Collect the mapping email (used by bugzilla) => GH account name > (used by issues). We are going to collect using different sources: > - Auto-populating the mapping from the list of known committers > - Asking GH API (works only if a person made their email public and > only when allowed by local law) > - Emailing everyone who submitted to Bugzilla over last year or > maybe two asking to fill in the form with the GH username > - We would likely allow a month or so to let everyone respond. > 2. While 1. is in progress, we will work on various format issues for > migration. For this we will use probable first 1k issues or so. It > would be nice to include some meta-bugs here to ensure we could > re-recreate issues. Things to consider: > - Comment migration (GH uses markdown everywhere, so we'd need to > carefully escape bugzilla contents) > - Components => labels mapping and migration > - Linking between the issues. Maybe automatically replace PR1234 in > the text with #1234 to enable auto linking. > - Authorship: reporter / commenter > - Attaches > 3. After we are sure everyone is ready, we will do the test migration > of the whole bugzilla. > - Estimate the necessary time it would be required to make such a > transition. > - Fix remaining issues, if any > 4. Put bugzilla into read-only mode and perform the final migration to > llvm-bugzilla-archive > 5. Wipe issues / PRs in llvm-project repo and perform migration from > llvm-bugzilla-archive to llvm-project > 6. Migration done. Probably bugzilla will be kept in read-only mode > for some time just for the sake of consistency and should any issues > be found. > > Any comments & ideas? > -- > With best regards, Anton Korobeynikov > Department of Statistical Modelling, Saint Petersburg State University > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200710/c3648004/attachment.html>
Anton Korobeynikov via llvm-dev
2020-Jul-10 14:50 UTC
[llvm-dev] [cfe-dev] RFC: Bugzilla migration plan
Hi Kristof,> If I remember correctly, one of the main concerns here is making sure that one can still easily find issues based on existing bugzilla IDs in existing comments, commit messages etc. > Do I understand correctly that after the migration, for an existing reference to "PR1234", you'll need to go to https://github.com/llvm/llvm-bugzilla-import/issues/1234 to find it (after our bugzilla server has been shut down)? > That seems workable if we document that well.Oh, maybe I was not clear - We will install llvm.org side redirect. So, links like llvm.org/PR1234 will redirect to https://github.com/llvm/llvm-bugzilla-import/issues/1234 - Since we will migrate issue to llvm-project, github will redirect by itself from https://github.com/llvm/llvm-bugzilla-import/issues/1234 to https://github.com/llvm/llvm-project/issues/XYZ for whatever value XYZ will be. During the bugzilla import we will also replace PR1234 in the comment text to #1234 and github during the migration will properly rewrite these references to llvm-project/issues/XYZ ones.> One other area where I thought there was quite a bit of debate was about how components will map to labels; mainly triggered because current bug triagers and watchers are looking for how they will be able to set up filters to see the bug updates they are interested in. > I wonder what the most recent thinking is on that?I made the first set of labels (and they are outlined in the google doc I made previously) basing the list of components / products we're having in Bugzilla -- With best regards, Anton Korobeynikov Department of Statistical Modelling, Saint Petersburg State University
On Fri, 10 Jul 2020 at 09:11, Anton Korobeynikov via llvm-dev <llvm-dev at lists.llvm.org> wrote:> 3. Wipe existing issues and pull requestsDoes this really wipes the "auto-increment" IDs used by PRs and issues and starts from zero again?> 4. Migrate all issues from llvm-bugzilla-import to llvm-project using > GH API. Github will take about llvm-bugzilla-import/issues/1234 => > llvm-project/issues/5678 redirectsIf we're setting a redirect, PR1234 wouldn't hit #5678. We either guarantee that the IDs will be identical or we'll need a smart redirect that will know the delta (or 1:1 relationship).
Anton Korobeynikov via llvm-dev
2020-Jul-10 15:19 UTC
[llvm-dev] RFC: Bugzilla migration plan
> On Fri, 10 Jul 2020 at 09:11, Anton Korobeynikov via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > 3. Wipe existing issues and pull requests > Does this really wipes the "auto-increment" IDs used by PRs and issues > and starts from zero again?I will need to clarify whether we will be able to reset the counter or not> > 4. Migrate all issues from llvm-bugzilla-import to llvm-project using > > GH API. Github will take about llvm-bugzilla-import/issues/1234 => > > llvm-project/issues/5678 redirects > If we're setting a redirect, PR1234 wouldn't hit #5678. We either > guarantee that the IDs will be identical or we'll need a smart > redirect that will know the delta (or 1:1 relationship).Why? If you migrate the issue inside GH, then GH does the necessary redirects on its side. So, we will be taking care about PR1234 => llvm-bugzilla-import/issues/1234 redirect and github will further redirect from llvm-bugzilla-import/issues/1234 to llvm-project/issues/5678. -- With best regards, Anton Korobeynikov Department of Statistical Modelling, Saint Petersburg State University
David Blaikie via llvm-dev
2020-Jul-10 17:12 UTC
[llvm-dev] [cfe-dev] RFC: Bugzilla migration plan
If I recall correctly, the previous discussion had a fair bit of pushback on having two numbering systems. In part because the old bug numbers are littered throughout the codebase, commit messages, etc, and aren't always prefixed with "PR", sometimes just "bug XXXX" - having two numberings is likely to mean some amount of confusiong/friction/ongoing cost to looking in multiple places for bugs. (but I'm not personally holding this process up on that issue - I think I can live with it, if that's what those with the time/inclination to make this migration happen decide is the right tradeoff) On Fri, Jul 10, 2020 at 1:11 AM Anton Korobeynikov via cfe-dev <cfe-dev at lists.llvm.org> wrote:> > Dear all, > > Over the last few weeks with the help of GH folks I've been exploring > the options of Bugzilla migration. I believe finally we came to the > viable solution which is detailed below. > > It turned out that GitHub has an internal project rehydration tool > that could be used to populate the empty repo contents from the simple > serialized format. There is a big advantage of this approach as > compared to using GH API as we are not bound to various thresholds and > throttling limits (remember, that we need to import 35k+ bz issues). > The downside is that such rehydration requires the empty repo and we > cannot delete the current llvm-project: this way we will lose > releases, fork connections, stars and watches. Unfortunately, there is > no way to recreate releases while keeping the origins dates, so this > is a no-go for us. Losing forks connections would strongly affect > downstream users as well. This allowed to formulate the following > scheme: > > 1. Migrate Bugzilla to a new repo, say, llvm-bugzilla-import using the > internal storage format. > 2. Install redirects llvm.org/PR1234 => gh/llvm/llvm-bugzilla-import/issues/1234 > 3. Wipe existing issues and pull requests > 4. Migrate all issues from llvm-bugzilla-import to llvm-project using > GH API. Github will take about llvm-bugzilla-import/issues/1234 => > llvm-project/issues/5678 redirects > > The only downside of this approach is that we will be seeing 30k > events like "llvm-bugzilla-import/issues/1234 migrated to > llvm-project/issues/5678". > > Here is the tentative timeline / list of action points: > > 1. Collect the mapping email (used by bugzilla) => GH account name > (used by issues). We are going to collect using different sources: > - Auto-populating the mapping from the list of known committers > - Asking GH API (works only if a person made their email public and > only when allowed by local law) > - Emailing everyone who submitted to Bugzilla over last year or > maybe two asking to fill in the form with the GH username > - We would likely allow a month or so to let everyone respond. > 2. While 1. is in progress, we will work on various format issues for > migration. For this we will use probable first 1k issues or so. It > would be nice to include some meta-bugs here to ensure we could > re-recreate issues. Things to consider: > - Comment migration (GH uses markdown everywhere, so we'd need to > carefully escape bugzilla contents) > - Components => labels mapping and migration > - Linking between the issues. Maybe automatically replace PR1234 in > the text with #1234 to enable auto linking. > - Authorship: reporter / commenter > - Attaches > 3. After we are sure everyone is ready, we will do the test migration > of the whole bugzilla. > - Estimate the necessary time it would be required to make such a transition. > - Fix remaining issues, if any > 4. Put bugzilla into read-only mode and perform the final migration to > llvm-bugzilla-archive > 5. Wipe issues / PRs in llvm-project repo and perform migration from > llvm-bugzilla-archive to llvm-project > 6. Migration done. Probably bugzilla will be kept in read-only mode > for some time just for the sake of consistency and should any issues > be found. > > Any comments & ideas? > -- > With best regards, Anton Korobeynikov > Department of Statistical Modelling, Saint Petersburg State University > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Tom Stellard via llvm-dev
2020-Jul-10 17:53 UTC
[llvm-dev] [cfe-dev] RFC: Bugzilla migration plan
On 7/10/20 1:12 PM, David Blaikie via cfe-dev wrote:> If I recall correctly, the previous discussion had a fair bit of > pushback on having two numbering systems. In part because the old bug > numbers are littered throughout the codebase, commit messages, etc, > and aren't always prefixed with "PR", sometimes just "bug XXXX" - > having two numberings is likely to mean some amount of > confusiong/friction/ongoing cost to looking in multiple places for > bugs. >This was my take away from the discussions too. I think it's important that we try to preserve the single numbering system. -Tom> (but I'm not personally holding this process up on that issue - I > think I can live with it, if that's what those with the > time/inclination to make this migration happen decide is the right > tradeoff) > > On Fri, Jul 10, 2020 at 1:11 AM Anton Korobeynikov via cfe-dev > <cfe-dev at lists.llvm.org> wrote: >> >> Dear all, >> >> Over the last few weeks with the help of GH folks I've been exploring >> the options of Bugzilla migration. I believe finally we came to the >> viable solution which is detailed below. >> >> It turned out that GitHub has an internal project rehydration tool >> that could be used to populate the empty repo contents from the simple >> serialized format. There is a big advantage of this approach as >> compared to using GH API as we are not bound to various thresholds and >> throttling limits (remember, that we need to import 35k+ bz issues). >> The downside is that such rehydration requires the empty repo and we >> cannot delete the current llvm-project: this way we will lose >> releases, fork connections, stars and watches. Unfortunately, there is >> no way to recreate releases while keeping the origins dates, so this >> is a no-go for us. Losing forks connections would strongly affect >> downstream users as well. This allowed to formulate the following >> scheme: >> >> 1. Migrate Bugzilla to a new repo, say, llvm-bugzilla-import using the >> internal storage format. >> 2. Install redirects llvm.org/PR1234 => gh/llvm/llvm-bugzilla-import/issues/1234 >> 3. Wipe existing issues and pull requests >> 4. Migrate all issues from llvm-bugzilla-import to llvm-project using >> GH API. Github will take about llvm-bugzilla-import/issues/1234 => >> llvm-project/issues/5678 redirects >> >> The only downside of this approach is that we will be seeing 30k >> events like "llvm-bugzilla-import/issues/1234 migrated to >> llvm-project/issues/5678". >> >> Here is the tentative timeline / list of action points: >> >> 1. Collect the mapping email (used by bugzilla) => GH account name >> (used by issues). We are going to collect using different sources: >> - Auto-populating the mapping from the list of known committers >> - Asking GH API (works only if a person made their email public and >> only when allowed by local law) >> - Emailing everyone who submitted to Bugzilla over last year or >> maybe two asking to fill in the form with the GH username >> - We would likely allow a month or so to let everyone respond. >> 2. While 1. is in progress, we will work on various format issues for >> migration. For this we will use probable first 1k issues or so. It >> would be nice to include some meta-bugs here to ensure we could >> re-recreate issues. Things to consider: >> - Comment migration (GH uses markdown everywhere, so we'd need to >> carefully escape bugzilla contents) >> - Components => labels mapping and migration >> - Linking between the issues. Maybe automatically replace PR1234 in >> the text with #1234 to enable auto linking. >> - Authorship: reporter / commenter >> - Attaches >> 3. After we are sure everyone is ready, we will do the test migration >> of the whole bugzilla. >> - Estimate the necessary time it would be required to make such a transition. >> - Fix remaining issues, if any >> 4. Put bugzilla into read-only mode and perform the final migration to >> llvm-bugzilla-archive >> 5. Wipe issues / PRs in llvm-project repo and perform migration from >> llvm-bugzilla-archive to llvm-project >> 6. Migration done. Probably bugzilla will be kept in read-only mode >> for some time just for the sake of consistency and should any issues >> be found. >> >> Any comments & ideas? >> -- >> With best regards, Anton Korobeynikov >> Department of Statistical Modelling, Saint Petersburg State University >> _______________________________________________ >> cfe-dev mailing list >> cfe-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >
Mehdi AMINI via llvm-dev
2021-Jun-24 06:00 UTC
[llvm-dev] [cfe-dev] RFC: Bugzilla migration plan
Hi, I don't where we're at with the migration from Bugzilla to Github issues, but there are improvements from GitHub in this area these days: https://github.blog/changelog/2021-06-23-issues-forms-beta-for-public-repositories/ This seems interesting! -- Mehdi On Fri, Jul 10, 2020 at 1:11 AM Anton Korobeynikov via cfe-dev < cfe-dev at lists.llvm.org> wrote:> Dear all, > > Over the last few weeks with the help of GH folks I've been exploring > the options of Bugzilla migration. I believe finally we came to the > viable solution which is detailed below. > > It turned out that GitHub has an internal project rehydration tool > that could be used to populate the empty repo contents from the simple > serialized format. There is a big advantage of this approach as > compared to using GH API as we are not bound to various thresholds and > throttling limits (remember, that we need to import 35k+ bz issues). > The downside is that such rehydration requires the empty repo and we > cannot delete the current llvm-project: this way we will lose > releases, fork connections, stars and watches. Unfortunately, there is > no way to recreate releases while keeping the origins dates, so this > is a no-go for us. Losing forks connections would strongly affect > downstream users as well. This allowed to formulate the following > scheme: > > 1. Migrate Bugzilla to a new repo, say, llvm-bugzilla-import using the > internal storage format. > 2. Install redirects llvm.org/PR1234 => > gh/llvm/llvm-bugzilla-import/issues/1234 > 3. Wipe existing issues and pull requests > 4. Migrate all issues from llvm-bugzilla-import to llvm-project using > GH API. Github will take about llvm-bugzilla-import/issues/1234 => > llvm-project/issues/5678 redirects > > The only downside of this approach is that we will be seeing 30k > events like "llvm-bugzilla-import/issues/1234 migrated to > llvm-project/issues/5678". > > Here is the tentative timeline / list of action points: > > 1. Collect the mapping email (used by bugzilla) => GH account name > (used by issues). We are going to collect using different sources: > - Auto-populating the mapping from the list of known committers > - Asking GH API (works only if a person made their email public and > only when allowed by local law) > - Emailing everyone who submitted to Bugzilla over last year or > maybe two asking to fill in the form with the GH username > - We would likely allow a month or so to let everyone respond. > 2. While 1. is in progress, we will work on various format issues for > migration. For this we will use probable first 1k issues or so. It > would be nice to include some meta-bugs here to ensure we could > re-recreate issues. Things to consider: > - Comment migration (GH uses markdown everywhere, so we'd need to > carefully escape bugzilla contents) > - Components => labels mapping and migration > - Linking between the issues. Maybe automatically replace PR1234 in > the text with #1234 to enable auto linking. > - Authorship: reporter / commenter > - Attaches > 3. After we are sure everyone is ready, we will do the test migration > of the whole bugzilla. > - Estimate the necessary time it would be required to make such a > transition. > - Fix remaining issues, if any > 4. Put bugzilla into read-only mode and perform the final migration to > llvm-bugzilla-archive > 5. Wipe issues / PRs in llvm-project repo and perform migration from > llvm-bugzilla-archive to llvm-project > 6. Migration done. Probably bugzilla will be kept in read-only mode > for some time just for the sake of consistency and should any issues > be found. > > Any comments & ideas? > -- > With best regards, Anton Korobeynikov > Department of Statistical Modelling, Saint Petersburg State University > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210623/bd661e6d/attachment-0001.html>