Tom Woolman
2022-Mar-03 21:04 UTC
[R] Looking for package for data generation for classification and regression
Hi Paul. Have you considered just going onto Kaggle and GitHub and searching for some of the many freely available real datasets that are posted there? I'm seeing a lot of productivity there days with research focused on data generation, and not just on creating algorithms and predictive models. Which is a good thing for us ;) One of the current research papers I'm working on now is based on mining a dataset I discovered on Kaggle a few months back and trying to create a novel solution for that. Proper credit will of course be provided in the citation references for the data provider. Thanks, Tom On 2022-03-03 16:00, Paul Smith wrote:> Dear All, > > I am in need of generating artificial data for machine learning > classification and regression analysis. What I am looking for is > something similar to Python sklearn.datasets.make_classification and > sklearn.datasets.make_regression: > > https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_classification.html > > https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_regression.html > > I have searched CRAN for something similar, but found nothing. Could > someone please help me with this? > > Thanks in advance, > > Paul > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Paul Smith
2022-Mar-03 21:11 UTC
[R] Looking for package for data generation for classification and regression
Sounds interesting, Tom! Thanks! I am trying to find datasets for creating assignments for students of a course of machine learning. Paul On Thu, Mar 3, 2022 at 9:04 PM Tom Woolman <twoolman at ontargettek.com> wrote:> > Hi Paul. Have you considered just going onto Kaggle and GitHub and > searching for some of the many freely available real datasets that are > posted there? I'm seeing a lot of productivity there days with research > focused on data generation, and not just on creating algorithms and > predictive models. Which is a good thing for us ;) > > One of the current research papers I'm working on now is based on mining > a dataset I discovered on Kaggle a few months back and trying to create > a novel solution for that. Proper credit will of course be provided in > the citation references for the data provider. > > > Thanks, > Tom > > > On 2022-03-03 16:00, Paul Smith wrote: > > Dear All, > > > > I am in need of generating artificial data for machine learning > > classification and regression analysis. What I am looking for is > > something similar to Python sklearn.datasets.make_classification and > > sklearn.datasets.make_regression: > > > > https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_classification.html > > > > https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_regression.html > > > > I have searched CRAN for something similar, but found nothing. Could > > someone please help me with this? > > > > Thanks in advance, > > > > Paul > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code.