My data frame looks like below dat = structure(list(a = c(66, 100, 100, 100, 100, 100, 100, 66, 100, 66), b = c(100, 50, 100, 100, 100, 100, 100, 100, 100, 100), c = c(75, 25, 75, 50, 50, 50, 50, 50, 75, 25)), class "data.frame", row.names = c(NA, -10L)) The values are basically categories for each column, however there may be missing values present, which typically represented as NA. My question is can I directly use as.matrix(na.omit(dat)) to convert this to matrix? On Thu, 26 Jun 2025 at 05:27, Rolf Turner <rolfturner at posteo.net> wrote:> > > On Thu, 26 Jun 2025 03:45:50 +0530 > Daniel Lobo <danielobo9976 at gmail.com> wrote: > > > Hi, > > > > I have a dataframe for which all columns are numeric but categorical. > > I don't understand what that means. Perhaps an example? > > > There are some missing values as well > > > > Typically, I have CSV file saved in drive, and then read it using > > read.csv command > > Is that relevant? > > > Can I use as.matrix(na.omit(<<my dataframe loaded using read.csv>>)) > > to convert such dataframe to matrix? > > > > I there any data loss or change that may occur if I use as.matrix > > command? > > I think your question is too vague for anyone to be able to answer this. > > > I remember that some experts recommends not to use as.matrix() > > command to convert a dataframe to matrix. > > My guess is that the problem is that as.matrix() will coerce all of the > columns of a data frame to a common class, which might yield unexpected > results. > > > Any guidance will be very helpful. > > It's possible that data.matrix() might be useful. > > But basically you should think carefully about what the nature of the > entries of your data frame could possibly be, and then design your code > to accommodate all of these possibilities, throwing an error if any > entry does not conform to any of the possibilities that you envisage. > > cheers, > > Rolf Turner > > -- > Honorary Research Fellow > Department of Statistics > University of Auckland > Stats. Dep't. (secretaries) phone: > +64-9-373-7599 ext. 89622 > Home phone: +64-9-480-4619 > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
With all due respect, why can't you try to convert the df to a matrix and see what happens? Am I missing something? John John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 ________________________________________ From: R-help <r-help-bounces at r-project.org> on behalf of Daniel Lobo <danielobo9976 at gmail.com> Sent: Wednesday, June 25, 2025 8:23 PM To: Rolf Turner Cc: r-help at r-project.org Subject: Re: [R] Converting dataframe to matrix My data frame looks like below dat = structure(list(a = c(66, 100, 100, 100, 100, 100, 100, 66, 100, 66), b = c(100, 50, 100, 100, 100, 100, 100, 100, 100, 100), c = c(75, 25, 75, 50, 50, 50, 50, 50, 75, 25)), class "data.frame", row.names = c(NA, -10L)) The values are basically categories for each column, however there may be missing values present, which typically represented as NA. My question is can I directly use as.matrix(na.omit(dat)) to convert this to matrix? On Thu, 26 Jun 2025 at 05:27, Rolf Turner <rolfturner at posteo.net> wrote:> > > On Thu, 26 Jun 2025 03:45:50 +0530 > Daniel Lobo <danielobo9976 at gmail.com> wrote: > > > Hi, > > > > I have a dataframe for which all columns are numeric but categorical. > > I don't understand what that means. Perhaps an example? > > > There are some missing values as well > > > > Typically, I have CSV file saved in drive, and then read it using > > read.csv command > > Is that relevant? > > > Can I use as.matrix(na.omit(<<my dataframe loaded using read.csv>>)) > > to convert such dataframe to matrix? > > > > I there any data loss or change that may occur if I use as.matrix > > command? > > I think your question is too vague for anyone to be able to answer this. > > > I remember that some experts recommends not to use as.matrix() > > command to convert a dataframe to matrix. > > My guess is that the problem is that as.matrix() will coerce all of the > columns of a data frame to a common class, which might yield unexpected > results. > > > Any guidance will be very helpful. > > It's possible that data.matrix() might be useful. > > But basically you should think carefully about what the nature of the > entries of your data frame could possibly be, and then design your code > to accommodate all of these possibilities, throwing an error if any > entry does not conform to any of the possibilities that you envisage. > > cheers, > > Rolf Turner > > -- > Honorary Research Fellow > Department of Statistics > University of Auckland > Stats. Dep't. (secretaries) phone: > +64-9-373-7599 ext. 89622 > Home phone: +64-9-480-4619 > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide https://www.r-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
@vi@e@gross m@iii@g oii gm@ii@com
2025-Jun-26 02:15 UTC
[R] Converting dataframe to matrix
Daniel, Do an experiment and see what happens, works lots of time. I note your question may be what happens if you use factors, as in categorical data. Factors have two sides with one being a numerical index of sorts and the other one being a character string. If I make two vectors for illustration: greek <- factor(c("alpha", "beta", "gamma")) hebrew <- factor(c("aleph", "bet", "gimmel")) You can see the two sides in various ways like this:> greek[1] alpha beta gamma Levels: alpha beta gamma> as.integer(greek)[1] 1 2 3> as.integer(hebrew)[1] 1 2 3 No make a data,frame like so: mydf <- data.frame(Greek=greek, Hebrew=hebrew)> mydfGreek Hebrew 1 alpha aleph 2 beta bet 3 gamma gimmel Note the numbers are not visible, just text. So converting it to a matrix will show the text/characters:> as.matrix(mydf)Greek Hebrew [1,] "alpha" "aleph" [2,] "beta" "bet" [3,] "gamma" "gimmel" If any columns happened to be numeric, a number like 666 becomes a string like "666". If you want the factors as integers, for some unknown reason, you might want to make another data.frame from mydf like so: mydf.int <- data.frame(Greek=as.integer(greek), Hebrew=as.integer(hebrew))> mydf.intGreek Hebrew 1 1 1 2 2 2 3 3 3> as.matrix(mydf.int)Greek Hebrew [1,] 1 1 [2,] 2 2 [3,] 3 3> typeof(as.matrix(mydf.int))[1] "integer" As others are saying, you can use as.integer safely enough as long as you first guarantee everything is of a compatible type (such as numeric) so result is uniform. Or, are we missing something about your real question? Matrices are not always a great choice and data.frames can do many of the same things for a 2-D object especially if you use some packages that ... -----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of Daniel Lobo Sent: Wednesday, June 25, 2025 8:24 PM To: Rolf Turner <rolfturner at posteo.net> Cc: r-help at r-project.org Subject: Re: [R] Converting dataframe to matrix My data frame looks like below dat = structure(list(a = c(66, 100, 100, 100, 100, 100, 100, 66, 100, 66), b = c(100, 50, 100, 100, 100, 100, 100, 100, 100, 100), c = c(75, 25, 75, 50, 50, 50, 50, 50, 75, 25)), class "data.frame", row.names = c(NA, -10L)) The values are basically categories for each column, however there may be missing values present, which typically represented as NA. My question is can I directly use as.matrix(na.omit(dat)) to convert this to matrix? On Thu, 26 Jun 2025 at 05:27, Rolf Turner <rolfturner at posteo.net> wrote:> > > On Thu, 26 Jun 2025 03:45:50 +0530 > Daniel Lobo <danielobo9976 at gmail.com> wrote: > > > Hi, > > > > I have a dataframe for which all columns are numeric but categorical. > > I don't understand what that means. Perhaps an example? > > > There are some missing values as well > > > > Typically, I have CSV file saved in drive, and then read it using > > read.csv command > > Is that relevant? > > > Can I use as.matrix(na.omit(<<my dataframe loaded using read.csv>>)) > > to convert such dataframe to matrix? > > > > I there any data loss or change that may occur if I use as.matrix > > command? > > I think your question is too vague for anyone to be able to answer this. > > > I remember that some experts recommends not to use as.matrix() > > command to convert a dataframe to matrix. > > My guess is that the problem is that as.matrix() will coerce all of the > columns of a data frame to a common class, which might yield unexpected > results. > > > Any guidance will be very helpful. > > It's possible that data.matrix() might be useful. > > But basically you should think carefully about what the nature of the > entries of your data frame could possibly be, and then design your code > to accommodate all of these possibilities, throwing an error if any > entry does not conform to any of the possibilities that you envisage. > > cheers, > > Rolf Turner > > -- > Honorary Research Fellow > Department of Statistics > University of Auckland > Stats. Dep't. (secretaries) phone: > +64-9-373-7599 ext. 89622 > Home phone: +64-9-480-4619 > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttps://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Yes you can. Whether that will yield useful or misleading results depends what analytical tools you intend to apply to the resulting matrix. Categorical data in matrices tends to be kind of a dead end in my experience... but ymmv. On June 25, 2025 5:23:47 PM PDT, Daniel Lobo <danielobo9976 at gmail.com> wrote:>My data frame looks like below > >dat = structure(list(a = c(66, 100, 100, 100, 100, 100, 100, 66, 100, > >66), b = c(100, 50, 100, 100, 100, 100, 100, 100, 100, 100), > > c = c(75, 25, 75, 50, 50, 50, 50, 50, 75, 25)), class >"data.frame", row.names = c(NA, > >-10L)) > >The values are basically categories for each column, however there may >be missing values present, which typically represented as NA. > >My question is can I directly use as.matrix(na.omit(dat)) to convert >this to matrix? > > >On Thu, 26 Jun 2025 at 05:27, Rolf Turner <rolfturner at posteo.net> wrote: >> >> >> On Thu, 26 Jun 2025 03:45:50 +0530 >> Daniel Lobo <danielobo9976 at gmail.com> wrote: >> >> > Hi, >> > >> > I have a dataframe for which all columns are numeric but categorical. >> >> I don't understand what that means. Perhaps an example? >> >> > There are some missing values as well >> > >> > Typically, I have CSV file saved in drive, and then read it using >> > read.csv command >> >> Is that relevant? >> >> > Can I use as.matrix(na.omit(<<my dataframe loaded using read.csv>>)) >> > to convert such dataframe to matrix? >> > >> > I there any data loss or change that may occur if I use as.matrix >> > command? >> >> I think your question is too vague for anyone to be able to answer this. >> >> > I remember that some experts recommends not to use as.matrix() >> > command to convert a dataframe to matrix. >> >> My guess is that the problem is that as.matrix() will coerce all of the >> columns of a data frame to a common class, which might yield unexpected >> results. >> >> > Any guidance will be very helpful. >> >> It's possible that data.matrix() might be useful. >> >> But basically you should think carefully about what the nature of the >> entries of your data frame could possibly be, and then design your code >> to accommodate all of these possibilities, throwing an error if any >> entry does not conform to any of the possibilities that you envisage. >> >> cheers, >> >> Rolf Turner >> >> -- >> Honorary Research Fellow >> Department of Statistics >> University of Auckland >> Stats. Dep't. (secretaries) phone: >> +64-9-373-7599 ext. 89622 >> Home phone: +64-9-480-4619 >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide https://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- Sent from my phone. Please excuse my brevity.
Thanks for confirmation. My only fear was that if I use as.matrix(na.omit(dat)) to a dataframe like the one I shared, if I would face any data lose or change or not. On Thu, 26 Jun 2025 at 07:53, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:> > Yes you can. Whether that will yield useful or misleading results depends what analytical tools you intend to apply to the resulting matrix. Categorical data in matrices tends to be kind of a dead end in my experience... but ymmv. > > On June 25, 2025 5:23:47 PM PDT, Daniel Lobo <danielobo9976 at gmail.com> wrote: > >My data frame looks like below > > > >dat = structure(list(a = c(66, 100, 100, 100, 100, 100, 100, 66, 100, > > > >66), b = c(100, 50, 100, 100, 100, 100, 100, 100, 100, 100), > > > > c = c(75, 25, 75, 50, 50, 50, 50, 50, 75, 25)), class > >"data.frame", row.names = c(NA, > > > >-10L)) > > > >The values are basically categories for each column, however there may > >be missing values present, which typically represented as NA. > > > >My question is can I directly use as.matrix(na.omit(dat)) to convert > >this to matrix? > > > > > >On Thu, 26 Jun 2025 at 05:27, Rolf Turner <rolfturner at posteo.net> wrote: > >> > >> > >> On Thu, 26 Jun 2025 03:45:50 +0530 > >> Daniel Lobo <danielobo9976 at gmail.com> wrote: > >> > >> > Hi, > >> > > >> > I have a dataframe for which all columns are numeric but categorical. > >> > >> I don't understand what that means. Perhaps an example? > >> > >> > There are some missing values as well > >> > > >> > Typically, I have CSV file saved in drive, and then read it using > >> > read.csv command > >> > >> Is that relevant? > >> > >> > Can I use as.matrix(na.omit(<<my dataframe loaded using read.csv>>)) > >> > to convert such dataframe to matrix? > >> > > >> > I there any data loss or change that may occur if I use as.matrix > >> > command? > >> > >> I think your question is too vague for anyone to be able to answer this. > >> > >> > I remember that some experts recommends not to use as.matrix() > >> > command to convert a dataframe to matrix. > >> > >> My guess is that the problem is that as.matrix() will coerce all of the > >> columns of a data frame to a common class, which might yield unexpected > >> results. > >> > >> > Any guidance will be very helpful. > >> > >> It's possible that data.matrix() might be useful. > >> > >> But basically you should think carefully about what the nature of the > >> entries of your data frame could possibly be, and then design your code > >> to accommodate all of these possibilities, throwing an error if any > >> entry does not conform to any of the possibilities that you envisage. > >> > >> cheers, > >> > >> Rolf Turner > >> > >> -- > >> Honorary Research Fellow > >> Department of Statistics > >> University of Auckland > >> Stats. Dep't. (secretaries) phone: > >> +64-9-373-7599 ext. 89622 > >> Home phone: +64-9-480-4619 > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > > >______________________________________________ > >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >https://stat.ethz.ch/mailman/listinfo/r-help > >PLEASE do read the posting guide https://www.R-project.org/posting-guide.html > >and provide commented, minimal, self-contained, reproducible code. > > -- > Sent from my phone. Please excuse my brevity.
CAN you use as.matrix(dat) to convert your data frame to a matrix? Yes, certainly. Try it! SHOULD you do this? Well, why do you WANT a matrix? Data frames act a lot like two-dimensional arrays as they are. If "the values are basically categories", then perhaps they should BE factors and processed as factors. For example, what meaning do you expect colMeans(dat) to have? On Thu, 26 Jun 2025 at 12:25, Daniel Lobo <danielobo9976 at gmail.com> wrote:> > My data frame looks like below > > dat = structure(list(a = c(66, 100, 100, 100, 100, 100, 100, 66, 100, > > 66), b = c(100, 50, 100, 100, 100, 100, 100, 100, 100, 100), > > c = c(75, 25, 75, 50, 50, 50, 50, 50, 75, 25)), class > "data.frame", row.names = c(NA, > > -10L)) > > The values are basically categories for each column, however there may > be missing values present, which typically represented as NA. > > My question is can I directly use as.matrix(na.omit(dat)) to convert > this to matrix? > > > On Thu, 26 Jun 2025 at 05:27, Rolf Turner <rolfturner at posteo.net> wrote: > > > > > > On Thu, 26 Jun 2025 03:45:50 +0530 > > Daniel Lobo <danielobo9976 at gmail.com> wrote: > > > > > Hi, > > > > > > I have a dataframe for which all columns are numeric but categorical. > > > > I don't understand what that means. Perhaps an example? > > > > > There are some missing values as well > > > > > > Typically, I have CSV file saved in drive, and then read it using > > > read.csv command > > > > Is that relevant? > > > > > Can I use as.matrix(na.omit(<<my dataframe loaded using read.csv>>)) > > > to convert such dataframe to matrix? > > > > > > I there any data loss or change that may occur if I use as.matrix > > > command? > > > > I think your question is too vague for anyone to be able to answer this. > > > > > I remember that some experts recommends not to use as.matrix() > > > command to convert a dataframe to matrix. > > > > My guess is that the problem is that as.matrix() will coerce all of the > > columns of a data frame to a common class, which might yield unexpected > > results. > > > > > Any guidance will be very helpful. > > > > It's possible that data.matrix() might be useful. > > > > But basically you should think carefully about what the nature of the > > entries of your data frame could possibly be, and then design your code > > to accommodate all of these possibilities, throwing an error if any > > entry does not conform to any of the possibilities that you envisage. > > > > cheers, > > > > Rolf Turner > > > > -- > > Honorary Research Fellow > > Department of Statistics > > University of Auckland > > Stats. Dep't. (secretaries) phone: > > +64-9-373-7599 ext. 89622 > > Home phone: +64-9-480-4619 > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide https://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.