DynV Montrealer
2024-Jul-14 07:16 UTC
[R] Reinterpret data without saving it to a file 1st? Check for integer stopping at 1st decimal?
A small number of columns in the data I need to work with are strings, the rest numbers. I'm using read_excel() from the readxl package to get the data ; right after it, the string columns are of type chr and the rest num. I'm tasked with finding out which columns are integers. From an advice, I tried saving the spreadsheet content into a CSV then loading that, which works like a charm ; the chr columns are the same but now a large portion of num is now instead int. Is there a way to skip writing and reading a CSV and get the same transformation? Perhaps some way to break the spreadsheet data (eg XLdata <- read_excel(...)), then put it back together without any writing to a file (eg XLdataReformed <- reform(XLdata)) ? In addition, from is.integer() documentation I ran> is.wholenumber <- function(x, tol = .Machine$double.eps^0.5) abs(x - round > (x)) < toland I'm now trying to have it stop at the 1st decimal content of a column. Someone advised me to use break and I scripted> is_integer = TRUE for (current_row in seq_along(data$column)) { if (! > is.wholenumber(data$column[current_row])) { is_integer = FALSE break; } }but I'm wondering if there's something better to check if a column is entirely made of integers. Thank you kindly for your help [[alternative HTML version deleted]]
Ivan Krylov
2024-Jul-14 11:08 UTC
[R] Reinterpret data without saving it to a file 1st? Check for integer stopping at 1st decimal?
? Sun, 14 Jul 2024 03:16:56 -0400 DynV Montrealer <dynvec at gmail.com> ?????:> Perhaps some way to break the spreadsheet data (eg XLdata <- > read_excel(...)), then put it back together without any writing to a > file (eg XLdataReformed <- reform(XLdata)) ?read_excel() is documented to return objects of class tibble: https://cran.r-project.org/package=tibble/vignettes/tibble.html Long story short, tibbles are named lists of columns, so it should be possible for you to access and replace the individual parts of them using the standard list subset syntax XLdata[[columnname]]. Lists are described in R Intro chapter 6 and many other books on R: https://cran.r-project.org/doc/manuals/R-intro.html#Lists-and-data-frames http://web.archive.org/web/20230415001551if_/http://ashipunov.info/shipunov/school/biol_240/en/visual_statistics.pdf (see section 3.8.2 on page 93 and following)> In addition, from is.integer() documentation I ran > > > is.wholenumber <- function(x, tol = .Machine$double.eps^0.5) abs(x > > - round (x)) < tol > > and I'm now trying to have it stop at the 1st decimal content of a > column.If you'd like to write idiomatic R code, consider the fact that is.wholenumber is vectorised: is.wholenumber(c(1,2,3,pi)) # [1] TRUE TRUE TRUE FALSE Given a vector of numbers, it will return a vector of the same length specifying whether each element can be considered a whole number. Combine it with all() and you can test the whole column in two function calls. R also has a type.convert function that may be useful in this case: https://search.r-project.org/R/refmans/utils/html/type.convert.html -- Best regards, Ivan
DynV Montrealer
2024-Jul-15 05:36 UTC
[R] Ps: Reinterpret data without saving it to a file 1st? Check for integer stopping at 1st decimal?
The answer: https://statisticsglobe.com/change-classes-data-frame-columns-automatically-r On Sun, Jul 14, 2024 at 3:16?AM DynV Montrealer <dynvec at gmail.com> wrote:> A small number of columns in the data I need to work with are strings, the > rest numbers. I'm using read_excel() from the readxl package to get the > data ; right after it, the string columns are of type chr and the rest num. > I'm tasked with finding out which columns are integers. From an advice, I > tried saving the spreadsheet content into a CSV then loading that, which > works like a charm ; the chr columns are the same but now a large portion > of num is now instead int. Is there a way to skip writing and reading a CSV > and get the same transformation? Perhaps some way to break the spreadsheet > data (eg XLdata <- read_excel(...)), then put it back together without any > writing to a file (eg XLdataReformed <- reform(XLdata)) ? > > In addition, from is.integer() documentation I ran > > > is.wholenumber <- function(x, tol = .Machine$double.eps^0.5) abs(x - > round > > (x)) < tol > > and I'm now trying to have it stop at the 1st decimal content of a column. > Someone advised me to use break and I scripted > > > is_integer = TRUE for (current_row in seq_along(data$column)) { if (! > > is.wholenumber(data$column[current_row])) { is_integer = FALSE break; } } > > but I'm wondering if there's something better to check if a column is > entirely made of integers. > > Thank you kindly for your help > >[[alternative HTML version deleted]]