Bock, Michael
2003-Sep-12 17:01 UTC
[R] converting dataframe columns to vector and missing values
I am relatively new to R, but very pleased with what I can do with it so far. I am embarrassed to ask what seems like a simple question but I am at my wits end. Basically I have written a function to calculate a bootstrapped statistic on a list of values. The function works perfectly if I can feed it the right data. I am exporting data into R as a dataframe and then assigning each column to the list and running the function use a for loop. The problem is what is the best way to convert the columns to a list. The column names and the number of columns will vary depending on the dataset. I am currently converting the dataframe to a matrix and the assigning each column of the matrix to the list in turn: #InputData is the dataframe RunTests <- function (InputData) { n <- length(InputData) Chem <- colnames(InputData) for (i in 1:n){ print (Chem[i]) Data <- data.matrix(InputData) x <- Data[,n] na.omit(x) #print(x) UCL <- HallBoot(x) print (UCL) } } Although this works some of the time, missing values are not removed. This is a huge problem as the number of observation is each column is quite variable. Obviously the na.omit is not working the way I expect. Any help would be appreciated, including a whole new approach to sending the data to the HallBoot function. Michael J. Bock, PhD. ARCADIS 24 Preble St. Suite 100 Portland, ME 04101 207.828.0046 fax 207.828.0062
Spencer Graves
2003-Sep-12 17:19 UTC
[R] converting dataframe columns to vector and missing values
Have you considered: x <- Data[!is.na(Data[,n]), n] Does this do what you want? Vectors, arrays, and data.frame can be indexed by number or by a logical vector -- and by names if such are supplied. In this case, "!is.na(Data[,n])" is a logical vector of length = number of rows of Data. hope this helps. spencer graves Bock, Michael wrote:> I am relatively new to R, but very pleased with what I can do with it so > far. > I am embarrassed to ask what seems like a simple question but I am at my > wits end. Basically I have written a function to calculate a bootstrapped > statistic on a list of values. The function works perfectly if I can feed it > the right data. I am exporting data into R as a dataframe and then assigning > each column to the list and running the function use a for loop. The problem > is what is the best way to convert the columns to a list. The column names > and the number of columns will vary depending on the dataset. I am currently > converting the dataframe to a matrix and the assigning each column of the > matrix to the list in turn: > > #InputData is the dataframe > RunTests <- function (InputData) > { > n <- length(InputData) > Chem <- colnames(InputData) > for (i in 1:n){ > print (Chem[i]) > Data <- data.matrix(InputData) > x <- Data[,n] > na.omit(x) > #print(x) > UCL <- HallBoot(x) > print (UCL) > } > > } > Although this works some of the time, missing values are not removed. This > is a huge problem as the number of observation is each column is quite > variable. Obviously the na.omit is not working the way I expect. Any help > would be appreciated, including a whole new approach to sending the data to > the HallBoot function. > > Michael J. Bock, PhD. > ARCADIS > 24 Preble St. Suite 100 > Portland, ME 04101 > 207.828.0046 > fax 207.828.0062 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help