Bock, Michael
2003-Sep-12 17:01 UTC
[R] converting dataframe columns to vector and missing values
I am relatively new to R, but very pleased with what I can do with it so
far.
I am embarrassed to ask what seems like a simple question but I am at my
wits end. Basically I have written a function to calculate a bootstrapped
statistic on a list of values. The function works perfectly if I can feed it
the right data. I am exporting data into R as a dataframe and then assigning
each column to the list and running the function use a for loop. The problem
is what is the best way to convert the columns to a list. The column names
and the number of columns will vary depending on the dataset. I am currently
converting the dataframe to a matrix and the assigning each column of the
matrix to the list in turn:
#InputData is the dataframe
RunTests <- function (InputData)
{
n <- length(InputData)
Chem <- colnames(InputData)
for (i in 1:n){
print (Chem[i])
Data <- data.matrix(InputData)
x <- Data[,n]
na.omit(x)
#print(x)
UCL <- HallBoot(x)
print (UCL)
}
}
Although this works some of the time, missing values are not removed. This
is a huge problem as the number of observation is each column is quite
variable. Obviously the na.omit is not working the way I expect. Any help
would be appreciated, including a whole new approach to sending the data to
the HallBoot function.
Michael J. Bock, PhD.
ARCADIS
24 Preble St. Suite 100
Portland, ME 04101
207.828.0046
fax 207.828.0062
Spencer Graves
2003-Sep-12 17:19 UTC
[R] converting dataframe columns to vector and missing values
Have you considered: x <- Data[!is.na(Data[,n]), n] Does this do what you want? Vectors, arrays, and data.frame can be indexed by number or by a logical vector -- and by names if such are supplied. In this case, "!is.na(Data[,n])" is a logical vector of length = number of rows of Data. hope this helps. spencer graves Bock, Michael wrote:> I am relatively new to R, but very pleased with what I can do with it so > far. > I am embarrassed to ask what seems like a simple question but I am at my > wits end. Basically I have written a function to calculate a bootstrapped > statistic on a list of values. The function works perfectly if I can feed it > the right data. I am exporting data into R as a dataframe and then assigning > each column to the list and running the function use a for loop. The problem > is what is the best way to convert the columns to a list. The column names > and the number of columns will vary depending on the dataset. I am currently > converting the dataframe to a matrix and the assigning each column of the > matrix to the list in turn: > > #InputData is the dataframe > RunTests <- function (InputData) > { > n <- length(InputData) > Chem <- colnames(InputData) > for (i in 1:n){ > print (Chem[i]) > Data <- data.matrix(InputData) > x <- Data[,n] > na.omit(x) > #print(x) > UCL <- HallBoot(x) > print (UCL) > } > > } > Although this works some of the time, missing values are not removed. This > is a huge problem as the number of observation is each column is quite > variable. Obviously the na.omit is not working the way I expect. Any help > would be appreciated, including a whole new approach to sending the data to > the HallBoot function. > > Michael J. Bock, PhD. > ARCADIS > 24 Preble St. Suite 100 > Portland, ME 04101 > 207.828.0046 > fax 207.828.0062 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help