Ben Fairbank
2005-Jun-28 14:39 UTC
[R] Using data frames for EDA: Insert, Change name, delete columns? (Newcomer's question)
I am finding complex analyses easier than some elementary operations in R. In particular I want to do some low level exploratory data analyses with data in a data frame but cannot find commands to easily insert, remove (delete), rename, and re-order (arbitrarily, not sort) columns. I see that the micEcon package has an insertCol command, but that is for matrices, not data frames. I have looked through several introductory texts but all seem to stop short of explaining the commands needed for the above. Could a reader provide a reference where such commands are documented? Thank you, Ben Fairbank [[alternative HTML version deleted]]
Earl F. Glynn
2005-Jun-28 15:08 UTC
[R] Using data frames for EDA: Insert, Change name, delete columns? (Newcomer's question)
"Ben Fairbank" <BEN at SSANET.COM> wrote in message news:CA612484A337C6479EA341DF9EEE14AC03AC1BB5 at hercules.ssainfo...> I ... cannot find commands to easily insert, > remove (delete), rename, and re-order (arbitrarily, not sort) columns....> Could a reader provide a reference where such commands are > documented?There's a lot of info in the old R-Help postings, but searching and finding an answer for a particular problem can be a bit of a pain. Here's some info from some old R-Help postings that may help on your question: DELETE TWO COLUMNS ------------------------------------------------------- I have a dataframe 'd2004' and I want to remove two columns: 'd2004$concentration' and 'd2004$stade". I could do it just as follows:> names(d2004)[1] "Localite" "Date" "parcelle" "maille" "presence.plant" "concentration" "stade.culture" [8] "stade" "Trou" "Horizon" "Profondeur"> d2004 <- d2004[, -c(6, 8)]but I'd like to use column names (to avoid finding column numbers each time). I cannot find an easy way to operate... I wonder why that works:> d2004[, "concentration"]and this don't:> d2004 <- d2004[, -c("concentration", "stade")]SOLUTIONS: d2004$concentration <- NULL d2004$stade <- NULL or Newdata <- subset(d2004, select=-c(concentration,stade)) RENAMING COLUMNS ------------------------------------------------------- This is a sample data frame:> myData <- data.frame( col1 = 1:3, col2 = 2:4, col3 = 3:5 )> myDatacol1 col2 col3 1 1 2 3 2 2 3 4 3 3 4 5 You can change all names by:> names( myData )<- c( "newcol1", "newcol2", "newcol3" )> myDatanewcol1 newcol2 newcol3 1 1 2 3 2 2 3 4 3 3 4 5 Or a single name by:> names( myData )[ 2 ] <- "newcol2"> myDatacol1 newcol2 col3 1 1 2 3 2 2 3 4 3 3 4 5 Or if you know the name, but not the column number:> names( myData )[ which( names( myData ) == "newcol2" ) ] <- "verynewcol2"> myDatacol1 verynewcol2 col3 1 1 2 3 2 2 3 4 3 3 4 5 REORDERING COLUMNS ------------------------------------------------------- I don't have a clipping for this one, but here's what I'd try:> myData <- data.frame( col1 = 1:3, col2 = 2:4, col3 = 3:5 ) > > myDatacol1 col2 col3 1 1 2 3 2 2 3 4 3 3 4 5> MyData <- myData[,c(3,1,2)] > MyDatacol3 col1 col2 1 3 1 2 2 4 2 3 3 5 3 4 -- efg Earl F. Glynn Bioinformatics Stowers Institute for Medical Research