drflxms
2008-Sep-06 19:00 UTC
[R] how to address last and all but last column in dataframe
Dear R-colleagues, another question from a newbie: I am creating a lot of simple pivot-charts from my raw data using the reshape-package. In these charts we have medical doctors judging videos in the columns and the videos they judge in the rows. Simple example of chart/data.frame "input" with two categories 1/0: video 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 3 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 6 6 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 7 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 9 9 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 1 0 0 0 1 0 10 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 I recently learned, that I can easily create a confusion matrix out of this data using the following commands: pairs<-data.frame(pred=factor(unlist(input[2:21])),ref=factor(input[,22])) pred<-pairs$pred ref <- pairs$ref library (caret) confusionMatrix(pred, ref, positive=1) - where column 21 is the reference/goldstandard. My problem is now, that I analyse data.frames with an unknown count of columns. So to get rid of the first and last column for the "pred" variable and to select the last column for the "ref" variable, I have to look at the data.frame before doing the above commands to set the proper column numbers. It would be very comfortable, if I could address the last column not by number (where I have to count beforehand) but by a variable "last column". Probably there is a more easy solution for this problem using the names of the columns as well: the reference is always number "21" the first column is always called "video". So I tried: attach(input) pairs<-data.frame(pred=factor(unlist(input[[,-c(video,21)]])),ref=factor(input[[21]])) which does not work unfortunately :-(. I'd be very happy in case someone could help me out, cause I am really tired of counting - there are a lot of tables to analyse... Cheers and greetings from Munich, Felix
Mark Difford
2008-Sep-06 19:50 UTC
[R] how to address last and all but last column in dataframe
Hi Felix,>> My problem is now, that I analyse data.frames with an unknown count of >> columns. So to get rid of the first and last column for the "pred" >> variable >> and to select the last column for the "ref" variable, ...Doubtless there are other routes. Generally I use ?length to get the number of columns. Then do your arithmetic within the indexing operator ?"[" to select what you want. ## Dummy ex. to select first and last column of any data frame ( = DF ) DF[ , c(1, length( names( DF ) ) ) ] ## Dummy ex. to select first and penultimate column of any data frame DF[ , c(1, length( names( DF ) ) -1 ) ] HTH, Mark. drflxms wrote:> > Dear R-colleagues, > > another question from a newbie: I am creating a lot of simple > pivot-charts from my raw data using the reshape-package. In these charts > we have medical doctors judging videos in the columns and the videos > they judge in the rows. Simple example of chart/data.frame "input" with > two categories 1/0: > > video 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 > > 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 > 3 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 4 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 5 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 > 6 6 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 > 7 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 8 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 > 9 9 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 1 0 0 0 1 0 > 10 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > > I recently learned, that I can easily create a confusion matrix out of > this data using the following commands: > > pairs<-data.frame(pred=factor(unlist(input[2:21])),ref=factor(input[,22])) > pred<-pairs$pred > ref <- pairs$ref > library (caret) > confusionMatrix(pred, ref, positive=1) > > - where column 21 is the reference/goldstandard. > > My problem is now, that I analyse data.frames with an unknown count of > columns. So to get rid of the first and last column for the "pred" > variable and to select the last column for the "ref" variable, I have to > look at the data.frame before doing the above commands to set the proper > column numbers. > > It would be very comfortable, if I could address the last column not by > number (where I have to count beforehand) but by a variable "last column". > > Probably there is a more easy solution for this problem using the names > of the columns as well: the reference is always number "21" the first > column is always called "video". So I tried: > > attach(input) > pairs<-data.frame(pred=factor(unlist(input[[,-c(video,21)]])),ref=factor(input[[21]])) > > which does not work unfortunately :-(. > > I'd be very happy in case someone could help me out, cause I am really > tired of counting - there are a lot of tables to analyse... > > Cheers and greetings from Munich, > Felix > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- View this message in context: http://www.nabble.com/how-to-address-last-and-all-but-last-column-in-dataframe-tp19349974p19350456.html Sent from the R help mailing list archive at Nabble.com.
David Winsemius
2008-Sep-06 19:52 UTC
[R] how to address last and all but last column in dataframe
Not sure where your "input" came from. It's not in a format I would have expected of an R object and the first line is not in a form that would be particularly easy to read into a valid R object. Numbers are no legitimate object names. It's also not clear what you want to do with the duplicated line numbers at the beginning. Your question implies that you do not consider them part of the data. In the future a worked example along the lines of that constructed by Jorge Ivan Velez in a recent answer to another question might increase chances of a prompt reply with tested code: # Data set DF=read.table(textConnection("V1 V2 V3 a b 0:1:12 d f 1:2:1 c d 1:0:9 b e 2:2:6 f c 5:5:0"),header=TRUE) closeAllConnections() The "length" of a dataframe is the number of columns. ?length Dataframes can be referenced using the extract operation e.g. df[<row>, <col>] ?Extract # for additional information on indexing using column vectors. So: video[ ,length(video)] #should return the last column vector although it will be no longer be named. The rest of the dataframe with intact column names could be obtained with: video[ ,-length(video)] -- David Winsemius On Sep 6, 2008, at 3:00 PM, drflxms wrote:> Dear R-colleagues, > > another question from a newbie: I am creating a lot of simple > pivot-charts from my raw data using the reshape-package. In these > charts > we have medical doctors judging videos in the columns and the videos > they judge in the rows. Simple example of chart/data.frame "input" > with > two categories 1/0: > > video 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 > > 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 > 3 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 4 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 5 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 > 6 6 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 > 7 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 8 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 > 9 9 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 1 0 0 0 1 0 > 10 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > > I recently learned, that I can easily create a confusion matrix out of > this data using the following commands: > > pairs<-data.frame(pred=factor(unlist(input[2:21])),ref=factor(input[, > 22])) > pred<-pairs$pred > ref <- pairs$ref > library (caret) > confusionMatrix(pred, ref, positive=1) > > - where column 21 is the reference/goldstandard. > > My problem is now, that I analyse data.frames with an unknown count of > columns. So to get rid of the first and last column for the "pred" > variable and to select the last column for the "ref" variable, I > have to > look at the data.frame before doing the above commands to set the > proper > column numbers. > > It would be very comfortable, if I could address the last column not > by > number (where I have to count beforehand) but by a variable "last > column". > > Probably there is a more easy solution for this problem using the > names > of the columns as well: the reference is always number "21" the first > column is always called "video". So I tried: > > attach(input) > pairs<-data.frame(pred=factor(unlist(input[[,-c(video, > 21)]])),ref=factor(input[[21]])) > > which does not work unfortunately :-(. > > I'd be very happy in case someone could help me out, cause I am really > tired of counting - there are a lot of tables to analyse... > > Cheers and greetings from Munich, > Felix > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
drflxms
2008-Sep-06 20:24 UTC
[R] how to address last and all but last column in dataframe
Hello Mr. Burns, Hello Mr. Dwinseminus thank you very much for your incredible quick and efficient reply! I was completely successful with the following command: pairs<-data.frame(pred=factor(unlist(input[,-c(1,ncol(input))])),ref=factor(input[,ncol(input)])) In case of the "input" example data.frame I sent with my question the above code is equivalent to: pairs<-data.frame(pred=factor(unlist(input[2:17])),ref=factor(input[,18])) Great! That is exactly what I was looking for! This simple code will save me hours! Patrick, your book looks in fact very interesting and will be my perfect reading material for the following nights :-) (probably not only the first chapter ;-). Thanks for the hint - and the free book of course. David, the "input" data.frame is the result of the reshape-command I performed. I just copied it from the R-console into the e-mail. In fact the first column "video" is not part of the data, but needed for analysis with kappam.fleiss function of the irr-package. Sorry, you are absolutely correct, I should have mentioned this in my question. I will improve when I ask my next question :-). Again I like to thank you for your help and wish you a pleasant Sunday. Greetings from Munich, Felix Patrick Burns wrote:> If I understand properly, you want > > input[, -c(1, ncol(input))] > > rather than > > input[[, -c(video, 21)]] > > Chapter 1 of S Poetry might be of interest to you. > > Patrick Burns > patrick at burns-stat.com > +44 (0)20 8525 0696 > http://www.burns-stat.com > (home of S Poetry and "A Guide for the Unwilling S User") > > drflxms wrote: >> Dear R-colleagues, >> >> another question from a newbie: I am creating a lot of simple >> pivot-charts from my raw data using the reshape-package. In these charts >> we have medical doctors judging videos in the columns and the videos >> they judge in the rows. Simple example of chart/data.frame "input" with >> two categories 1/0: >> >> video 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 >> >> 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 >> 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 >> 3 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 >> 4 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 >> 5 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 >> 6 6 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 >> 7 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 >> 8 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 >> 9 9 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 1 0 0 0 1 0 >> 10 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 >> >> I recently learned, that I can easily create a confusion matrix out of >> this data using the following commands: >> >> pairs<-data.frame(pred=factor(unlist(input[2:21])),ref=factor(input[,22])) >> >> pred<-pairs$pred >> ref <- pairs$ref >> library (caret) >> confusionMatrix(pred, ref, positive=1) >> >> - where column 21 is the reference/goldstandard. >> >> My problem is now, that I analyse data.frames with an unknown count of >> columns. So to get rid of the first and last column for the "pred" >> variable and to select the last column for the "ref" variable, I have to >> look at the data.frame before doing the above commands to set the proper >> column numbers. >> >> It would be very comfortable, if I could address the last column not by >> number (where I have to count beforehand) but by a variable "last >> column". >> >> Probably there is a more easy solution for this problem using the names >> of the columns as well: the reference is always number "21" the first >> column is always called "video". So I tried: >> >> attach(input) >> pairs<-data.frame(pred=factor(unlist(input[[,-c(video,21)]])),ref=factor(input[[21]])) >> >> >> which does not work unfortunately :-(. >> >> I'd be very happy in case someone could help me out, cause I am really >> tired of counting - there are a lot of tables to analyse... >> >> Cheers and greetings from Munich, >> Felix >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> >>