Paul Miller
2012-Mar-20 13:50 UTC
[R] Reshaping data from long to wide without a "timevar"
Hello All, I was wondering if it's possible to reshape data from long to wide in R without using a "timevar". I've pasted some sample data below along with some code. The data are sorted by Subject and Drug. I want to transpose the Drug variable into multiple columns in alphabetical order. My data have a variable called "RowNo" that functions almost like a "timevar" but not quite. In Subject 6, Erlotinib has a RowNo value of 3 whereas Paclitaxel has a RowNo value of 2. So if I use reshape as in the first bit of code below, the columns for drug don't transpose in alphabetical order. That is, Paclitaxel appears in Drug.2 and Erlotinib appears in Drug.3 when it should be the other way around. The next two bits of code represent a couple of other things I've tried. The cast function almost works but unfortunately makes a separate column for each drug (at least the way I'm using it). The unstack function works almost perfectly but to my surprise creates a list instead of a dataframe (which I understand is a different kind of list). Thought it might take a single line of code to convert the former structure to the latter but this appears not to be the case. So can I get what I want without adding a timevar to my data? And if do need a timevar, what's the best way to add it? Thanks, Paul connection <- textConnection(" 005 1 Gemcitabine 005 2 Erlotinib 006 1 Gemcitabine 006 3 Erlotinib 006 2 Paclitaxel 009 1 Gemcitabine 009 2 Erlotinib 010 1 Gemcitabine 010 2 Erlotinib 010 3 Herceptin ") TestData <- data.frame(scan(connection, list(Subject = 0, RowNo = 0, Drug = ""))) TestData$Subject <- as.integer(TestData$Subject) TestData$RowNo <- as.integer(TestData$RowNo) TestData$Drug <- as.character(TestData$Drug) require(reshape) Transpose <- reshape(TestData, direction="wide", idvar="Subject", timevar="RowNo", v.names="Drug") Transpose Transpose <- melt(TestData, id.var="Subject", measure.var="Drug") Transpose <- cast(Transpose, Subject ~ value) Transpose Transpose <- unstack(TestData, Drug ~ Subject) Transpose
R. Michael Weylandt
2012-Mar-20 14:01 UTC
[R] Reshaping data from long to wide without a "timevar"
If I understand you right, library(reshape2) dcast(melt(TestData, id.var = "Subject", measure.var = "Drug"), Subject ~ value) Michael On Tue, Mar 20, 2012 at 9:50 AM, Paul Miller <pjmiller_57 at yahoo.com> wrote:> Hello All, > > I was wondering if it's possible to reshape data from long to wide in R without using a "timevar". I've pasted some sample data below along with some code. The data are sorted by Subject and Drug. I want to transpose the Drug variable into multiple columns in alphabetical order. > > My data have a variable called "RowNo" that functions almost like a "timevar" but not quite. In Subject 6, Erlotinib has a RowNo value of 3 whereas Paclitaxel has a RowNo value of 2. So if I use reshape as in the first bit of code below, the columns for drug don't transpose in alphabetical order. That is, Paclitaxel appears in Drug.2 and Erlotinib appears in Drug.3 when it should be the other way around. > > The next two bits of code represent a couple of other things I've tried. The cast function almost works but unfortunately makes a separate column for each drug (at least the way I'm using it). The unstack function works almost perfectly but to my surprise creates a list instead of a dataframe (which I understand is a different kind of list). Thought it might take a single line of code to convert the former structure to the latter but this appears not to be the case. > > So can I get what I want without adding a timevar to my data? And if do need a timevar, what's the best way to add it? > > Thanks, > > Paul > > connection <- textConnection(" > 005 1 Gemcitabine > 005 2 Erlotinib > 006 1 Gemcitabine > 006 3 Erlotinib > 006 2 Paclitaxel > 009 1 Gemcitabine > 009 2 Erlotinib > 010 1 Gemcitabine > 010 2 Erlotinib > 010 3 Herceptin > ") > > TestData <- data.frame(scan(connection, list(Subject = 0, RowNo = 0, Drug = ""))) > TestData$Subject <- as.integer(TestData$Subject) > TestData$RowNo <- as.integer(TestData$RowNo) > TestData$Drug <- as.character(TestData$Drug) > > require(reshape) > > Transpose <- reshape(TestData, direction="wide", idvar="Subject", timevar="RowNo", v.names="Drug") > Transpose > > Transpose <- melt(TestData, id.var="Subject", measure.var="Drug") > Transpose <- cast(Transpose, Subject ~ value) > Transpose > > Transpose <- unstack(TestData, Drug ~ Subject) > Transpose > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Paul Miller
2012-Mar-21 13:20 UTC
[R] Reshaping data from long to wide without a "timevar"
Hello All, Tried some more Internet searches and came to the conclusion that one probably does need to create a "timevar" before reshaping from long to wide. Below is some code that creates the "timevar" and transposes the data. connection <- textConnection(" 005 1 Gemcitabine 005 2 Erlotinib 006 1 Gemcitabine 006 3 Erlotinib 006 2 Paclitaxel 009 1 Gemcitabine 009 2 Erlotinib 010 1 Gemcitabine 010 2 Erlotinib 010 3 Herceptin ") TestData <- data.frame(scan(connection, list(Subject = 0, RowNo = 0, Drug = ""))) TestData$Drug <- as.character(TestData$Drug) TestData$Drug <- with(TestData, ifelse(Drug == "Gemcitabine", " Gemcitabine", Drug)) TestData$Drug <- with(TestData, ifelse(Drug == "Erlotinib", " Erlotinib", Drug)) TestData <- with(TestData, TestData[order(Subject,Drug), c("Subject", "Drug")]) require(reshape) Qualifying_Regimen <- TestData Qualifying_Regimen$Ordvar <- with(TestData, ave(1:nrow(Qualifying_Regimen), Subject, FUN = seq_along)) Qualifying_Regimen <- reshape(Qualifying_Regimen, direction="wide", idvar="Subject", timevar="Ordvar", v.names="Drug") TestData Qualifying_Regimen ? The code for creating the timevar came from a response to an earlier post by Joshua Wiley which can be found at: http://tolstoy.newcastle.edu.au/R/e15/help/11/07/0387.html All in all this works pretty well, and making the "timevar" is easy once you know how. If the gods are listening though, it would be nice if reshape could transpose from long to wide based solely on the order in which observations occur in the data. (Assuming of course that it can't do so already.) Time for a small confession. I've referred in my posts to alphabetical sorting of my Drug column and to wanting the transposed columns to be in alphabetical order. In my?actual data, I added two spaces before the drug "Gemcitabine" and one space before the drug "Erlotinib" so that these drugs would always come first and second in the sort order. Unfortunately, I neglected to mention I had done this and as a result must have caused people some confusion. My apologies for this oversight. Thanks, Paul