Paul Miller
2012-Mar-20 13:50 UTC
[R] Reshaping data from long to wide without a "timevar"
Hello All,
I was wondering if it's possible to reshape data from long to wide in R
without using a "timevar". I've pasted some sample data below
along with some code. The data are sorted by Subject and Drug. I want to
transpose the Drug variable into multiple columns in alphabetical order.
My data have a variable called "RowNo" that functions almost like a
"timevar" but not quite. In Subject 6, Erlotinib has a RowNo value of
3 whereas Paclitaxel has a RowNo value of 2. So if I use reshape as in the first
bit of code below, the columns for drug don't transpose in alphabetical
order. That is, Paclitaxel appears in Drug.2 and Erlotinib appears in Drug.3
when it should be the other way around.
The next two bits of code represent a couple of other things I've tried. The
cast function almost works but unfortunately makes a separate column for each
drug (at least the way I'm using it). The unstack function works almost
perfectly but to my surprise creates a list instead of a dataframe (which I
understand is a different kind of list). Thought it might take a single line of
code to convert the former structure to the latter but this appears not to be
the case.
So can I get what I want without adding a timevar to my data? And if do need a
timevar, what's the best way to add it?
Thanks,
Paul
connection <- textConnection("
005 1 Gemcitabine
005 2 Erlotinib
006 1 Gemcitabine
006 3 Erlotinib
006 2 Paclitaxel
009 1 Gemcitabine
009 2 Erlotinib
010 1 Gemcitabine
010 2 Erlotinib
010 3 Herceptin
")
TestData <- data.frame(scan(connection, list(Subject = 0, RowNo = 0, Drug =
"")))
TestData$Subject <- as.integer(TestData$Subject)
TestData$RowNo <- as.integer(TestData$RowNo)
TestData$Drug <- as.character(TestData$Drug)
require(reshape)
Transpose <- reshape(TestData, direction="wide",
idvar="Subject", timevar="RowNo", v.names="Drug")
Transpose
Transpose <- melt(TestData, id.var="Subject",
measure.var="Drug")
Transpose <- cast(Transpose, Subject ~ value)
Transpose
Transpose <- unstack(TestData, Drug ~ Subject)
Transpose
R. Michael Weylandt
2012-Mar-20 14:01 UTC
[R] Reshaping data from long to wide without a "timevar"
If I understand you right, library(reshape2) dcast(melt(TestData, id.var = "Subject", measure.var = "Drug"), Subject ~ value) Michael On Tue, Mar 20, 2012 at 9:50 AM, Paul Miller <pjmiller_57 at yahoo.com> wrote:> Hello All, > > I was wondering if it's possible to reshape data from long to wide in R without using a "timevar". I've pasted some sample data below along with some code. The data are sorted by Subject and Drug. I want to transpose the Drug variable into multiple columns in alphabetical order. > > My data have a variable called "RowNo" that functions almost like a "timevar" but not quite. In Subject 6, Erlotinib has a RowNo value of 3 whereas Paclitaxel has a RowNo value of 2. So if I use reshape as in the first bit of code below, the columns for drug don't transpose in alphabetical order. That is, Paclitaxel appears in Drug.2 and Erlotinib appears in Drug.3 when it should be the other way around. > > The next two bits of code represent a couple of other things I've tried. The cast function almost works but unfortunately makes a separate column for each drug (at least the way I'm using it). The unstack function works almost perfectly but to my surprise creates a list instead of a dataframe (which I understand is a different kind of list). Thought it might take a single line of code to convert the former structure to the latter but this appears not to be the case. > > So can I get what I want without adding a timevar to my data? And if do need a timevar, what's the best way to add it? > > Thanks, > > Paul > > connection <- textConnection(" > 005 1 Gemcitabine > 005 2 Erlotinib > 006 1 Gemcitabine > 006 3 Erlotinib > 006 2 Paclitaxel > 009 1 Gemcitabine > 009 2 Erlotinib > 010 1 Gemcitabine > 010 2 Erlotinib > 010 3 Herceptin > ") > > TestData <- data.frame(scan(connection, list(Subject = 0, RowNo = 0, Drug = ""))) > TestData$Subject <- as.integer(TestData$Subject) > TestData$RowNo <- as.integer(TestData$RowNo) > TestData$Drug <- as.character(TestData$Drug) > > require(reshape) > > Transpose <- reshape(TestData, direction="wide", idvar="Subject", timevar="RowNo", v.names="Drug") > Transpose > > Transpose <- melt(TestData, id.var="Subject", measure.var="Drug") > Transpose <- cast(Transpose, Subject ~ value) > Transpose > > Transpose <- unstack(TestData, Drug ~ Subject) > Transpose > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Paul Miller
2012-Mar-21 13:20 UTC
[R] Reshaping data from long to wide without a "timevar"
Hello All,
Tried some more Internet searches and came to the conclusion that one probably
does need to create a "timevar" before reshaping from long to wide.
Below is some code that creates the "timevar" and transposes the data.
connection <- textConnection("
005 1 Gemcitabine
005 2 Erlotinib
006 1 Gemcitabine
006 3 Erlotinib
006 2 Paclitaxel
009 1 Gemcitabine
009 2 Erlotinib
010 1 Gemcitabine
010 2 Erlotinib
010 3 Herceptin
")
TestData <- data.frame(scan(connection, list(Subject = 0, RowNo = 0, Drug =
"")))
TestData$Drug <- as.character(TestData$Drug)
TestData$Drug <- with(TestData, ifelse(Drug == "Gemcitabine",
" Gemcitabine", Drug))
TestData$Drug <- with(TestData, ifelse(Drug == "Erlotinib", "
Erlotinib", Drug))
TestData <- with(TestData, TestData[order(Subject,Drug),
c("Subject", "Drug")])
require(reshape)
Qualifying_Regimen <- TestData
Qualifying_Regimen$Ordvar <- with(TestData, ave(1:nrow(Qualifying_Regimen),
Subject, FUN = seq_along))
Qualifying_Regimen <- reshape(Qualifying_Regimen, direction="wide",
idvar="Subject", timevar="Ordvar", v.names="Drug")
TestData
Qualifying_Regimen
?
The code for creating the timevar came from a response to an earlier post by
Joshua Wiley which can be found at:
http://tolstoy.newcastle.edu.au/R/e15/help/11/07/0387.html
All in all this works pretty well, and making the "timevar" is easy
once you know how. If the gods are listening though, it would be nice if reshape
could transpose from long to wide based solely on the order in which
observations occur in the data. (Assuming of course that it can't do so
already.)
Time for a small confession. I've referred in my posts to alphabetical
sorting of my Drug column and to wanting the transposed columns to be in
alphabetical order. In my?actual data, I added two spaces before the drug
"Gemcitabine" and one space before the drug "Erlotinib" so
that these drugs would always come first and second in the sort order.
Unfortunately, I neglected to mention I had done this and as a result must have
caused people some confusion. My apologies for this oversight.
Thanks,
Paul