Ajay Ohri
2010-Nov-01 14:39 UTC
[R] transforming a dataset for association analysis RESHAPE2
I get the following message when using the reshape2 package line> tDat.m<- melt(Dataset)Using Item, Subject as id variables> tDatCast<- acast(tDat.m,Subject~Item)Aggregation function missing: defaulting to length Note Problem Statement- convert dataframe Subject Item Score 1 Subject 1 Item 1 1 2 Subject 1 Item 2 0 3 Subject 1 Item 3 1 4 Subject 2 Item 1 1 5 Subject 2 Item 2 1 6 Subject 2 Item 3 0 to Subject Item 1 Item 2 Item 3 Item 4 1 Subject 1 1 0 1 1 5 Subject 2 1 1 0 0 Note- when I tried using the "wide" method the resultant vector went out of memory- its a dataset appox 100,000 lines Websites- http://decisionstats.com http://dudeofdata.com Linkedin- www.linkedin.com/in/ajayohri On Sat, Oct 30, 2010 at 5:41 PM, Rainer Hurling <rhurlin@gwdg.de> wrote:> On 30.10.2010 13:50 (UTC+1), Santosh Srinivas wrote: > >> A more usable problem input would definitely help ... use dput to send a >> reproducible sample to the group >> >> Think the below should solve your problem >> >> read.csv("Book1.csv") >>> >> Subject Item Score >> 1 Subject 1 Item 1 1 >> 2 Subject 1 Item 2 0 >> 3 Subject 1 Item 3 1 >> 4 Subject 2 Item 1 1 >> 5 Subject 2 Item 2 1 >> 6 Subject 2 Item 3 0 >> >> library("reshape2") >>> tDat.m<- melt(tDat) >>> >> >> tDatCast<- acast(tDat.m,Subject~Item) >>> tDatCast >>> >> Item 1 Item 2 Item 3 >> Subject 1 1 0 1 >> Subject 2 1 1 0 >> > > > # Or without using package reshape2, only function reshape from stats: > > df <- data.frame(Subject> c("Subject 1","Subject 1","Subject 1","Subject 1", > "Subject 2","Subject 2","Subject 2","Subject 2"), > Item > c("Item 1","Item 2","Item 3","Item 4", > "Item 1","Item 2","Item 3","Item 4"), > Score = c(1,0,1,1,1,1,0,0)) > > df.wide <- reshape(df, idvar="Subject", timevar="Item", direction="wide") > names(df.wide) <- c("Subject",unique(as.character(df$Item))) > > df.wide > Subject Item 1 Item 2 Item 3 Item 4 > 1 Subject 1 1 0 1 1 > 5 Subject 2 1 1 0 0 > > > > -----Original Message----- >> From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org] >> On >> Behalf Of Ajay Ohri >> Sent: 30 October 2010 16:27 >> To: Rhelp >> Subject: [R] transforming a dataset for association analysis >> >> Hi >> >> I would like to transform a data frame like >> >> Subject Item Score >> Subject 1 Item 1 1 >> Subject 1 Item 2 0 >> Subject 1 Item 3 1 >> Subject 2 Item 1 1 >> Subject 2 Item 2 1 >> Subject 2 Item 3 0 >> .... >> *to * >> >> Subject Item1 Item2 Item3 .....Item N >> Subject1 1 0 1 >> Subject2 1 1 0 >> ........ >> SubjectP.. >> >> Apologize for the simple nature of my query but I am stuck. How can I do >> this transformation? >> >> Regards >> >> Ajay >> >> >> >> Websites- >> http://decisionstats.com >> http://dudeofdata.com >> >> >> Linkedin- www.linkedin.com/in/ajayohri >> >> >> >> >> On Sat, Oct 30, 2010 at 2:39 PM, Alaios<alaios@yahoo.com> wrote: >> >> Hello everyone. >>> I have written quite a big function that at the end correctly returns the >>> values >>> I want. I found a rare exception that I want to cover also. The easier >>> for >>> me >>> would be to write something like that >>> >>> >>> function(){ >>> >>> if (rare exception happened) >>> return that value >>> >>> # The comes the code for normal execution >>> # ... >>> # ... >>> return value # Normal values to return >>> >>> } >>> >>> >>> Would that be feasible with R or two returns statements are not accepted? >>> >>> Regards >>> Alex >>> >>[[alternative HTML version deleted]]
Ista Zahn
2010-Nov-01 15:14 UTC
[R] transforming a dataset for association analysis RESHAPE2
Hi Ajay, I'm not sure what the problem is, and I don't think your description is enough to reproduce it. This works fine for me library(reshape2) dat <- read.table(textConnection('Subject Item Score "Subject 1" "Item 1" 1 "Subject 1" "Item 2" 0 "Subject 1" "Item 3" 1 "Subject 2" "Item 1" 1 "Subject 2" "Item 2" 1 "Subject 2" "Item 3" 0'), header=TRUE) closeAllConnections() acast(dat, Subject~Item) sessionInfo() R version 2.12.0 (2010-10-15) Platform: i686-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C [3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8 [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8 [7] LC_PAPER=en_US.utf8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C attached base packages: [1] grid stats graphics grDevices utils datasets methods [8] base other attached packages: [1] reshape2_1.0 ggplot2_0.8.8 proto_0.3-8 reshape_0.8.3 plyr_1.2.1 loaded via a namespace (and not attached): [1] stringr_0.4 tools_2.12.0 -Ista On Mon, Nov 1, 2010 at 10:39 AM, Ajay Ohri <ohri2007 at gmail.com> wrote:> I get the following message when using the reshape2 package line > >> tDat.m<- melt(Dataset) > Using Item, Subject as id variables >> tDatCast<- acast(tDat.m,Subject~Item) > Aggregation function missing: defaulting to length > > > Note Problem Statement- > > convert dataframe > > > Subject ? Item Score > 1 Subject 1 Item 1 ? ? 1 > 2 Subject 1 Item 2 ? ? 0 > 3 Subject 1 Item 3 ? ? 1 > 4 Subject 2 Item 1 ? ? 1 > 5 Subject 2 Item 2 ? ? 1 > 6 Subject 2 Item 3 ? ? 0 > > to > > > ?Subject Item 1 Item 2 Item 3 Item 4 > 1 Subject 1 ? ? ?1 ? ? ?0 ? ? ?1 ? ? ?1 > 5 Subject 2 ? ? ?1 ? ? ?1 ? ? ?0 ? ? ?0 > > Note- when I tried using the "wide" method the resultant vector went out of > memory- its a dataset appox 100,000 lines > > > > Websites- > http://decisionstats.com > http://dudeofdata.com > > > Linkedin- www.linkedin.com/in/ajayohri > > > > > On Sat, Oct 30, 2010 at 5:41 PM, Rainer Hurling <rhurlin at gwdg.de> wrote: > >> On 30.10.2010 13:50 (UTC+1), Santosh Srinivas wrote: >> >>> A more usable problem input would definitely help ... use dput to send a >>> reproducible sample to the group >>> >>> Think the below should solve your problem >>> >>> ?read.csv("Book1.csv") >>>> >>> ? ? Subject ? Item Score >>> 1 Subject 1 Item 1 ? ? 1 >>> 2 Subject 1 Item 2 ? ? 0 >>> 3 Subject 1 Item 3 ? ? 1 >>> 4 Subject 2 Item 1 ? ? 1 >>> 5 Subject 2 Item 2 ? ? 1 >>> 6 Subject 2 Item 3 ? ? 0 >>> >>> ?library("reshape2") >>>> tDat.m<- melt(tDat) >>>> >>> >>> ?tDatCast<- acast(tDat.m,Subject~Item) >>>> tDatCast >>>> >>> ? ? ? ? ? Item 1 Item 2 Item 3 >>> Subject 1 ? ? ?1 ? ? ?0 ? ? ?1 >>> Subject 2 ? ? ?1 ? ? ?1 ? ? ?0 >>> >> >> >> # Or without using package reshape2, only function reshape from stats: >> >> df <- data.frame(Subject>> ? ? ? ? ? ? ? ? ? c("Subject 1","Subject 1","Subject 1","Subject 1", >> ? ? ? ? ? ? ? ? ? ? "Subject 2","Subject 2","Subject 2","Subject 2"), >> ? ? ? ? ? ? ? ? Item ? >> ? ? ? ? ? ? ? ? ? c("Item 1","Item 2","Item 3","Item 4", >> ? ? ? ? ? ? ? ? ? ? "Item 1","Item 2","Item 3","Item 4"), >> ? ? ? ? ? ? ? ? Score ?= c(1,0,1,1,1,1,0,0)) >> >> df.wide <- reshape(df, idvar="Subject", timevar="Item", direction="wide") >> names(df.wide) <- c("Subject",unique(as.character(df$Item))) >> >> df.wide >> ? ?Subject Item 1 Item 2 Item 3 Item 4 >> 1 Subject 1 ? ? ?1 ? ? ?0 ? ? ?1 ? ? ?1 >> 5 Subject 2 ? ? ?1 ? ? ?1 ? ? ?0 ? ? ?0 >> >> >> >> ?-----Original Message----- >>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] >>> On >>> Behalf Of Ajay Ohri >>> Sent: 30 October 2010 16:27 >>> To: Rhelp >>> Subject: [R] transforming a dataset for association analysis >>> >>> Hi >>> >>> I would like to transform ?a data frame like >>> >>> Subject ? ?Item ? Score >>> Subject 1 Item 1 1 >>> Subject 1 Item 2 0 >>> Subject 1 Item 3 1 >>> Subject 2 Item 1 1 >>> Subject 2 Item 2 1 >>> Subject 2 Item 3 0 >>> .... >>> *to * >>> >>> Subject ? ? ?Item1 ? Item2 ? Item3 .....Item N >>> Subject1 ? ? ? 1 ? ? ? ? ?0 ? ? ? 1 >>> Subject2 ? ? ? 1 ? ? ? ? ?1 ? ? ? ?0 >>> ........ >>> SubjectP.. >>> >>> Apologize for the simple nature of my query but I am stuck. How can I do >>> this transformation? >>> >>> Regards >>> >>> Ajay >>> >>> >>> >>> Websites- >>> http://decisionstats.com >>> http://dudeofdata.com >>> >>> >>> Linkedin- www.linkedin.com/in/ajayohri >>> >>> >>> >>> >>> On Sat, Oct 30, 2010 at 2:39 PM, Alaios<alaios at yahoo.com> ?wrote: >>> >>> ?Hello everyone. >>>> I have written quite a big function that at the end correctly returns the >>>> values >>>> I want. I found a rare exception that I want to cover also. The easier >>>> for >>>> me >>>> would be to write something like that >>>> >>>> >>>> function(){ >>>> >>>> ?if (rare exception happened) >>>> ? ? ?return that value >>>> >>>> ?# The comes the code for normal execution >>>> ?# ... >>>> ?# ... >>>> ?return value # Normal values to return >>>> >>>> } >>>> >>>> >>>> Would that be feasible with R or two returns statements are not accepted? >>>> >>>> Regards >>>> Alex >>>> >>> > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org
Dennis Murphy
2010-Nov-01 18:55 UTC
[R] transforming a dataset for association analysis RESHAPE2
Hi: xtabs() also works in this case:> dat <- read.table(textConnection('Subject Item Score+ "Subject 1" "Item 1" 1 + "Subject 1" "Item 2" 0 + "Subject 1" "Item 3" 1 + "Subject 2" "Item 1" 1 + "Subject 2" "Item 2" 1 + "Subject 2" "Item 3" 0'), header=TRUE)> closeAllConnections() > > acast(dat, Subject~Item)Using Score as value column: use value_var to override. Item 1 Item 2 Item 3 Subject 1 1 0 1 Subject 2 1 1 0> xtabs(Score ~ Subject + Item, data = dat)Item Subject Item 1 Item 2 Item 3 Subject 1 1 0 1 Subject 2 1 1 0> df <- data.frame(Subject+ c("Subject 1","Subject 1","Subject 1","Subject 1",+ "Subject 2","Subject 2","Subject 2","Subject 2"), + Item + c("Item 1","Item 2","Item 3","Item 4", + "Item 1","Item 2","Item 3","Item 4"), + Score = c(1,0,1,1,1,1,0,0))> xtabs(Score ~ Subject + Item, data = df)Item Subject Item 1 Item 2 Item 3 Item 4 Subject 1 1 0 1 1 Subject 2 1 1 0 0 HTH, Dennis On Mon, Nov 1, 2010 at 7:39 AM, Ajay Ohri <ohri2007@gmail.com> wrote:> I get the following message when using the reshape2 package line > > > tDat.m<- melt(Dataset) > Using Item, Subject as id variables > > tDatCast<- acast(tDat.m,Subject~Item) > Aggregation function missing: defaulting to length > > > Note Problem Statement- > > convert dataframe > > > Subject Item Score > 1 Subject 1 Item 1 1 > 2 Subject 1 Item 2 0 > 3 Subject 1 Item 3 1 > 4 Subject 2 Item 1 1 > 5 Subject 2 Item 2 1 > 6 Subject 2 Item 3 0 > > to > > > Subject Item 1 Item 2 Item 3 Item 4 > 1 Subject 1 1 0 1 1 > 5 Subject 2 1 1 0 0 > > Note- when I tried using the "wide" method the resultant vector went out of > memory- its a dataset appox 100,000 lines > > > > Websites- > http://decisionstats.com > http://dudeofdata.com > > > Linkedin- www.linkedin.com/in/ajayohri > > > > > On Sat, Oct 30, 2010 at 5:41 PM, Rainer Hurling <rhurlin@gwdg.de> wrote: > > > On 30.10.2010 13:50 (UTC+1), Santosh Srinivas wrote: > > > >> A more usable problem input would definitely help ... use dput to send a > >> reproducible sample to the group > >> > >> Think the below should solve your problem > >> > >> read.csv("Book1.csv") > >>> > >> Subject Item Score > >> 1 Subject 1 Item 1 1 > >> 2 Subject 1 Item 2 0 > >> 3 Subject 1 Item 3 1 > >> 4 Subject 2 Item 1 1 > >> 5 Subject 2 Item 2 1 > >> 6 Subject 2 Item 3 0 > >> > >> library("reshape2") > >>> tDat.m<- melt(tDat) > >>> > >> > >> tDatCast<- acast(tDat.m,Subject~Item) > >>> tDatCast > >>> > >> Item 1 Item 2 Item 3 > >> Subject 1 1 0 1 > >> Subject 2 1 1 0 > >> > > > > > > # Or without using package reshape2, only function reshape from stats: > > > > df <- data.frame(Subject> > c("Subject 1","Subject 1","Subject 1","Subject 1", > > "Subject 2","Subject 2","Subject 2","Subject 2"), > > Item > > c("Item 1","Item 2","Item 3","Item 4", > > "Item 1","Item 2","Item 3","Item 4"), > > Score = c(1,0,1,1,1,1,0,0)) > > > > df.wide <- reshape(df, idvar="Subject", timevar="Item", direction="wide") > > names(df.wide) <- c("Subject",unique(as.character(df$Item))) > > > > df.wide > > Subject Item 1 Item 2 Item 3 Item 4 > > 1 Subject 1 1 0 1 1 > > 5 Subject 2 1 1 0 0 > > > > > > > > -----Original Message----- > >> From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org > ] > >> On > >> Behalf Of Ajay Ohri > >> Sent: 30 October 2010 16:27 > >> To: Rhelp > >> Subject: [R] transforming a dataset for association analysis > >> > >> Hi > >> > >> I would like to transform a data frame like > >> > >> Subject Item Score > >> Subject 1 Item 1 1 > >> Subject 1 Item 2 0 > >> Subject 1 Item 3 1 > >> Subject 2 Item 1 1 > >> Subject 2 Item 2 1 > >> Subject 2 Item 3 0 > >> .... > >> *to * > >> > >> Subject Item1 Item2 Item3 .....Item N > >> Subject1 1 0 1 > >> Subject2 1 1 0 > >> ........ > >> SubjectP.. > >> > >> Apologize for the simple nature of my query but I am stuck. How can I do > >> this transformation? > >> > >> Regards > >> > >> Ajay > >> > >> > >> > >> Websites- > >> http://decisionstats.com > >> http://dudeofdata.com > >> > >> > >> Linkedin- www.linkedin.com/in/ajayohri > >> > >> > >> > >> > >> On Sat, Oct 30, 2010 at 2:39 PM, Alaios<alaios@yahoo.com> wrote: > >> > >> Hello everyone. > >>> I have written quite a big function that at the end correctly returns > the > >>> values > >>> I want. I found a rare exception that I want to cover also. The easier > >>> for > >>> me > >>> would be to write something like that > >>> > >>> > >>> function(){ > >>> > >>> if (rare exception happened) > >>> return that value > >>> > >>> # The comes the code for normal execution > >>> # ... > >>> # ... > >>> return value # Normal values to return > >>> > >>> } > >>> > >>> > >>> Would that be feasible with R or two returns statements are not > accepted? > >>> > >>> Regards > >>> Alex > >>> > >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]