Ajay Ohri
2010-Nov-01 14:39 UTC
[R] transforming a dataset for association analysis RESHAPE2
I get the following message when using the reshape2 package line> tDat.m<- melt(Dataset)Using Item, Subject as id variables> tDatCast<- acast(tDat.m,Subject~Item)Aggregation function missing: defaulting to length Note Problem Statement- convert dataframe Subject Item Score 1 Subject 1 Item 1 1 2 Subject 1 Item 2 0 3 Subject 1 Item 3 1 4 Subject 2 Item 1 1 5 Subject 2 Item 2 1 6 Subject 2 Item 3 0 to Subject Item 1 Item 2 Item 3 Item 4 1 Subject 1 1 0 1 1 5 Subject 2 1 1 0 0 Note- when I tried using the "wide" method the resultant vector went out of memory- its a dataset appox 100,000 lines Websites- http://decisionstats.com http://dudeofdata.com Linkedin- www.linkedin.com/in/ajayohri On Sat, Oct 30, 2010 at 5:41 PM, Rainer Hurling <rhurlin@gwdg.de> wrote:> On 30.10.2010 13:50 (UTC+1), Santosh Srinivas wrote: > >> A more usable problem input would definitely help ... use dput to send a >> reproducible sample to the group >> >> Think the below should solve your problem >> >> read.csv("Book1.csv") >>> >> Subject Item Score >> 1 Subject 1 Item 1 1 >> 2 Subject 1 Item 2 0 >> 3 Subject 1 Item 3 1 >> 4 Subject 2 Item 1 1 >> 5 Subject 2 Item 2 1 >> 6 Subject 2 Item 3 0 >> >> library("reshape2") >>> tDat.m<- melt(tDat) >>> >> >> tDatCast<- acast(tDat.m,Subject~Item) >>> tDatCast >>> >> Item 1 Item 2 Item 3 >> Subject 1 1 0 1 >> Subject 2 1 1 0 >> > > > # Or without using package reshape2, only function reshape from stats: > > df <- data.frame(Subject> c("Subject 1","Subject 1","Subject 1","Subject 1", > "Subject 2","Subject 2","Subject 2","Subject 2"), > Item > c("Item 1","Item 2","Item 3","Item 4", > "Item 1","Item 2","Item 3","Item 4"), > Score = c(1,0,1,1,1,1,0,0)) > > df.wide <- reshape(df, idvar="Subject", timevar="Item", direction="wide") > names(df.wide) <- c("Subject",unique(as.character(df$Item))) > > df.wide > Subject Item 1 Item 2 Item 3 Item 4 > 1 Subject 1 1 0 1 1 > 5 Subject 2 1 1 0 0 > > > > -----Original Message----- >> From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org] >> On >> Behalf Of Ajay Ohri >> Sent: 30 October 2010 16:27 >> To: Rhelp >> Subject: [R] transforming a dataset for association analysis >> >> Hi >> >> I would like to transform a data frame like >> >> Subject Item Score >> Subject 1 Item 1 1 >> Subject 1 Item 2 0 >> Subject 1 Item 3 1 >> Subject 2 Item 1 1 >> Subject 2 Item 2 1 >> Subject 2 Item 3 0 >> .... >> *to * >> >> Subject Item1 Item2 Item3 .....Item N >> Subject1 1 0 1 >> Subject2 1 1 0 >> ........ >> SubjectP.. >> >> Apologize for the simple nature of my query but I am stuck. How can I do >> this transformation? >> >> Regards >> >> Ajay >> >> >> >> Websites- >> http://decisionstats.com >> http://dudeofdata.com >> >> >> Linkedin- www.linkedin.com/in/ajayohri >> >> >> >> >> On Sat, Oct 30, 2010 at 2:39 PM, Alaios<alaios@yahoo.com> wrote: >> >> Hello everyone. >>> I have written quite a big function that at the end correctly returns the >>> values >>> I want. I found a rare exception that I want to cover also. The easier >>> for >>> me >>> would be to write something like that >>> >>> >>> function(){ >>> >>> if (rare exception happened) >>> return that value >>> >>> # The comes the code for normal execution >>> # ... >>> # ... >>> return value # Normal values to return >>> >>> } >>> >>> >>> Would that be feasible with R or two returns statements are not accepted? >>> >>> Regards >>> Alex >>> >>[[alternative HTML version deleted]]
Ista Zahn
2010-Nov-01 15:14 UTC
[R] transforming a dataset for association analysis RESHAPE2
Hi Ajay,
I'm not sure what the problem is, and I don't think your description
is enough to reproduce it. This works fine for me
library(reshape2)
dat <- read.table(textConnection('Subject Item Score
"Subject 1" "Item 1" 1
"Subject 1" "Item 2" 0
"Subject 1" "Item 3" 1
"Subject 2" "Item 1" 1
"Subject 2" "Item 2" 1
"Subject 2" "Item 3" 0'), header=TRUE)
closeAllConnections()
acast(dat, Subject~Item)
sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: i686-pc-linux-gnu (32-bit)
locale:
[1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C
[3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8
[5] LC_MONETARY=C LC_MESSAGES=en_US.utf8
[7] LC_PAPER=en_US.utf8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
attached base packages:
[1] grid stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] reshape2_1.0 ggplot2_0.8.8 proto_0.3-8 reshape_0.8.3 plyr_1.2.1
loaded via a namespace (and not attached):
[1] stringr_0.4 tools_2.12.0
-Ista
On Mon, Nov 1, 2010 at 10:39 AM, Ajay Ohri <ohri2007 at gmail.com>
wrote:> I get the following message when using the reshape2 package line
>
>> tDat.m<- melt(Dataset)
> Using Item, Subject as id variables
>> tDatCast<- acast(tDat.m,Subject~Item)
> Aggregation function missing: defaulting to length
>
>
> Note Problem Statement-
>
> convert dataframe
>
>
> Subject ? Item Score
> 1 Subject 1 Item 1 ? ? 1
> 2 Subject 1 Item 2 ? ? 0
> 3 Subject 1 Item 3 ? ? 1
> 4 Subject 2 Item 1 ? ? 1
> 5 Subject 2 Item 2 ? ? 1
> 6 Subject 2 Item 3 ? ? 0
>
> to
>
>
> ?Subject Item 1 Item 2 Item 3 Item 4
> 1 Subject 1 ? ? ?1 ? ? ?0 ? ? ?1 ? ? ?1
> 5 Subject 2 ? ? ?1 ? ? ?1 ? ? ?0 ? ? ?0
>
> Note- when I tried using the "wide" method the resultant vector
went out of
> memory- its a dataset appox 100,000 lines
>
>
>
> Websites-
> http://decisionstats.com
> http://dudeofdata.com
>
>
> Linkedin- www.linkedin.com/in/ajayohri
>
>
>
>
> On Sat, Oct 30, 2010 at 5:41 PM, Rainer Hurling <rhurlin at gwdg.de>
wrote:
>
>> On 30.10.2010 13:50 (UTC+1), Santosh Srinivas wrote:
>>
>>> A more usable problem input would definitely help ... use dput to
send a
>>> reproducible sample to the group
>>>
>>> Think the below should solve your problem
>>>
>>> ?read.csv("Book1.csv")
>>>>
>>> ? ? Subject ? Item Score
>>> 1 Subject 1 Item 1 ? ? 1
>>> 2 Subject 1 Item 2 ? ? 0
>>> 3 Subject 1 Item 3 ? ? 1
>>> 4 Subject 2 Item 1 ? ? 1
>>> 5 Subject 2 Item 2 ? ? 1
>>> 6 Subject 2 Item 3 ? ? 0
>>>
>>> ?library("reshape2")
>>>> tDat.m<- melt(tDat)
>>>>
>>>
>>> ?tDatCast<- acast(tDat.m,Subject~Item)
>>>> tDatCast
>>>>
>>> ? ? ? ? ? Item 1 Item 2 Item 3
>>> Subject 1 ? ? ?1 ? ? ?0 ? ? ?1
>>> Subject 2 ? ? ?1 ? ? ?1 ? ? ?0
>>>
>>
>>
>> # Or without using package reshape2, only function reshape from stats:
>>
>> df <- data.frame(Subject>> ? ? ? ? ? ? ? ? ? c("Subject
1","Subject 1","Subject 1","Subject 1",
>> ? ? ? ? ? ? ? ? ? ? "Subject 2","Subject
2","Subject 2","Subject 2"),
>> ? ? ? ? ? ? ? ? Item ? >> ? ? ? ? ? ? ? ? ? c("Item
1","Item 2","Item 3","Item 4",
>> ? ? ? ? ? ? ? ? ? ? "Item 1","Item 2","Item
3","Item 4"),
>> ? ? ? ? ? ? ? ? Score ?= c(1,0,1,1,1,1,0,0))
>>
>> df.wide <- reshape(df, idvar="Subject",
timevar="Item", direction="wide")
>> names(df.wide) <-
c("Subject",unique(as.character(df$Item)))
>>
>> df.wide
>> ? ?Subject Item 1 Item 2 Item 3 Item 4
>> 1 Subject 1 ? ? ?1 ? ? ?0 ? ? ?1 ? ? ?1
>> 5 Subject 2 ? ? ?1 ? ? ?1 ? ? ?0 ? ? ?0
>>
>>
>>
>> ?-----Original Message-----
>>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at
r-project.org]
>>> On
>>> Behalf Of Ajay Ohri
>>> Sent: 30 October 2010 16:27
>>> To: Rhelp
>>> Subject: [R] transforming a dataset for association analysis
>>>
>>> Hi
>>>
>>> I would like to transform ?a data frame like
>>>
>>> Subject ? ?Item ? Score
>>> Subject 1 Item 1 1
>>> Subject 1 Item 2 0
>>> Subject 1 Item 3 1
>>> Subject 2 Item 1 1
>>> Subject 2 Item 2 1
>>> Subject 2 Item 3 0
>>> ....
>>> *to *
>>>
>>> Subject ? ? ?Item1 ? Item2 ? Item3 .....Item N
>>> Subject1 ? ? ? 1 ? ? ? ? ?0 ? ? ? 1
>>> Subject2 ? ? ? 1 ? ? ? ? ?1 ? ? ? ?0
>>> ........
>>> SubjectP..
>>>
>>> Apologize for the simple nature of my query but I am stuck. How can
I do
>>> this transformation?
>>>
>>> Regards
>>>
>>> Ajay
>>>
>>>
>>>
>>> Websites-
>>> http://decisionstats.com
>>> http://dudeofdata.com
>>>
>>>
>>> Linkedin- www.linkedin.com/in/ajayohri
>>>
>>>
>>>
>>>
>>> On Sat, Oct 30, 2010 at 2:39 PM, Alaios<alaios at yahoo.com>
?wrote:
>>>
>>> ?Hello everyone.
>>>> I have written quite a big function that at the end correctly
returns the
>>>> values
>>>> I want. I found a rare exception that I want to cover also. The
easier
>>>> for
>>>> me
>>>> would be to write something like that
>>>>
>>>>
>>>> function(){
>>>>
>>>> ?if (rare exception happened)
>>>> ? ? ?return that value
>>>>
>>>> ?# The comes the code for normal execution
>>>> ?# ...
>>>> ?# ...
>>>> ?return value # Normal values to return
>>>>
>>>> }
>>>>
>>>>
>>>> Would that be feasible with R or two returns statements are not
accepted?
>>>>
>>>> Regards
>>>> Alex
>>>>
>>>
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org
Dennis Murphy
2010-Nov-01 18:55 UTC
[R] transforming a dataset for association analysis RESHAPE2
Hi: xtabs() also works in this case:> dat <- read.table(textConnection('Subject Item Score+ "Subject 1" "Item 1" 1 + "Subject 1" "Item 2" 0 + "Subject 1" "Item 3" 1 + "Subject 2" "Item 1" 1 + "Subject 2" "Item 2" 1 + "Subject 2" "Item 3" 0'), header=TRUE)> closeAllConnections() > > acast(dat, Subject~Item)Using Score as value column: use value_var to override. Item 1 Item 2 Item 3 Subject 1 1 0 1 Subject 2 1 1 0> xtabs(Score ~ Subject + Item, data = dat)Item Subject Item 1 Item 2 Item 3 Subject 1 1 0 1 Subject 2 1 1 0> df <- data.frame(Subject+ c("Subject 1","Subject 1","Subject 1","Subject 1",+ "Subject 2","Subject 2","Subject 2","Subject 2"), + Item + c("Item 1","Item 2","Item 3","Item 4", + "Item 1","Item 2","Item 3","Item 4"), + Score = c(1,0,1,1,1,1,0,0))> xtabs(Score ~ Subject + Item, data = df)Item Subject Item 1 Item 2 Item 3 Item 4 Subject 1 1 0 1 1 Subject 2 1 1 0 0 HTH, Dennis On Mon, Nov 1, 2010 at 7:39 AM, Ajay Ohri <ohri2007@gmail.com> wrote:> I get the following message when using the reshape2 package line > > > tDat.m<- melt(Dataset) > Using Item, Subject as id variables > > tDatCast<- acast(tDat.m,Subject~Item) > Aggregation function missing: defaulting to length > > > Note Problem Statement- > > convert dataframe > > > Subject Item Score > 1 Subject 1 Item 1 1 > 2 Subject 1 Item 2 0 > 3 Subject 1 Item 3 1 > 4 Subject 2 Item 1 1 > 5 Subject 2 Item 2 1 > 6 Subject 2 Item 3 0 > > to > > > Subject Item 1 Item 2 Item 3 Item 4 > 1 Subject 1 1 0 1 1 > 5 Subject 2 1 1 0 0 > > Note- when I tried using the "wide" method the resultant vector went out of > memory- its a dataset appox 100,000 lines > > > > Websites- > http://decisionstats.com > http://dudeofdata.com > > > Linkedin- www.linkedin.com/in/ajayohri > > > > > On Sat, Oct 30, 2010 at 5:41 PM, Rainer Hurling <rhurlin@gwdg.de> wrote: > > > On 30.10.2010 13:50 (UTC+1), Santosh Srinivas wrote: > > > >> A more usable problem input would definitely help ... use dput to send a > >> reproducible sample to the group > >> > >> Think the below should solve your problem > >> > >> read.csv("Book1.csv") > >>> > >> Subject Item Score > >> 1 Subject 1 Item 1 1 > >> 2 Subject 1 Item 2 0 > >> 3 Subject 1 Item 3 1 > >> 4 Subject 2 Item 1 1 > >> 5 Subject 2 Item 2 1 > >> 6 Subject 2 Item 3 0 > >> > >> library("reshape2") > >>> tDat.m<- melt(tDat) > >>> > >> > >> tDatCast<- acast(tDat.m,Subject~Item) > >>> tDatCast > >>> > >> Item 1 Item 2 Item 3 > >> Subject 1 1 0 1 > >> Subject 2 1 1 0 > >> > > > > > > # Or without using package reshape2, only function reshape from stats: > > > > df <- data.frame(Subject> > c("Subject 1","Subject 1","Subject 1","Subject 1", > > "Subject 2","Subject 2","Subject 2","Subject 2"), > > Item > > c("Item 1","Item 2","Item 3","Item 4", > > "Item 1","Item 2","Item 3","Item 4"), > > Score = c(1,0,1,1,1,1,0,0)) > > > > df.wide <- reshape(df, idvar="Subject", timevar="Item", direction="wide") > > names(df.wide) <- c("Subject",unique(as.character(df$Item))) > > > > df.wide > > Subject Item 1 Item 2 Item 3 Item 4 > > 1 Subject 1 1 0 1 1 > > 5 Subject 2 1 1 0 0 > > > > > > > > -----Original Message----- > >> From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org > ] > >> On > >> Behalf Of Ajay Ohri > >> Sent: 30 October 2010 16:27 > >> To: Rhelp > >> Subject: [R] transforming a dataset for association analysis > >> > >> Hi > >> > >> I would like to transform a data frame like > >> > >> Subject Item Score > >> Subject 1 Item 1 1 > >> Subject 1 Item 2 0 > >> Subject 1 Item 3 1 > >> Subject 2 Item 1 1 > >> Subject 2 Item 2 1 > >> Subject 2 Item 3 0 > >> .... > >> *to * > >> > >> Subject Item1 Item2 Item3 .....Item N > >> Subject1 1 0 1 > >> Subject2 1 1 0 > >> ........ > >> SubjectP.. > >> > >> Apologize for the simple nature of my query but I am stuck. How can I do > >> this transformation? > >> > >> Regards > >> > >> Ajay > >> > >> > >> > >> Websites- > >> http://decisionstats.com > >> http://dudeofdata.com > >> > >> > >> Linkedin- www.linkedin.com/in/ajayohri > >> > >> > >> > >> > >> On Sat, Oct 30, 2010 at 2:39 PM, Alaios<alaios@yahoo.com> wrote: > >> > >> Hello everyone. > >>> I have written quite a big function that at the end correctly returns > the > >>> values > >>> I want. I found a rare exception that I want to cover also. The easier > >>> for > >>> me > >>> would be to write something like that > >>> > >>> > >>> function(){ > >>> > >>> if (rare exception happened) > >>> return that value > >>> > >>> # The comes the code for normal execution > >>> # ... > >>> # ... > >>> return value # Normal values to return > >>> > >>> } > >>> > >>> > >>> Would that be feasible with R or two returns statements are not > accepted? > >>> > >>> Regards > >>> Alex > >>> > >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]