thr3ads.net - R help - [R] transforming a dataset for association analysis RESHAPE2 [Nov 2010]

If this information is useful, please help other people find it:
Share via:

Ajay Ohri

2010-Nov-01 14:39 UTC

[R] transforming a dataset for association analysis RESHAPE2

I get the following message when using the reshape2 package line
> tDat.m<- melt(Dataset)
Using Item, Subject as id variables> tDatCast<- acast(tDat.m,Subject~Item)Aggregation function missing: defaulting to length


Note Problem Statement-

convert dataframe


Subject   Item Score
1 Subject 1 Item 1     1
2 Subject 1 Item 2     0
3 Subject 1 Item 3     1
4 Subject 2 Item 1     1
5 Subject 2 Item 2     1
6 Subject 2 Item 3     0

to


  Subject Item 1 Item 2 Item 3 Item 4
1 Subject 1      1      0      1      1
5 Subject 2      1      1      0      0

Note- when I tried using the "wide" method the resultant vector went
out of
memory- its a dataset appox 100,000 lines



Websites-
http://decisionstats.com
http://dudeofdata.com


Linkedin- www.linkedin.com/in/ajayohri




On Sat, Oct 30, 2010 at 5:41 PM, Rainer Hurling <rhurlin@gwdg.de> wrote:
> On 30.10.2010 13:50 (UTC+1), Santosh Srinivas wrote:
>
>> A more usable problem input would definitely help ... use dput to send
a
>> reproducible sample to the group
>>
>> Think the below should solve your problem
>>
>>  read.csv("Book1.csv")
>>>
>>     Subject   Item Score
>> 1 Subject 1 Item 1     1
>> 2 Subject 1 Item 2     0
>> 3 Subject 1 Item 3     1
>> 4 Subject 2 Item 1     1
>> 5 Subject 2 Item 2     1
>> 6 Subject 2 Item 3     0
>>
>>  library("reshape2")
>>> tDat.m<- melt(tDat)
>>>
>>
>>  tDatCast<- acast(tDat.m,Subject~Item)
>>> tDatCast
>>>
>>           Item 1 Item 2 Item 3
>> Subject 1      1      0      1
>> Subject 2      1      1      0
>>
>
>
> # Or without using package reshape2, only function reshape from stats:
>
> df <- data.frame(Subject>                   c("Subject
1","Subject 1","Subject 1","Subject 1",
>                     "Subject 2","Subject
2","Subject 2","Subject 2"),
>                 Item   >                   c("Item
1","Item 2","Item 3","Item 4",
>                     "Item 1","Item 2","Item
3","Item 4"),
>                 Score  = c(1,0,1,1,1,1,0,0))
>
> df.wide <- reshape(df, idvar="Subject",
timevar="Item", direction="wide")
> names(df.wide) <- c("Subject",unique(as.character(df$Item)))
>
> df.wide
>    Subject Item 1 Item 2 Item 3 Item 4
> 1 Subject 1      1      0      1      1
> 5 Subject 2      1      1      0      0
>
>
>
>  -----Original Message-----
>> From: r-help-bounces@r-project.org
[mailto:r-help-bounces@r-project.org]
>> On
>> Behalf Of Ajay Ohri
>> Sent: 30 October 2010 16:27
>> To: Rhelp
>> Subject: [R] transforming a dataset for association analysis
>>
>> Hi
>>
>> I would like to transform  a data frame like
>>
>> Subject    Item   Score
>> Subject 1 Item 1 1
>> Subject 1 Item 2 0
>> Subject 1 Item 3 1
>> Subject 2 Item 1 1
>> Subject 2 Item 2 1
>> Subject 2 Item 3 0
>> ....
>> *to *
>>
>> Subject      Item1   Item2   Item3 .....Item N
>> Subject1       1          0       1
>> Subject2       1          1        0
>> ........
>> SubjectP..
>>
>> Apologize for the simple nature of my query but I am stuck. How can I
do
>> this transformation?
>>
>> Regards
>>
>> Ajay
>>
>>
>>
>> Websites-
>> http://decisionstats.com
>> http://dudeofdata.com
>>
>>
>> Linkedin- www.linkedin.com/in/ajayohri
>>
>>
>>
>>
>> On Sat, Oct 30, 2010 at 2:39 PM, Alaios<alaios@yahoo.com>  wrote:
>>
>>  Hello everyone.
>>> I have written quite a big function that at the end correctly
returns the
>>> values
>>> I want. I found a rare exception that I want to cover also. The
easier
>>> for
>>> me
>>> would be to write something like that
>>>
>>>
>>> function(){
>>>
>>>  if (rare exception happened)
>>>      return that value
>>>
>>>  # The comes the code for normal execution
>>>  # ...
>>>  # ...
>>>  return value # Normal values to return
>>>
>>> }
>>>
>>>
>>> Would that be feasible with R or two returns statements are not
accepted?
>>>
>>> Regards
>>> Alex
>>>
>>
	[[alternative HTML version deleted]]

Ista Zahn

2010-Nov-01 15:14 UTC

head link

[R] transforming a dataset for association analysis RESHAPE2

Hi Ajay,
I'm not sure what the problem is, and I don't think your description
is enough to reproduce it. This works fine for me


library(reshape2)

dat <- read.table(textConnection('Subject   Item Score
"Subject 1" "Item 1"     1
"Subject 1" "Item 2"     0
"Subject 1" "Item 3"     1
"Subject 2" "Item 1"     1
"Subject 2" "Item 2"     1
"Subject 2" "Item 3"     0'), header=TRUE)
closeAllConnections()

acast(dat, Subject~Item)

sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: i686-pc-linux-gnu (32-bit)

locale:
 [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
 [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
 [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8
 [7] LC_PAPER=en_US.utf8       LC_NAME=C
 [9] LC_ADDRESS=C              LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] reshape2_1.0  ggplot2_0.8.8 proto_0.3-8   reshape_0.8.3 plyr_1.2.1

loaded via a namespace (and not attached):
[1] stringr_0.4  tools_2.12.0

-Ista

On Mon, Nov 1, 2010 at 10:39 AM, Ajay Ohri <ohri2007 at gmail.com>
wrote:> I get the following message when using the reshape2 package line
>
>> tDat.m<- melt(Dataset)
> Using Item, Subject as id variables
>> tDatCast<- acast(tDat.m,Subject~Item)
> Aggregation function missing: defaulting to length
>
>
> Note Problem Statement-
>
> convert dataframe
>
>
> Subject ? Item Score
> 1 Subject 1 Item 1 ? ? 1
> 2 Subject 1 Item 2 ? ? 0
> 3 Subject 1 Item 3 ? ? 1
> 4 Subject 2 Item 1 ? ? 1
> 5 Subject 2 Item 2 ? ? 1
> 6 Subject 2 Item 3 ? ? 0
>
> to
>
>
> ?Subject Item 1 Item 2 Item 3 Item 4
> 1 Subject 1 ? ? ?1 ? ? ?0 ? ? ?1 ? ? ?1
> 5 Subject 2 ? ? ?1 ? ? ?1 ? ? ?0 ? ? ?0
>
> Note- when I tried using the "wide" method the resultant vector
went out of
> memory- its a dataset appox 100,000 lines
>
>
>
> Websites-
> http://decisionstats.com
> http://dudeofdata.com
>
>
> Linkedin- www.linkedin.com/in/ajayohri
>
>
>
>
> On Sat, Oct 30, 2010 at 5:41 PM, Rainer Hurling <rhurlin at gwdg.de>
wrote:
>
>> On 30.10.2010 13:50 (UTC+1), Santosh Srinivas wrote:
>>
>>> A more usable problem input would definitely help ... use dput to
send a
>>> reproducible sample to the group
>>>
>>> Think the below should solve your problem
>>>
>>> ?read.csv("Book1.csv")
>>>>
>>> ? ? Subject ? Item Score
>>> 1 Subject 1 Item 1 ? ? 1
>>> 2 Subject 1 Item 2 ? ? 0
>>> 3 Subject 1 Item 3 ? ? 1
>>> 4 Subject 2 Item 1 ? ? 1
>>> 5 Subject 2 Item 2 ? ? 1
>>> 6 Subject 2 Item 3 ? ? 0
>>>
>>> ?library("reshape2")
>>>> tDat.m<- melt(tDat)
>>>>
>>>
>>> ?tDatCast<- acast(tDat.m,Subject~Item)
>>>> tDatCast
>>>>
>>> ? ? ? ? ? Item 1 Item 2 Item 3
>>> Subject 1 ? ? ?1 ? ? ?0 ? ? ?1
>>> Subject 2 ? ? ?1 ? ? ?1 ? ? ?0
>>>
>>
>>
>> # Or without using package reshape2, only function reshape from stats:
>>
>> df <- data.frame(Subject>> ? ? ? ? ? ? ? ? ? c("Subject
1","Subject 1","Subject 1","Subject 1",
>> ? ? ? ? ? ? ? ? ? ? "Subject 2","Subject
2","Subject 2","Subject 2"),
>> ? ? ? ? ? ? ? ? Item ? >> ? ? ? ? ? ? ? ? ? c("Item
1","Item 2","Item 3","Item 4",
>> ? ? ? ? ? ? ? ? ? ? "Item 1","Item 2","Item
3","Item 4"),
>> ? ? ? ? ? ? ? ? Score ?= c(1,0,1,1,1,1,0,0))
>>
>> df.wide <- reshape(df, idvar="Subject",
timevar="Item", direction="wide")
>> names(df.wide) <-
c("Subject",unique(as.character(df$Item)))
>>
>> df.wide
>> ? ?Subject Item 1 Item 2 Item 3 Item 4
>> 1 Subject 1 ? ? ?1 ? ? ?0 ? ? ?1 ? ? ?1
>> 5 Subject 2 ? ? ?1 ? ? ?1 ? ? ?0 ? ? ?0
>>
>>
>>
>> ?-----Original Message-----
>>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at
r-project.org]
>>> On
>>> Behalf Of Ajay Ohri
>>> Sent: 30 October 2010 16:27
>>> To: Rhelp
>>> Subject: [R] transforming a dataset for association analysis
>>>
>>> Hi
>>>
>>> I would like to transform ?a data frame like
>>>
>>> Subject ? ?Item ? Score
>>> Subject 1 Item 1 1
>>> Subject 1 Item 2 0
>>> Subject 1 Item 3 1
>>> Subject 2 Item 1 1
>>> Subject 2 Item 2 1
>>> Subject 2 Item 3 0
>>> ....
>>> *to *
>>>
>>> Subject ? ? ?Item1 ? Item2 ? Item3 .....Item N
>>> Subject1 ? ? ? 1 ? ? ? ? ?0 ? ? ? 1
>>> Subject2 ? ? ? 1 ? ? ? ? ?1 ? ? ? ?0
>>> ........
>>> SubjectP..
>>>
>>> Apologize for the simple nature of my query but I am stuck. How can
I do
>>> this transformation?
>>>
>>> Regards
>>>
>>> Ajay
>>>
>>>
>>>
>>> Websites-
>>> http://decisionstats.com
>>> http://dudeofdata.com
>>>
>>>
>>> Linkedin- www.linkedin.com/in/ajayohri
>>>
>>>
>>>
>>>
>>> On Sat, Oct 30, 2010 at 2:39 PM, Alaios<alaios at yahoo.com>
?wrote:
>>>
>>> ?Hello everyone.
>>>> I have written quite a big function that at the end correctly
returns the
>>>> values
>>>> I want. I found a rare exception that I want to cover also. The
easier
>>>> for
>>>> me
>>>> would be to write something like that
>>>>
>>>>
>>>> function(){
>>>>
>>>> ?if (rare exception happened)
>>>> ? ? ?return that value
>>>>
>>>> ?# The comes the code for normal execution
>>>> ?# ...
>>>> ?# ...
>>>> ?return value # Normal values to return
>>>>
>>>> }
>>>>
>>>>
>>>> Would that be feasible with R or two returns statements are not
accepted?
>>>>
>>>> Regards
>>>> Alex
>>>>
>>>
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

Dennis Murphy

2010-Nov-01 18:55 UTC

head link

[R] transforming a dataset for association analysis RESHAPE2

Hi:

xtabs() also works in this case:
> dat <- read.table(textConnection('Subject   Item Score+ "Subject 1" "Item 1"     1
+ "Subject 1" "Item 2"     0
+ "Subject 1" "Item 3"     1
+ "Subject 2" "Item 1"     1
+ "Subject 2" "Item 2"     1
+ "Subject 2" "Item 3"     0'),
header=TRUE)> closeAllConnections()
>
> acast(dat, Subject~Item)Using Score as value column: use value_var to override.
          Item 1 Item 2 Item 3
Subject 1      1      0      1
Subject 2      1      1      0> xtabs(Score ~ Subject + Item, data = dat)           Item
Subject     Item 1 Item 2 Item 3
  Subject 1      1      0      1
  Subject 2      1      1      0> df <- data.frame(Subject+                   c("Subject
1","Subject 1","Subject 1","Subject 1",+                     "Subject 2","Subject 2","Subject
2","Subject 2"),
+                 Item   +                   c("Item 1","Item
2","Item 3","Item 4",
+                     "Item 1","Item 2","Item
3","Item 4"),
+                 Score  = c(1,0,1,1,1,1,0,0))> xtabs(Score ~ Subject + Item, data = df)           Item
Subject     Item 1 Item 2 Item 3 Item 4
  Subject 1      1      0      1      1
  Subject 2      1      1      0      0

HTH,
Dennis

On Mon, Nov 1, 2010 at 7:39 AM, Ajay Ohri <ohri2007@gmail.com> wrote:
> I get the following message when using the reshape2 package line
>
> > tDat.m<- melt(Dataset)
> Using Item, Subject as id variables
> > tDatCast<- acast(tDat.m,Subject~Item)
> Aggregation function missing: defaulting to length
>
>
> Note Problem Statement-
>
> convert dataframe
>
>
> Subject   Item Score
> 1 Subject 1 Item 1     1
> 2 Subject 1 Item 2     0
> 3 Subject 1 Item 3     1
> 4 Subject 2 Item 1     1
> 5 Subject 2 Item 2     1
> 6 Subject 2 Item 3     0
>
> to
>
>
>  Subject Item 1 Item 2 Item 3 Item 4
> 1 Subject 1      1      0      1      1
> 5 Subject 2      1      1      0      0
>
> Note- when I tried using the "wide" method the resultant vector
went out of
> memory- its a dataset appox 100,000 lines
>
>
>
> Websites-
> http://decisionstats.com
> http://dudeofdata.com
>
>
> Linkedin- www.linkedin.com/in/ajayohri
>
>
>
>
> On Sat, Oct 30, 2010 at 5:41 PM, Rainer Hurling <rhurlin@gwdg.de>
wrote:
>
> > On 30.10.2010 13:50 (UTC+1), Santosh Srinivas wrote:
> >
> >> A more usable problem input would definitely help ... use dput to
send a
> >> reproducible sample to the group
> >>
> >> Think the below should solve your problem
> >>
> >>  read.csv("Book1.csv")
> >>>
> >>     Subject   Item Score
> >> 1 Subject 1 Item 1     1
> >> 2 Subject 1 Item 2     0
> >> 3 Subject 1 Item 3     1
> >> 4 Subject 2 Item 1     1
> >> 5 Subject 2 Item 2     1
> >> 6 Subject 2 Item 3     0
> >>
> >>  library("reshape2")
> >>> tDat.m<- melt(tDat)
> >>>
> >>
> >>  tDatCast<- acast(tDat.m,Subject~Item)
> >>> tDatCast
> >>>
> >>           Item 1 Item 2 Item 3
> >> Subject 1      1      0      1
> >> Subject 2      1      1      0
> >>
> >
> >
> > # Or without using package reshape2, only function reshape from stats:
> >
> > df <- data.frame(Subject> >                   c("Subject
1","Subject 1","Subject 1","Subject 1",
> >                     "Subject 2","Subject
2","Subject 2","Subject 2"),
> >                 Item   > >                   c("Item
1","Item 2","Item 3","Item 4",
> >                     "Item 1","Item 2","Item
3","Item 4"),
> >                 Score  = c(1,0,1,1,1,1,0,0))
> >
> > df.wide <- reshape(df, idvar="Subject",
timevar="Item", direction="wide")
> > names(df.wide) <-
c("Subject",unique(as.character(df$Item)))
> >
> > df.wide
> >    Subject Item 1 Item 2 Item 3 Item 4
> > 1 Subject 1      1      0      1      1
> > 5 Subject 2      1      1      0      0
> >
> >
> >
> >  -----Original Message-----
> >> From: r-help-bounces@r-project.org
[mailto:r-help-bounces@r-project.org
> ]
> >> On
> >> Behalf Of Ajay Ohri
> >> Sent: 30 October 2010 16:27
> >> To: Rhelp
> >> Subject: [R] transforming a dataset for association analysis
> >>
> >> Hi
> >>
> >> I would like to transform  a data frame like
> >>
> >> Subject    Item   Score
> >> Subject 1 Item 1 1
> >> Subject 1 Item 2 0
> >> Subject 1 Item 3 1
> >> Subject 2 Item 1 1
> >> Subject 2 Item 2 1
> >> Subject 2 Item 3 0
> >> ....
> >> *to *
> >>
> >> Subject      Item1   Item2   Item3 .....Item N
> >> Subject1       1          0       1
> >> Subject2       1          1        0
> >> ........
> >> SubjectP..
> >>
> >> Apologize for the simple nature of my query but I am stuck. How
can I do
> >> this transformation?
> >>
> >> Regards
> >>
> >> Ajay
> >>
> >>
> >>
> >> Websites-
> >> http://decisionstats.com
> >> http://dudeofdata.com
> >>
> >>
> >> Linkedin- www.linkedin.com/in/ajayohri
> >>
> >>
> >>
> >>
> >> On Sat, Oct 30, 2010 at 2:39 PM, Alaios<alaios@yahoo.com> 
wrote:
> >>
> >>  Hello everyone.
> >>> I have written quite a big function that at the end correctly
returns
> the
> >>> values
> >>> I want. I found a rare exception that I want to cover also.
The easier
> >>> for
> >>> me
> >>> would be to write something like that
> >>>
> >>>
> >>> function(){
> >>>
> >>>  if (rare exception happened)
> >>>      return that value
> >>>
> >>>  # The comes the code for normal execution
> >>>  # ...
> >>>  # ...
> >>>  return value # Normal values to return
> >>>
> >>> }
> >>>
> >>>
> >>> Would that be feasible with R or two returns statements are
not
> accepted?
> >>>
> >>> Regards
> >>> Alex
> >>>
> >>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Apparently Analagous Threads

Search for more seemingly similar threads

R help - Nov 2010 - transforming a dataset for association analysis RESHAPE2

[R] transforming a dataset for association analysis RESHAPE2

[R] transforming a dataset for association analysis RESHAPE2

[R] transforming a dataset for association analysis RESHAPE2

Apparently Analagous Threads