thr3ads.net - R help - [R] Beginner question: select cases [Sep 2006]

If this information is useful, please help other people find it:
Share via:

Peter Wolkerstorfer - CURE

2006-Sep-25 11:51 UTC

[R] Beginner question: select cases

Hello all,

I hope i chose the right list as my question is a beginner-question.

I have a data set with 3 colums  "London", "Rome" and
"Vienna" - the
location is presented through a 1 like this:
London 	Rome 	Vienna	q1
0		0	1		4
0		1	0		2	
1		0	0		3
....
....
....

I just want to calculate the means of a variable q1.

I tried following script:

# calculate the mean of all locations
results <- subset(results, subset== 1 )
mean(results$q1)
# calculate the mean of London
results <- subset(results, subset== 1 , select=c(London))
mean(results$q1)
# calculate the mean of Rome
results <- subset(results, subset== 1 , select=c(Rome))
mean(results$q1)
# calcualate the mean of Vienna
results <- subset(results, subset== 1 , select=c(Vienna))
mean(results$q1)

As all results are 1.68 and there is defenitely a difference in the
three locations I wonder whats going on.
I get confused as the Rcmdr asks me to overwrite things and there is no
"just filter" option.

Any help would be apprechiated. Thank you in advance.

Regards
Peter



___CURE - Center for Usability Research & Engineering___
 
Peter Wolkerstorfer
Usability Engineer
Hauffgasse 3-5, 1110 Wien, Austria
 
[Tel]  +43.1.743 54 51.46
[Fax]  +43.1.743 54 51.30
 
[Mail] wolkerstorfer at cure.at
[Web]  http://www.cure.at

Doran, Harold

2006-Sep-25 12:11 UTC

head link

[R] Beginner question: select cases

Peter,

There is a much easier way to do this. First, you should consider
organizing your data as follows:

set.seed(1) # for replication only

# Here is a sample dataframe
tmp <- data.frame(city = gl(3,10, label = c("London",
"Rome","Vienna"
)), q1 = rnorm(30))

# Compute the means
with(tmp, tapply(q1,city, mean))
  London       Rome     Vienna 
 0.1322028  0.2488450 -0.1336732 

I hope this helps
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Peter 
> Wolkerstorfer - CURE
> Sent: Monday, September 25, 2006 7:51 AM
> To: r-help at stat.math.ethz.ch
> Subject: [R] Beginner question: select cases
> 
> Hello all,
> 
> I hope i chose the right list as my question is a beginner-question.
> 
> I have a data set with 3 colums  "London", "Rome" and 
> "Vienna" - the location is presented through a 1 like this:
> London 	Rome 	Vienna	q1
> 0		0	1		4
> 0		1	0		2	
> 1		0	0		3
> ....
> ....
> ....
> 
> I just want to calculate the means of a variable q1.
> 
> I tried following script:
> 
> # calculate the mean of all locations
> results <- subset(results, subset== 1 )
> mean(results$q1)
> # calculate the mean of London
> results <- subset(results, subset== 1 , select=c(London))
> mean(results$q1)
> # calculate the mean of Rome
> results <- subset(results, subset== 1 , select=c(Rome))
> mean(results$q1)
> # calcualate the mean of Vienna
> results <- subset(results, subset== 1 , select=c(Vienna))
> mean(results$q1)
> 
> As all results are 1.68 and there is defenitely a difference 
> in the three locations I wonder whats going on.
> I get confused as the Rcmdr asks me to overwrite things and 
> there is no "just filter" option.
> 
> Any help would be apprechiated. Thank you in advance.
> 
> Regards
> Peter
> 
> 
> 
> ___CURE - Center for Usability Research & Engineering___
>  
> Peter Wolkerstorfer
> Usability Engineer
> Hauffgasse 3-5, 1110 Wien, Austria
>  
> [Tel]  +43.1.743 54 51.46
> [Fax]  +43.1.743 54 51.30
>  
> [Mail] wolkerstorfer at cure.at
> [Web]  http://www.cure.at
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

ONKELINX, Thierry

2006-Sep-25 12:11 UTC

head link

[R] Beginner question: select cases

Your problem would be a lot easier if you coded the location in one
variable instead of three variables. Then you could calculate the means
with one line of code:

by(results$q1, results$location, mean)

With your dataset you could use
by(results$London, results$location, mean)
by(results$Rome, results$location, mean)
by(results$Vienna, results$location, mean)

see ?by for more information

And take a good look at your code. You take a subset from results and
the assign it to results. This means that you replace the original
results dataframe with a subset of it. As you take the subset for the
next city, you won't take a subset from the original dataset but for the
previous subset!

Cheers,

Thierry
------------------------------------------------------------------------
----

ir. Thierry Onkelinx

Instituut voor natuur- en bosonderzoek / Reseach Institute for Nature
and Forest

Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance

Gaverstraat 4

9500 Geraardsbergen

Belgium

tel. + 32 54/436 185

Thierry.Onkelinx op inbo.be

www.inbo.be 


-----Oorspronkelijk bericht-----
Van: r-help-bounces op stat.math.ethz.ch
[mailto:r-help-bounces op stat.math.ethz.ch] Namens Peter Wolkerstorfer -
CURE
Verzonden: maandag 25 september 2006 13:51
Aan: r-help op stat.math.ethz.ch
Onderwerp: [R] Beginner question: select cases

Hello all,

I hope i chose the right list as my question is a beginner-question.

I have a data set with 3 colums  "London", "Rome" and
"Vienna" - the
location is presented through a 1 like this:
London 	Rome 	Vienna	q1
0		0	1		4
0		1	0		2	
1		0	0		3
....
....
....

I just want to calculate the means of a variable q1.

I tried following script:

# calculate the mean of all locations
results <- subset(results, subset== 1 )
mean(results$q1)
# calculate the mean of London
results <- subset(results, subset== 1 , select=c(London))
mean(results$q1)
# calculate the mean of Rome
results <- subset(results, subset== 1 , select=c(Rome))
mean(results$q1)
# calcualate the mean of Vienna
results <- subset(results, subset== 1 , select=c(Vienna))
mean(results$q1)

As all results are 1.68 and there is defenitely a difference in the
three locations I wonder whats going on.
I get confused as the Rcmdr asks me to overwrite things and there is no
"just filter" option.

Any help would be apprechiated. Thank you in advance.

Regards
Peter



___CURE - Center for Usability Research & Engineering___
 
Peter Wolkerstorfer
Usability Engineer
Hauffgasse 3-5, 1110 Wien, Austria
 
[Tel]  +43.1.743 54 51.46
[Fax]  +43.1.743 54 51.30
 
[Mail] wolkerstorfer op cure.at
[Web]  http://www.cure.at

______________________________________________
R-help op stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

John Kane

2006-Sep-25 13:20 UTC

head link

[R] Beginner question: select cases

--- Peter Wolkerstorfer - CURE <wolkerstorfer at cure.at>
wrote:
> Hello all,
> 
> I hope i chose the right list as my question is a
> beginner-question.
> 
> I have a data set with 3 colums  "London", "Rome"
> and "Vienna" - the
> location is presented through a 1 like this:
> London 	Rome 	Vienna	q1
> 0		0	1		4
> 0		1	0		2	
> 1		0	0		3
> ....
> ....
> ....
> 
> I just want to calculate the means of a variable q1.
> 
> I tried following script:
> 
> # calculate the mean of all locations
> results <- subset(results, subset== 1 )
> mean(results$q1)
> # calculate the mean of London
> results <- subset(results, subset== 1 ,
> select=c(London))
> mean(results$q1)
> # calculate the mean of Rome
> results <- subset(results, subset== 1 ,
> select=c(Rome))
> mean(results$q1)
> # calcualate the mean of Vienna
> results <- subset(results, subset== 1 ,
> select=c(Vienna))
> mean(results$q1)
> 
> As all results are 1.68 and there is defenitely a
> difference in the
> three locations I wonder whats going on.
> I get confused as the Rcmdr asks me to overwrite
> things and there is no
> "just filter" option.
> 
> Any help would be apprechiated. Thank you in
> advance.
> 
> Regards
> Peter

I'm new at R also.  However I don't recognize your
syntax. I have not seen select used here. 

Try 
results <- subset(results, London==1 )

justin bem

2006-Sep-25 14:37 UTC

head link

[R] RE : Beginner question: select cases

Un texte encapsul? et encod? dans un jeu de caract?res inconnu a ?t? nettoy?...
Nom : non disponible
Url :
https://stat.ethz.ch/pipermail/r-help/attachments/20060925/1fc20328/attachment.pl

Maybe Matching Threads

Search for more maybe matching threads

R help - Sep 2006 - Beginner question: select cases

[R] Beginner question: select cases

[R] Beginner question: select cases

[R] Beginner question: select cases

[R] Beginner question: select cases

[R] RE : Beginner question: select cases

Maybe Matching Threads