similar to: Comparison of aggregate in R and group by in mysql

Displaying 20 results from an estimated 3000 matches similar to: "Comparison of aggregate in R and group by in mysql"

2008 Jan 26
3
An R clause to bind dataframes under certain contions
Hi netters, Suppose I have two data frames X and Y. X has three colnames A, B and C. Y has three colnames A,B and D. I want to combine them into one matrix, joining the rows having the same A and B values (X$A==Y$A and X$B = Y$B). So the resulting dataframe has four variables/columns: A,B,C and D. I was wondering what's the best way to do it in R. Could anyone give me some advice? Thanks!
2008 Mar 07
4
locate the rows in a dataframe with some criteria
Hi, netters, This is probably a rookie question but I couldn't find the answer after hours of searching and trying. Suppose there'a a dataframe M: x y 10 A 13 B 8 A 11 A I want to locate the rows where x >=10 and y="A". I know how to do it to vectors by using which, but how to do it with the dataframe? Thank you very much! Zhihua Li
2008 Apr 17
1
how to use a function in aggregate which accepts matrix and outputs matrix?
Dear netters, suppose I have a matrix X [1,] 'c1' 'r6' '150'[2,] 'c1' 'r4' '70'[3,] 'c1' 'r2' '20'[4,] 'c1' 'r5' '90'[5,] 'c2' 'r2' '20'[6,] 'c3' 'r1' '10'I want to apply some funciton to groups of rows by the first column.If the function is just to
2008 Sep 22
4
sort a data matrix by all the values and keep the names
Dear all, If I have a data frame x<-data.frame(x1=c(1,7),x2=c(4,6),x3=c(8,2)): x1 x2 x3 1 4 8 7 6 2 I want to sort the whole data and get this: x1 1 x3 2 x2 4 x2 6 x1 7 x3 8 If I do sort(X), R reports: Error in order(list(x1 = c(1, 7), x2 = c(4, 6), x3 = c(8, 2)), decreasing = FALSE) : unimplemented type 'list' in 'orderVector1' The only way
2007 Jul 19
3
Error: evaluation nested too deeply when doing heatmap with binary distfunction
Hi netters, I have a matrix X of the size (1000,100). The values are from -3 to +3. When I tried heatmap(X, distfun=function(c),dist(c,method="bin"),hclustfun=function(m),hclust(m,method="average")) I got the error message: Error: evaluation nested too deeply: infinite recursion / options(expressions=)? However, if I used default parameters for distfunction:
2005 Jun 23
2
quotient and remainder
hi netters Is there a function in R that can compute the quotient and remainder of a division calculation? such that when 11 is given as the dividend and 5 the divider, the function returns 2(quotient) and 1(remainder). Thanks a lot! _________________________________________________________________ 伱佲伔佈佅伮佋佖 MSN Explorer: http://explorer.msn.com/lccn/
2005 Dec 03
2
how to subset rows using regular expression patterns
hi netters, i have a dataframe A with several columns(variables). the elements of column M are character strings. so A$M=c("ab","abc","bcd","ac","abcd","fg",....."fl"). i wanna extract all the rows where A$M match some regular expression pattern. for a simple example, let the pattern be just "ab", i wanna subset
2005 Dec 08
2
how to change a dataframe with characters to a numeric matrix?
hi netters, i have a dataframe TEST like this: Y1 Y2 Y3 X1 4 7 8 X2 6 2 Z X3 8 0 1 i would like to change it to a numeric matrix, replacing "Z" with NA Y1 Y2 Y3 X1 4 7 8 X2 6 2 NA X3 8 0 1 i've tried the function data.matrix but it didn't work. is there any easy way to do this? thanks a lot!
2007 Jul 02
2
working with R graphics remotely
Hi netters, Now I'm connecting from my local windows machine to a remote linux machine and launch R out there using SSH. When I tried to create grahics, like using plot or heatmap, I cannot see the output. Maybe a new R window displaying the graphics has popped out in the remote machine? Or I need to change some settings for the graphics to display? I don't know. I googled it and
2005 Jun 21
2
how to count "associated" factors?
hi netters Suppose I have a factor X, with 10 elements and 3 levels: A B B C A C B A C C . It is easy to count the number of elements for each level: tapply(X,X,length). Now I have another factor Y, which formed a matrix with X: X| A B B C A C B A C C Y| B B C C C A A A B B I wanna count the number of elements for each of these conditions: when X=A and Y=A; when X=A and Y=B; when X=A and
2005 May 13
1
manipulating dataframe according to the values of some columns
hi netters, I'm a newbie to R and there are some very simple problems puzzeled me for two days. I've a dataframe here with several columns different in modes. Two of the columns are special for me: column 1 has the mode "factor" and column 2 has the mode "numeric vectors". The values for column 1 are either "T" or "F". I wanna do two things:
2007 Jul 18
2
memory error with 64-bit R in linux
Hi netters, I'm using the 64-bit R-2.5.0 on a x86-64 cpu, with an RAM of 2 GB. The operating system is SUSE 10. The system information is: -uname -a Linux someone 2.6.13-15.15-smp #1 SMP Mon Feb 26 14:11:33 UTC 2007 x86_64 x86_64 x86_64 GNU/Linux I used heatmap to process a matrix of the dim [16000,100]. After 3 hours of desperating waiting, R told me: cannot allocate vector of size
2007 Jun 05
1
rJava installation under linux: configuration failed
Hi netter, Recently I was trying to install rJava. The operating system is suse 10.0, and the R versionis 2.5.0. Following the instructions of R Wiki for rJava, I did configuration first: R CMD javareconf and then it showed a series of information, from what it seems that java is in the system and the configuration succeeded. Then I tried to install rJava:
2008 Sep 24
2
keep the row indexes/names when do aggregate
Hi, R-users, If I have a data frame like this: >x<-data.frame(g=c("g1","g2","g1","g1","g2"),v=c(1,7,3,2,8)) g v 1 g1 1 2 g2 7 3 g1 3 4 g1 2 5 g2 8 It contains two groups, g1 and g2. Now for each group I want the max v: > aggregate(x$v,list(g=x$g),max) g x 1 g1 3 2 g2 8 Beautiful. But what if I want to keep the row index of (g1
2005 May 30
3
how to "singlify" entries
hi netters I have a rather simple question. I have a data frame with two variables X and Y, both of which are factors. X has 100 levels while Y has 10 levels only. The data frame has 100 rows in all, so for X the values are unique, and Y has many replicate values. Now I wanna reduce the data frame into 10 rows only, according to the 10 levels of Y. I don't care which value of X is in
2005 Dec 12
2
store and retrieve object names in a vector
hi netters, suppose i have a series of objects X1, X2, B1,C1........... they all have the same dimensions. i want to combine into one by using cbind: y<-cbind(X1,X2,B1,C1.....) but i don't want to type the names of these objects one by one. instead, i've put their names into a vector: x<-c("X1","X2","B1","C1",....) i used y<-cbind(x).
2005 Jul 12
2
how to generate argument from a vector automatically
hi netters i have a vector NAMES containing a series of variable names: NAMES=c(x,r,z,m,st,qr,.....nn). i wanna fit a regression tree by using the code: my.tree<-tree(y~x+r+z+m+....nn,my.dataframe) but i don't want to type out "x+r+z+m+....+nn" one by one, as there are so many variables. besides, sometimes i wanna put the code in a function. so i need to have the
2005 Aug 26
2
learning decision trees with one's own scoring functins
Hi netters, I want to learn a decision tree from a series of instances (learning data). The packages tree or rpart can do this quite well, but the scoring functions (splitting criteria) are fixed in these packages, like gini or something. However, I'm going to use another scoring function. At first I wanna modify the R code of tree or rpart and put my own scoring function in. But it
2009 Dec 03
2
Formatting of numbers on y axis
Hello all. I have the following: plot(salaries$yearID, salaries$salary, type='n', xaxt='n', xlab='', yaxt='n', ylab='') axis(1, at=unique(salaries$yearID), labels=unique(salaries$yearID), lwd=.25, tck=-0.05) axis(2, axTicks(2), format(axTicks(2), scientific = F)) Which nicely creates the Y axis with the raw numbers, which are in the range of .5 - 7
2012 Jun 01
3
Add rank column to data frame as in SQL...
Hopefully this is an easy problem... I'm trying to add a partitioned rank column to a data frame where the rank is calculated separately across a partition by categories, the way you could easily do in SQL. I found this solution in the archives that looked like it might work: http://tolstoy.newcastle.edu.au/R/e11/help/10/09/8675.html The example has a data frame with several car companies,