similar to: Grouping variables in a data frame

Displaying 20 results from an estimated 800 matches similar to: "Grouping variables in a data frame"

2016 Apr 14
Bug in by() function which works for some FUN argument and does not work for others
Dear Sirs, I am Professor at Indira Gandhi Krishi Vishwavidyalaya, Raipur, Chhattisgarh, India. While taking classes, I found the *by() *function producing following error when I use FUN=mean or median and some other functions, however, FUN=summary works. Given below is the output of the example I used on a built-in dataset "mtcars", along with error message reproduced herewith: >
2016 Apr 14
Bug in by() function which works for some FUN argument and does not work for others
I think you are not using the best function for what your intentions are. Try: > by(data=mtcars, INDICES=list(as.factor(mtcars$am)), FUN=colMeans) : 0 mpg cyl disp hp drat wt qsec vs 17.1473684 6.9473684 290.3789474 160.2631579 3.2863158 3.7688947 18.1831579 0.3684211 am gear carb 0.0000000
2016 Apr 15
Bug in by() function which works for some FUN argument and does not work for others
Dear All, Thanks for your help. However, I would like to draw your attention to the following: Actually, I was replicating the Example 2.3, using the dataset "brainsize.txt" given in Section 2.3.3 ("Summarize by group") at page 55, of a famous book "R by Example" written by "Jim Albert and Maria Rizzo" published in Springers (2012) in a Use R! Series. The
2016 Apr 16
Bug in by() function which works for some FUN argument and does not work for others
Dear All, I have got your core message, that it is my responsibility to determine whether any particular function in my version of R satisfies the language requirements at the time of your use. Jim Albert and Maria Rizzo must have used their code, which was permitted in the R-code of their time (2012). Therefore, I have now modified my R-code, as per R-3..2.4 version, according to my requirement
2016 Apr 15
Bug in by() function which works for some FUN argument and does not work for others
> On Apr 15, 2016, at 1:16 AM, Akhilesh Singh <akhileshsingh.igkv at> wrote: > > Dear All, > > Thanks for your help. However, I would like to draw your attention to the > following: > > Actually, I was replicating the Example 2.3, using the dataset > "brainsize.txt" given in Section 2.3.3 ("Summarize by group") at page 55, > of a
2016 Apr 17
Bug in by() function which works for some FUN argument and does not work for others
> On Apr 16, 2016, at 2:03 AM, Akhilesh Singh <akhileshsingh.igkv at> wrote: > > Dear All, > > I have got your core message, that it is my responsibility to determine whether any particular function in my version of R satisfies the language requirements at the time of your use. Jim Albert and Maria Rizzo must have used their code, which was permitted in the R-code
2011 Aug 23
GLM question
Hi All, I am trying to fit my data with glm model, my data is a matrix of size n*100. So, I have n rows and 100 columns and my vector y is of size n which contains the labels (0 or 1) My question is: instead of manually typing the model as = glm(y~ x[,1]+x[,2]+...+x[,100], family=binomial()) I have a for loop as follows that concatenates the x variables as follows: final_str=NULL for
2013 Aug 30
Memory usage bar plot
Hi, I haven't tried the code yet. Is there a way to parse this data using R and create bar plots so that each program's 'RAM used' figures are grouped together. So 'uuidd' bars will be together. The data will have about 50 sets. So if there are 100 processes each will have about 50 bars. What is the recommended way to graph these big barplots ? I am looking
2011 Aug 26
How to find the accuracy of the predicted glm model with family = binomial (link = logit)
Hi All, When modeling with glm and family = binomial (link = logit) and response values of 0 and 1, I get the predicted probabilities of assigning to my class one, then I would like to compare it with my vector y which does have the original labels. How should I change the probabilities into values of zero and 1 and then compare it with my vector y to find out about the accuracy of my
2012 Jan 06
How to fit my data with a distribution?
Dear All, I have a bunch of data points as follows: x  100 y  200 z  300 ... where 100, 200, 300 are the values. I would like to know the distribution of my data? how can I fit my data into a distribution? Thanks a lot, Andra [[alternative HTML version deleted]]
2010 Nov 30
pca analysis: extract rotated scores?
Dear all I'm unable to find an example of extracting the rotated scores of a principal components analysis. I can do this easily for the un-rotated version. data(mtcars) .PC <- princomp(~am+carb+cyl+disp+drat+gear+hp+mpg, cor=TRUE, data=mtcars) unclass(loadings(.PC)) # component loadings summary(.PC) # proportions of variance mtcars$PC1 <- .PC$scores[,1] # extract un-rotated scores of
2010 May 27
How to combine data columns to a single column
Dear users, I have several columns of data (each column containing monthly data for a particular year from january - december) . I would wish to combine the columns to get a Single column of continuous data as shown in (b) below. I have read this data as table in R (a) Data example Year 1903 1904 1905 1906 Jan 125.0 30.0 113.0 5.0 Feb 128.0 100.0 70.0 388.0
2011 Aug 20
a Question regarding glm for linear regression
Hello All, I have a question about glm in R. I would like to fit a model with glm function, I have a vector y (size n) which is my response variable and I have matrix X which is by size (n*f) where f is the number of features or columns. I have about 80 features, and when I fit a model using the following formula,? glmfit = glm(y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10 + x11 + x12 + x13
2010 Sep 20
invalid 'row.names' length error when running scatterplots or plot in R Commander
Hello, I teach statistics and use R Commander for teaching. I have 2 students out of 169 that can't get scatterplots or plot to work. I have had them update packages, restart R/R Commander/their computers and even reinstall R/R Commander. One is using Windows 7 on a new pc and the other is a pc user (not sure the OS). They are both using R2.11.1 and R Commander 1.6-0. The data look like
2011 Sep 01
Question about BIC of two different regression models? how should we compare two regression models?
Hi All,  In order to compare two different logistic regressions, I think I need to compare them based on their BIC values, but I am not sure if the smaller BIC would mean a better model or the reverse is true? Thanks a lot,Andra [[alternative HTML version deleted]]
2011 Feb 03
interpret significance from the contr.poly() function
Hello R-help I don’t know how to interpret significance from the contr.poly() function . From the example below : how can I tell if data has a significant Linear/quadratic/cubic trend? > contr.poly(4, c(1,2,4,8))               .L         .Q          .C [1,] -0.51287764  0.5296271 -0.45436947 [2,] -0.32637668 -0.1059254  0.79514657 [3,]  0.04662524 -0.7679594 -0.39757328 [4,]  0.79262909 
2005 Apr 02
An exercise in the use of 'substitute'
I would like to create a method for the generic function "with" applied to a class of fitted models. The method should do two things: 1. Substitute the name of the first argument for '.' throughout the expression 2. Evaluate the modified expression using the data argument to the fitted model as the first element of the search list. The second part is relatively easy. The
2002 Jun 26
Bug? (PR#1710)
Hi, I tried to do a multiple linear model from the example dataset Formaldehyde. However, the function lm() did not estimate the coefficient of the term carb^2. The same problem occurred with the (nlme)dataset Pixel with both function lme() and lm(). I am using the windows version of R 1.5.1 Lauri Mehtatalo The Formaldehyde example: > data(Formaldehyde) >
2002 Dec 13
how to get Residual Standard Error
Hi, I use lm or loess to make smoothing. After smoothing I need "Residual Standard Error" in my script. Could you please tell me how can I get this information? Thanks,
2013 Jun 25
F statistic in add1.lm vs add1.glm
Should the F statistic be the same when using add1() on models created by lm and glm(family=gaussian)? They are in the single-degree-of-freedom case but not in the multiple-degree-of-freedom case. MASS:addterm shows the same discrepancy. It looks like the deviance (==residual sum of squares) gets divided by the number of degrees of freedom for the term twice in add1.glm. Using anova() on the