similar to: summary per group

Displaying 20 results from an estimated 10000 matches similar to: "summary per group"

2011 Nov 24
2
dataframe indexing by number of cases per group
Hello, assume we have following dataframe: group <-c(rep("A",5),rep("B",6),rep("C",4)) x <- c(runif(5,1,5),runif(6,1,10),runif(4,2,15)) df <- data.frame(group,x) Now I want to select all cases (rows) for those groups which have more or equal 5 cases (so I want to select all cases of group A and B). How can I use the indexing for such questions? df[??]...
2011 Dec 06
5
Argument validation within functions
Hi, I just started with writing functions in R and so some questions popped up. I provide some values as argument to my function such as: function(a,b,c){} Now i want that the function first checks if the arguments are valid for the function. E.g argument "a" has to be a number in the range 0-1. How can that easily done? So far I have: a <- as.numeric(a) if(0 <= a &&
2012 Feb 03
2
Assigning objects to variable and variable to list in a for loop
Hello, I want to use a for loop for repeadely calculating a maxent model (package dismo, function maxent()) which creates an object of the class maxent (S4). I want to collect all the resulting object in a list. I tried to simplify my for loop to explain what I want. There are two problems/questions: 1) How can I create the new variables in the loop (using paste) and assign the objects 2) How
2011 Aug 15
2
Extracting information from lm results (multiple model runs)
Just to inform: I posted that before in R-sig-ecology but as it might be interesting also for other useRs, I post it also to the general r-user list: Hello Alexandre, thank you very much. I also found another way to extract summarizing information from lm results over e.g. 1000 repeated model runs: results2 <- t(as.data.frame(results)) summary(results2) Although some questions popped up in
2011 Nov 17
1
How to resample one per group
Hello, I have got a dataframe which looks like: y <- c(1,5,6,2,5,10) # response x <- c(2,12,8,1,16,17) # predictor group <- factor(c(1,2,2,3,4,4)) # group df <- data.frame(y,x,group) Now I'd like to resample that dataset. I want to get dataset (row) per group. So per total sample I get 4 rows into a new data frame. How can I do that? Is there any simple approach using an
2012 May 31
3
Remove columns from dataframe based on their statistics
Hi, I have a dataframe and want to remove columns from it that are populated with a similar value (for the total column) (the variation of that column is 0). Is there an easier way than to calculate the statistics and then remove them by hand? A <- runif(100) B <- rep(1,100) C <- rep(2.42,100) D <- runif(100) df <- data.frame(A,B,C,D) # if want to conditionally remove column B and
2011 Mar 19
2
persuade tabulate function to count NAs in a data frame
Hi, I'd like to ask you a question again. It is basically about data frames, NAs and tabulate function. I have this data frame. I already used this in one of the previous questions of mine. It intentionally looks this simple, my real 'df' dataframe is much bigger actually and again, I am not willing to annoy anyone with huge databases... So, my database: id
2011 Aug 15
2
MCMC regress, using runif()
Hello, just to follow up a question from last week. Here what I've done so far (here an example): library(MCMCpack) Y=c(15,14,23,18,19,9,19,13) X1=c(0.2,0.6,0.45,0.27,0.6,0.14,0.1,0.52) X2a=c(17,22,21,18,19,25,8,19) X2b=c(22,22,29,34,19,26,17,22) X2 <- function()runif(length(X2a), X2a, X2b) model1 <- MCMCregress(Y~X1+X2()) summary(model1) but I am not sure if my X2-function is
2008 Apr 28
1
error in summary.Design
Dear list, after fitting an lrm with the Design package (stored as "mymodel") I try running a summary, but I get the following error: dim(mydata) [1] 235 9 names(mydata) [1] "id" "VAR1" "VAR2" "VAR3" "VAR4" "VAR5" "VAR6" "VAR7" "VAR8" summary(mymodel) Error in `contrasts<-`(`*tmp*`,
2012 Jun 08
4
Sort 1-column dataframe with rownames
Hi, I have a 1-column dataframe with rownames and I want to sort it based on the single column. The typical procedure that is recommended in diverse posts is to use order in the index. But that "destroys" my dataframe structure. Probabaly it is a very simple solution. Here is a short reproducable example: x <- c(1,3,51,2,34,44,12,33,2,8) df <- data.frame(x) rownames(df) <-
2012 Feb 14
3
Wildcard for indexing?
Hi, I'd like to know if it is possible to use wildcards * for indexing... E.g. I have a vector of strings. Now I'd like to select all elements which start with A_*? I'd also need to combine that with logical operators: "Select all elements of a vector that start with A (A*) OR that start with B (B*)" Probably that is quite easy. I looked into grep() which I think might
2012 May 11
2
text(): combine expression and line break
Hi, I would like to plot some extra text in my plot. This should be a two line text including a special character (sigma). I tried so far a to use expression in combination with paste and "\n"... but I can't get the line break... Here what I've done so far: plot(1,type="n", xaxt='n', yaxt='n', ann=FALSE) text(1,1,labels=expression(paste(sigma,"\n
2011 Nov 03
2
variable transformation for lm
Hello, I am doing a simple regression using lm(Y~X). As my response and my predictor seemed to be skewed and I can't meet the model assumptions. Therefore I need to transform my variables. I wanted to ask what is the preferred way to find out if predictor and/or response needs to be transformed and if yes how (log-transform?). I found a procedure in "A modern approach to Regressoin in
2008 Nov 20
5
summary statistics into table/data base, many factors to analyse
Dear list, I reduced my data to the following: x <- c(1,4,2,6,8,3,4,2,4,5,1,3) y <- as.factor(c(2,2,1,1,1,2,2,1,1,2,1,2)) z <- as.factor(c(1,2,2,1,1,2,2,3,3,3,3,3)) I can produce the statistical summary just fine. s1 <- tapply(x, y, summary) d1 <- tapply(x, y, sd) s2 <- tapply(x, z, summary) d2 <- tapply(x, z, sd) First thing: I have 100 plus factors to analyse. Theirs
2007 Aug 01
1
Problem to remove loops in a routine
Dear R-users, I have written the following code to generate some trellis plots. It works perfectly fine except that it is quite slow when it is apply to my typical datasets (over several thousands of lines). I believe the problem comes from the loops I am using to subset my data.frame. I read in the archives that the tapply function is often more efficient than a loop in R. Unfortunately ,
2012 Jul 25
2
reshape -> reshape 2: function cast changed?
Hi, I used to use reshape and moved to reshape2 (R 2.15.1). Now I tried some of my older scripts and was surprised that my cast function wasn't working like before. What I did/want to do: 1) Melt a dataframe based on a vector specifying column names as measure.vars. Thats working so far: dfm <- melt(df, measure.vars=n, variable_name = "species", na.rm = FALSE) 2) Recast the
2009 Jul 29
4
- counting factor occurrences within a group: tapply()
Dear List, I'm an [R] novice starting analysis of an ecological dataset containing the basal areas of different tree species in a number of research plots. Example data follow: > Trees<-data.frame(SppID=as.factor(c(rep('QUEELL',2), rep('QUEALB',3), 'CORAME', 'ACENEG', 'TILAME')), BA=c(907.9, 1104.4, 113.0, 143.1, 452.3, 638.7, 791.7, 804.3),
2008 Apr 15
1
by inconsistently strips class - with fix
summary: The function 'by' inconsistently strips class from the data to which it is applied. quick reason: tapply strips class when simplify is set to TRUE (the default) due to the class stripping behaviour of unlist. quick answer: This can be fixed by invoking tapply with simplify=FALSE, or changing tapply to use do.call(c instead of unlist executable example:
2012 Feb 08
2
Split dataframe into new dataframes
Hi, I want to split a dataframe based on a grouping variable (in one column). The resulting new dataframes should be stored in a new variable. I tried to split the dataframe using split() and to store it using a FOR loop, but thats not working so far: df <- data.frame(A=c("A1","A1","A2","A2"),B=seq(1:4)) Fsplit <- function(x,y){ ls <-
2012 Jan 13
2
multidimensional array calculation
Hello, probably it is quite easy but I can get it: I have mulitple numeric vectors and a function using all of them to calculate a new value: L <- c(200,400,600) AR <- c(1.5) SO <- c(1,3,5) T <- c(30,365) fun <- function(L,AR,SO,T){ exp(L*AR+sqrt(SO)*log(T)) } How can I get an array or dataframe where all possible combinations of the factors are listed and the new value is