thr3ads.net - similar to: "summary per group"

Displaying 20 results from an estimated 10000 matches similar to: "summary per group"

dataframe indexing by number of cases per group

2011 Nov 24

dataframe indexing by number of cases per group

Hello, assume we have following dataframe: group <-c(rep("A",5),rep("B",6),rep("C",4)) x <- c(runif(5,1,5),runif(6,1,10),runif(4,2,15)) df <- data.frame(group,x) Now I want to select all cases (rows) for those groups which have more or equal 5 cases (so I want to select all cases of group A and B). How can I use the indexing for such questions? df[??]...

Argument validation within functions

2011 Dec 06

Argument validation within functions

Hi, I just started with writing functions in R and so some questions popped up. I provide some values as argument to my function such as: function(a,b,c){} Now i want that the function first checks if the arguments are valid for the function. E.g argument "a" has to be a number in the range 0-1. How can that easily done? So far I have: a <- as.numeric(a) if(0 <= a &&

Assigning objects to variable and variable to list in a for loop

2012 Feb 03

Assigning objects to variable and variable to list in a for loop

Hello, I want to use a for loop for repeadely calculating a maxent model (package dismo, function maxent()) which creates an object of the class maxent (S4). I want to collect all the resulting object in a list. I tried to simplify my for loop to explain what I want. There are two problems/questions: 1) How can I create the new variables in the loop (using paste) and assign the objects 2) How

Extracting information from lm results (multiple model runs)

2011 Aug 15

Extracting information from lm results (multiple model runs)

Just to inform: I posted that before in R-sig-ecology but as it might be interesting also for other useRs, I post it also to the general r-user list: Hello Alexandre, thank you very much. I also found another way to extract summarizing information from lm results over e.g. 1000 repeated model runs: results2 <- t(as.data.frame(results)) summary(results2) Although some questions popped up in

How to resample one per group

2011 Nov 17

How to resample one per group

Hello, I have got a dataframe which looks like: y <- c(1,5,6,2,5,10) # response x <- c(2,12,8,1,16,17) # predictor group <- factor(c(1,2,2,3,4,4)) # group df <- data.frame(y,x,group) Now I'd like to resample that dataset. I want to get dataset (row) per group. So per total sample I get 4 rows into a new data frame. How can I do that? Is there any simple approach using an

Remove columns from dataframe based on their statistics

2012 May 31

Remove columns from dataframe based on their statistics

Hi, I have a dataframe and want to remove columns from it that are populated with a similar value (for the total column) (the variation of that column is 0). Is there an easier way than to calculate the statistics and then remove them by hand? A <- runif(100) B <- rep(1,100) C <- rep(2.42,100) D <- runif(100) df <- data.frame(A,B,C,D) # if want to conditionally remove column B and

persuade tabulate function to count NAs in a data frame

2011 Mar 19

persuade tabulate function to count NAs in a data frame

Hi, I'd like to ask you a question again. It is basically about data frames, NAs and tabulate function. I have this data frame. I already used this in one of the previous questions of mine. It intentionally looks this simple, my real 'df' dataframe is much bigger actually and again, I am not willing to annoy anyone with huge databases... So, my database: id

MCMC regress, using runif()

2011 Aug 15

MCMC regress, using runif()

Hello, just to follow up a question from last week. Here what I've done so far (here an example): library(MCMCpack) Y=c(15,14,23,18,19,9,19,13) X1=c(0.2,0.6,0.45,0.27,0.6,0.14,0.1,0.52) X2a=c(17,22,21,18,19,25,8,19) X2b=c(22,22,29,34,19,26,17,22) X2 <- function()runif(length(X2a), X2a, X2b) model1 <- MCMCregress(Y~X1+X2()) summary(model1) but I am not sure if my X2-function is

error in summary.Design

2008 Apr 28

error in summary.Design

Dear list, after fitting an lrm with the Design package (stored as "mymodel") I try running a summary, but I get the following error: dim(mydata) [1] 235 9 names(mydata) [1] "id" "VAR1" "VAR2" "VAR3" "VAR4" "VAR5" "VAR6" "VAR7" "VAR8" summary(mymodel) Error in `contrasts<-`(`*tmp*`,

Sort 1-column dataframe with rownames

2012 Jun 08

Sort 1-column dataframe with rownames

Hi, I have a 1-column dataframe with rownames and I want to sort it based on the single column. The typical procedure that is recommended in diverse posts is to use order in the index. But that "destroys" my dataframe structure. Probabaly it is a very simple solution. Here is a short reproducable example: x <- c(1,3,51,2,34,44,12,33,2,8) df <- data.frame(x) rownames(df) <-

Wildcard for indexing?

2012 Feb 14

Wildcard for indexing?

Hi, I'd like to know if it is possible to use wildcards * for indexing... E.g. I have a vector of strings. Now I'd like to select all elements which start with A_*? I'd also need to combine that with logical operators: "Select all elements of a vector that start with A (A*) OR that start with B (B*)" Probably that is quite easy. I looked into grep() which I think might

text(): combine expression and line break

2012 May 11

text(): combine expression and line break

Hi, I would like to plot some extra text in my plot. This should be a two line text including a special character (sigma). I tried so far a to use expression in combination with paste and "\n"... but I can't get the line break... Here what I've done so far: plot(1,type="n", xaxt='n', yaxt='n', ann=FALSE) text(1,1,labels=expression(paste(sigma,"\n

variable transformation for lm

2011 Nov 03

variable transformation for lm

Hello, I am doing a simple regression using lm(Y~X). As my response and my predictor seemed to be skewed and I can't meet the model assumptions. Therefore I need to transform my variables. I wanted to ask what is the preferred way to find out if predictor and/or response needs to be transformed and if yes how (log-transform?). I found a procedure in "A modern approach to Regressoin in

summary statistics into table/data base, many factors to analyse

2008 Nov 20

summary statistics into table/data base, many factors to analyse

Dear list, I reduced my data to the following: x <- c(1,4,2,6,8,3,4,2,4,5,1,3) y <- as.factor(c(2,2,1,1,1,2,2,1,1,2,1,2)) z <- as.factor(c(1,2,2,1,1,2,2,3,3,3,3,3)) I can produce the statistical summary just fine. s1 <- tapply(x, y, summary) d1 <- tapply(x, y, sd) s2 <- tapply(x, z, summary) d2 <- tapply(x, z, sd) First thing: I have 100 plus factors to analyse. Theirs

Problem to remove loops in a routine

2007 Aug 01

Problem to remove loops in a routine

Dear R-users, I have written the following code to generate some trellis plots. It works perfectly fine except that it is quite slow when it is apply to my typical datasets (over several thousands of lines). I believe the problem comes from the loops I am using to subset my data.frame. I read in the archives that the tapply function is often more efficient than a loop in R. Unfortunately ,

reshape -> reshape 2: function cast changed?

2012 Jul 25

reshape -> reshape 2: function cast changed?

Hi, I used to use reshape and moved to reshape2 (R 2.15.1). Now I tried some of my older scripts and was surprised that my cast function wasn't working like before. What I did/want to do: 1) Melt a dataframe based on a vector specifying column names as measure.vars. Thats working so far: dfm <- melt(df, measure.vars=n, variable_name = "species", na.rm = FALSE) 2) Recast the

- counting factor occurrences within a group: tapply()

2009 Jul 29

- counting factor occurrences within a group: tapply()

Dear List, I'm an [R] novice starting analysis of an ecological dataset containing the basal areas of different tree species in a number of research plots. Example data follow: > Trees<-data.frame(SppID=as.factor(c(rep('QUEELL',2), rep('QUEALB',3), 'CORAME', 'ACENEG', 'TILAME')), BA=c(907.9, 1104.4, 113.0, 143.1, 452.3, 638.7, 791.7, 804.3),

by inconsistently strips class - with fix

2008 Apr 15

by inconsistently strips class - with fix

summary: The function 'by' inconsistently strips class from the data to which it is applied. quick reason: tapply strips class when simplify is set to TRUE (the default) due to the class stripping behaviour of unlist. quick answer: This can be fixed by invoking tapply with simplify=FALSE, or changing tapply to use do.call(c instead of unlist executable example:

Split dataframe into new dataframes

2012 Feb 08

Split dataframe into new dataframes

Hi, I want to split a dataframe based on a grouping variable (in one column). The resulting new dataframes should be stored in a new variable. I tried to split the dataframe using split() and to store it using a FOR loop, but thats not working so far: df <- data.frame(A=c("A1","A1","A2","A2"),B=seq(1:4)) Fsplit <- function(x,y){ ls <-

multidimensional array calculation

2012 Jan 13

multidimensional array calculation

Hello, probably it is quite easy but I can get it: I have mulitple numeric vectors and a function using all of them to calculate a new value: L <- c(200,400,600) AR <- c(1.5) SO <- c(1,3,5) T <- c(30,365) fun <- function(L,AR,SO,T){ exp(L*AR+sqrt(SO)*log(T)) } How can I get an array or dataframe where all possible combinations of the factors are listed and the new value is

similar to: summary per group