thr3ads.net - similar to: "dataframe indexing by number of cases per group"

Displaying 20 results from an estimated 90 matches similar to: "dataframe indexing by number of cases per group"

2005 Dec 05

apply() and dropped dimensions

Hi I am having difficulty with apply(). I want apply() to return a matrix, but sometimes a vector is returned. Toy example follows. Function jj() takes a couple of matrices m1 and m2 as arguments and returns a matrix with r rows and c columns where r=nrow(m2) and c=nrow(m1). jj <- function(m1,m2,f,...){ apply(m1, 1, function(y) { apply(m2, 1, function(x) { f(x, y, ...)

Removing rows in dataframe w'o duplicated values

2011 Nov 22

Removing rows in dataframe w'o duplicated values

Hi, Is there an easy way to remove dataframe rows without duplicated values of a specified column ('id')? e.g., dat <- data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4), value2 = c(1,4,3,3,4,3)) dat id value value2 1 1 5 1 2 1 6 4 3 1 7 3 4 2 4 3 5 3 5 4 6 3 4 3 This is sample data and the real data has hundreds of

Argument validation within functions

2011 Dec 06

Argument validation within functions

Hi, I just started with writing functions in R and so some questions popped up. I provide some values as argument to my function such as: function(a,b,c){} Now i want that the function first checks if the arguments are valid for the function. E.g argument "a" has to be a number in the range 0-1. How can that easily done? So far I have: a <- as.numeric(a) if(0 <= a &&

Extracting information from lm results (multiple model runs)

2011 Aug 15

Extracting information from lm results (multiple model runs)

Just to inform: I posted that before in R-sig-ecology but as it might be interesting also for other useRs, I post it also to the general r-user list: Hello Alexandre, thank you very much. I also found another way to extract summarizing information from lm results over e.g. 1000 repeated model runs: results2 <- t(as.data.frame(results)) summary(results2) Although some questions popped up in

Assigning objects to variable and variable to list in a for loop

2012 Feb 03

Assigning objects to variable and variable to list in a for loop

Hello, I want to use a for loop for repeadely calculating a maxent model (package dismo, function maxent()) which creates an object of the class maxent (S4). I want to collect all the resulting object in a list. I tried to simplify my for loop to explain what I want. There are two problems/questions: 1) How can I create the new variables in the loop (using paste) and assign the objects 2) How

Remove columns from dataframe based on their statistics

2012 May 31

Remove columns from dataframe based on their statistics

Hi, I have a dataframe and want to remove columns from it that are populated with a similar value (for the total column) (the variation of that column is 0). Is there an easier way than to calculate the statistics and then remove them by hand? A <- runif(100) B <- rep(1,100) C <- rep(2.42,100) D <- runif(100) df <- data.frame(A,B,C,D) # if want to conditionally remove column B and

MCMC regress, using runif()

2011 Aug 15

MCMC regress, using runif()

Hello, just to follow up a question from last week. Here what I've done so far (here an example): library(MCMCpack) Y=c(15,14,23,18,19,9,19,13) X1=c(0.2,0.6,0.45,0.27,0.6,0.14,0.1,0.52) X2a=c(17,22,21,18,19,25,8,19) X2b=c(22,22,29,34,19,26,17,22) X2 <- function()runif(length(X2a), X2a, X2b) model1 <- MCMCregress(Y~X1+X2()) summary(model1) but I am not sure if my X2-function is

text(): combine expression and line break

2012 May 11

text(): combine expression and line break

Hi, I would like to plot some extra text in my plot. This should be a two line text including a special character (sigma). I tried so far a to use expression in combination with paste and "\n"... but I can't get the line break... Here what I've done so far: plot(1,type="n", xaxt='n', yaxt='n', ann=FALSE) text(1,1,labels=expression(paste(sigma,"\n

variable transformation for lm

2011 Nov 03

variable transformation for lm

Hello, I am doing a simple regression using lm(Y~X). As my response and my predictor seemed to be skewed and I can't meet the model assumptions. Therefore I need to transform my variables. I wanted to ask what is the preferred way to find out if predictor and/or response needs to be transformed and if yes how (log-transform?). I found a procedure in "A modern approach to Regressoin in

Wildcard for indexing?

2012 Feb 14

Wildcard for indexing?

Hi, I'd like to know if it is possible to use wildcards * for indexing... E.g. I have a vector of strings. Now I'd like to select all elements which start with A_*? I'd also need to combine that with logical operators: "Select all elements of a vector that start with A (A*) OR that start with B (B*)" Probably that is quite easy. I looked into grep() which I think might

More list to vector puzzle

2008 Nov 20

More list to vector puzzle

Many thanks for the answers on my previous question, it got me started. Indeed, stack() was the function I was vaguely remembering. However, I didn?t get very far because my data set is way more complicated then I expected. In fact I have a mixture of levels and lists within a list. Basically, it resemble the following list (named data) made of the levels H and the list of lists A and T. for each

Sort 1-column dataframe with rownames

2012 Jun 08

Sort 1-column dataframe with rownames

Hi, I have a 1-column dataframe with rownames and I want to sort it based on the single column. The typical procedure that is recommended in diverse posts is to use order in the index. But that "destroys" my dataframe structure. Probabaly it is a very simple solution. Here is a short reproducable example: x <- c(1,3,51,2,34,44,12,33,2,8) df <- data.frame(x) rownames(df) <-

reshape -> reshape 2: function cast changed?

2012 Jul 25

reshape -> reshape 2: function cast changed?

Hi, I used to use reshape and moved to reshape2 (R 2.15.1). Now I tried some of my older scripts and was surprised that my cast function wasn't working like before. What I did/want to do: 1) Melt a dataframe based on a vector specifying column names as measure.vars. Thats working so far: dfm <- melt(df, measure.vars=n, variable_name = "species", na.rm = FALSE) 2) Recast the

lme: how to extract the variance components?

2001 Nov 14

lme: how to extract the variance components?

Dear all, Here is the question: For example, using the "petrol" data offered with R. pet3.lme<-lme(Y~SG+VP+V10+EP,random=~1|No,data=petrol) pet3.lme$sigma gives the residual StdDev. But I can't figure out how to extract the "(intercept) StdDev", although it is in the print out if I do "summary(pet3.lme)". In

Combine variables of different length

2011 Nov 01

Combine variables of different length

Hi, I have got a dataset with the variables Y,X1,X2,X3. Some of these variables contain NAs. Therefore incomplete datasets aren't recognized when I am doing a regression like: model <- lm(Y~X1+X2+X3) so the resulting vector of resid(model) is obviousely shorter then the original variables. How can I combine the residuals-vector with the original dataset (Y,Xi,...). I recognize that the

Save/Load function()-result to file in a loop

2012 Mar 08

Save/Load function()-result to file in a loop

Hi, I am looking for a way to save the result of a function, e.g the lm()-function to a file and reload it afterwards again. I'd like to do that in order to minimize the used memory when running the function in a loop. The actual function I want to store is the evaluate() from the dismo package. I tried it with save() and load() but I am not sure if that is the way I should do it as I

Generate strings from multiple variables

2012 May 04

Generate strings from multiple variables

Hi, it is easiest to explain what I want to do by an example: lets assume there are two factors/variables: A <- c(1,2,3) B <- c(1,3,3) Now I would like to generate a list of strings that should look like ("A1_B1","A1_B2","A2_B1","A2_B2"). So actually the string contains all possible combinations of A and B (separated by _). This should be also

Correct use of ddply with own function

2012 May 05

Correct use of ddply with own function

Hi, I am really confused how ddply work, so maybe you can help me. I created a function that sorts a vector etc. fn <- function(x){ x1 <- sort(x) x2 <- seq(length(x)) x3 <- x2/max(x2) df <- data.frame(x1,x2,x3) df } Probably this is not the best form of the function, but at least it produces what I want (data to plot a cumulative count curve). This function works on a

summary per group

2012 Jan 02

summary per group

Hello, I know that it'll be quite easy to do what I want but somehow I am lost as I am new to R. I want to get summary results arranged by groups. In detail I'd like get the number (levels) of Species per Family like for this dataset: SPEC <-

Split dataframe into new dataframes

2012 Feb 08

Split dataframe into new dataframes

Hi, I want to split a dataframe based on a grouping variable (in one column). The resulting new dataframes should be stored in a new variable. I tried to split the dataframe using split() and to store it using a FOR loop, but thats not working so far: df <- data.frame(A=c("A1","A1","A2","A2"),B=seq(1:4)) Fsplit <- function(x,y){ ls <-

similar to: dataframe indexing by number of cases per group