similar to: dataframe indexing by number of cases per group

Displaying 20 results from an estimated 90 matches similar to: "dataframe indexing by number of cases per group"

2005 Dec 05
1
apply() and dropped dimensions
Hi I am having difficulty with apply(). I want apply() to return a matrix, but sometimes a vector is returned. Toy example follows. Function jj() takes a couple of matrices m1 and m2 as arguments and returns a matrix with r rows and c columns where r=nrow(m2) and c=nrow(m1). jj <- function(m1,m2,f,...){ apply(m1, 1, function(y) { apply(m2, 1, function(x) { f(x, y, ...)
2011 Nov 22
4
Removing rows in dataframe w'o duplicated values
Hi, Is there an easy way to remove dataframe rows without duplicated values of a specified column ('id')? e.g., dat <- data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4), value2 = c(1,4,3,3,4,3)) dat id value value2 1 1 5 1 2 1 6 4 3 1 7 3 4 2 4 3 5 3 5 4 6 3 4 3 This is sample data and the real data has hundreds of
2011 Dec 06
5
Argument validation within functions
Hi, I just started with writing functions in R and so some questions popped up. I provide some values as argument to my function such as: function(a,b,c){} Now i want that the function first checks if the arguments are valid for the function. E.g argument "a" has to be a number in the range 0-1. How can that easily done? So far I have: a <- as.numeric(a) if(0 <= a &&
2011 Aug 15
2
Extracting information from lm results (multiple model runs)
Just to inform: I posted that before in R-sig-ecology but as it might be interesting also for other useRs, I post it also to the general r-user list: Hello Alexandre, thank you very much. I also found another way to extract summarizing information from lm results over e.g. 1000 repeated model runs: results2 <- t(as.data.frame(results)) summary(results2) Although some questions popped up in
2012 Feb 03
2
Assigning objects to variable and variable to list in a for loop
Hello, I want to use a for loop for repeadely calculating a maxent model (package dismo, function maxent()) which creates an object of the class maxent (S4). I want to collect all the resulting object in a list. I tried to simplify my for loop to explain what I want. There are two problems/questions: 1) How can I create the new variables in the loop (using paste) and assign the objects 2) How
2012 May 31
3
Remove columns from dataframe based on their statistics
Hi, I have a dataframe and want to remove columns from it that are populated with a similar value (for the total column) (the variation of that column is 0). Is there an easier way than to calculate the statistics and then remove them by hand? A <- runif(100) B <- rep(1,100) C <- rep(2.42,100) D <- runif(100) df <- data.frame(A,B,C,D) # if want to conditionally remove column B and
2011 Aug 15
2
MCMC regress, using runif()
Hello, just to follow up a question from last week. Here what I've done so far (here an example): library(MCMCpack) Y=c(15,14,23,18,19,9,19,13) X1=c(0.2,0.6,0.45,0.27,0.6,0.14,0.1,0.52) X2a=c(17,22,21,18,19,25,8,19) X2b=c(22,22,29,34,19,26,17,22) X2 <- function()runif(length(X2a), X2a, X2b) model1 <- MCMCregress(Y~X1+X2()) summary(model1) but I am not sure if my X2-function is
2012 May 11
2
text(): combine expression and line break
Hi, I would like to plot some extra text in my plot. This should be a two line text including a special character (sigma). I tried so far a to use expression in combination with paste and "\n"... but I can't get the line break... Here what I've done so far: plot(1,type="n", xaxt='n', yaxt='n', ann=FALSE) text(1,1,labels=expression(paste(sigma,"\n
2011 Nov 03
2
variable transformation for lm
Hello, I am doing a simple regression using lm(Y~X). As my response and my predictor seemed to be skewed and I can't meet the model assumptions. Therefore I need to transform my variables. I wanted to ask what is the preferred way to find out if predictor and/or response needs to be transformed and if yes how (log-transform?). I found a procedure in "A modern approach to Regressoin in
2012 Feb 14
3
Wildcard for indexing?
Hi, I'd like to know if it is possible to use wildcards * for indexing... E.g. I have a vector of strings. Now I'd like to select all elements which start with A_*? I'd also need to combine that with logical operators: "Select all elements of a vector that start with A (A*) OR that start with B (B*)" Probably that is quite easy. I looked into grep() which I think might
2008 Nov 20
0
More list to vector puzzle
Many thanks for the answers on my previous question, it got me started. Indeed, stack() was the function I was vaguely remembering. However, I didn?t get very far because my data set is way more complicated then I expected. In fact I have a mixture of levels and lists within a list. Basically, it resemble the following list (named data) made of the levels H and the list of lists A and T. for each
2012 Jun 08
4
Sort 1-column dataframe with rownames
Hi, I have a 1-column dataframe with rownames and I want to sort it based on the single column. The typical procedure that is recommended in diverse posts is to use order in the index. But that "destroys" my dataframe structure. Probabaly it is a very simple solution. Here is a short reproducable example: x <- c(1,3,51,2,34,44,12,33,2,8) df <- data.frame(x) rownames(df) <-
2012 Jul 25
2
reshape -> reshape 2: function cast changed?
Hi, I used to use reshape and moved to reshape2 (R 2.15.1). Now I tried some of my older scripts and was surprised that my cast function wasn't working like before. What I did/want to do: 1) Melt a dataframe based on a vector specifying column names as measure.vars. Thats working so far: dfm <- melt(df, measure.vars=n, variable_name = "species", na.rm = FALSE) 2) Recast the
2001 Nov 14
2
lme: how to extract the variance components?
Dear all, Here is the question: For example, using the "petrol" data offered with R. pet3.lme<-lme(Y~SG+VP+V10+EP,random=~1|No,data=petrol) pet3.lme$sigma gives the residual StdDev. But I can't figure out how to extract the "(intercept) StdDev", although it is in the print out if I do "summary(pet3.lme)". In
2011 Nov 01
1
Combine variables of different length
Hi, I have got a dataset with the variables Y,X1,X2,X3. Some of these variables contain NAs. Therefore incomplete datasets aren't recognized when I am doing a regression like: model <- lm(Y~X1+X2+X3) so the resulting vector of resid(model) is obviousely shorter then the original variables. How can I combine the residuals-vector with the original dataset (Y,Xi,...). I recognize that the
2012 Mar 08
1
Save/Load function()-result to file in a loop
Hi, I am looking for a way to save the result of a function, e.g the lm()-function to a file and reload it afterwards again. I'd like to do that in order to minimize the used memory when running the function in a loop. The actual function I want to store is the evaluate() from the dismo package. I tried it with save() and load() but I am not sure if that is the way I should do it as I
2012 May 04
1
Generate strings from multiple variables
Hi, it is easiest to explain what I want to do by an example: lets assume there are two factors/variables: A <- c(1,2,3) B <- c(1,3,3) Now I would like to generate a list of strings that should look like ("A1_B1","A1_B2","A2_B1","A2_B2"). So actually the string contains all possible combinations of A and B (separated by _). This should be also
2012 May 05
1
Correct use of ddply with own function
Hi, I am really confused how ddply work, so maybe you can help me. I created a function that sorts a vector etc. fn <- function(x){ x1 <- sort(x) x2 <- seq(length(x)) x3 <- x2/max(x2) df <- data.frame(x1,x2,x3) df } Probably this is not the best form of the function, but at least it produces what I want (data to plot a cumulative count curve). This function works on a
2012 Jan 02
2
summary per group
Hello, I know that it'll be quite easy to do what I want but somehow I am lost as I am new to R. I want to get summary results arranged by groups. In detail I'd like get the number (levels) of Species per Family like for this dataset: SPEC <-
2012 Feb 08
2
Split dataframe into new dataframes
Hi, I want to split a dataframe based on a grouping variable (in one column). The resulting new dataframes should be stored in a new variable. I tried to split the dataframe using split() and to store it using a FOR loop, but thats not working so far: df <- data.frame(A=c("A1","A1","A2","A2"),B=seq(1:4)) Fsplit <- function(x,y){ ls <-