similar to: subset grouped data with quantile and NA's

Displaying 20 results from an estimated 10000 matches similar to: "subset grouped data with quantile and NA's"

1999 Feb 19
1
Potential problem with tapply
Is the following behaviour of tapply not disappointing? Problem with tapply occurs when dealing with na.rm when an argument additional to na.rm is sent to the applied function (here quantile). Any comment? Thank you, Philippe Lambert > x <- c(12,10,12,2,4,11,3,7,2,1,18,7,NA,NA,7,5) > fac <- gl(4,4,16) > # Works fine > tapply(x,fac,quantile,na.rm=T) $"1" 0% 25%
2011 Apr 04
2
Examples of web-based Sweave use?
I appreciate that this is OT, but I'd be grateful for pointers to examples of where Sweave has been used for web-based applications. In particular, examples of where reports/analyses are produced automatically through submission of data to a web-sever. I am mostly interested in situations where pdf reports have been produced rather than, say, a plot/table etc shown on a web page.
2009 Jul 31
2
how use the subset?
hi ,everyone I want subtract some dataset by subset. >From the help running help(subset), ths information is "*subset(airquality, Day == 1, select = -Temp)* " while I running my script written as "*g1data<-subset(errdata, fac>12) *" ,it is wrong with the error information "*subset.default(newerrdata, fac>12),can not find fac*" and g1 in read
2008 Sep 09
1
how to split a data framed with sequences
Hi all, Given a data frame: my.df <- data.frame(a = c(1:5, 1:10, 1:20), b = runif(35)) I want to split it by "a" such that I end up with a list containing 3 components i.e. the first containing a = 1 to 5, the second a = 1 to 10 etc. In other words, sets of sequences of a. I can't seem to find the right form using the split function - can you help? Much appreciated. David
2006 Jan 23
1
Sample rows in data frame by subsets
Hi, I need to resample rows in a data frame by subsets L3 <- LETTERS[1:3] d <- data.frame(cbind(x=1, y=1:10), fac=sample(L3, 10, repl=TRUE)) x y fac 1 1 1 A 2 1 2 A 3 1 3 A 4 1 4 A 5 1 5 C 6 1 6 C 7 1 7 B 8 1 8 A 9 1 9 C 10 1 10 A I have seen this used to sample rows with replacement d[sample(nrow(d), replace=T), ] x y fac 7 1 7 B 2
2011 Mar 27
1
run function on subsets of matrix
I was wondering if it is possible to do the following in a smarter way. I want get the mean value across the columns of a matrix, but I want to do this on subrows of the matrix, given by some vector(same length as the the number of rows). Something like nObs<- 6 nDim <- 4 m <- matrix(rnorm(nObs*nDim),ncol=nDim) fac<-sample(1:(nObs/2),nObs,rep=T) ##loop trough different
2011 Jun 17
1
question about split
Dear R-users I seem to be stumped on something simple. I want to split a data frame by factor levels given in one or more columns e.g. given dat <- data.frame(x = runif(100), fac1 = rep(c("a", "b", "c", "d"), each = 25), fac2 = rep(c("A", "B"), 50)) I know I can split it by fac1, fac2 by:
2006 May 02
1
pairwise.t.test: empty p-table
Hi list-members can anybody tell me why > pairwise.t.test(val, fac) produces an empty p-table. As shown below: Pairwise comparisons using t tests with pooled SD data: val and fac AS AT Fhh Fm Fmk Fmu GBS Gf HFS Hn jAL Kol R_Fill AT - - - - - - - - - - - - - Fhh - - - - - - - - - - - - - Fm - - - - - - -
2005 Mar 31
1
Contingency table: logistic regression
Hi, I am analyzing a data set with greater than 1000 independent cases (collected in an unrestricted manner), where each case has 3 variables associated with it: one, a factor variable with 0/1 levels (called XX), another factor variable with 8 levels (X) and a third response variable with two levels (Y: 0/1). I am trying to see if X1 has an effect on the relationship between X2 and the
2009 Nov 05
2
[LLVMdev] Strange error for libLLVMCore.a
mingw, llvm 2.6 (buid with llvm-gcc) Example source code: http://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html I change LLVMCreateJITCompiler(&engine, provider, &error); to LLVMCreateJITCompiler(&engine, provider, 3, &error); $ llvm-gcc `llvm-config --cflags` -c fac.c $ g++ `llvm-config --libs --cflags --ldflags core analysis executionengine jit
2010 May 28
5
difference in sort order linux/Windows (R.2.11.0)
Dear R users, I'm a bit perplexed with the effect sort has here, as it is different on Windows vs. linux. It makes my factor levels and subsequent plots different on the two systems. Given: types <- c("PC-D-Euro-0", "PC-D-Euro-1", "PC-D-Euro-2", "PC-D-Euro-3", "PC-D-Euro-4", "PC-D-Euro-5", "PC-D-Euro-6",
2012 Jun 07
3
conditional statement to replace values in dataframe with NA
Hello and thanks for helping. #some data L3 <- LETTERS[1:3] dat1 <- data.frame(cbind(x=1, y=rep(1:3,2), fac=sample(L3, 6, replace=TRUE))) #When x==1 and y==1 I want to replace the 1 values with NA #I can select the rows I want: dat2<-subset(dat1,x==1 & y==1) #replace the 1 with NA dat2$x<-rep(NA,nrow(dat2) dat2$y<-rep(NA,nrow(dat2) #select the other rows and rbind
2006 Sep 27
1
Impossible to merge with a zero rows data frame?
I'm trying to merge two data frames. One of them is a zero rows data frame. I'm using the merge parameter 'all.x = TRUE' so I'd expect to obtain all the rows of x. In fact the merge help says: all.x: logical; if 'TRUE', then extra rows will be added to the output, one for each row in 'x' that has no matching row in 'y'. These rows
2009 Jun 03
1
Need help understanding output from aov and from anova
Hi all, I noticed something strange when I ran aov and anova. vtot=c(7.29917, 7.29917, 7.29917) #identical values fac=as.factor(c(1,1,2)) #group 1 has first two elements, group 2 has the 3rd element When I run: > anova(lm(vtot~fac)) Analysis of Variance Table Response: vtot Df Sum Sq Mean Sq F value Pr(>F) fac 1 1.6818e-30 1.6818e-30 0.3333 0.6667 Residuals 1
2012 May 29
1
GAM interactions, by example
Dear all, I'm using the mgcv library by Simon Wood to fit gam models with interactions and I have been reading (and running) the "factor 'by' variable example" given on the gam.models help page (see below, output from the two first models b, and b1). The example explains that both b and b1 fits are similar: "note that the preceding fit (here b) is the same as
2013 Dec 14
2
Change factor levels
Suppose I have a dataframe 'd' defined as L3 <- LETTERS[1:3] d0 <- data.frame(cbind(x = 1, y = 1:10), fac = sample(L3, 10, replace = TRUE)) (d <- d0[d0$fac %in% c('A', 'B'),]) x y fac 2 1 2 B 3 1 3 A 4 1 4 A 5 1 5 A 6 1 6 B 8 1 8 A Even though factor 'fac' in 'd' only has 2 levels, but it seems to bear the birthmark
2013 Jan 17
3
Colors in interaction plots
Hi, I am trying to plot an interaction.plot with different color for each level of a factor. It has an erratic behavior. For example, it works for the first interaction.plot below, with the example from the ALDA book, but not with the other plots, from the NPK dataset: # from http://www.ats.ucla.edu/stat/r/examples/alda/ch2.htm tolerance <-
2011 Apr 21
1
Sorting values within a raster
I am working with a raster and want to take values assigned to each cell and sort them from largest to smallest, then cummulatively sum them together (in order from largest to smallest). I'll then be coding the individual cells such that the top 10% of the largest cell values can be visualize with one color, the next 10% with another and so on. I have tried a number of schemes but
2013 Feb 19
3
Quantiles of a subset of data
bradleyd wrote > Excuse the request from an R novice! I have a data frame (DATA) that has > two numeric columns (YEAR and DAY) and 4000 rows. For each YEAR I need to > determine the 10% and 90% quantiles of DAY. I'm sure this is easy enough, > but I am a new to this. > >> quantile(DATA$DAY,c(0.1,0.9)) > 10% 90% > 12 29 > > But this is for the entire
2011 Jan 21
1
match function causing bad performance when using table function on factors with multibyte characters on Windows
[I originally posted this on the R-help mailing list, and it was suggested that R-devel would be a better place to dicuss it.] Running ?table? on a factor with levels containing non-ASCII characters seems to result in extremely bad performance on Windows. Here?s a simple example with benchmark results (I?ve reduced the number of replications to make the function finish within reasonable time):