thr3ads.net - similar to: "subset grouped data with quantile and NA's"

Displaying 20 results from an estimated 10000 matches similar to: "subset grouped data with quantile and NA's"

1999 Feb 19

Potential problem with tapply

Is the following behaviour of tapply not disappointing? Problem with tapply occurs when dealing with na.rm when an argument additional to na.rm is sent to the applied function (here quantile). Any comment? Thank you, Philippe Lambert > x <- c(12,10,12,2,4,11,3,7,2,1,18,7,NA,NA,7,5) > fac <- gl(4,4,16) > # Works fine > tapply(x,fac,quantile,na.rm=T) $"1" 0% 25%

Examples of web-based Sweave use?

2011 Apr 04

Examples of web-based Sweave use?

I appreciate that this is OT, but I'd be grateful for pointers to examples of where Sweave has been used for web-based applications. In particular, examples of where reports/analyses are produced automatically through submission of data to a web-sever. I am mostly interested in situations where pdf reports have been produced rather than, say, a plot/table etc shown on a web page.

how use the subset?

2009 Jul 31

how use the subset?

hi ,everyone I want subtract some dataset by subset. >From the help running help(subset), ths information is "*subset(airquality, Day == 1, select = -Temp)* " while I running my script written as "*g1data<-subset(errdata, fac>12) *" ,it is wrong with the error information "*subset.default(newerrdata, fac>12),can not find fac*" and g1 in read

how to split a data framed with sequences

2008 Sep 09

how to split a data framed with sequences

Hi all, Given a data frame: my.df <- data.frame(a = c(1:5, 1:10, 1:20), b = runif(35)) I want to split it by "a" such that I end up with a list containing 3 components i.e. the first containing a = 1 to 5, the second a = 1 to 10 etc. In other words, sets of sequences of a. I can't seem to find the right form using the split function - can you help? Much appreciated. David

Sample rows in data frame by subsets

2006 Jan 23

Sample rows in data frame by subsets

Hi, I need to resample rows in a data frame by subsets L3 <- LETTERS[1:3] d <- data.frame(cbind(x=1, y=1:10), fac=sample(L3, 10, repl=TRUE)) x y fac 1 1 1 A 2 1 2 A 3 1 3 A 4 1 4 A 5 1 5 C 6 1 6 C 7 1 7 B 8 1 8 A 9 1 9 C 10 1 10 A I have seen this used to sample rows with replacement d[sample(nrow(d), replace=T), ] x y fac 7 1 7 B 2

run function on subsets of matrix

2011 Mar 27

run function on subsets of matrix

I was wondering if it is possible to do the following in a smarter way. I want get the mean value across the columns of a matrix, but I want to do this on subrows of the matrix, given by some vector(same length as the the number of rows). Something like nObs<- 6 nDim <- 4 m <- matrix(rnorm(nObs*nDim),ncol=nDim) fac<-sample(1:(nObs/2),nObs,rep=T) ##loop trough different

question about split

2011 Jun 17

question about split

Dear R-users I seem to be stumped on something simple. I want to split a data frame by factor levels given in one or more columns e.g. given dat <- data.frame(x = runif(100), fac1 = rep(c("a", "b", "c", "d"), each = 25), fac2 = rep(c("A", "B"), 50)) I know I can split it by fac1, fac2 by:

pairwise.t.test: empty p-table

2006 May 02

pairwise.t.test: empty p-table

Hi list-members can anybody tell me why > pairwise.t.test(val, fac) produces an empty p-table. As shown below: Pairwise comparisons using t tests with pooled SD data: val and fac AS AT Fhh Fm Fmk Fmu GBS Gf HFS Hn jAL Kol R_Fill AT - - - - - - - - - - - - - Fhh - - - - - - - - - - - - - Fm - - - - - - -

Contingency table: logistic regression

2005 Mar 31

Contingency table: logistic regression

Hi, I am analyzing a data set with greater than 1000 independent cases (collected in an unrestricted manner), where each case has 3 variables associated with it: one, a factor variable with 0/1 levels (called XX), another factor variable with 8 levels (X) and a third response variable with two levels (Y: 0/1). I am trying to see if X1 has an effect on the relationship between X2 and the

[LLVMdev] Strange error for libLLVMCore.a

2009 Nov 05

[LLVMdev] Strange error for libLLVMCore.a

mingw, llvm 2.6 (buid with llvm-gcc) Example source code: http://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html I change LLVMCreateJITCompiler(&engine, provider, &error); to LLVMCreateJITCompiler(&engine, provider, 3, &error); $ llvm-gcc `llvm-config --cflags` -c fac.c $ g++ `llvm-config --libs --cflags --ldflags core analysis executionengine jit

difference in sort order linux/Windows (R.2.11.0)

2010 May 28

difference in sort order linux/Windows (R.2.11.0)

Dear R users, I'm a bit perplexed with the effect sort has here, as it is different on Windows vs. linux. It makes my factor levels and subsequent plots different on the two systems. Given: types <- c("PC-D-Euro-0", "PC-D-Euro-1", "PC-D-Euro-2", "PC-D-Euro-3", "PC-D-Euro-4", "PC-D-Euro-5", "PC-D-Euro-6",

conditional statement to replace values in dataframe with NA

2012 Jun 07

conditional statement to replace values in dataframe with NA

Hello and thanks for helping. #some data L3 <- LETTERS[1:3] dat1 <- data.frame(cbind(x=1, y=rep(1:3,2), fac=sample(L3, 6, replace=TRUE))) #When x==1 and y==1 I want to replace the 1 values with NA #I can select the rows I want: dat2<-subset(dat1,x==1 & y==1) #replace the 1 with NA dat2$x<-rep(NA,nrow(dat2) dat2$y<-rep(NA,nrow(dat2) #select the other rows and rbind

Impossible to merge with a zero rows data frame?

2006 Sep 27

Impossible to merge with a zero rows data frame?

I'm trying to merge two data frames. One of them is a zero rows data frame. I'm using the merge parameter 'all.x = TRUE' so I'd expect to obtain all the rows of x. In fact the merge help says: all.x: logical; if 'TRUE', then extra rows will be added to the output, one for each row in 'x' that has no matching row in 'y'. These rows

Need help understanding output from aov and from anova

2009 Jun 03

Need help understanding output from aov and from anova

Hi all, I noticed something strange when I ran aov and anova. vtot=c(7.29917, 7.29917, 7.29917) #identical values fac=as.factor(c(1,1,2)) #group 1 has first two elements, group 2 has the 3rd element When I run: > anova(lm(vtot~fac)) Analysis of Variance Table Response: vtot Df Sum Sq Mean Sq F value Pr(>F) fac 1 1.6818e-30 1.6818e-30 0.3333 0.6667 Residuals 1

GAM interactions, by example

2012 May 29

GAM interactions, by example

Dear all, I'm using the mgcv library by Simon Wood to fit gam models with interactions and I have been reading (and running) the "factor 'by' variable example" given on the gam.models help page (see below, output from the two first models b, and b1). The example explains that both b and b1 fits are similar: "note that the preceding fit (here b) is the same as

Change factor levels

2013 Dec 14

Change factor levels

Suppose I have a dataframe 'd' defined as L3 <- LETTERS[1:3] d0 <- data.frame(cbind(x = 1, y = 1:10), fac = sample(L3, 10, replace = TRUE)) (d <- d0[d0$fac %in% c('A', 'B'),]) x y fac 2 1 2 B 3 1 3 A 4 1 4 A 5 1 5 A 6 1 6 B 8 1 8 A Even though factor 'fac' in 'd' only has 2 levels, but it seems to bear the birthmark

Colors in interaction plots

2013 Jan 17

Colors in interaction plots

Hi, I am trying to plot an interaction.plot with different color for each level of a factor. It has an erratic behavior. For example, it works for the first interaction.plot below, with the example from the ALDA book, but not with the other plots, from the NPK dataset: # from http://www.ats.ucla.edu/stat/r/examples/alda/ch2.htm tolerance <-

Sorting values within a raster

2011 Apr 21

Sorting values within a raster

I am working with a raster and want to take values assigned to each cell and sort them from largest to smallest, then cummulatively sum them together (in order from largest to smallest). I'll then be coding the individual cells such that the top 10% of the largest cell values can be visualize with one color, the next 10% with another and so on. I have tried a number of schemes but

Quantiles of a subset of data

2013 Feb 19

Quantiles of a subset of data

bradleyd wrote > Excuse the request from an R novice! I have a data frame (DATA) that has > two numeric columns (YEAR and DAY) and 4000 rows. For each YEAR I need to > determine the 10% and 90% quantiles of DAY. I'm sure this is easy enough, > but I am a new to this. > >> quantile(DATA$DAY,c(0.1,0.9)) > 10% 90% > 12 29 > > But this is for the entire

match function causing bad performance when using table function on factors with multibyte characters on Windows

2011 Jan 21

match function causing bad performance when using table function on factors with multibyte characters on Windows

[I originally posted this on the R-help mailing list, and it was suggested that R-devel would be a better place to dicuss it.] Running ?table? on a factor with levels containing non-ASCII characters seems to result in extremely bad performance on Windows. Here?s a simple example with benchmark results (I?ve reduced the number of replications to make the function finish within reasonable time):

similar to: subset grouped data with quantile and NA's