Displaying 20 results from an estimated 10000 matches similar to: "subset grouped data with quantile and NA's"
1999 Feb 19
1
Potential problem with tapply
Is the following behaviour of tapply not disappointing?
Problem with tapply occurs when dealing with na.rm when an
argument additional to na.rm is sent to the applied function (here
quantile).
Any comment?
Thank you,
Philippe Lambert
> x <- c(12,10,12,2,4,11,3,7,2,1,18,7,NA,NA,7,5)
> fac <- gl(4,4,16)
> # Works fine
> tapply(x,fac,quantile,na.rm=T)
$"1"
0% 25%
2011 Apr 04
2
Examples of web-based Sweave use?
I appreciate that this is OT, but I'd be grateful for pointers to examples of
where
Sweave has been used for web-based applications. In particular, examples of
where reports/analyses are produced automatically through submission of data
to a web-sever. I am mostly interested in situations where pdf reports have
been produced rather than, say, a plot/table etc shown on a web page.
2009 Jul 31
2
how use the subset?
hi ,everyone
I want subtract some dataset by subset.
>From the help running help(subset), ths information is "*subset(airquality,
Day == 1, select = -Temp)* "
while I running my script written as "*g1data<-subset(errdata, fac>12) *"
,it is wrong with the error information "*subset.default(newerrdata,
fac>12),can not find fac*"
and g1 in read
2008 Sep 09
1
how to split a data framed with sequences
Hi all,
Given a data frame:
my.df <- data.frame(a = c(1:5, 1:10, 1:20), b = runif(35))
I want to split it by "a" such that I end up with a list containing 3
components i.e. the first containing a = 1 to 5, the second a = 1 to 10 etc.
In other words, sets of sequences of a.
I can't seem to find the right form using the split function - can you help?
Much appreciated.
David
2006 Jan 23
1
Sample rows in data frame by subsets
Hi,
I need to resample rows in a data frame by subsets
L3 <- LETTERS[1:3]
d <- data.frame(cbind(x=1, y=1:10), fac=sample(L3, 10, repl=TRUE))
x y fac
1 1 1 A
2 1 2 A
3 1 3 A
4 1 4 A
5 1 5 C
6 1 6 C
7 1 7 B
8 1 8 A
9 1 9 C
10 1 10 A
I have seen this used to sample rows with replacement
d[sample(nrow(d), replace=T), ]
x y fac
7 1 7 B
2
2011 Mar 27
1
run function on subsets of matrix
I was wondering if it is possible to do the following in a smarter way.
I want get the mean value across the columns of a matrix, but I want
to do this on subrows of the matrix, given by some vector(same length
as the the number of rows). Something like
nObs<- 6
nDim <- 4
m <- matrix(rnorm(nObs*nDim),ncol=nDim)
fac<-sample(1:(nObs/2),nObs,rep=T)
##loop trough different
2011 Jun 17
1
question about split
Dear R-users
I seem to be stumped on something simple. I want to split a data frame
by factor levels given in one or more columns e.g. given
dat <- data.frame(x = runif(100),
fac1 = rep(c("a", "b", "c", "d"), each = 25),
fac2 = rep(c("A", "B"), 50))
I know I can split it by fac1, fac2 by:
2006 May 02
1
pairwise.t.test: empty p-table
Hi list-members
can anybody tell me why
> pairwise.t.test(val, fac)
produces an empty p-table. As shown below:
Pairwise comparisons using t tests with pooled SD
data: val and fac
AS AT Fhh Fm Fmk Fmu GBS Gf HFS Hn jAL Kol R_Fill
AT - - - - - - - - - - - - -
Fhh - - - - - - - - - - - - -
Fm - - - - - - -
2005 Mar 31
1
Contingency table: logistic regression
Hi,
I am analyzing a data set with greater than 1000 independent cases
(collected in an unrestricted manner), where each case has 3 variables
associated with it: one, a factor variable with 0/1 levels (called XX),
another factor variable with 8 levels (X) and a third response variable
with two levels (Y: 0/1). I am trying to see if X1 has an effect on the
relationship between X2 and the
2009 Nov 05
2
[LLVMdev] Strange error for libLLVMCore.a
mingw, llvm 2.6 (buid with llvm-gcc)
Example source code:
http://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html
I change
LLVMCreateJITCompiler(&engine, provider, &error);
to
LLVMCreateJITCompiler(&engine, provider, 3, &error);
$ llvm-gcc `llvm-config --cflags` -c fac.c
$ g++ `llvm-config --libs --cflags --ldflags core analysis
executionengine jit
2010 May 28
5
difference in sort order linux/Windows (R.2.11.0)
Dear R users,
I'm a bit perplexed with the effect sort has here, as it is different on
Windows vs. linux.
It makes my factor levels and subsequent plots different on the two systems.
Given:
types <- c("PC-D-Euro-0", "PC-D-Euro-1", "PC-D-Euro-2", "PC-D-Euro-3",
"PC-D-Euro-4", "PC-D-Euro-5", "PC-D-Euro-6",
2012 Jun 07
3
conditional statement to replace values in dataframe with NA
Hello and thanks for helping.
#some data
L3 <- LETTERS[1:3]
dat1 <- data.frame(cbind(x=1, y=rep(1:3,2), fac=sample(L3, 6, replace=TRUE)))
#When x==1 and y==1 I want to replace the 1 values with NA
#I can select the rows I want:
dat2<-subset(dat1,x==1 & y==1)
#replace the 1 with NA
dat2$x<-rep(NA,nrow(dat2)
dat2$y<-rep(NA,nrow(dat2)
#select the other rows and rbind
2006 Sep 27
1
Impossible to merge with a zero rows data frame?
I'm trying to merge two data frames. One of them is a zero rows data
frame.
I'm using the merge parameter 'all.x = TRUE' so I'd expect to obtain all
the rows of x. In fact the merge help says:
all.x: logical; if 'TRUE', then extra rows will be added to the
output, one for each row in 'x' that has no matching row in
'y'. These rows
2009 Jun 03
1
Need help understanding output from aov and from anova
Hi all,
I noticed something strange when I ran aov and anova.
vtot=c(7.29917, 7.29917, 7.29917) #identical values
fac=as.factor(c(1,1,2)) #group 1 has first two elements, group 2 has
the 3rd element
When I run:
> anova(lm(vtot~fac))
Analysis of Variance Table
Response: vtot
Df Sum Sq Mean Sq F value Pr(>F)
fac 1 1.6818e-30 1.6818e-30 0.3333 0.6667
Residuals 1
2012 May 29
1
GAM interactions, by example
Dear all,
I'm using the mgcv library by Simon Wood to fit gam models with interactions and I have been reading (and running) the "factor 'by' variable example" given on the gam.models help page (see below, output from the two first models b, and b1).
The example explains that both b and b1 fits are similar: "note that the preceding fit (here b) is the same as
2013 Dec 14
2
Change factor levels
Suppose I have a dataframe 'd' defined as
L3 <- LETTERS[1:3]
d0 <- data.frame(cbind(x = 1, y = 1:10), fac = sample(L3, 10, replace
= TRUE))
(d <- d0[d0$fac %in% c('A', 'B'),])
x y fac
2 1 2 B
3 1 3 A
4 1 4 A
5 1 5 A
6 1 6 B
8 1 8 A
Even though factor 'fac' in 'd' only has 2 levels, but it seems to bear the
birthmark
2013 Jan 17
3
Colors in interaction plots
Hi,
I am trying to plot an interaction.plot with different color for each
level of a factor. It has an erratic behavior.
For example, it works for the first interaction.plot below, with the
example from the ALDA book, but not with the other plots, from the NPK
dataset:
# from http://www.ats.ucla.edu/stat/r/examples/alda/ch2.htm
tolerance <-
2011 Apr 21
1
Sorting values within a raster
I am working with a raster and want to take values assigned to each
cell and sort them from largest to smallest, then cummulatively sum
them together (in order from largest to smallest). I'll then be
coding the individual cells such that the top 10% of the largest cell
values can be visualize with one color, the next 10% with another and
so on.
I have tried a number of schemes but
2013 Feb 19
3
Quantiles of a subset of data
bradleyd wrote
> Excuse the request from an R novice! I have a data frame (DATA) that has
> two numeric columns (YEAR and DAY) and 4000 rows. For each YEAR I need to
> determine the 10% and 90% quantiles of DAY. I'm sure this is easy enough,
> but I am a new to this.
>
>> quantile(DATA$DAY,c(0.1,0.9))
> 10% 90%
> 12 29
>
> But this is for the entire
2011 Jan 21
1
match function causing bad performance when using table function on factors with multibyte characters on Windows
[I originally posted this on the R-help mailing list, and it was suggested that R-devel would be a better
place to dicuss it.]
Running ?table? on a factor with levels containing non-ASCII characters
seems to result in extremely bad performance on Windows. Here?s a simple
example with benchmark results (I?ve reduced the number of replications to
make the function finish within reasonable time):