thr3ads.net - similar to: "Filtering out bad data points"

Displaying 20 results from an estimated 500 matches similar to: "Filtering out bad data points"

Logistic Regression - Variable Selection Methods With Prediction

2011 Oct 25

Logistic Regression - Variable Selection Methods With Prediction

Hello, I am pretty new to R, I have always used SAS and SAS products. My target variable is binary ('Y' and 'N') and i have about 14 predictor variables. My goal is to compare different variable selection methods like Forward, Backward, All possible subsests. I am using misclassification rate to pick the winner method. This is what i have as of now, Reg <- glm (Graduation ~.,

Data frame vs matrix quirk: Hinky error message?

2012 May 01

Data frame vs matrix quirk: Hinky error message?

AdvisoRs: Is the following a bug, feature, hinky error message, or dumb Bert? > mtest <- matrix(1:12,nr=4) > dftest <- data.frame(mtest) > ix <- cbind(1:2,2:3) > mtest[ix] <- NA > mtest [,1] [,2] [,3] [1,] 1 NA 9 [2,] 2 6 NA [3,] 3 7 11 [4,] 4 8 12 ## But ... > dftest[ix] <- NA Error in `[<-.data.frame`(`*tmp*`, ix, value

write.table with row.names=FALSE unnecessarily slow?

2008 Mar 10

write.table with row.names=FALSE unnecessarily slow?

write.table with large data frames takes quite a long time > system.time({ + write.table(df, '/tmp/dftest.txt', row.names=FALSE) + }, gcFirst=TRUE) user system elapsed 97.302 1.532 98.837 A reason is because dimnames is always called, causing 'anonymous' row names to be created as character vectors. Avoiding this in src/library/utils, along the lines of Index:

Conditionally adding a constant

2012 Jan 02

Conditionally adding a constant

I am trying to add a constant to the previous value of a variable based on certain conditions. Maybe there is a simple way to do this that I am missing completely. I have given an example below: df <- data.frame(x = c(1,2,3,4,5), y = c(10,20,30,NA,NA)) > df x y 1 1 10 2 2 20 3 3 30 4 4 NA 5 5 NA I want to add 2 to the previous value of y, if x exceeds 3 (also will have to handle NAs in

missing value where TRUE/FALSE needed with R ipolygrowth

2025 May 09

missing value where TRUE/FALSE needed with R ipolygrowth

Dear R-Help, I am trying to determine the growth rate of bacteria under specific conditions using ipolygrowth function `ipg_multisample`. While this worked before, I got some data that give the error: ``` Error in if (tb.result$peak.growth.time == 0) { : missing value where TRUE/FALSE needed In addition: Warning message: In max(pgr[pgr > 0 & Re(x) >= 0 & Re(x) <= max]) :

add constraints to nls or use another function

2012 Jun 28

add constraints to nls or use another function

Hello, I'm trying to fit experimental data with a model and nls. For some experiments, I have data with x from 0 to 1.2 and the fit is quite good. But it can happen that I have data only the [0,0.8] range (see the example below) and, then, the fit is not correct. I would like to add a constraint, for example : the second derivative must be positive. But I don't know how to add this to

timezone specification on windows machine

2011 May 14

timezone specification on windows machine

Hi, I'm wondering what's the right value for specifying "America/New_York" time zone on a windows machine? I got my code which specify this time zone on as.POSIXct function work properly with this value on a linux machine. But it keeps giving me complaint on windows. Thank you. Cheers, Robert

AFTREG weights

2012 Sep 04

AFTREG weights

On Wed, Aug 1, 2012 at 3:08 PM, <fra.meucci@hotmail.it> wrote: > Dear Göran Broström, > I am trying to use AFTREG function for R to estimate a loglogistic > survival function, including time dependent covariates. > Actually, my Subset includes some partial events; the idea is to model > this kind of events using something similar to “weights” in the SURVREG > function.

Finding percentile of a value from an empirical distribution

2012 Jan 11

Finding percentile of a value from an empirical distribution

Hello, I am not sure how to do this in R. Any suggestion would be appreciated. I have a vector of values from where I build an empirical CDF. For example: > x <- seq(1,100) > x <- sample(x,1000,replace=T) > quantile(x,probs=seq(0,1,.05)) 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 1.00 5.00 10.00 16.00 20.00 25.00 31.00 36.00 41.00

AsOf join in R

2011 Oct 05

AsOf join in R

Hi, I tried to google for any solution for asof join operator in R. But I couldn't find one. The asof join operator AsOf(A,B) merges 2 time series by looking for latest available value of B prior to each time point in A. For example, A <- xts(c(10,15,20,25), order.by=as.POSIXct(c("2011-09-01","2011-09-09","2011-09-10","2011-09-15")) B <-

Product integral in R

2011 Dec 18

Product integral in R

Hi, I am wondering if anybody ever come across any implementation of product integral in R? As far as I googled, I haven't come across any package. Is there any? Thank you. http://en.wikipedia.org/wiki/Product_integral Regards, Robert

Double integration using R

2011 Nov 06

Double integration using R

Hi, I have a function that I need to do double integration: \int^T_0 \int^t_0 N(\delta / \sigma \sqrt(u)) (1-N(\delta / \sigma \sqrt(u))) du dt where N(x) is a standard normal probability of x. I start off by writing an inner integral into a function. Meaning \int^t_0 N(\delta,\sigma \sqrt(u)) (1-N(\delta,\sigma \sqrt(u))) du. Then calling integrate function on this function. This

strange result with contrasts

2004 Apr 20

strange result with contrasts

Hello, I'm trying to reproduce some SAS result wit R (after I got suspicious with the result in R). I struggle with the contrasts in a linear model. I've got three factors > d$dose <- as.factor(d$dose) # 5 levels > d$time <- as.factor(d$time) # 2 levels > d$batch <- as.factor(d$batch) # 3 levels the data frame d contains 82 rows. There are 2 to 4 replicates of

seq(0.05,0.95,by=0.002) and logical error

2000 Dec 10

seq(0.05,0.95,by=0.002) and logical error

Regardless of which version -- 1.1.1 or 1.2.0 (2000-11-27) -- with a fresh "directory" (i.e. no .RData), I am getting an extremely weird result. R : Copyright 2000, The R Development Core Team Version 1.2.0 Under development (unstable) (2000-11-27) > jj _ seq(0.05,0.95,by=0.002) > sum(jj==0.75) ## WRONG ANSWER [1] 0 > 0.05 + 350*.002 ## Double check that 0.75 is in jj [1]

Handling of irregular time series in lineChart

2011 Apr 29

Handling of irregular time series in lineChart

Hi, I realized that when I have irregular series to feed into lineChart, the interval of each point in the chart does not seem to take care of irregular time interval I specified in my input xts time series. But rather, lineChart seems to take each point as equal spaced time series. For example, I have the following code: library(quantmod) options(digits.sec=3) t0 <-

[LLVMdev] [RFC] Benchmarking subset of the test suite

2014 May 04

[LLVMdev] [RFC] Benchmarking subset of the test suite

At the LLVM Developers' Meeting in November, I promised to work on isolating a subset of the current test suite that is useful for benchmarking. Having looked at this in more detail, most of the applications and benchmarks in the test suite are useful for benchmarking, and so I think that a better way of phrasing it is that we should construct a list of programs in the test suite that are not

chi2

2007 Oct 10

chi2

Hello, I want to use the quantile function so I read the doc but I don't understand with this > qchisq(seq(0.05,0.95,by=0.05),df=(length(don)-1)) [1] 62667.11 62795.62 62882.42 62951.47 63010.74 63064.00 63113.39 63160.27 63205.65 63250.33 63295.04 63340.48 63387.48 63437.03 63490.53 63550.14 63619.68 [18] 63707.24 63837.16 Can you help me please?

bootstrap

2007 Apr 27

bootstrap

Dear All, I would like to use a nonparametric bootstrap to calculate the confidence intervals for the 5% and 95% quantiles using boot.ci. As you know, boot.ci requires the use of boot to generate bootstrap replicates for my statistic. However this last function doesn't work in my case because I am missing something. Here is an example y <- rnorm(100) Quantile <-

R quantreg anova: How to change summary se-type

2012 May 28

R quantreg anova: How to change summary se-type

He folks=) I want to check whether a coefficient has an impact on a quantile regression (by applying the sup-wald test for a given quantile range [0.05,0.95]. Therefore I am doing the following calculations: a=0; for (i in 5:95/100){ fitrestricted=rq(Y~X1+X2,tau=i) tifunrestrited=rq(Y~X1+X2+X3,tau=i) a[i]=anova(fitrestricted,fitunrestricted)$table$Tn) #gives the Test-Value } supW=max(a) As anova

Change values in a dateframe

2013 Jul 24

Change values in a dateframe

Hello I have the following problem : The dataframe TEST has multiple lines for a same person because : there are differents values of Nom or differents values of Prenom but the values of Matricule or Sexe or Date.de.naissance are the same. TEST <- structure(list(Matricule = c(66L, 67L, 67L, 68L, 89L, 90L, 90L, 91L, 108L, 108L, 108L), Nom = structure(c(1L, 2L, 2L, 4L, 8L, 5L, 6L, 9L, 3L, 3L,

similar to: Filtering out bad data points