thr3ads.net - similar to: "Eliminate cases in a subset of a dataframe"

Displaying 20 results from an estimated 1000 matches similar to: "Eliminate cases in a subset of a dataframe"

2012 May 29

setting parameters equal in lm

Forgive me if this is a trivial question, but I couldn't find it an answer in former forums. I'm trying to reproduce some SAS results where they set two parameters equal. For example: y = b1X1 + b2X2 + b1X3 Notice that the variables X1 and X3 both have the same slope and the intercept has been removed. How do I get an estimate of this regression model? I know how to remove the intercept

cygwing warming when creating a package in windows

2011 Oct 18

cygwing warming when creating a package in windows

Dear All, I am a beginner creating R packages. I followed the Leisch (2009) tutorial and the document ?Writing R Extensions? to write an example. I installed R 2.12.2 (I also tried R2.13.2), the last version of Rtools and the recommended packages in a PC with Windows 7 Home Premium. I can run R CMD INSTALL linmod in the command prompt and the R CMD check linmod. The following outputs are

Cook's distance

2013 Mar 12

Cook's distance

Dear useRs, I have some trouble with the calculation of Cook's distance in R. The formula for Cook's distance can be found for example here: http://en.wikipedia.org/wiki/Cook%27s_distance I tried to apply it in R: > y <- (1:400)^2 > x <- 1:100 > lm(y~x) -> linmod # just for the sake of a simple example >

Extracting values from Surv function in survival package

2024 May 15

Extracting values from Surv function in survival package

OS X R 4.3.3 Colleagues I have created objects using the Surv function in the survival package: > FIT.1 Call: survfit(formula = FORMULA1) n events median 0.95LCL 0.95UCL SUBDATA$ARM=1, SUBDATA[, EXP.STRAT]=0 18 13 345 156 NA SUBDATA$ARM=2, SUBDATA[, EXP.STRAT]=1 13 5 NA 186 NA SUBDATA$ARM=2, SUBDATA[, EXP.STRAT]=2 5

Extracting values from Surv function in survival package

2024 May 16

Extracting values from Surv function in survival package

Hi Dennis, look at the help page for summary.survfit, the Value n.event. G?ran On 2024-05-15 22:41, Dennis Fisher wrote: > OS X > R 4.3.3 > > Colleagues > > I have created objects using the Surv function in the survival package: >> FIT.1 > Call: survfit(formula = FORMULA1) > > n events median 0.95LCL 0.95UCL >

What is the most efficient practice to develop an R package?

2009 Oct 26

What is the most efficient practice to develop an R package?

I am reading Section 5 and 6 of http://cran.r-project.org/doc/contrib/Leisch-CreatingPackages.pdf It seems that I have to do the following two steps in order to make an R package. But when I am testing these package, these two steps will run many times, which may take a lot of time. So when I still develop the package, shall I always source('linmod.R') to test it. Once the code in

Problem calculating multiple regressions on a data frame.

2010 Apr 27

Problem calculating multiple regressions on a data frame.

Hi there, I am stuck trying to solve what should be a fairly easy problem. I have a data frame that essentially consists of (ID, time as seqMonth, variable, value) and i want to find the regression coefficient of value vs time for each combination of ID and Variable. I have tried several approaches and none of them seems to work as i expected. For example, i have tried:

updating formula inside function

2006 Jan 11

updating formula inside function

Dear R-Helpers Given a function like foo <- function(data,var1,var2,var3) { f <- formula(paste(var1,'~',paste(var2,var3,sep='+'),sep='')) linmod <- lm(f) return(linmod) } By typing foo(mydata,'a','b','c') I get the result of the linear model a~b+c. How can I rewrite the function so that the formula can be updated inside the function,

Problem to remove loops in a routine

2007 Aug 01

Problem to remove loops in a routine

Dear R-users, I have written the following code to generate some trellis plots. It works perfectly fine except that it is quite slow when it is apply to my typical datasets (over several thousands of lines). I believe the problem comes from the loops I am using to subset my data.frame. I read in the archives that the tapply function is often more efficient than a loop in R. Unfortunately ,

How to calculate the robust standard error of the dependent variable

2010 Jun 18

How to calculate the robust standard error of the dependent variable

Hi, folks linmod=y~x+z summary(linmod) The summary of linmod shows the standard error of the coefficients. How can we get the sd of y and the robust standard errors in R? Thanks! [[alternative HTML version deleted]]

How to predict the mean and variance of the dependent variable after regression

2010 Jun 21

How to predict the mean and variance of the dependent variable after regression

Hi, folks, As seen in the following codes: x1=rlnorm(10) x2=rlnorm(10,mean=2) y=rlnorm(10,mean=10)### Fake dataset linmod=lm(log(y)~log(x1)+log(x2)) After the regression, I would like to know the mean of y. Since log(y) is normal and y is lognormal, I need to know the mean and variance of log(y) first. I tried mean (y) and mean(linmod), but either one is what I want. Any tips? Thanks in

How to apply a function to subsets of a data frame *and* obtain a data frame again?

2011 Aug 17

How to apply a function to subsets of a data frame *and* obtain a data frame again?

Dear all, First, let's create some data to play around: set.seed(1) (df <- data.frame(Group=rep(c("Group1","Group2","Group3"), each=10), Value=c(rexp(10, 1), rexp(10, 4), rexp(10, 10)))[sample(1:30,30),]) ## Now we need the empirical distribution function: edf <- function(x) ecdf(x)(x) # empirical distribution function evaluated at x ##

Help with data.frame subsets

2003 Mar 25

Help with data.frame subsets

Hello all, I'm trying to get a subset of a data frame by taking all rows where the 2nd column is >= Min and <= Max. I can do that by a 2 step process similar to the following: subData <- dataFrame[dataFrame[,2] >= Min,] subData2 <- subData[subData[,2] <= Max,] Then I try to graph the results where col 2 is the X var and col 3 is the Y var. Therefore I do the following: X

googleVis motionchart - slow with Date class

2011 Oct 31

googleVis motionchart - slow with Date class

Hi, I am trying to create a googleVis motion chart with monthly data. When formatting the date column as a Date class variable, the plot as presented in the browser becomes considerably slower and very prone to crashing the browser. To illustrate this issue I have modified the WorldBank demo. ### objects from demo("WorldBank", package = "googleVis") M <-

mean of subset of rows

2007 Oct 01

mean of subset of rows

Dear list, this must be an easy one: I have a data.frame of two columns, "ID" with four different levels (A to D) and numerical "size", and each of the 4 different IDs is repeated a different number of times. I would like to get the mean size for each ID as another data.frame. I have tried the following: >ID= as.character(unique(data[,1])) # I use unique() because

Weighting data when running regressions

2008 Mar 10

Weighting data when running regressions

Dear R-Help, I'm new to R and struggling with weighting data when I run regression. I've tried to use search to solve my problem but haven't found anything helpful so far. I (successfully) import data from SPSS (15) and try to run a linear regression on a subset of my data file where WEIGHT is the name of my weighting variable (numeric), e.g.: library(foreign)

Advanced Filtering problem

2008 Jun 19

Advanced Filtering problem

http://www.nabble.com/file/p18018170/subdata.csv subdata.csv I've attached 100 rows of a data frame I am working with. I have one factor, id, with 27 levels. There are two columns of reference data, x and y (UTM coordinates), one column "date" in POSIXct format, and one column "diff" in times format (chron package). What I am trying to do is as follows: For each day

Overlaying lattice graphs (continued)

2007 Jun 21

Overlaying lattice graphs (continued)

Dear R Users, I recently posted an email on this list about the use of data.frame and overlaying multiple plots. Deepayan kindly indicated to me the panel.superposition command which worked perfectly in the context of the example I gave. I'd like to go a little bit further on this topic using a more complex dataset structure (actually the one I want to work on). >mydata Plot

Sweave: include a multi-page-pdf plot

2011 Mar 27

Sweave: include a multi-page-pdf plot

Hi, I'm just starting out with Sweave, and I can't get a plot(linmod) to display all four plots: << bild >>= x1 <- runif(100) x2 <- rexp(100) y <- 3 + 4*x1 + 5*x2 + rnorm(100) mod <- lm(y~x1+x2) plot(mod) @ Some Text <<fig=TRUE>>= <<bild>> @ This plots only the first image of the four-page plot.lm() result. I don't want to use

stop on rows where !is.na(mydata$ti_all)

2012 Sep 24

stop on rows where !is.na(mydata$ti_all)

Dear R experts, I got help to build a loop but there is a bug inside it that causes one part of the mechanism to fail. It should grow once, but if keep growing on rows where $ti_all is not NA. Here is a wall of code that very crudely demonstrates the problem, there is a couple of dim() outputs at the end where you can see how it the second time around keeps adds (2) rows, but this does not

similar to: Eliminate cases in a subset of a dataframe