thr3ads.net - similar to: "stats tests on large tables"

Displaying 20 results from an estimated 20000 matches similar to: "stats tests on large tables"

Repeated analysis over groups / Splitting by group variable

2010 Jul 15

Repeated analysis over groups / Splitting by group variable

I am performing some analysis over a large data frame and would like to conduct repeated analysis over grouped-up subsets. How can I do that? Here some example code for clarification: require("flexmix") # for Kullback-Leibler divergence n <- 23 groups <- c(1,2,3) mydata <- data.frame( sequence=c(1:n), data1=c(rnorm(n)), data2=c(rnorm(n)), group=rep(sample(groups, n,

glm(family=binomial) and lmer

2007 Aug 14

glm(family=binomial) and lmer

Dear R users, I've notice that there are two ways to conduct a binomial GLM with binomial counts using R. The first way is outlined by Michael Crawley in his "Statistical Computing book" (p 520-521): >dose=c(1,3,10,30,100) >dead = c(2,10,40,96,98) >batch=c(100,90,98,100,100) >response = cbind(dead,batch-dead) >model1=glm(y~log(dose),binomial)

Multivariate Power Test?

2013 Mar 07

Multivariate Power Test?

Generic question... I am familiar with generic power calculations in R, however a lot of the data I primarily work with is multivariate. Is there any package/function that you would recommend to conduct such power analysis? Any recommendations would be appreciated. Thank you for your time, Charles [[alternative HTML version deleted]]

how do I "relate" tables in R?

2006 Feb 11

how do I "relate" tables in R?

Hi all, I'm new to the list...pretty new at learning to code in R... Is there a way to relate 2 different arrays in R? Hypothetical example: data1 ID z 1 100 2 250 3 75 4 12 5 89 data2 ID z 1 1 1 1 2 3 4 3 4 5 5 5 etc. Goal is to fill column z in data2 with appropriate z-values from data1 that correspond to a given ID. I'm looking for something akin to a

ADF test

2007 Aug 16

ADF test

Hi all, Hope you people do not feel irritated for repeatedly sending mail on Time series. Here I got another problem on the same, and hope I would get some answer from you. I have following dataset: data[,1] [1] 4.96 4.95 4.96 4.96 4.97 4.97 4.97 4.97 4.97 4.98 4.98 4.98 4.98 4.98 4.99 4.99 5.00 5.01 [19] 5.01 5.00 5.01 5.01 5.01 5.01 5.02 5.01 5.02 5.02 5.03 5.03 5.03

interactions and GAM

2007 Feb 27

interactions and GAM

Dear R-users, I have 1 remark and 1 question on the inclusion of interactions in the gam function from the gam package. I need to fit quantitative predictors in interactions with factors. You can see an example of what I need in fig 9.13 p265 from Hastie and Tibshirani book (1990). It's clearly stated that in ?gam "Interactions with nonparametric smooth terms are not fully

reading tables into R. .

2005 Jun 03

reading tables into R. .

Hi, The file I am reading is a text file, whose contents are a matrix that has 15 rows and 58 columns. The first row has column names, and the first column has row names, so the format is correct as far as using read.table is concerned. The other values in the table are all float values (numeric). So when I read in the file using data1 <- read.table("HAL001_HAL0015_Signals.txt"), it

How to apply five lines of code to ten dataframes?

2009 Dec 07

How to apply five lines of code to ten dataframes?

Hello R-helpers, I have 10 dataframes (named data1, data2, ... data10) and I would like to add 5 new columns to each dataframe using the following code: data1$LogDepth<-log10(data1[,2]/data1[,4]) data1$LogArea<-log10(data1[,3]/data1[,5]) data1$p<-2*data1[,6]/data1[,7] data1$Exp<-data1[,2]^(2/data1[,8]) data1$s<-data1[,3]/data1[,9] ...but I would prefer not to repeat this chunk of

Removing Outliers Function

2011 Feb 09

Removing Outliers Function

I am working on a function that will remove outliers for regression analysis. I am stating that a data point is an outlier if its studentized residual is above or below 3 and -3, respectively. The code below is what i have thus far for the function x = c(1:20) y = c(1,3,4,2,5,6,18,8,10,8,11,13,14,14,15,85,17,19,19,20) data1 = data.frame(x,y) rm.outliers =

cleanse columns and unwanted rows

2009 Nov 13

cleanse columns and unwanted rows

hello folks, Im trying to clean out a large file with data i dont need. The column im manipulating in the file is called "legal_status" There are three kinds of rows i want to remove. Those that have "Private", "Private (Op", or "Unknown" in the legal_status column. I wrote this code but i get errors and it says im missing a TRUE/ False thingy...im

Errno::ENETUNREACH (Network is unreachable - connect(2)):

2010 Jan 05

Errno::ENETUNREACH (Network is unreachable - connect(2)):

I am trying to run my first app on the Solaris server, in a production environment. I get a network unreachable. Why? Does it belongs to database.yml config? Processing CategoriesController#index (for 10.3.70.129 at 2010-01-05 14:00:47) [GET] Errno::ENETUNREACH (Network is unreachable - connect(2)): /usr/ruby-enterprise/lib/ruby/1.8/net/http.rb:560:in `initialize''

combining collumns for data.frames

2010 Sep 06

combining collumns for data.frames

Hi This question is far less simple than the title suggests, please read carefully, thanks. I have 2 sets of data, both read into R >data1<-read.table ("1.txt", header=T, sep="\t") >data2<-read.table ("2.txt", header=T, sep="\t") >data1 Taxon stage1 stage2 stage3 stage4 T1 0 0 1 1 T2 0

Passing options as lists

2002 Dec 05

Passing options as lists

Hi, I apologize if this has previously been posted. I've just subscribed to the R-help digest. I'm writing a plotting function that uses layout() to plot several different plots on the same device. This function uses plot(), image(), and a custom function that uses text(). Each cell of the layout needs different par() parameters, so what I'd like to do is pass them as lists:

probem on merge data

2009 Nov 06

probem on merge data

Hi there, data1<-matrix(data=c(1,1.2,1.3,"3/23/2004",1,1.5,2.3,"3/22/2004",2,0.2,3.3,"4/23/2004",3,1.5,1.3,"5/22/2004"),nrow=4,ncol=4,byrow=TRUE) data1<-data.frame(data1) names(data1)<-c("areaid","x","y","date") data1 areaid x y date 1 1 1.2 1.3 3/23/2004 2 1 1.5 2.3 3/22/2004 3 2

Assessing calibration of Cox model with time-dependent coefficients

2018 Jan 17

Assessing calibration of Cox model with time-dependent coefficients

I am trying to find methods for testing and visualizing calibration to Cox models with time-depended coefficients. I have read this nice article <http://journals.sagepub.com/doi/10.1177/0962280213497434>. In this paper, we can fit three models: fit0 <- coxph(Surv(futime, status) ~ x1 + x2 + x3, data = data0) p <- log(predict(fit0, newdata = data1, type = "expected")) lp

SAPPLY function XXXX

2011 May 04

SAPPLY function XXXX

Hello everyone, I am attempting to write a function to count the number of non-missing values of each column in a data frame using the sapply function. I have the following code which is receiving the error message below. > n.valid<-sapply(data1,sum(!is.na)) Error in !is.na : invalid argument type Ultimately, I would like for this to be 1 conponent in a larger function that will produce

subset in dataframes

2011 Oct 02

subset in dataframes

I need help in subseting a dataframe: data1<-data.frame(year=c(2001,2002,2003,2004,2001,2002,2003,2004, 2001,2002,2003,2004,2001,2002,2003,2004), firm=c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4),x=c(11,22,-32,25,-26,47,85,98, 101,14,87,56,12,43,67,54), y=c(110,220,302,250,260,470,850,980,1010,140,870,560,120,430,670,540)) data1 I want to keep the firms where all x>0 (where there are

Automatization of non-linear regression

2009 Oct 22

Automatization of non-linear regression

Hi everybody, I'm using the method described here to make a linear regression: http://www.apsnet.org/education/advancedplantpath/topics/Rmodules/Doc1/05_Nonlinear_regression.html > ## Input the data that include the variables time, plant ID, and severity > time <- c(seq(0,10),seq(0,10),seq(0,10)) > plant <- c(rep(1,11),rep(2,11),rep(3,11)) > > ## Severity

how to convert character variables into numeric variables directly

2010 Mar 08

how to convert character variables into numeric variables directly

Here is the example. > age=18:29 > height=c(76.1,77,78.1,78.2,78.8,79.7,79.9,81.1,81.2,81.8,82.8,83.5) > type=c("A", "B", "C", "D","A", "B", "C", "D","A", "B", "C", "D") >

lm and R-squared (newbie)

2011 Dec 15

lm and R-squared (newbie)

Hello, I've two data.frames (data1 and data4), dec="." and sep=";". http://r.789695.n4.nabble.com/file/n4199964/data1.txt data1.txt http://r.789695.n4.nabble.com/file/n4199964/data4.txt data4.txt When I do plot(data1$nx,data1$ny, col="red") points(data4$nx,data4$ny, col="blue") , results seem very similar (at least to me) but the R-squared of

similar to: stats tests on large tables