similar to: using a noisy variable in regression (not an R question)

Displaying 20 results from an estimated 6000 matches similar to: "using a noisy variable in regression (not an R question)"

2012 Mar 21
2
glmnet: obtain predictions using predict and also by extracting coefficients
All, For my understanding, I wanted to see if I can get glmnet predictions using both the predict function and also by multiplying coefficients by the variable matrix. This is not worked out. Could anyone suggest where I am going wrong? I understand that I may not have the mean/intercept correct, but the scaling is also off, which suggests a bigger mistake. Thanks for your help. Juliet Hannah
2008 Aug 23
3
graphs for pretest data
Is there an easy way to make graphs for the following data. I have pretest and posttest scores for men and women. I would like to form a 'titlted segment' plot for the data. That is, make segments joining the scores, with different types of segments for men and women. Example data: menpre <- c(43,42,26,39,60,60,46) menpost <- c(40,41,36,42,54,58,43) womenpre <-
2008 Jul 09
3
randomly select duplicated entries
Using this data as an example dat <- read.table(textConnection("Id myvar 12 1 12 2 12 6 34 9 34 4 34 8 65 15 65 23"), header = TRUE) closeAllConnections() how can I create another data set that does not have duplicate entries for 'Id', but the included values are randomly selected from the available ones. Thanks! Juliet
2009 Mar 02
3
ways to put multiple graphs on single page (using ggplot2)
Hi, Here are three plots: library(ggplot2) data(diamonds) randind <- sample(nrow(diamonds),1000,replace=FALSE) dsmall <- diamonds[randind,] qplot(carat, data=dsmall, geom="histogram",binwidth=1) qplot(carat, data=dsmall, geom="histogram",binwidth=.1) qplot(carat, data=dsmall, geom="histogram",binwidth=.01) What are ways to put these three plots on a single
2010 Jul 15
2
replace negative numbers by smallest positive value in matrix
Hi Group, I have a matrix, and I would like to replace numbers less than 0 by the smallest minimum number. Below is an small matrix, and the loop I used. I would like to get suggestions on the "R way" to do this. Thanks, Juliet # example data set mymat <- structure(c(-0.503183609420937, 0.179063475173256, 0.130473004669938, -1.80825226960127, -0.794910626384209, 1.03857280868547,
2009 Apr 20
3
what is R best for; what should one learn in addition to R
Hi, I've been working with R for a couple of years, and I've been able to get most of the things done that I needed (sometimes in a roundabout way). A few experienced statisticians told me that R is best for interactive data analysis, but for large-scale computations, one needs something else. I understand that this all depends on what you are trying to accomplish, and R offers many ways
2009 Feb 08
2
how to make this qq plot in lattice and/or ggplot2
Hi Group, Here is some data. p <- runif(1000) # sample data groups <- rep(c(1,2),each=500) #conditioning variable mydata <- cbind(p,groups) n <- length(p) u <- (1:n)/(n + 1) # uniform distribution reference for qqplot logp <- -log(p,base=10) logu <- -log(u,base=10) qqplot(logp,logu) How can I make the above qqplot in lattice and/or ggplot2. The sample is uniform, and I take
2008 Nov 19
2
ggplot2; dot plot, jitter, and error bars
With this data x <- c(0,0,1,1,2,2) y <- c(5,6,4,3,2,6) lwr <- y-1 upr <- y+1 xlab <- c("Low","Low","Med","Med","High","High") mydata <- data.frame(x,xlab,y,lwr,upr) I would like to make a dot plot and use lwr and upr as error bars. Above 0=Low. I would like there to be some space between the 5 and the 6 corresponding
2011 Aug 11
3
improve formatting of HTML table
I am trying to improve the look of an HTML table for a report (that needs to be pasted into Word). Here is an example. table2 <- structure(c(26L, 0L, 40L, 0L, 10L, 0L, 0L, 188L, 0L, 281L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 4L), .Dim = c(6L, 3L), .Dimnames = structure(list( myvar = c("Don't know", "Somewhat likely", "Somewhat unlikely", "Very
2008 Sep 22
2
adding layers in ggplot2 (data and code included)
Here is some sample data: mydata <- read.table(textConnection("Est Group Tri 0 0 4.639644 1 0 4.579189 2 0 4.590714 0 1 4.443696 1 1 4.588243 2 1 4.650505 0 2 4.296608 1 2 4.826036 2 2 4.765386"),header=TRUE); closeAllConnections(); I can form two plots,
2009 Jan 24
2
how to prevent duplications of data within a loop
Hi All, I had posted a question on a similar topic, but I think it was not focused. I am posting a modification that I think better accomplishes this. I hope this is ok, and I apologize if it is not. :) I am looping through variables and running several regressions. I have reason to believe that the data is being duplicated because I have been monitoring the memory use on unix. How can I avoid
2010 Jan 30
2
convert data frame of values into correlation matrix
Hi Group, Consider a data frame like this: mylabel1 <- rep(c("A","B","C"),each=3) mylabel2 <- rep(c("A","B","C"),3) corrs <- c(1,.8,.7,.8,1,.7,.7,.7,1) myData <- data.frame(mylabel1,mylabel2,corrs) myData mylabel1 mylabel2 corrs 1 A A 1.0 2 A B 0.8 3 A C 0.7 4 B
2011 Jun 30
2
error building package: packaging into .tar.gz failed
I am trying to build a package using windows xp. Here is the error I am getting: R CMD build myfunctions * checking for file 'myfunctions/DESCRIPTION' ... OK * preparing 'myfunctions': * checking DESCRIPTION meta-information ... OK * checking for LF line-endings in source and make files * checking for empty or unneeded directories * building 'myfunctions_1.0.tar.gz'
2008 Sep 15
1
modifying this barplot
Here is an example barplot that needs some tweaking: library(gplots) ratios <- data.frame(c(0.05,0.10,0.9),c(0.06,0.15,0.76)) rownames(ratios) <- c("T1","T2","T3") colnames(ratios) <- c("A1","A2") ratios <- as.matrix(ratios) myplot <- barplot2(ratios, beside = TRUE,col = c("blue",
2008 Sep 19
1
reproduce this graph in ggplot2 (code and data included)
How can I reproduce this graph in ggplot2 (regression lines and data point superimposed). Thanks, Juliet filename="http://personality-project.org/r/datasets/heating.txt" heating=read.table(filename,header=TRUE) symb=c(19,25,3,23) colors=c("black","red","green","blue")
2010 Sep 07
1
average columns of data frame corresponding to replicates
Hi Group, I have a data frame below. Within this data frame there are samples (columns) that are measured more than once. Samples are indicated by "idx". So "id1" is present in columns 1, 3, and 5. Not every id is repeated. I would like to create a new data frame so that the repeated ids are averaged. For example, in the new data frame, columns 1, 3, and 5 of the original
2010 Aug 10
1
partial match of one column in data frame to another character vector
Here is some data (dput output below) > myData id group 1 D599 A 2 002-0004 B 3 F01932 A 18 F16 B 19
2011 Aug 24
2
data manipulation and summaries with few million rows
I have a data set with about 6 million rows and 50 columns. It is a mixture of dates, factors, and numerics. What I am trying to accomplish can be seen with the following simplified data, which is given as dput output below. > head(myData) mydate gender mygroup id 1 2012-03-25 F A 1 2 2005-05-23 F B 2 3 2005-09-08 F B 2 4 2005-12-07 F B 2
2011 Nov 29
2
aggregate syntax for grouped column means
I am calculating the mean of each column grouped by the variable 'id'. I do this using aggregate, data.table, and plyr. My aggregate results do not match the other two, and I am trying to figure out what is incorrect with my syntax. Any suggestions? Thanks. Here is the data. myData <- structure(list(var1 = c(31.59, 32.21, 31.78, 31.34, 31.61, 31.61, 30.59, 30.84, 30.98, 30.79, 30.79,
2009 Jun 14
1
learning about panel functions in lattice
Hi All, I am trying to understand panel functions. Let's use this example. library(lattice) time<-c(rep(1:10,5)) y <-time+rnorm(50,5,2) group<-c(rep('A',30),rep('B',20)) subject<-c(rep('a',10),rep('b',10),rep('c',10),rep('d',10),rep('e',10)) myData <-data.frame(subject,group,time,y) head(myData) Plot 1 xyplot(y ~ time