Displaying 20 results from an estimated 6000 matches similar to: "using a noisy variable in regression (not an R question)"
2012 Mar 21
2
glmnet: obtain predictions using predict and also by extracting coefficients
All,
For my understanding, I wanted to see if I can get glmnet predictions
using both the predict function and also by multiplying coefficients
by the variable matrix. This is not worked out. Could anyone suggest
where I am going wrong?
I understand that I may not have the mean/intercept correct, but the
scaling is also off, which suggests a bigger mistake.
Thanks for your help.
Juliet Hannah
2008 Aug 23
3
graphs for pretest data
Is there an easy way to make graphs for the following data. I have
pretest and posttest scores for men and
women. I would like to form a 'titlted segment' plot for the data.
That is, make segments joining the scores,
with different types of segments for men and women.
Example data:
menpre <- c(43,42,26,39,60,60,46)
menpost <- c(40,41,36,42,54,58,43)
womenpre <-
2008 Jul 09
3
randomly select duplicated entries
Using this data as an example
dat <- read.table(textConnection("Id myvar
12 1
12 2
12 6
34 9
34 4
34 8
65 15
65 23"), header = TRUE)
closeAllConnections()
how can I create another data set that does not have duplicate entries
for 'Id', but the included values
are randomly selected from the available ones.
Thanks!
Juliet
2009 Mar 02
3
ways to put multiple graphs on single page (using ggplot2)
Hi, Here are three plots:
library(ggplot2)
data(diamonds)
randind <- sample(nrow(diamonds),1000,replace=FALSE)
dsmall <- diamonds[randind,]
qplot(carat, data=dsmall, geom="histogram",binwidth=1)
qplot(carat, data=dsmall, geom="histogram",binwidth=.1)
qplot(carat, data=dsmall, geom="histogram",binwidth=.01)
What are ways to put these three plots on a single
2010 Jul 15
2
replace negative numbers by smallest positive value in matrix
Hi Group,
I have a matrix, and I would like to replace numbers less than 0 by
the smallest minimum number. Below is an
small matrix, and the loop I used. I would like to get suggestions on
the "R way" to do this.
Thanks,
Juliet
# example data set
mymat <- structure(c(-0.503183609420937, 0.179063475173256, 0.130473004669938,
-1.80825226960127, -0.794910626384209, 1.03857280868547,
2009 Apr 20
3
what is R best for; what should one learn in addition to R
Hi,
I've been working with R for a couple of years, and I've
been able to get most of the things done that I needed (sometimes in
a roundabout way). A few experienced statisticians told me that
R is best for interactive data analysis, but for large-scale
computations, one needs something else.
I understand that this all depends on what you are trying to
accomplish, and R offers many ways
2009 Feb 08
2
how to make this qq plot in lattice and/or ggplot2
Hi Group,
Here is some data.
p <- runif(1000) # sample data
groups <- rep(c(1,2),each=500) #conditioning variable
mydata <- cbind(p,groups)
n <- length(p)
u <- (1:n)/(n + 1) # uniform distribution reference for qqplot
logp <- -log(p,base=10)
logu <- -log(u,base=10)
qqplot(logp,logu)
How can I make the above qqplot in lattice and/or ggplot2. The sample
is uniform, and I take
2008 Nov 19
2
ggplot2; dot plot, jitter, and error bars
With this data
x <- c(0,0,1,1,2,2)
y <- c(5,6,4,3,2,6)
lwr <- y-1
upr <- y+1
xlab <- c("Low","Low","Med","Med","High","High")
mydata <- data.frame(x,xlab,y,lwr,upr)
I would like to make a dot plot and use lwr and upr as error bars.
Above 0=Low. I would like there to be
some space between the 5 and the 6 corresponding
2011 Aug 11
3
improve formatting of HTML table
I am trying to improve the look of an HTML table for a report (that
needs to be pasted into Word).
Here is an example.
table2 <- structure(c(26L, 0L, 40L, 0L, 10L, 0L, 0L, 188L, 0L, 281L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 4L), .Dim = c(6L, 3L), .Dimnames = structure(list(
myvar = c("Don't know", "Somewhat likely", "Somewhat unlikely",
"Very
2008 Sep 22
2
adding layers in ggplot2 (data and code included)
Here is some sample data:
mydata <- read.table(textConnection("Est Group Tri
0 0 4.639644
1 0 4.579189
2 0 4.590714
0 1 4.443696
1 1 4.588243
2 1 4.650505
0 2 4.296608
1 2 4.826036
2 2 4.765386"),header=TRUE);
closeAllConnections();
I can form two plots,
2009 Jan 24
2
how to prevent duplications of data within a loop
Hi All,
I had posted a question on a similar topic, but I think it was not
focused. I am posting a modification that I think better accomplishes
this.
I hope this is ok, and I apologize if it is not. :)
I am looping through variables and running several regressions. I have
reason to believe that the data is being duplicated because I have
been
monitoring the memory use on unix.
How can I avoid
2010 Jan 30
2
convert data frame of values into correlation matrix
Hi Group,
Consider a data frame like this:
mylabel1 <- rep(c("A","B","C"),each=3)
mylabel2 <- rep(c("A","B","C"),3)
corrs <- c(1,.8,.7,.8,1,.7,.7,.7,1)
myData <- data.frame(mylabel1,mylabel2,corrs)
myData
mylabel1 mylabel2 corrs
1 A A 1.0
2 A B 0.8
3 A C 0.7
4 B
2011 Jun 30
2
error building package: packaging into .tar.gz failed
I am trying to build a package using windows xp. Here is the error I am getting:
R CMD build myfunctions
* checking for file 'myfunctions/DESCRIPTION' ... OK
* preparing 'myfunctions':
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files
* checking for empty or unneeded directories
* building 'myfunctions_1.0.tar.gz'
2008 Sep 15
1
modifying this barplot
Here is an example barplot that needs some tweaking:
library(gplots)
ratios <- data.frame(c(0.05,0.10,0.9),c(0.06,0.15,0.76))
rownames(ratios) <- c("T1","T2","T3")
colnames(ratios) <- c("A1","A2")
ratios <- as.matrix(ratios)
myplot <- barplot2(ratios, beside = TRUE,col = c("blue",
2008 Sep 19
1
reproduce this graph in ggplot2 (code and data included)
How can I reproduce this graph in ggplot2 (regression lines and data
point superimposed). Thanks, Juliet
filename="http://personality-project.org/r/datasets/heating.txt"
heating=read.table(filename,header=TRUE)
symb=c(19,25,3,23)
colors=c("black","red","green","blue")
2010 Sep 07
1
average columns of data frame corresponding to replicates
Hi Group,
I have a data frame below. Within this data frame there are samples
(columns) that are measured more than once. Samples are indicated by
"idx". So "id1" is present in columns 1, 3, and 5. Not every id is
repeated. I would like to create a new data frame so that the repeated
ids are averaged. For example, in the new data frame, columns 1, 3,
and 5 of the original
2010 Aug 10
1
partial match of one column in data frame to another character vector
Here is some data (dput output below)
> myData
id group
1 D599 A
2 002-0004 B
3 F01932 A
18 F16 B
19
2011 Aug 24
2
data manipulation and summaries with few million rows
I have a data set with about 6 million rows and 50 columns. It is a
mixture of dates, factors, and numerics.
What I am trying to accomplish can be seen with the following
simplified data, which is given as dput output below.
> head(myData)
mydate gender mygroup id
1 2012-03-25 F A 1
2 2005-05-23 F B 2
3 2005-09-08 F B 2
4 2005-12-07 F B 2
2011 Nov 29
2
aggregate syntax for grouped column means
I am calculating the mean of each column grouped by the variable 'id'.
I do this using aggregate, data.table, and plyr. My aggregate results
do not match the other two, and I am trying to figure out what is
incorrect with my syntax. Any suggestions? Thanks.
Here is the data.
myData <- structure(list(var1 = c(31.59, 32.21, 31.78, 31.34, 31.61, 31.61,
30.59, 30.84, 30.98, 30.79, 30.79,
2009 Jun 14
1
learning about panel functions in lattice
Hi All,
I am trying to understand panel functions. Let's use this example.
library(lattice)
time<-c(rep(1:10,5))
y <-time+rnorm(50,5,2)
group<-c(rep('A',30),rep('B',20))
subject<-c(rep('a',10),rep('b',10),rep('c',10),rep('d',10),rep('e',10))
myData <-data.frame(subject,group,time,y)
head(myData)
Plot 1
xyplot(y ~ time