Tal Galili
2010-Jun-15 10:57 UTC
[R] Graphics question: How to create a changing "smudge factor" for overlapping lines?
Hello all,
I am trying to create a Clustergram in R.
(More about it here: http://www.schonlau.net/clustergram.html)
And to produce a picture similar to what is seen here:
http://www.schonlau.net/images/clustergramexample.gif
I was able (more or less) to write the R code for creating the image, but
there is one thing I can't seem to figure out, that is the
*changing*"smudge factor" of the lines.
I want the overlapping lines to "jitter" a tiny bit so they will give
a
sense of thickness to the line (according to how many observations are
present in that cluster).
My current solution is to use a constant jitter (based on "seq") on
all the
k number of clusters, but that causes glitches in the produced image (run my
code to see).
Here is a simple self reproducible code to create the image I was able to
make:
# ------------------------------------
set.seed(100)
Data <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),
matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2))
colnames(x) <- c("x", "y")
# noise <- runif(100,0,.05)
noise <- seq(0,.3, length.out = 100)
Y <- NULL
X <- NULL
k.range <- 2:10
for(k in k.range)
{
cl <- kmeans(Data, k)
y <- apply(cl$centers,1, mean)[cl$cluster] + noise
Y <- cbind(Y, y)
x <- rep(k, length(y))
X <- cbind(X, x)
points(y ~ x)
}
require(colorspace)
COL <- rainbow_hcl(100)
plot(0,0, col = "white", xlim = c(1,10), ylim = c(-.5,1.6),
xlab = "Number of clusters", ylab = "Clusters means", main
= "(Basic)
Clustergram")
axis(side =1, at = k.range)
abline(v = k.range, col = "grey")
matlines(t(X), t(Y), pch = 19, col = COL, lty = 1, lwd = 1.5)
# The next step would be to create a method for different cluster objects,
but thats for another day...
#--------------------------------------------
Any suggestions on how to do this ?
Thanks,
Tal
----------------Contact
Details:-------------------------------------------------------
Contact me: Tal.Galili@gmail.com | 972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
----------------------------------------------------------------------------------------------
[[alternative HTML version deleted]]
Hadley Wickham
2010-Jun-15 12:45 UTC
[R] Graphics question: How to create a changing "smudge factor" for overlapping lines?
> My current solution is to use a constant jitter (based on "seq") on all the > k number of clusters, but that causes glitches in the produced image (run my > code to see).What are the glitches? It looks pretty good to me. (I'm not sure if the colour does anything apart from make it pretty though). Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/
Paul Hiemstra
2010-Jun-16 07:29 UTC
[R] Graphics question: How to create a changing "smudge factor" for overlapping lines?
Hi Tal,
I you use ggplot you can use the alpha command to make lines
transparent. The nice thing is that when they overlap, the transparency
adds up. I use this a lot to visualize outcomes from ensemble modelling
(e.g. time series of RMSE).
A small example:
library(ggplot2)
dat = data.frame(x = rep(1:100, 100),
y = rep(1:100, 100),
grp = rep(sapply(1:100, function(x)
sprintf("line%s", x)), each = 100))
dat$y = dat$y + rnorm(length(dat$y), 3, 3)
# Without alpha
ggplot(aes(x = x, y = y, group = grp), data = dat) + geom_line()
# With alpha
ggplot(aes(x = x, y = y, group = grp), data = dat) + geom_line(alpha =
0.04, size = 2)
cheers,
Paul
On 06/15/2010 12:57 PM, Tal Galili wrote:> Hello all,
>
> I am trying to create a Clustergram in R.
> (More about it here: http://www.schonlau.net/clustergram.html)
>
> And to produce a picture similar to what is seen here:
> http://www.schonlau.net/images/clustergramexample.gif
>
> I was able (more or less) to write the R code for creating the image, but
> there is one thing I can't seem to figure out, that is the
> *changing*"smudge factor" of the lines.
> I want the overlapping lines to "jitter" a tiny bit so they will
give a
> sense of thickness to the line (according to how many observations are
> present in that cluster).
> My current solution is to use a constant jitter (based on "seq")
on all the
> k number of clusters, but that causes glitches in the produced image (run
my
> code to see).
>
> Here is a simple self reproducible code to create the image I was able to
> make:
>
>
>
> # ------------------------------------
>
> set.seed(100)
> Data<- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),
> matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2))
> colnames(x)<- c("x", "y")
>
> # noise<- runif(100,0,.05)
> noise<- seq(0,.3, length.out = 100)
> Y<- NULL
> X<- NULL
> k.range<- 2:10
> for(k in k.range)
> {
> cl<- kmeans(Data, k)
> y<- apply(cl$centers,1, mean)[cl$cluster] + noise
> Y<- cbind(Y, y)
> x<- rep(k, length(y))
> X<- cbind(X, x)
> points(y ~ x)
> }
>
> require(colorspace)
> COL<- rainbow_hcl(100)
> plot(0,0, col = "white", xlim = c(1,10), ylim = c(-.5,1.6),
> xlab = "Number of clusters", ylab = "Clusters means",
main = "(Basic)
> Clustergram")
> axis(side =1, at = k.range)
> abline(v = k.range, col = "grey")
> matlines(t(X), t(Y), pch = 19, col = COL, lty = 1, lwd = 1.5)
>
> # The next step would be to create a method for different cluster objects,
> but thats for another day...
>
>
> #--------------------------------------------
>
> Any suggestions on how to do this ?
>
> Thanks,
> Tal
>
>
>
> ----------------Contact
> Details:-------------------------------------------------------
> Contact me: Tal.Galili at gmail.com | 972-52-7275845
> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
> www.r-statistics.com (English)
>
----------------------------------------------------------------------------------------------
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Drs. Paul Hiemstra
Department of Physical Geography
Faculty of Geosciences
University of Utrecht
Heidelberglaan 2
P.O. Box 80.115
3508 TC Utrecht
Phone: +3130 274 3113 Mon-Tue
Phone: +3130 253 5773 Wed-Fri
http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770