Displaying 20 results from an estimated 2000 matches similar to: "Variance explained by cluster analysis"
2008 Feb 10
0
[R-sig-Geo] Comparing spatial point patterns - Syrjala test
Hi,
I went ahead and implemented something. However:
- I cannot garantie it gives correct results since, unfortunately, the
data used in Syrjala 1996 is not published along with the paper. To
avoid mistakes, I started by coding things in a fast and simple way
and then tried to optimize the code. At least all versions given the
same results.
- As expected, the test is still quite slow
2000 Mar 21
1
clustering methods in R
Dear R people,
I need to do some work with clustering, but know next to nothing about it
at present. R has (at least) three clustering packages, cluster, mclust,
cclust.
I was wondering if someone can direct me to some good books where I could
find documentation and background on the functions in these packages. The
html help in these packages lists the following as references. Can people
2005 May 30
2
"FANNY" function in R package "cluster"
Dear All,
I am attempting to use the FANNY fuzzy clustering function in R
(Kaufman & Rousseeuw, 1990), found in the "cluster" package. I have
run into a variety of difficulties; the two most crucial difficulties
are enumerated below.
1. Where is the 'm' parameter in FANNY?
In _Finding Groups in Data: An Introduction to Cluster Analysis_
(1990) by Kaufman & Rousseeuw,
2001 Jan 09
2
PAM clustering (using triangular matrix)
Hi,
I'm trying to use a similarity matrix (triangular) as input for pam() or
fanny() clustering algorithms.
The problem is that this algorithms can only accept a dissimilarity
matrix, normally generated by daisy().
However, daisy only accept 'data matrix or dataframe. Dissimilarities
will be computed between the rows of x'.
Is there any way to say to that your data are already a
2007 Sep 20
1
ggplot and xlim/ylim
Hello everyone,
I am (happily) using ggplot2 for all my plotting now and I wondered
is there is an easy way to specify xlim and ylim somewhere when using
the ggplot syntax, as opposed to the qplot syntax. Eg.
qplot(data=mtcars,y=wt, x=qsec,xlim=c(0,30))
<->
ggplot(mtcars, aes(y=wt, x=qsec)) + geom_point() + ???
Indeed the ggplot syntax is in general more flexible and powerful and
2007 May 21
1
plot(......,new=T) vs. par(new=T)
Hello everybody,
This is probably a classic but I cannot find an answer to this on the
mailing list (i.e. with a google search restricted to the mailing
list archive). Setting:
par(new=T)
plot(x,y)
works but
plot(x,y,new=T)
doesn't while it is said in plot's help that ... arguments are passed
to par. What am I missing?
JiHO
---
http://jo.irisson.free.fr/
2004 Jun 29
1
PAM clustering: using my own dissimilarity matrix
Hello,
I would like to use my own dissimilarity matrix in a PAM clustering with
method "pam" (cluster package) instead of a dissimilarity matrix created
by daisy.
I read data from a file containing the dissimilarity values using
"read.csv". This creates a matrix (alternatively: an array or vector)
which is not accepted by "pam": A call
2007 Dec 13
2
use ggplot in a function to which a column name is given
Hi everyone, Hi ggplot users in particular,
ggplot makes it very easy to plot things given their names when you
use it interactively (and therefore can provide the names of the
columns).
qplot(x,foo,data=A) where A has columns (x,y,foo,bar) for example
but I would like to use this from inside a function to which the name
of the column is given. I cannot find an elegant way to make this
2008 Jan 18
1
Selecting rows conditionally between 2 data.frames
Hello everyone,
I have two data.frames that look like
calib:
place zoom scale
left 0.65 8
left 0.80 5.6
left 1.20 3
right 0.65 8.4
right 0.80 6
right 1.20 2.9
X:
... place zoom ....
... left 0.80 ....
... left 1.20 ....
... right 0.65 ....
... NA NA ....
... right 0.8 ....
... left 1.20 ....
and I want to get the corresponding values of 'scale' in a new column
2007 Jul 30
2
apply, lapply and data.frame in R 2.5
Hello everyone,
A recent (in 2.5 I suspect) change in R is giving me trouble. I want
to apply a function (tolower) to all the columns of a data.frame and
get a data.frame in return.
Currently, on a data.frame, both apply (for arrays) and lapply (for
lists) work, but each returns its native class (resp. matrix and list):
apply(mydat,2,tolower) # gives a matrix
lapply(mydat,tolower) # gives
2007 Oct 04
3
pdf() device uses fonts to represent points - data alteration?
Hello all,
I discovered that the pdf device uses fonts to represent "points"
symbols (as in plot(...,type="p",...) ). Namely it uses ZapfDingbats
with symbol U+25cf. This can lead to problems when the font is not
available, or available in another version (such as points being
replaced by other symbols, or worst: slightly displaced).
Furthermore, it also causes
2007 Jul 24
2
x,y,z table to matrix with x as rows and y as columns
Hello all,
I am sure I am missing something obvious but I cannot find the
function I am looking for. I have a data frame with three columns: X,
Y and Z, with X and Y being grid coordinates and Z the value
associated with these coordinates. I want to transform this data
frame in a matrix of Z values, on the grid defined by X and Y (and,
as a plus, fill the X.Y combinations which do no
2007 May 21
1
Comparing multiple distributions
Hello eveybody,
I am studying the vertical distribution of plankton and want to study
its variations relatively to several factors (time of day, species,
water column structure etc.). So my data is special in that, at each
sampling site (each observation), I don't have *one* number, I have
*several* numbers (abundance of organisms in each depth bin, I sample
5 depth bins) which
2008 Jan 11
1
ggplot2, coord_equal and aspect ratio
Hi everyone, Hi Hadley,
I am a heavy user of coord_equal() in ggplot2 since most of my data is
spatial, on x,y coordinates. Everything works. However by enforcing an
aspect ratio of 1 for the plotting region, coord_equal() usually
wastes a lot of space if the region of interest is not a perfect square.
For example:
x=runif(10)
a=data.frame(x=x*3,y=x)
ggplot(data=a, aes(x=x,y=y)) +
2008 Feb 05
2
Incomplete ouput with sink and split=TRUE
Dear List,
I am trying to get R's terminal output to a file and to the terminal
at the same time, so that I can walk through some tests and keep a log
concurrently. The function 'sink' with the option split=TRUE seems to
do just that. It works fine for most output but for objects of class
htest, the terminal output is incomplete (the lines are there but
empty). Here is an
2007 May 18
3
lapply not reading arguments from the correct environment
Hello,
I am facing a problem with lapply which I ''''think''' may be a bug.
This is the most basic function in which I can reproduce it:
myfun <- function()
{
foo = data.frame(1:10,10:1)
foos = list(foo)
fooCollumn=2
cFoo = lapply(foos,subset,select=fooCollumn)
return(cFoo)
}
I am building a list of dataframes, in each of which I want to keep
only column
2009 Mar 22
3
'require' equivalent for local functions
Hello everyone,
I often create some local "libraries" of functions (.R files with only
functions in them) that I latter call. In scripts that call a function
from such library, I would like to be able to test whether the
function is already known in the namespace and, only if it is not,
source the library file. I.e. what `require` does for packages, I want
to do with my local
2008 Feb 09
1
Comparing spatial point patterns - Syrjala test
Dear Lists,
At several stations distributed regularly in space[1], we sampled
repeatedly (4 times) the abundance of organisms and measured
environmental parameters. I now want to compare the spatial
distribution of various species (and test wether they differ or not),
or to compare the distribution of a particular organism with the
distribution of some environmental variable.
2007 Sep 27
3
Plotting from different data sources on the same plot (with ggplot2)
Hello everyone (and Hadley in particular),
I often need to plot data from multiple datasets on the same graph. A
common example is when mapping some values: I want to plot the
underlying map and then add the points. I currently do it with base
graphics, by recording the maximum region in which my map+point will
fit, plotting both with these xlim and ylim parameters, adding par
(new=T)
2008 Jan 21
1
sorting in 'merge'
Hello everyone,
I've been advised to use merge to extract information from two
data.frames with a number of common columns, but I cannot get a grasp
on how it sorts the result. With sort=FALSE, I would expect it to give
the result back sorted exactly as the input was but it seems it is not
always the case, especially when there are repeats in the input.
For example:
> a =