similar to: cluster

Displaying 20 results from an estimated 4000 matches similar to: "cluster"

2006 Oct 17
4
cluster in R
hi, is there some good summary on clustering methods in R? It seems there are many packages involving it. And I have two questions on clustering here: 1. Is there a way of evaluate the effecitives (or seperation) of clustering (rather than by visualization)? 2. Is there a search method (like genetic search) which can help find the best subset of attributes which gives best seperation? Thanks,
2005 Aug 08
2
computationally singular
Hi, I have a dataset which has around 138 variables and 30,000 cases. I am trying to calculate a mahalanobis distance matrix for them and my procedure is like this: Suppose my data is stored in mymatrix > S<-cov(mymatrix) # this is fine > D<-sapply(1:nrow(mymatrix), function(i) mahalanobis(mymatrix, mymatrix[i,], S)) Error in solve.default(cov, ...) : system is computationally
2011 May 27
4
network package in R
Hi there, I need a network builder and it can change the node size and color; I am not sure if network package in R can do this or not. The other functions I wanted have been found in that package. BTW, if there is another package in R relating to this, please suggest too. Thanks, Weiwei -- Weiwei Shi, Ph.D Research Scientist "Did you always know?" "No, I did not. But I
2005 Aug 12
2
need help
Hi, there: I think i need to re-phrase my question since last time I did not get any reply but i think the question is not that hard, probably i did not make the question clear: I want to find cases like 35, 90, 330, 330, 335 from the rest which look like 3, 3, 3, 3.2, 3.3 4, 4.4, 4.5, 4.6, 4.7 .... basically there is one (or more) big 'gap' in the case i seek. thanks, weiwei --
2005 Jul 13
1
read.table
Hi, I have a question on read.table. I have a dataset with 273,000 lines and 195 columns. I used the read.table to load the data into R: trn<-read.table('train1.dat', header=F, sep='|', na.strings='.') I found it takes forever. then I run 1/10 of the data (test) using read.table again. And this time it finished quickly. So, there might be something wrong in my data
2005 Jul 07
2
randomForest
> From: Weiwei Shi > > it works. > thanks, > > but: (just curious) > why i tried previously and i got > > > is.vector(sample.size) > [1] TRUE Because a list is also a vector: > a <- c(list(1), list(2)) > a [[1]] [1] 1 [[2]] [1] 2 > is.vector(a) [1] TRUE > is.numeric(a) [1] FALSE Actually, the way I initialize a list of known length is by
2005 Oct 11
1
a problem in random forest
Hi, there: I spent some time on this but I think I really cannot figure it out, maybe I missed something here: my data looks like this: > dim(trn3) [1] 7361 209 > dim(val3) [1] 7427 209 > mg.rf2<-randomForest(x=trn3[,1:208], y=trn3[,209], data=trn3, xtest=val3[, 1:208], ytest=val3[,209], importance=T) my test data has 7427 observations but after prediction, > dim(mg.rf2$votes)
2005 Oct 04
1
generalized linear model and missing handling
Hi, I have a dataset and want to build a generalized linear model on it. Unfortunately, complete.cases(df) returns null, which means I have to find a way to "fill" those missings. One way is following my previous post to use median to replace(or use most freq. of level to replace for catergorical case), but I am wondering if there are other ways, when glm or something like it is
2005 Dec 15
2
question on write.table
Hi, I have a question on write.table: I have a data.frame called t7 as below: > dim(t7) [1] 14015184 6 > t7[1:5,] uci uce par line graphical.forms stems 1 0 0 0 0 active activ 2 0 0 0 0 policy polici 3 0 0 0 0 wc PC 4 0 0 0 0 eff elf 5 0 0 0 0 icn ICC I want to write the
2006 Jun 03
1
time series clustering
Dear Listers: I happened to have a problem requiring time-series clustering since the clusters will change with time (too old data need to be removed from data while new data comes in). I am wondering if there is some paper or reference on this topic and there is some kind of implementation in R? Thanks, Weiwei -- Weiwei Shi, Ph.D "Did you always know?" "No, I did not. But I
2007 Apr 11
5
how to reverse a list
Hi, there: I am wondering if there is a quick way to "reverse" a list like this: t0 <- list(a=1, b=1, c=2, d=1) reverst t0 to t1 > t1 $`1` [1] "a" "b" "d" $`2` [1] "c" thanks. -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. "Did you always know?" "No, I did not. But I believed..." ---Matrix III
2007 Apr 24
5
intersect more than two sets
Hi, I searched the archives and did not find a good solution to that. assume I have 10 sets and I want to have the common character elements of them. how could i do that? -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. "Did you always know?" "No, I did not. But I believed..." ---Matrix III
2007 Jun 25
3
a string to enviroment or function
Hi, I am wondering how to make a function Fun to make the following work: t0 <- (paste("hgu133a", "ENTREZID", sep="")) xx <- as.list(Fun(t0)) # make it work like xx<-as.list(hgu133aENTREZID) thanks, -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. "Did you always know?" "No, I did not. But I believed..." ---Matrix III
2005 Jul 08
1
"more" and "tab" functionalities in R under linux
Hi, forgive me if it is due to my "laziness" :) I am wondering if there are functionalities in R, which can do like "more" and "tab" in linux: more(one.data.frame) so I can browse through it. Sometimes I can use one.data.frame[1:100,], but still not as good as "more" in linux. tab: can I use tab to auto complete an defined object name in R so I don't
2005 Oct 05
1
pca in dimension reduction
Hi, there: I am wondering if anyone here can provide an example using pca doing dimension reduction for a dataset. The dataset can be n*q (n>=q or n<=q). As to dimension reduction, are there other implementations for like ICA, Isomap, Locally Linear Embedding... Thanks, weiwei -- Weiwei Shi, Ph.D "Did you always know?" "No, I did not. But I believed..." ---Matrix III
2005 Oct 11
1
an error in my using of nnet
Hi, there: I am trying nnet as followed: > mg.nnet<-nnet(x=trn3[,r.v[1:100]], y=trn3[,209], size=5, decay = 5e-4, maxit = 200) # weights: 511 initial value 13822.108453 iter 10 value 7408.169201 iter 20 value 7362.201934 iter 30 value 7361.669408 iter 40 value 7361.294379 iter 50 value 7361.045190 final value 7361.038121 converged Error in y - tmp : non-numeric argument to binary operator
2007 Oct 29
3
how to split data.frame by row?
hi, if I have 20 x 3 data.frame, how to split it into 10 x 6 (moving the lower part of 10x3 to column) or 5 x 12 thanks -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. "Did you always know?" "No, I did not. But I believed..." ---Matrix III
2008 Aug 24
2
similarity between two gene lists with varied length
Dear listers, a little off-topic: I am looking for and compare algorithms which can calculate "distance" or "similarity" between two gene lists with different lengths. Any paper, any implementation in R and any suggestion is welcome! Thanks, -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. "Did you always know?" "No, I did not. But I believed..."
2006 Apr 07
2
a statistics question
Hi there, I have a statistics question on a classification problem: Suppose I have 1000 binary variables and one binary dependent variable. I want to find a way similar to PCA, in which I can find a couple of combinations of those variables to discriminate best according to the dependent variable. It is not only for dimension reduction, but more important, for finding best way to construct
2007 May 01
1
dlda{supclust} 's output
Hi, I am using dlda algorithm from supclust package and I am wondering if the output can be a continuous probability instead of discrete class label (zero or one) since it puts some restriction on convariance matrix, compared with lda, while the latter can. thanks, -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. "Did you always know?" "No, I did not. But I believed..."