search for: advstats

Displaying 20 results from an estimated 29 matches for "advstats".

2012 Mar 04
1
rpart package, text function, and round of class counts
...it) text(fit, use.n=TRUE) The text labels represent the count of each class at the leaf node. Unfortunately, the numbers are rounded and in scientific notation rather than the exact number of examples sorted by that node in each class. The plot is supposed to look like http://www.statmethods.net/advstats/images/ctree.png as per http://www.statmethods.net/advstats/cart.html. I'm running 2.14.1 on a mac. Can anyone verify or point out if I am doing something obviously wrong for displaying the counts rounded and in scientific notation rather than the true counts in each class at each node? Thank...
2008 Mar 06
2
Principle component analysis function
Dear All, In a package, I want to use PCA function. The structure I used follow this page: http://www.statmethods.net/advstats/factor.html. fit<-principle(mydata, nfactors=9, rotation=TRUE) or: result<-PCA(mydata) But I don't known why R language in my computer noticed: "not found principle", "not found PCA". I download and installed R-2.6.2-win32.exe. Thanks alot for ans...
2010 May 03
1
rpart, cross-validation errors question
I ran this code (several times) from the Quick-R web page ( http://www.statmethods.net/advstats/cart.html) but my cross-validation errors increase instead of decrease (same thing happens with an unrelated data set). Why does this happen? Am I doing something wrong? # Classification Tree with rpart library(rpart) # grow tree fit <- rpart(Kyphosis ~ Age + Number + Start, method="c...
2010 Apr 21
1
Can I compare two clusters without using their distance-matrix (dist()) ?
Hello all, I would like to compare the similarity of two cluster solutions using a validation criteria (such as Hubert's gamma coefficient, the Dunn index the corrected rand index and so on) I see (from here:http://www.statmethods.net/advstats/cluster.html) that the function cluster.stats() in the fpc package provides a mechanism for comparing 2 cluster solutions - *BUT* - it requires me to give the the distance matrix among objects. *My question *is: What ways can you suggest for comparing two cluster solutions, while using the cluster...
2013 Jul 02
1
Recursive partitioning on censored data
I am interested in applying a "classification tree" analysis where the response variable is a censored variable (survival data). I've discovered the package 'party' through this page: http://www.statmethods.net/advstats/cart.html. However, as my sample is not very big I would like to apply 'bootstrap' and use 'random forests', but with my censored response variable. Are there any packages for that?? Looking forward to your answer, -- vicent @vginer_upv [[alternative HTML version deleted]]
2011 Sep 08
1
"rpart" or "tree" function issue
...tree using either tree or rpart functions but when it comes to plotting the results the formatting I get is different than what I see in all the tutorials (like http://www.youtube.com/watch?v=9XNhqO1bu0A or http://www.youtube.com/watch?v=m3mLNpeke0I&feature=related or http://www.statmethods.net/advstats/cart.html "tree for kyphosis"). I am trying to take a large demographic population and create a tree which systematically and accurately divides them into 2 pre-defined classifications using multiple predictor variables. What I would like to see is what I have seen in the tutorials simi...
2017 Aug 16
1
Bias-corrected percentile confidence intervals
...reasons). I cannot figure out where I'm going wrong but the estimates from my attempt at the BCP CI are different enough from other methods that I assume I'm doing something wrong. require(boot) data("mtcars") # 1) Bootstrap 95% CI for R-Squared via boot::boot # statmethods.net/advstats/bootstrapping.html # Function for boot rsq <- function(formula, data, indices) { d <- data[indices,] fit <- lm(formula, data=d) return(summary(fit)$r.square) } # bootstrapping with 1000 replications results <- boot(data=mtcars, statistic=rsq, R=1000, formula = mp...
2016 Apr 13
0
Decision Tree and Random Forrest
...andom forrests and have even used them both before. Mike On Apr 13, 2016 5:32 PM, "Sarah Goslee" <sarah.goslee at gmail.com> wrote: It sounds like you want classification or regression trees. rpart does exactly what you describe. Here's an overview: http://www.statmethods.net/advstats/cart.html But there are a lot of other ways to do the same thing in R, for instance: http://www.r-bloggers.com/a-brief-tour-of-the-trees-and-forests/ You can get the same kind of information from random forests, but it's less straightforward. If you want a clear set of rules as in your golf e...
2013 Jul 26
1
variación en los resultados de k medias (Alfredo Alvarez)
...ro no estoy seguro si otra agrupación funcione mejor. Es decir me interesa un método de agrupación que genere la "mejor" agrupación y como los resultados de kmeans cambian, no sé cual agrupación elegir. Utilicé otros métodos de agrupación como mclust y pvclust ( http://www.statmethods.net/advstats/cluster.html) que entiendo generan la "mejor" agrupación y sobretodo no varían en los resultados. A diferencia de kmeans y pvclust, para el paquete mclust no es necesario definir de antemano el número de grupos. Sobre el control del tamaño de los grupos, me parece haber visto algo, pero a...
2016 Apr 14
3
Decision Tree and Random Forrest
...ah.goslee at gmail.com > <javascript:_e(%7B%7D,'cvml','sarah.goslee at gmail.com');>> wrote: > > It sounds like you want classification or regression trees. rpart does > exactly what you describe. > > Here's an overview: > http://www.statmethods.net/advstats/cart.html > > But there are a lot of other ways to do the same thing in R, for instance: > http://www.r-bloggers.com/a-brief-tour-of-the-trees-and-forests/ > > You can get the same kind of information from random forests, but it's > less straightforward. If you want a clear se...
2016 Apr 15
1
Decision Tree and Random Forrest
...;> On Apr 13, 2016 5:32 PM, "Sarah Goslee" <sarah.goslee at gmail.com> wrote: >> >> It sounds like you want classification or regression trees. rpart does >> exactly what you describe. >> >> Here's an overview: >> http://www.statmethods.net/advstats/cart.html >> >> But there are a lot of other ways to do the same thing in R, for instance: >> http://www.r-bloggers.com/a-brief-tour-of-the-trees-and-forests/ >> >> You can get the same kind of information from random forests, but it's >> less straightforward...
2008 Mar 05
2
Principle component analysis
Thanks to Mr.Liviu Androvic and Mr.Richard Rowe helped me in PCA. Because I have just learn R language in a few day so I have many problem. 1) I don't know why PCA rotation function not run although I try many times. Would you please hepl me and explain how to read the PCA map (both of rotated and unrotated) in a concrete example. 2) Where I can find document relate: Plan S(A), S(A*B),
2010 Jan 05
1
bootstrapping a matrix and calculating Pearson's correlation coefficient
Hi All, I have got matrix 'data' of dimension 22000x600. I want to make 50 independent samples of dimension 22000x300 from the original matrix 'data'. And then want to calculate pearsons CC for each of the obtained 50 matrices. It seems it is possible to do this using 'boot' function from library boot but I am not able to figure out how? I am really stuck. Please help!
2010 Jul 08
1
Histogram Principal component analysis in R
Hi, I am trying to do a Principal component analysis on histogram data. Basically, I have a group of subjects and for each of them, I have a column of bin-counts (vis-a-vis intervals) and a corresponding column of frequencies (or normalized frequencies). The bin counts are the same for all the subjects. I also have a group-averaged histogram (with the same bin counts and a column of frequencies)
2009 Aug 30
1
Complexity parameter in rpart
...and I am at the point where I wish to prune my overfitted trees. Having read the documentation I understand that to do this requires the use of the complexity parameter. My question is how to go about choosing the correct complexity parameter for my tree? In some places (http://www.statmethods.net/advstats/cart.html) I have read that it is best to select the complexity parameter which minimises the cross-validated (x) error of the model, but elsewhere I have read that the optimum cp is the first value on the left above the '1+SE' line of the complexity paramter plot. I was hoping someone mig...
2010 Apr 24
1
Multiple Correlation coefficient (spearman, Kenall)
Hi, I'm currently trying to find/define a relationship between one dependent and several independant variables. The problem is that i cannot use the normal multiple regression/correlation in Spss because the data is not normal distributed. i calculated the spearman roh and Kendalls tau Correlation and also some partial correlations in R. Now i wanna find out the the multiple correlation
2010 Jul 27
1
as.dendrogram for DICE coefficient.
Hi R, I was using 'as.dendrogram' with the DIST coefficient, where the smaller values of the DIST coefficient, say that the objects are closer to each other, while the larger values of the coefficient say that the objects are far from each other. But now, I have my coefficient as the DICE coefficient(in some sense similar to correlation coefficient), where the larger coefficient
2010 Oct 22
1
question about decision trees
Hi, I have seen that R has a implementation of decision trees; however, after I have the tree with the classification: R Quinlan's trivial example of the "golf" decision tree. Outlook Temperature Humidity Windy PlayDontPlay 1 sunny 85 85 false DontPlay 2 sunny 80 90 true DontPlay 3 overcast 83 78 false Play 4 rain 70 96 false Play ... What's next? I mean, what is this
2010 Oct 30
1
compare quality of clustering methods?
Hi, Suppose I want to compare the results of two clustering methods, what is the best way to do it? Thanks Regards, -k
2012 Nov 26
0
cluster analysis error - mclust package
I am following instructions online for cluster analysis using the mclust package, and keep getting errors. http://www.statmethods.net/advstats/cluster.html These are the instructions (there is no sample dataset unfortunately): # Model Based Clustering library(mclust) fit <- Mclust(mydata) plot(fit, mydata) # plot results print(fit) # display the best model This is what I did and the error I get: > library(mclust) > fit <-...