thr3ads.net - search: "advstats"

Displaying 20 results from an estimated 29 matches for "advstats".

rpart package, text function, and round of class counts

2012 Mar 04

rpart package, text function, and round of class counts

...it) text(fit, use.n=TRUE) The text labels represent the count of each class at the leaf node. Unfortunately, the numbers are rounded and in scientific notation rather than the exact number of examples sorted by that node in each class. The plot is supposed to look like http://www.statmethods.net/advstats/images/ctree.png as per http://www.statmethods.net/advstats/cart.html. I'm running 2.14.1 on a mac. Can anyone verify or point out if I am doing something obviously wrong for displaying the counts rounded and in scientific notation rather than the true counts in each class at each node? Thank...

Principle component analysis function

2008 Mar 06

Principle component analysis function

Dear All, In a package, I want to use PCA function. The structure I used follow this page: http://www.statmethods.net/advstats/factor.html. fit<-principle(mydata, nfactors=9, rotation=TRUE) or: result<-PCA(mydata) But I don't known why R language in my computer noticed: "not found principle", "not found PCA". I download and installed R-2.6.2-win32.exe. Thanks alot for ans...

rpart, cross-validation errors question

2010 May 03

rpart, cross-validation errors question

I ran this code (several times) from the Quick-R web page ( http://www.statmethods.net/advstats/cart.html) but my cross-validation errors increase instead of decrease (same thing happens with an unrelated data set). Why does this happen? Am I doing something wrong? # Classification Tree with rpart library(rpart) # grow tree fit <- rpart(Kyphosis ~ Age + Number + Start, method="c...

Can I compare two clusters without using their distance-matrix (dist()) ?

2010 Apr 21

Can I compare two clusters without using their distance-matrix (dist()) ?

Hello all, I would like to compare the similarity of two cluster solutions using a validation criteria (such as Hubert's gamma coefficient, the Dunn index the corrected rand index and so on) I see (from here:http://www.statmethods.net/advstats/cluster.html) that the function cluster.stats() in the fpc package provides a mechanism for comparing 2 cluster solutions - *BUT* - it requires me to give the the distance matrix among objects. *My question *is: What ways can you suggest for comparing two cluster solutions, while using the cluster...

Recursive partitioning on censored data

2013 Jul 02

Recursive partitioning on censored data

I am interested in applying a "classification tree" analysis where the response variable is a censored variable (survival data). I've discovered the package 'party' through this page: http://www.statmethods.net/advstats/cart.html. However, as my sample is not very big I would like to apply 'bootstrap' and use 'random forests', but with my censored response variable. Are there any packages for that?? Looking forward to your answer, -- vicent @vginer_upv [[alternative HTML version deleted]]

"rpart" or "tree" function issue

2011 Sep 08

"rpart" or "tree" function issue

...tree using either tree or rpart functions but when it comes to plotting the results the formatting I get is different than what I see in all the tutorials (like http://www.youtube.com/watch?v=9XNhqO1bu0A or http://www.youtube.com/watch?v=m3mLNpeke0I&feature=related or http://www.statmethods.net/advstats/cart.html "tree for kyphosis"). I am trying to take a large demographic population and create a tree which systematically and accurately divides them into 2 pre-defined classifications using multiple predictor variables. What I would like to see is what I have seen in the tutorials simi...

Bias-corrected percentile confidence intervals

2017 Aug 16

Bias-corrected percentile confidence intervals

...reasons). I cannot figure out where I'm going wrong but the estimates from my attempt at the BCP CI are different enough from other methods that I assume I'm doing something wrong. require(boot) data("mtcars") # 1) Bootstrap 95% CI for R-Squared via boot::boot # statmethods.net/advstats/bootstrapping.html # Function for boot rsq <- function(formula, data, indices) { d <- data[indices,] fit <- lm(formula, data=d) return(summary(fit)$r.square) } # bootstrapping with 1000 replications results <- boot(data=mtcars, statistic=rsq, R=1000, formula = mp...

Decision Tree and Random Forrest

2016 Apr 13

Decision Tree and Random Forrest

...andom forrests and have even used them both before. Mike On Apr 13, 2016 5:32 PM, "Sarah Goslee" <sarah.goslee at gmail.com> wrote: It sounds like you want classification or regression trees. rpart does exactly what you describe. Here's an overview: http://www.statmethods.net/advstats/cart.html But there are a lot of other ways to do the same thing in R, for instance: http://www.r-bloggers.com/a-brief-tour-of-the-trees-and-forests/ You can get the same kind of information from random forests, but it's less straightforward. If you want a clear set of rules as in your golf e...

variación en los resultados de k medias (Alfredo Alvarez)

2013 Jul 26

variación en los resultados de k medias (Alfredo Alvarez)

...ro no estoy seguro si otra agrupación funcione mejor. Es decir me interesa un método de agrupación que genere la "mejor" agrupación y como los resultados de kmeans cambian, no sé cual agrupación elegir. Utilicé otros métodos de agrupación como mclust y pvclust ( http://www.statmethods.net/advstats/cluster.html) que entiendo generan la "mejor" agrupación y sobretodo no varían en los resultados. A diferencia de kmeans y pvclust, para el paquete mclust no es necesario definir de antemano el número de grupos. Sobre el control del tamaño de los grupos, me parece haber visto algo, pero a...

Decision Tree and Random Forrest

2016 Apr 14

Decision Tree and Random Forrest

...ah.goslee at gmail.com > <javascript:_e(%7B%7D,'cvml','sarah.goslee at gmail.com');>> wrote: > > It sounds like you want classification or regression trees. rpart does > exactly what you describe. > > Here's an overview: > http://www.statmethods.net/advstats/cart.html > > But there are a lot of other ways to do the same thing in R, for instance: > http://www.r-bloggers.com/a-brief-tour-of-the-trees-and-forests/ > > You can get the same kind of information from random forests, but it's > less straightforward. If you want a clear se...

Decision Tree and Random Forrest

2016 Apr 15

Decision Tree and Random Forrest

...;> On Apr 13, 2016 5:32 PM, "Sarah Goslee" <sarah.goslee at gmail.com> wrote: >> >> It sounds like you want classification or regression trees. rpart does >> exactly what you describe. >> >> Here's an overview: >> http://www.statmethods.net/advstats/cart.html >> >> But there are a lot of other ways to do the same thing in R, for instance: >> http://www.r-bloggers.com/a-brief-tour-of-the-trees-and-forests/ >> >> You can get the same kind of information from random forests, but it's >> less straightforward...

Principle component analysis

2008 Mar 05

Principle component analysis

Thanks to Mr.Liviu Androvic and Mr.Richard Rowe helped me in PCA. Because I have just learn R language in a few day so I have many problem. 1) I don't know why PCA rotation function not run although I try many times. Would you please hepl me and explain how to read the PCA map (both of rotated and unrotated) in a concrete example. 2) Where I can find document relate: Plan S(A), S(A*B),

bootstrapping a matrix and calculating Pearson's correlation coefficient

2010 Jan 05

bootstrapping a matrix and calculating Pearson's correlation coefficient

Hi All, I have got matrix 'data' of dimension 22000x600. I want to make 50 independent samples of dimension 22000x300 from the original matrix 'data'. And then want to calculate pearsons CC for each of the obtained 50 matrices. It seems it is possible to do this using 'boot' function from library boot but I am not able to figure out how? I am really stuck. Please help!

Histogram Principal component analysis in R

2010 Jul 08

Histogram Principal component analysis in R

Hi, I am trying to do a Principal component analysis on histogram data. Basically, I have a group of subjects and for each of them, I have a column of bin-counts (vis-a-vis intervals) and a corresponding column of frequencies (or normalized frequencies). The bin counts are the same for all the subjects. I also have a group-averaged histogram (with the same bin counts and a column of frequencies)

Complexity parameter in rpart

2009 Aug 30

Complexity parameter in rpart

...and I am at the point where I wish to prune my overfitted trees. Having read the documentation I understand that to do this requires the use of the complexity parameter. My question is how to go about choosing the correct complexity parameter for my tree? In some places (http://www.statmethods.net/advstats/cart.html) I have read that it is best to select the complexity parameter which minimises the cross-validated (x) error of the model, but elsewhere I have read that the optimum cp is the first value on the left above the '1+SE' line of the complexity paramter plot. I was hoping someone mig...

Multiple Correlation coefficient (spearman, Kenall)

2010 Apr 24

Multiple Correlation coefficient (spearman, Kenall)

Hi, I'm currently trying to find/define a relationship between one dependent and several independant variables. The problem is that i cannot use the normal multiple regression/correlation in Spss because the data is not normal distributed. i calculated the spearman roh and Kendalls tau Correlation and also some partial correlations in R. Now i wanna find out the the multiple correlation

as.dendrogram for DICE coefficient.

2010 Jul 27

as.dendrogram for DICE coefficient.

Hi R, I was using 'as.dendrogram' with the DIST coefficient, where the smaller values of the DIST coefficient, say that the objects are closer to each other, while the larger values of the coefficient say that the objects are far from each other. But now, I have my coefficient as the DICE coefficient(in some sense similar to correlation coefficient), where the larger coefficient

question about decision trees

2010 Oct 22

question about decision trees

Hi, I have seen that R has a implementation of decision trees; however, after I have the tree with the classification: R Quinlan's trivial example of the "golf" decision tree. Outlook Temperature Humidity Windy PlayDontPlay 1 sunny 85 85 false DontPlay 2 sunny 80 90 true DontPlay 3 overcast 83 78 false Play 4 rain 70 96 false Play ... What's next? I mean, what is this

compare quality of clustering methods?

2010 Oct 30

compare quality of clustering methods?

Hi, Suppose I want to compare the results of two clustering methods, what is the best way to do it? Thanks Regards, -k

cluster analysis error - mclust package

2012 Nov 26

cluster analysis error - mclust package

I am following instructions online for cluster analysis using the mclust package, and keep getting errors. http://www.statmethods.net/advstats/cluster.html These are the instructions (there is no sample dataset unfortunately): # Model Based Clustering library(mclust) fit <- Mclust(mydata) plot(fit, mydata) # plot results print(fit) # display the best model This is what I did and the error I get: > library(mclust) > fit <-...

search for: advstats