thr3ads.net - similar to: "removing factor level represented by less than x rows"

Displaying 20 results from an estimated 10000 matches similar to: "removing factor level represented by less than x rows"

question about "mean"

2010 Jun 09

question about "mean"

Hi there: I have a question about generating mean value of a data.frame. Take iris data for example, if I have a data.frame looking like the following: --------------------- Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2

Cannot Compute Box's M (Three Days Trying...)

2017 Oct 28

Cannot Compute Box's M (Three Days Trying...)

Hey Duncan, Hard to debug? That's an understatement. Eyes bleeding.... In any case, I tried all your suggestions. To get "integer" for the final column, I had to change the code to get integers instead of strings. double[] d1 = ((REXPVector) ((RList) tableRead).get(0)).asDoubles(); double[] d2 = ((REXPVector) ((RList) tableRead).get(1)).asDoubles(); double[] d3 = ((REXPVector)

Cannot Compute Box's M (Three Days Trying...)

2017 Oct 28

Cannot Compute Box's M (Three Days Trying...)

Thanks Duncan. Awesome ideas! I think we're getting closer! I tried what you suggested and got a possibly better error... . . . rConnection.assign("boxMVariable", myDf); String resultBV = "str(boxMVariable)"; // your suggestion. RESULTING ERROR: Error in format.default(nam.ob, width = max(ncn), justify = "left") : invalid 'width' argument (No idea

removed data is still there!

2010 Sep 21

removed data is still there!

I'm confused, hope someone can point out what is not obvious to me. I thought I was creating a new data frame by 'deleting' rows from an existing dataframe - I've tried 2 methods. But this new data frame seems to remember values from its parent - even though there are no occurences. Where does it get the values versicolor and virginica from and give then a count of 0? What

aggregate() naming -- bug or feature

2018 Mar 23

aggregate() naming -- bug or feature

In the examples below, the first loses the name attached by foo(), the second retains names attached by bar(). Is this an intentional difference? I?d prefer that the names be retained in both cases. foo <- function(x) { c(mean = base::mean(x)) } bar <- function(x) { c(mean = base::mean(x), sd = stats::sd(x))} aggregate(iris$Sepal.Length, by = list(iris$Species), FUN = foo) #>

randomForest gives different results for formula call v. x, y methods. Why?

2007 Apr 29

randomForest gives different results for formula call v. x, y methods. Why?

Just out of curiosity, I took the default "iris" example in the RF helpfile... but seeing the admonition against using the formula interface for large data sets, I wanted to play around a bit to see how the various options affected the output. Found something interesting I couldn't find documentation for... Just like the example... > set.seed(12) # to be sure I have

Cannot Compute Box's M (Three Days Trying...)

2017 Oct 29

Cannot Compute Box's M (Three Days Trying...)

Thanks Duncan. I can't tell you how helpful all your terrific replies have been. I think the biggest surprise is that nobody appears to be using Java and R together like I"m trying to do. I suppose it should be a surprise since there are no books on the subject and almost no technical documentation other than a few sites here and there. ----- I originally had the "int" as the

How to get the rowindices without using which?

2005 Sep 26

How to get the rowindices without using which?

Hi, I was wondering if it is possible to get the rowindices without using the function "which" because I don't have a restriction criteria. Here's an example of what I mean: # take 10 randomly selected instances iris[sample(1:nrow(iris), 10),] # output Sepal.Length Sepal.Width Petal.Length Petal.Width Species 76 6.6 3.0 4.4 1.4

aggregate() naming -- bug or feature

2018 Mar 23

aggregate() naming -- bug or feature

On Fri, Mar 23, 2018 at 6:43 PM, Rui Barradas <ruipbarradas at sapo.pt> wrote: > Hello, > > Not exactly an answer but here it goes. > If you use the formula interface the names will be retained. Also if you pass named arguments: aggregate(iris["Sepal.Length"], by = iris["Species"], FUN = foo) # Species Sepal.Length # 1 setosa 5.006 # 2

Random Forest confusion matrix

2009 Feb 26

Random Forest confusion matrix

Dear R users, I have a question on the confusion matrix generated by function randomForest. I used the entire data set to generate the forest, for example: > print(iris.rf) Call: randomForest(formula = Species ~ ., data = iris, importance = TRUE, keep.forest = TRUE) confusion setosa versicolor virginica class.error setosa 50 0 0 0.00

segfault during example(svm)

2011 Feb 18

segfault during example(svm)

If do: > library("e1071") > example(svm) I get: svm> data(iris) svm> attach(iris) svm> ## classification mode svm> # default with factor response: svm> model <- svm(Species ~ ., data = iris) svm> # alternatively the traditional interface: svm> x <- subset(iris, select = -Species) svm> y <- Species svm> model <- svm(x, y) svm>

Get a percent variable based on group

2013 Jan 16

Get a percent variable based on group

Dear all, I'd like to get a percentage variable based on a group, but without creating a new data frame. For example: data(iris) iris$percent <-unlist(tapply(iris$Sepal.Length,iris$Species,function(x) x/sum(x, na.rm=TRUE))) This does not work, I should have only three standard values, respectively for setosa, versicolor, and virginica. How can I do this? MANY THANKS, Karine

problem with certain data sets when using randomForest

2005 Aug 26

problem with certain data sets when using randomForest

Hi, Since I've had no replies on my previous post about my problem I am posting it again in the hope someone notice it. The problem is that the randomForest function doesn't take datasets which has instances only containing a subset of all the classes. So the dataset with instances that either belong to class "a" or "b" from the levels "a", "b" and

saving sublist lda object with save.image()

2012 Jun 11

saving sublist lda object with save.image()

Greetings R experts, I'm having some difficulty recovering lda objects that I've saved within sublists using the save.image() function. I am running a script that exports a variety of different information as a list, included within that list is an lda object. I then take that list and create a list of that with all the different replications I've run. Unfortunately I've been

Newbie question - struggling with boxplots

2011 Aug 16

Newbie question - struggling with boxplots

Hopefully I will not be flamed for this on the list, but I am starting out with R and having some trouble with combining plots. I am playing with the famous iris dataset (checking out example dataset in R while reading through Introduction to datamining) What I would like to do is create three graphs (combined boxplots) besides each other for each of the three species (Setosa, Versicolour and

similar to: removing factor level represented by less than x rows