thr3ads.net - search: "iris2"

Displaying 20 results from an estimated 21 matches for "iris2".

Did you mean: iris

2005 Apr 07

axis colors in pairs plot

...[1:4], main = "Anderson's Iris Data -- 3 species",pch = "+", col = c("black", "red", "green3", "blue")[ 1+ unclass(iris$Species)]) One very kludgy work-around is to define a new level 1, say "foo" in the first row of iris: iris2=iris iris2$Species = as.character(iris2$Species) iris2$Species[1]="foo" iris2$Species = factor(iris2$Species) pairs(iris2[1:4], main = "Anderson's Iris Data -- 3 species", pch = "+", col = c( "black","red", "green3","blue")...

pairs() uses col argument for axes coloring

2005 Jul 08

pairs() uses col argument for axes coloring

Hi list, not sure if this is the wanted behavior, but running the following code: > version platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 1.1 year 2005 month 06 day 20 language R > n <- 500 > d <- 4 > m <- matrix(runif(n*d, -1, 1), ncol=d) > c <- hsv(apply(m, 1, function(x) {sum(x*x)/d}),

pairs() uses col argument for axes coloring

2005 Jul 08

pairs() uses col argument for axes coloring

translate grouped data to their centroid

2013 Jan 01

translate grouped data to their centroid

Given a data set with a group factor, I want to translate the numeric variables to their centroid, by subtracting out the group means (adding back the grand means). The following gives what I want, but there must be an easier way using sweep or apply or some such. iris2 <- iris[,c(1,2,5)] means <- colMeans(iris2[,1:2]) pooled <- lm(cbind(Sepal.Length, Sepal.Width) ~ Species, data=iris2)$residuals pooled[,1] <- pooled[,1] + means[1] pooled[,2] <- pooled[,2] + means[2] pooled <- as.data.frame(pooled) pooled$Species <- iris2$Species -- Michae...

removed data is still there!

2010 Sep 21

removed data is still there!

...50 50 50 > nrow(iris) [1] 150 > iris1 <- iris[iris$Species == 'setosa',] > nrow(iris1) [1] 50 > summary(iris1$Species) setosa versicolor virginica 50 0 0 boxplot(Petal.Width ~ Species, data = iris1, plot=1) > iris2 <- subset(iris, Species == 'setosa') > nrow(iris2) [1] 50 > summary(iris2$Species) setosa versicolor virginica 50 0 0 > boxplot(Petal.Width ~ Species, data = iris2, plot=1) -- View this message in context: http://r.789695.n4.nabble.com/re...

cluster a distance(analogue)-object using agnes(cluster)

2008 Sep 02

cluster a distance(analogue)-object using agnes(cluster)

I try to perform a clustering using an existing dissimilarity matrix that I calculated using distance (analogue) I tried two different things. One of them worked and one not and I don`t understand why. Here the code: not working example library(cluster) library(analogue) iris2<-as.data.frame(iris) str(iris2) 'data.frame': 150 obs. of 5 variables: $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ... $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ... $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ... $ Petal.Width : num 0....

R newbie: logical subsets

2006 Jul 11

R newbie: logical subsets

Hello! I'm a newcomer to R hoping to replace some convoluted database code with an R script. Unfortunately, I haven't been able to figure out how to implement the following logic. Essentially, we have a database of transactions that are coded with a geographic locale and a type. These are being loaded into a data.frame with named variables city, type, and price. E.g., trans$city

barplot() x axes are not updated after removal of categories from the dataframe

2009 Feb 12

barplot() x axes are not updated after removal of categories from the dataframe

Hi all, I'd be grateful for your help. I am a new user struggling with a barplot issue. I am plotting categories (X axis) and their mean count (Y axies) with barplot(). The first call to barplot works fine. I remove records from the dataframe using final=[!final$varname == "some value",] I echo the dataframe and the records are no longer in the dataframe. When I call plot again

Newbie wants to compare 2 huge RDSs row by row.

2018 Jan 28

Newbie wants to compare 2 huge RDSs row by row.

The diffobj package (https://cran.r-project.org/package=diffobj) is really helpful here. It provides "diff" functions diffPrint(), diffStr(), and diffChr() to compare two object 'x' and 'y' and provide neat colorized summary output. Example: > iris2 <- iris > iris2[122:125,4] <- iris2[122:125,4] + 0.1 > diffobj::diffPrint(iris2, iris) < iris2 > iris @@ 121,8 / 121,8 @@ ~ Sepal.Length Sepal.Width Petal.Length Petal.Width Species 120 6.0 2.2 5.0 1.5 virginica 121 6.9...

Newbie wants to compare 2 huge RDSs row by row.

2018 Jan 28

Newbie wants to compare 2 huge RDSs row by row.

...huge RDSs row by row. The diffobj package (https://cran.r-project.org/package=diffobj) is really helpful here. It provides "diff" functions diffPrint(), diffStr(), and diffChr() to compare two object 'x' and 'y' and provide neat colorized summary output. Example: > iris2 <- iris > iris2[122:125,4] <- iris2[122:125,4] + 0.1 > diffobj::diffPrint(iris2, iris) < iris2 > iris @@ 121,8 / 121,8 @@ ~ Sepal.Length Sepal.Width Petal.Length Petal.Width Species 120 6.0 2.2 5.0 1.5 virginica 121 6.9...

NA and NaN randomForest

2007 Apr 24

NA and NaN randomForest

Dear R-help, This is about randomForest's handling of NA and NaNs in test set data. Currently, if the test set data contains an NA or NaN then predict.randomForest will skip that row in the output. I would like to change that behavior to outputting an NA. Can this be done with flags to randomForest? If not can some sort of wrapper be built to put the NAs back in? thanks, Clayton

Don't dput() data frames?

2012 Aug 28

Don't dput() data frames?

...830 with the commit message "correct the work of dput() on the row names of a data frame with compact representation." Is there a problem / better way to use the result of a hefty dput than source()ing it? This seems to work rather robustly: data(iris) source(textConnection(paste0("iris2 <- ", capture.output(dput(iris))))) identical(iris, iris2) Cheers, Michael

Newbie wants to compare 2 huge RDSs row by row.

2018 Jan 28

Newbie wants to compare 2 huge RDSs row by row.

The anti_join from the package dplyr might also be handy. install.package("dplyr") library(dplyr) anti_join (x1, x2) You can get help on the different functions by ?function.name(), so ?anti_join() will bring you help - and examples - on the anti_join function. It might be worth testing your approach on a small subset of the data. That makes it easier for you to follow what happens

converting code into a function - seperating a data frame with n columns into n individual vectors

2006 May 05

converting code into a function - seperating a data frame with n columns into n individual vectors

I have many very large dataframes with 20 columns each. In order to conserve memory, I wish to separate the data frame into 20 vectors, each named the name of the dataframe followed by .1,.2,.3 .20. (For example purposes, one data frame is named ?testa?.) e.g. testa.1, testa.2, testa.3 I have written the code to do this (see below). I am trying to convert this into a function that I can reuse.

randomForest maxnodes

2010 Jan 15

randomForest maxnodes

Has anyone sucessfully used the maxnodes feature in randomForest? I tried setting it, but when it is non-NULL I always get back a forest in which all trees have size 1. I am using a continuous response (regression). Any help would be appreciated. Thanks. [[alternative HTML version deleted]]

splitting dataset based on variable and re-combining

2012 Dec 10

splitting dataset based on variable and re-combining

I have a dataset and I wish to use two different models to predict. Both models are SVM. The reason for two different models is based on the sex of the observation. I wish to be able to make predictions and have the results be in the same order as my original dataset. To illustrate I will use iris: # Take Iris and create a dataframe of just two Species, setosa and versicolor, shuffle them

xyplot type="l"

2012 Apr 15

xyplot type="l"

Probably a stupidly simple question, but I wouldn't know how to google it: xyplot(neuro ~ time | UserID, data=data_sub) creates a proper plot. However, if I add type = "l" the lines do not go first through time1, then time2, then time3 etc but in about 50% of all subjects the lines go through points seemingly random (e.g. from 1 to 4 to 2 to 5 to 3). The lines always start at time

extract data from lm object and then use again?

2006 Sep 22

extract data from lm object and then use again?

Hi list, I want to write a general function so that it would take an lm object, extract its data element, then use the data at another R function (eg, glm). I searched R-help list, and found this would do the trick of the first part: a.lm$call$data this would return a name object but could not be recognized as a data.frameby glm. I also tried call(as.character(a.lm$call$data)) or

mob(party) formula question

2008 Aug 13

mob(party) formula question

I try tu use mob() with my data.frame ('data.frame': 288 obs. of 81 variables; factors, numerics and ordered factors) My response is a binary variable and I should use for modelling a logistic regression (family=binomial). I read in the "MOB" Vignette that I could use a formula like this if I would like to have only partitioning variables apart from the response.

Efficient Cartesian product of data.frames

2004 Sep 10

Efficient Cartesian product of data.frames

Hello List, I am looking for efficient code to produce the Cartesian product of two or more data.frames. I'd like to be able to do this without resorting to looping. I have searched the FAQ, web, etc without luck. That being said, the help page for merge says that the function can produce what I'm looking for if the by vectors are of zero length. Would someone be so kind as to

search for: iris2