similar to: split data, but ensure each level of the factor is represented

Displaying 20 results from an estimated 10000 matches similar to: "split data, but ensure each level of the factor is represented"

2009 Oct 17
1
Easy way to `iris[,-"Petal.Length"]' subsetting?
Dear all What is the easy way to drop a variable by using its name (and not its number)? Example: > data(iris) > head(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1
2012 Jul 31
1
kernlab kpca predict
Hi! The kernlab function kpca() mentions that new observations can be transformed by using predict. Theres also an example in the documentation, but as you can see i am getting an error there (As i do with my own data). I'm not sure whats wrong at the moment. I haven't any predict functions written by myself in the workspace either. I've tested it with using the matrix version and the
2010 Jul 29
1
where did the column names go to?
I've just tried to merge 2 data sets thinking they would only keep the common columns, but noticed the column count was not adding up. I've then replicated a simple example and got the same thing happening. q1. why doesn't 'b' have a column name? q2. when I merge, why does the new column 'y' have all values as 5.1? Thanks in advance, Mr. confused > a <-
2005 Apr 27
4
How to add some of data in the first place dataset
Dear R-help, First I apologize if my question is quite simple. I need add some of data in the first place my dataset, how can I do that. I have tried with rbind, but I did not succes. 0.1 3.6 0.4 0.9 rose 4.1 4.0 1.2 1.2 rose 4.4 3.2 1.9 0.5 rose 4.6 1.1 1.1 0.2 rose For example,
2013 Apr 16
1
avoid losing data.frame attributes on cbind()
Dear all, How should I add several variables to a data frame without losing the attributes of the df? Consider the following: > require(Hmisc) > Xa <- iris > label(Xa, self=T) <- "Some df label" > str(Xa) 'data.frame': 150 obs. of 5 variables: $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ... $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9
2011 Aug 10
2
round() a data frame containing 'character' variables?
Dear all It is difficult to use round(..., digits=2) on a data frame since one has to first take care to remove non-numeric variables such as 'character' or 'factor': > head(round(iris, 2)) Error in Math.data.frame(list(Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5, : non-numeric variable in data frame: Species > head(round(iris[1:4], 2)) Sepal.Length Sepal.Width Petal.Length
2008 Sep 02
2
cluster a distance(analogue)-object using agnes(cluster)
I try to perform a clustering using an existing dissimilarity matrix that I calculated using distance (analogue) I tried two different things. One of them worked and one not and I don`t understand why. Here the code: not working example library(cluster) library(analogue) iris2<-as.data.frame(iris) str(iris2) 'data.frame': 150 obs. of 5 variables: $ Sepal.Length: num 5.1 4.9 4.7
2009 Sep 09
1
change character to factor in data frame
Dear all I have a simple problem which I thought is easy to solve but what I tried did not work. I want to change character variables to factor in data frame. It goes easily from factor to character, but I am stuck in how to do backwards conversion. Here is an example irisf<-iris irisf[,2]<-factor(irisf[,2]) # create second factor str(irisf) 'data.frame': 150 obs. of 5
2010 Jun 09
4
question about "mean"
Hi there: I have a question about generating mean value of a data.frame. Take iris data for example, if I have a data.frame looking like the following: --------------------- Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2
2010 Feb 03
1
Calculating subsets "on the fly" with ddply
Hi, [I sent this to the plyr mailing list (late) last night, but it seems to be lost in the moderation queue, so here's a shot to the broadeR community] Apologies in advance for being more verbose than necessary, but I'm not even sure how to ask this question in the context of plyr, so ... here goes. As meaningless as this might be to do with the `iris` data, the spirit of it is what
2011 Dec 23
2
missing value where TRUE/FALSE needed
Merry Xmas to all, I am writing a function and curiously this runs sometimes on one data set and fails on another and i cannot figure out why. Any help much appreciated. If i run the code below with data <- iris[ ,1:4] The code runs fine, but if i run on a large dataset i get the following error (showing data structures as matrix is large) > str(cluster.data) num [1:9985, 1:811] 0 0 0 0
2006 May 31
2
a problem 'cor' function
Hi list, One of my co-workers found this problem with 'cor' in his code and I confirm it too (see below). He's using R 2.2.1 under Win 2K and I'm using R 2.3.0 under Win XP. =========================================== > R.Version() $platform [1] "i386-pc-mingw32" $arch [1] "i386" $os [1] "mingw32" $system [1] "i386, mingw32" $status
2008 Feb 27
2
multiple plots per page using hist and pdf
Hello, I am puzzled by the behavior of hist() when generating multiple plots per page on the pdf device. In the following example two pdf files are generated. The first results in 4 plots on one pdf page as expected. However, the second, which swaps one of the plot() calls for hist(), results in a 4 page pdf with one plot per page. How might I get the histogram with 3 other scatter
2011 Aug 16
3
Newbie question - struggling with boxplots
Hopefully I will not be flamed for this on the list, but I am starting out with R and having some trouble with combining plots. I am playing with the famous iris dataset (checking out example dataset in R while reading through Introduction to datamining) What I would like to do is create three graphs (combined boxplots) besides each other for each of the three species (Setosa, Versicolour and
2011 Jul 28
2
not working yet: Re: lattice overlay
Hi Dieter and R community: I tried both of these three versions with ylim as suggested, none work: I am getting only single (pch = 16) not overlayed (pch =3) everytime. *vs 1* require(lattice) xyplot(Sepal.Length ~ Sepal.Width | Species , data= iris, panel= function(x, y, subscripts) { panel.xyplot(x, y, pch=16, col = "green4", ylim = c(0, 10)) panel.lmline(x, y, lty=4, col =
2012 Jul 23
1
duplicated() variation that goes both ways to capture all duplicates
Dear all The trouble with the current duplicated() function in is that it can report duplicates while searching fromFirst _or_ fromLast, but not both ways. Often users will want to identify and extract all the copies of the item that has duplicates, not only the duplicates themselves. To take the example from the man page: > data(iris) > iris[duplicated(iris), ] ##duplicates while
2009 Aug 18
2
(no subject)
Dear all, I have a problem with the function read.xls from the gdata package, error message see below. Two examples: First, I try to read my data, which does not work; Secondly, I tried the example code/data with the Iris data, which worked Any idea? Thanks, Lars > path<-"I:/subProjects/bh/HPGD/" > > setwd(path) > > xls <- "Platten_Liste_090421.xls"
2007 Mar 22
2
unexpected behavior of trellis calls inside a user-defined function
I am making a battery of levelplots and wireframes for several fitted models. I wrote a function that takes the fitted model object as the sole argument and produces these plots. Various strange behavior ensued, but I have identified one very concrete issue (illustrated below): when my figure-drawing function includes the addition of points/lines to trellis plots, some of the
2017 Sep 15
3
Regarding Principal Component Analysis result Interpretation
Dear Sir/Madam, I am trying to do PCA analysis with "iris" dataset and trying to interpret the result. Dataset contains 150 obs of 5 variables Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa
2007 Dec 03
1
cor(data.frame) infelicities
In using cor(data.frame), it is annoying that you have to explicitly filter out non-numeric columns, and when you don't, the error message is misleading: > cor(iris) Error in cor(iris) : missing observations in cov/cor In addition: Warning message: In cor(iris) : NAs introduced by coercion It would be nicer if stats:::cor() did the equivalent *itself* of the following for a data.frame: