Displaying 20 results from an estimated 10000 matches similar to: "split data, but ensure each level of the factor is represented"
2009 Oct 17
1
Easy way to `iris[,-"Petal.Length"]' subsetting?
Dear all
What is the easy way to drop a variable by using its name (and not its
number)? Example:
> data(iris)
> head(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1
2010 Jul 29
1
where did the column names go to?
I've just tried to merge 2 data sets thinking they would only keep the common
columns, but noticed the column count was not adding up. I've then
replicated a simple example and got the same thing happening.
q1. why doesn't 'b' have a column name?
q2. when I merge, why does the new column 'y' have all values as 5.1?
Thanks in advance,
Mr. confused
> a <-
2005 Apr 27
4
How to add some of data in the first place dataset
Dear R-help,
First I apologize if my question is quite simple.
I need add some of data in the first place my dataset, how can I do that.
I have tried with rbind, but I did not succes.
0.1 3.6 0.4 0.9 rose
4.1 4.0 1.2 1.2 rose
4.4 3.2 1.9 0.5 rose
4.6 1.1 1.1 0.2 rose
For example,
2013 Apr 16
1
avoid losing data.frame attributes on cbind()
Dear all,
How should I add several variables to a data frame without losing the
attributes of the df? Consider the following:
> require(Hmisc)
> Xa <- iris
> label(Xa, self=T) <- "Some df label"
> str(Xa)
'data.frame': 150 obs. of 5 variables:
$ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
$ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9
2011 Aug 10
2
round() a data frame containing 'character' variables?
Dear all
It is difficult to use round(..., digits=2) on a data frame since one
has to first take care to remove non-numeric variables such as
'character' or 'factor':
> head(round(iris, 2))
Error in Math.data.frame(list(Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5, :
non-numeric variable in data frame: Species
> head(round(iris[1:4], 2))
Sepal.Length Sepal.Width Petal.Length
2012 Jul 31
1
kernlab kpca predict
Hi!
The kernlab function kpca() mentions that new observations can be transformed by using predict. Theres also an example in the documentation, but as you can see i am getting an error there (As i do with my own data). I'm not sure whats wrong at the moment. I haven't any predict functions written by myself in the workspace either. I've tested it with using the matrix version and the
2008 Sep 02
2
cluster a distance(analogue)-object using agnes(cluster)
I try to perform a clustering using an existing dissimilarity matrix that I
calculated using distance (analogue)
I tried two different things. One of them worked and one not and I don`t
understand why.
Here the code:
not working example
library(cluster)
library(analogue)
iris2<-as.data.frame(iris)
str(iris2)
'data.frame': 150 obs. of 5 variables:
$ Sepal.Length: num 5.1 4.9 4.7
2010 Jun 09
4
question about "mean"
Hi there:
I have a question about generating mean value of a data.frame. Take
iris data for example, if I have a data.frame looking like the following:
---------------------
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4
0.2 setosa
2 4.9 3.0 1.4
0.2
2009 Sep 09
1
change character to factor in data frame
Dear all
I have a simple problem which I thought is easy to solve but what I tried
did not work. I want to change character variables to factor in data
frame. It goes easily from factor to character, but I am stuck in how to
do backwards conversion.
Here is an example
irisf<-iris
irisf[,2]<-factor(irisf[,2]) # create second factor
str(irisf)
'data.frame': 150 obs. of 5
2006 May 31
2
a problem 'cor' function
Hi list,
One of my co-workers found this problem with 'cor' in his code and I confirm it too (see below). He's using R 2.2.1 under Win 2K and I'm using R 2.3.0 under Win XP.
===========================================
> R.Version()
$platform
[1] "i386-pc-mingw32"
$arch
[1] "i386"
$os
[1] "mingw32"
$system
[1] "i386, mingw32"
$status
2011 Dec 23
2
missing value where TRUE/FALSE needed
Merry Xmas to all,
I am writing a function and curiously this runs sometimes on one data set
and fails on another and i cannot figure out why.
Any help much appreciated.
If i run the code below with
data <- iris[ ,1:4]
The code runs fine, but if i run on a large dataset i get the following
error (showing data structures as matrix is large)
> str(cluster.data)
num [1:9985, 1:811] 0 0 0 0
2008 Feb 27
2
multiple plots per page using hist and pdf
Hello,
I am puzzled by the behavior of hist() when generating multiple plots
per page on the pdf device. In the following example two pdf files
are generated. The first results in 4 plots on one pdf page as
expected. However, the second, which swaps one of the plot() calls
for hist(), results in a 4 page pdf with one plot per page.
How might I get the histogram with 3 other scatter
2010 Feb 03
1
Calculating subsets "on the fly" with ddply
Hi,
[I sent this to the plyr mailing list (late) last night, but it seems
to be lost in the moderation queue, so here's a shot to the broadeR
community]
Apologies in advance for being more verbose than necessary, but I'm
not even sure how to ask this question in the context of plyr, so ...
here goes.
As meaningless as this might be to do with the `iris` data, the spirit
of it is what
2011 Aug 16
3
Newbie question - struggling with boxplots
Hopefully I will not be flamed for this on the list, but I am starting out
with R and having some trouble with combining plots.
I am playing with the famous iris dataset (checking out example dataset in R
while reading through Introduction to datamining)
What I would like to do is create three graphs (combined boxplots) besides
each other for each of the three species (Setosa, Versicolour and
2011 Jul 28
2
not working yet: Re: lattice overlay
Hi Dieter and R community:
I tried both of these three versions with ylim as suggested, none work: I
am getting only single (pch = 16) not overlayed (pch =3) everytime.
*vs 1*
require(lattice)
xyplot(Sepal.Length ~ Sepal.Width | Species , data= iris,
panel= function(x, y, subscripts) {
panel.xyplot(x, y, pch=16, col = "green4", ylim = c(0, 10))
panel.lmline(x, y, lty=4, col =
2012 Jul 23
1
duplicated() variation that goes both ways to capture all duplicates
Dear all
The trouble with the current duplicated() function in is that it can
report duplicates while searching fromFirst _or_ fromLast, but not
both ways. Often users will want to identify and extract all the
copies of the item that has duplicates, not only the duplicates
themselves.
To take the example from the man page:
> data(iris)
> iris[duplicated(iris), ] ##duplicates while
2009 Aug 18
2
(no subject)
Dear all,
I have a problem with the function read.xls from the gdata package, error message see below. Two examples:
First, I try to read my data, which does not work;
Secondly, I tried the example code/data with the Iris data, which worked
Any idea?
Thanks,
Lars
> path<-"I:/subProjects/bh/HPGD/"
>
> setwd(path)
>
> xls <- "Platten_Liste_090421.xls"
2007 Mar 22
2
unexpected behavior of trellis calls inside a user-defined function
I am making a battery of levelplots and wireframes for several fitted
models. I wrote a function that takes the fitted model object as the
sole argument and produces these plots. Various strange behavior
ensued, but I have identified one very concrete issue (illustrated
below): when my figure-drawing function includes the addition of
points/lines to trellis plots, some of the
2012 Aug 01
3
Neuralnet Error
I require some help in debugging this codeĀ
library(neuralnet)
ir<-read.table(file="iris_data.txt",header=TRUE,row.names=NULL)
ir1 <- data.frame(ir[1:100,2:6])
ir2 <- data.frame(ifelse(ir1$Species=="setosa",1,ifelse(ir1$Species=="versicolor",0,"")))
colnames(ir2)<-("Output")
ir3 <- data.frame(rbind(ir1[1:4],ir2))
2007 Dec 03
1
cor(data.frame) infelicities
In using cor(data.frame), it is annoying that you have to explicitly
filter out non-numeric columns, and when you don't, the error message
is misleading:
> cor(iris)
Error in cor(iris) : missing observations in cov/cor
In addition: Warning message:
In cor(iris) : NAs introduced by coercion
It would be nicer if stats:::cor() did the equivalent *itself* of the
following for a data.frame: