thr3ads.net - similar to: "split data, but ensure each level of the factor is represented"

Displaying 20 results from an estimated 10000 matches similar to: "split data, but ensure each level of the factor is represented"

Easy way to `iris[,-"Petal.Length"]' subsetting?

2009 Oct 17

Easy way to `iris[,-"Petal.Length"]' subsetting?

Dear all What is the easy way to drop a variable by using its name (and not its number)? Example: > data(iris) > head(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1

where did the column names go to?

2010 Jul 29

where did the column names go to?

I've just tried to merge 2 data sets thinking they would only keep the common columns, but noticed the column count was not adding up. I've then replicated a simple example and got the same thing happening. q1. why doesn't 'b' have a column name? q2. when I merge, why does the new column 'y' have all values as 5.1? Thanks in advance, Mr. confused > a <-

How to add some of data in the first place dataset

2005 Apr 27

How to add some of data in the first place dataset

Dear R-help, First I apologize if my question is quite simple. I need add some of data in the first place my dataset, how can I do that. I have tried with rbind, but I did not succes. 0.1 3.6 0.4 0.9 rose 4.1 4.0 1.2 1.2 rose 4.4 3.2 1.9 0.5 rose 4.6 1.1 1.1 0.2 rose For example,

avoid losing data.frame attributes on cbind()

2013 Apr 16

avoid losing data.frame attributes on cbind()

Dear all, How should I add several variables to a data frame without losing the attributes of the df? Consider the following: > require(Hmisc) > Xa <- iris > label(Xa, self=T) <- "Some df label" > str(Xa) 'data.frame': 150 obs. of 5 variables: $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ... $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9

round() a data frame containing 'character' variables?

2011 Aug 10

round() a data frame containing 'character' variables?

Dear all It is difficult to use round(..., digits=2) on a data frame since one has to first take care to remove non-numeric variables such as 'character' or 'factor': > head(round(iris, 2)) Error in Math.data.frame(list(Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5, : non-numeric variable in data frame: Species > head(round(iris[1:4], 2)) Sepal.Length Sepal.Width Petal.Length

kernlab kpca predict

2012 Jul 31

kernlab kpca predict

Hi! The kernlab function kpca() mentions that new observations can be transformed by using predict. Theres also an example in the documentation, but as you can see i am getting an error there (As i do with my own data). I'm not sure whats wrong at the moment. I haven't any predict functions written by myself in the workspace either. I've tested it with using the matrix version and the

cluster a distance(analogue)-object using agnes(cluster)

2008 Sep 02

cluster a distance(analogue)-object using agnes(cluster)

I try to perform a clustering using an existing dissimilarity matrix that I calculated using distance (analogue) I tried two different things. One of them worked and one not and I don`t understand why. Here the code: not working example library(cluster) library(analogue) iris2<-as.data.frame(iris) str(iris2) 'data.frame': 150 obs. of 5 variables: $ Sepal.Length: num 5.1 4.9 4.7

question about "mean"

2010 Jun 09

question about "mean"

Hi there: I have a question about generating mean value of a data.frame. Take iris data for example, if I have a data.frame looking like the following: --------------------- Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2

change character to factor in data frame

2009 Sep 09

change character to factor in data frame

Dear all I have a simple problem which I thought is easy to solve but what I tried did not work. I want to change character variables to factor in data frame. It goes easily from factor to character, but I am stuck in how to do backwards conversion. Here is an example irisf<-iris irisf[,2]<-factor(irisf[,2]) # create second factor str(irisf) 'data.frame': 150 obs. of 5

a problem 'cor' function

2006 May 31

a problem 'cor' function

Hi list, One of my co-workers found this problem with 'cor' in his code and I confirm it too (see below). He's using R 2.2.1 under Win 2K and I'm using R 2.3.0 under Win XP. =========================================== > R.Version() $platform [1] "i386-pc-mingw32" $arch [1] "i386" $os [1] "mingw32" $system [1] "i386, mingw32" $status

missing value where TRUE/FALSE needed

2011 Dec 23

missing value where TRUE/FALSE needed

Merry Xmas to all, I am writing a function and curiously this runs sometimes on one data set and fails on another and i cannot figure out why. Any help much appreciated. If i run the code below with data <- iris[ ,1:4] The code runs fine, but if i run on a large dataset i get the following error (showing data structures as matrix is large) > str(cluster.data) num [1:9985, 1:811] 0 0 0 0

multiple plots per page using hist and pdf

2008 Feb 27

multiple plots per page using hist and pdf

Hello, I am puzzled by the behavior of hist() when generating multiple plots per page on the pdf device. In the following example two pdf files are generated. The first results in 4 plots on one pdf page as expected. However, the second, which swaps one of the plot() calls for hist(), results in a 4 page pdf with one plot per page. How might I get the histogram with 3 other scatter

Calculating subsets "on the fly" with ddply

2010 Feb 03

Calculating subsets "on the fly" with ddply

Hi, [I sent this to the plyr mailing list (late) last night, but it seems to be lost in the moderation queue, so here's a shot to the broadeR community] Apologies in advance for being more verbose than necessary, but I'm not even sure how to ask this question in the context of plyr, so ... here goes. As meaningless as this might be to do with the `iris` data, the spirit of it is what

Newbie question - struggling with boxplots

2011 Aug 16

Newbie question - struggling with boxplots

Hopefully I will not be flamed for this on the list, but I am starting out with R and having some trouble with combining plots. I am playing with the famous iris dataset (checking out example dataset in R while reading through Introduction to datamining) What I would like to do is create three graphs (combined boxplots) besides each other for each of the three species (Setosa, Versicolour and

not working yet: Re: lattice overlay

2011 Jul 28

not working yet: Re: lattice overlay

Hi Dieter and R community: I tried both of these three versions with ylim as suggested, none work: I am getting only single (pch = 16) not overlayed (pch =3) everytime. *vs 1* require(lattice) xyplot(Sepal.Length ~ Sepal.Width | Species , data= iris, panel= function(x, y, subscripts) { panel.xyplot(x, y, pch=16, col = "green4", ylim = c(0, 10)) panel.lmline(x, y, lty=4, col =

duplicated() variation that goes both ways to capture all duplicates

2012 Jul 23

duplicated() variation that goes both ways to capture all duplicates

Dear all The trouble with the current duplicated() function in is that it can report duplicates while searching fromFirst _or_ fromLast, but not both ways. Often users will want to identify and extract all the copies of the item that has duplicates, not only the duplicates themselves. To take the example from the man page: > data(iris) > iris[duplicated(iris), ] ##duplicates while

(no subject)

2009 Aug 18

(no subject)

Dear all, I have a problem with the function read.xls from the gdata package, error message see below. Two examples: First, I try to read my data, which does not work; Secondly, I tried the example code/data with the Iris data, which worked Any idea? Thanks, Lars > path<-"I:/subProjects/bh/HPGD/" > > setwd(path) > > xls <- "Platten_Liste_090421.xls"

unexpected behavior of trellis calls inside a user-defined function

2007 Mar 22

unexpected behavior of trellis calls inside a user-defined function

I am making a battery of levelplots and wireframes for several fitted models. I wrote a function that takes the fitted model object as the sole argument and produces these plots. Various strange behavior ensued, but I have identified one very concrete issue (illustrated below): when my figure-drawing function includes the addition of points/lines to trellis plots, some of the

Neuralnet Error

2012 Aug 01

Neuralnet Error

I require some help in debugging this code library(neuralnet) ir<-read.table(file="iris_data.txt",header=TRUE,row.names=NULL) ir1 <- data.frame(ir[1:100,2:6]) ir2 <- data.frame(ifelse(ir1$Species=="setosa",1,ifelse(ir1$Species=="versicolor",0,""))) colnames(ir2)<-("Output") ir3 <- data.frame(rbind(ir1[1:4],ir2))

cor(data.frame) infelicities

2007 Dec 03

cor(data.frame) infelicities

In using cor(data.frame), it is annoying that you have to explicitly filter out non-numeric columns, and when you don't, the error message is misleading: > cor(iris) Error in cor(iris) : missing observations in cov/cor In addition: Warning message: In cor(iris) : NAs introduced by coercion It would be nicer if stats:::cor() did the equivalent *itself* of the following for a data.frame:

similar to: split data, but ensure each level of the factor is represented