thr3ads.net - similar to: "using "factor" to eliminate unused levels without dropping other variables"

Displaying 20 results from an estimated 300 matches similar to: "using "factor" to eliminate unused levels without dropping other variables"

problems subsetting

2010 Nov 18

problems subsetting

Dear all, I have searched the forums for an answer - and there is plenty of questions along the same line - but none of the paproaches shown worked to my problem: I have a data frame that I get from a csv: summarystats<-as.data.frame(read.csv(file=f_summary)); where I have the columns Dataset, Class, Type, Category,.. Problem1: I want to find a subset of this frame, based on values in

Creating objects (data.frames) with names stored in character vector

2011 Feb 24

Creating objects (data.frames) with names stored in character vector

Hello, I'm fairly new to R. I'm a chemist, not a programmer so please bear with me. I have a large data.frame that I want to break down (subset) into smaller data.frames for analysis. I would like to give the data.frames descriptive names which I have stored in a character vector. My original thought was that I want the subsets to show up as individual objects, but haveing them stored

Manage an unknown and variable number of data frames

2009 Sep 13

Manage an unknown and variable number of data frames

Hi, In the code below I create a small data.frame (dat) and then cut it into different groups using CutList. The lists in CutList allow to me choose whatever columns I want from dat and allow me to cut it into any number of groups by changing the lists. It seems to work OK but when I'm done I have a variable number of data frames what I need to do further operations on and I don't know

Is the aggregate function the best way to do this?

2010 Feb 17

Is the aggregate function the best way to do this?

Hi, I''m having a dataframe ''Subset1'' with a number of factor variables and 160 numerical variables Now I want to make sums for all rows that have the same values for the different factor variables, except for the factor variables: VAR1,VAR2,VAR3 who may have the same values. With the formula given below this works great, but in a situation with 15000 rows and 13

Multiple options for a package

2004 Dec 14

Multiple options for a package

Hi R-devel, I am facing a situation where the number of options I would like to propose to the user is somewhat big (and could easily increase more and more as I will code up a little more - even coming to a point where an user should be able to implement his own options). What we have to handle options is the couple: options(par=value) and getOption("par") I was aking myselft what

custom subset method / handling columns selection as logic in '...' parameter

2007 Dec 20

custom subset method / handling columns selection as logic in '...' parameter

Dear R-helpers & bioconductor Sorry for cross-posting, this concerns R-programming stuff applied on Bioconductor context. Also sorry for this long message, I try to be complete in my request. I am trying to write a subset method for a specific class (ExpressionSet from Bioconductor) allowing selection more flexible than "[" method . The schema I am thinking for is the following:

Do you keep an archive of "useful" R code? and if so - how?

2009 Nov 22

Do you keep an archive of "useful" R code? and if so - how?

An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20091122/430e3297/attachment-0001.pl>

COMPAR.GEE error with logistic model

2007 Dec 29

COMPAR.GEE error with logistic model

Hello, I am trying to run the APE program COMPAR.GEE with a model containing a categorical response variable and a mixture of continuous and categorical independent variables. The model runs when I have categorical (binary) response and two continuous independent variables (VAR1 and VAR2), but when I include a categorical (binary) independent variable (VAR3), I receive the following output with

[[.data frame and row names

2007 Jul 12

[[.data frame and row names

Hi, I'm wondering why indexing a data frame by row name doesn't work with [[. It works with [: > sw <- swiss[1:5,1:2] > sw["Moutier", "Agriculture"] [1] 36.5 but not with [[: > sw[["Moutier", "Agriculture"]] Error in .subset2(.subset2(x, ..2), ..1) : subscript out of bounds The problem is really with the row name (and not

trouble with ifelse statement

2012 May 16

trouble with ifelse statement

Hello, I apologize in advance for not providing sample data, I'm a very new to R and can't easily generate appropriate sample data quickly. I'm hoping someone can offer advice without it. This code below works and does what I want it to do, which is for a given row in my dataframe, where the variable "peak.cort" = max, it makes the value of another variable

subset data.frame at C level

2020 Jun 17

subset data.frame at C level

Hi, Hope you are well. I was wondering if there is a function at C level that is equivalent to mtcars$carb or .subset2(mtcars, "carb"). If I have the index of the column then the answer would be VECTOR_ELT(df, asInteger(idx)) but I was wondering if there is a way to do it directly from the name of the column without having to loop over columns names to find the index? Thank you Best

bug (?) in [.data.frame with matrix-like indexing

2007 Oct 14

bug (?) in [.data.frame with matrix-like indexing

Consider in R-2.6.0 (also R-patched from yesterday): iris[1, c(TRUE, FALSE, FALSE, FALSE, FALSE)] ## Error in .subset2(xx, j) : recursive indexing failed at level 2 iris[1, c(FALSE, FALSE, FALSE, FALSE, TRUE)] ## Error in .subset2(xx, j) : attempt to select less than one element i.e. matrix-like indexing on data.frames, one logically-indexed dimension with only one value TRUE in it. It is

[.data.frame speedup

2008 Jul 01

[.data.frame speedup

Below is a version of [.data.frame that is faster for subscripting rows of large data frames; it avoids calling duplicated(rows) if there is no need to check for duplicate row names, when: i is logical attr(x, "dup.row.names") is not NULL (S+ compatibility) i is numeric and negative i is strictly increasing "[.data.frame" <- function (x, i, j,

True length - length(unclass(x)) - without having to call unclass()?

2018 Aug 24

True length - length(unclass(x)) - without having to call unclass()?

Is there a low-level function that returns the length of an object 'x' - the length that for instance .subset(x) and .subset2(x) see? An obvious candidate would be to use: .length <- function(x) length(unclass(x)) However, I'm concerned that calling unclass(x) may trigger an expensive copy internally in some cases. Is that concern unfounded? Thxs, Henrik

Help interpreting output of Rprof

2007 Oct 22

Help interpreting output of Rprof

Hello there, I am not quite sure how to interpret the output of Rprof (in the following the output I was staring at). I was poking around the web a little bit for documentation but without much success. I guess if I want to figure out what takes so long in my code the 2nd table $by.total and the total.pct column (pct = percent) is the most helpful. What does it mean that [ or [.data.frame is

how to get rid of 2 for-loops and optimize runtime

2009 Oct 19

how to get rid of 2 for-loops and optimize runtime

Short: get rid of the loops I use and optimize runtime Dear all, I want to calculate for each row the amount of the month ago. I use a matrix with 2100 rows and 22 colums (which is still a very small matrix. nrows of other matrixes can easily be more then 100000) Table before Year month quarter yearmonth Service ... Amount 2009 9 Q3 092009 A ...

use of ddply() within function

2012 Sep 06

use of ddply() within function

Dear all, I am encountering problems with the application of ddply within the body of a self-defined function. The script is the following: moncostcarmoto <- function(costtype){ costaux_result <- data.frame() for (purp in PURPcount){for (per in PERcount){ costcarin =

Making an S3 object act like a data.frame

2006 Mar 07

Making an S3 object act like a data.frame

"[.ggobiDataset" <- function(x, ..., drop=FALSE) { x <- as.data.frame(x) NextMethod("[", x) } "[[.ggobiDataset" <- function(x, ..., drop=FALSE) { x <- as.data.frame(x) NextMethod("[[", x) } "$.ggobiDataset" <- function(x, ..., drop=FALSE) { x <- as.data.frame(x) NextMethod("$", x) } > class(x) [1]

fast subsetting of lists in lists

2010 Dec 07

fast subsetting of lists in lists

Hello, my data is contained in nested lists (which seems not necessarily to be the best approach). What I need is a fast way to get subsets from the data. An example: test <- list(list(a = 1, b = 2, c = 3), list(a = 4, b = 5, c = 6), list(a = 7, b = 8, c = 9)) Now I would like to have all values in the named variables "a", that is the vector c(1, 4, 7). The best I could come up

plot separate groups with plotmeans()

2011 Nov 09

plot separate groups with plotmeans()

Hi, I often use plotmeans() from the gplots package to quickly visualize a pattern of change. I would like to be able to plot separate lines for different groups, but the function gives an error when a grouping variable is included in the formula argument. For instance, > require(gplots) > x <- data.frame(Score=rnorm(100), Time=rep(1:10, 10),

similar to: using "factor" to eliminate unused levels without dropping other variables