similar to: using "factor" to eliminate unused levels without dropping other variables

Displaying 20 results from an estimated 300 matches similar to: "using "factor" to eliminate unused levels without dropping other variables"

2010 Nov 18
3
problems subsetting
Dear all, I have searched the forums for an answer - and there is plenty of questions along the same line - but none of the paproaches shown worked to my problem: I have a data frame that I get from a csv: summarystats<-as.data.frame(read.csv(file=f_summary)); where I have the columns Dataset, Class, Type, Category,.. Problem1: I want to find a subset of this frame, based on values in
2011 Feb 24
1
Creating objects (data.frames) with names stored in character vector
Hello, I'm fairly new to R. I'm a chemist, not a programmer so please bear with me. I have a large data.frame that I want to break down (subset) into smaller data.frames for analysis. I would like to give the data.frames descriptive names which I have stored in a character vector. My original thought was that I want the subsets to show up as individual objects, but haveing them stored
2009 Sep 13
1
Manage an unknown and variable number of data frames
Hi, In the code below I create a small data.frame (dat) and then cut it into different groups using CutList. The lists in CutList allow to me choose whatever columns I want from dat and allow me to cut it into any number of groups by changing the lists. It seems to work OK but when I'm done I have a variable number of data frames what I need to do further operations on and I don't know
2010 Feb 17
2
Is the aggregate function the best way to do this?
Hi, I''m having a dataframe ''Subset1'' with a number of factor variables and 160 numerical variables Now I want to make sums for all rows that have the same values for the different factor variables, except for the factor variables: VAR1,VAR2,VAR3 who may have the same values. With the formula given below this works great, but in a situation with 15000 rows and 13
2004 Dec 14
1
Multiple options for a package
Hi R-devel, I am facing a situation where the number of options I would like to propose to the user is somewhat big (and could easily increase more and more as I will code up a little more - even coming to a point where an user should be able to implement his own options). What we have to handle options is the couple: options(par=value) and getOption("par") I was aking myselft what
2007 Dec 20
1
custom subset method / handling columns selection as logic in '...' parameter
Dear R-helpers & bioconductor Sorry for cross-posting, this concerns R-programming stuff applied on Bioconductor context. Also sorry for this long message, I try to be complete in my request. I am trying to write a subset method for a specific class (ExpressionSet from Bioconductor) allowing selection more flexible than "[" method . The schema I am thinking for is the following:
2009 Nov 22
4
Do you keep an archive of "useful" R code? and if so - how?
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20091122/430e3297/attachment-0001.pl>
2007 Dec 29
1
COMPAR.GEE error with logistic model
Hello, I am trying to run the APE program COMPAR.GEE with a model containing a categorical response variable and a mixture of continuous and categorical independent variables. The model runs when I have categorical (binary) response and two continuous independent variables (VAR1 and VAR2), but when I include a categorical (binary) independent variable (VAR3), I receive the following output with
2007 Jul 12
1
[[.data frame and row names
Hi, I'm wondering why indexing a data frame by row name doesn't work with [[. It works with [: > sw <- swiss[1:5,1:2] > sw["Moutier", "Agriculture"] [1] 36.5 but not with [[: > sw[["Moutier", "Agriculture"]] Error in .subset2(.subset2(x, ..2), ..1) : subscript out of bounds The problem is really with the row name (and not
2012 May 16
2
trouble with ifelse statement
Hello, I apologize in advance for not providing sample data, I'm a very new to R and can't easily generate appropriate sample data quickly. I'm hoping someone can offer advice without it. This code below works and does what I want it to do, which is for a given row in my dataframe, where the variable "peak.cort" = max, it makes the value of another variable
2020 Jun 17
2
subset data.frame at C level
Hi, Hope you are well. I was wondering if there is a function at C level that is equivalent to mtcars$carb or .subset2(mtcars, "carb"). If I have the index of the column then the answer would be VECTOR_ELT(df, asInteger(idx)) but I was wondering if there is a way to do it directly from the name of the column without having to loop over columns names to find the index? Thank you Best
2007 Oct 14
1
bug (?) in [.data.frame with matrix-like indexing
Consider in R-2.6.0 (also R-patched from yesterday): iris[1, c(TRUE, FALSE, FALSE, FALSE, FALSE)] ## Error in .subset2(xx, j) : recursive indexing failed at level 2 iris[1, c(FALSE, FALSE, FALSE, FALSE, TRUE)] ## Error in .subset2(xx, j) : attempt to select less than one element i.e. matrix-like indexing on data.frames, one logically-indexed dimension with only one value TRUE in it. It is
2008 Jul 01
1
[.data.frame speedup
Below is a version of [.data.frame that is faster for subscripting rows of large data frames; it avoids calling duplicated(rows) if there is no need to check for duplicate row names, when: i is logical attr(x, "dup.row.names") is not NULL (S+ compatibility) i is numeric and negative i is strictly increasing "[.data.frame" <- function (x, i, j,
2018 Aug 24
5
True length - length(unclass(x)) - without having to call unclass()?
Is there a low-level function that returns the length of an object 'x' - the length that for instance .subset(x) and .subset2(x) see? An obvious candidate would be to use: .length <- function(x) length(unclass(x)) However, I'm concerned that calling unclass(x) may trigger an expensive copy internally in some cases. Is that concern unfounded? Thxs, Henrik
2007 Oct 22
2
Help interpreting output of Rprof
Hello there, I am not quite sure how to interpret the output of Rprof (in the following the output I was staring at). I was poking around the web a little bit for documentation but without much success. I guess if I want to figure out what takes so long in my code the 2nd table $by.total and the total.pct column (pct = percent) is the most helpful. What does it mean that [ or [.data.frame is
2009 Oct 19
2
how to get rid of 2 for-loops and optimize runtime
Short: get rid of the loops I use and optimize runtime Dear all, I want to calculate for each row the amount of the month ago. I use a matrix with 2100 rows and 22 colums (which is still a very small matrix. nrows of other matrixes can easily be more then 100000) Table before Year month quarter yearmonth Service ... Amount 2009 9 Q3 092009 A ...
2012 Sep 06
1
use of ddply() within function
Dear all, I am encountering problems with the application of ddply within the body of a self-defined function. The script is the following: moncostcarmoto <- function(costtype){ costaux_result <- data.frame() for (purp in PURPcount){for (per in PERcount){ costcarin =
2006 Mar 07
3
Making an S3 object act like a data.frame
"[.ggobiDataset" <- function(x, ..., drop=FALSE) { x <- as.data.frame(x) NextMethod("[", x) } "[[.ggobiDataset" <- function(x, ..., drop=FALSE) { x <- as.data.frame(x) NextMethod("[[", x) } "$.ggobiDataset" <- function(x, ..., drop=FALSE) { x <- as.data.frame(x) NextMethod("$", x) } > class(x) [1]
2010 Dec 07
5
fast subsetting of lists in lists
Hello, my data is contained in nested lists (which seems not necessarily to be the best approach). What I need is a fast way to get subsets from the data. An example: test <- list(list(a = 1, b = 2, c = 3), list(a = 4, b = 5, c = 6), list(a = 7, b = 8, c = 9)) Now I would like to have all values in the named variables "a", that is the vector c(1, 4, 7). The best I could come up
2011 Nov 09
2
plot separate groups with plotmeans()
Hi, I often use plotmeans() from the gplots package to quickly visualize a pattern of change. I would like to be able to plot separate lines for different groups, but the function gives an error when a grouping variable is included in the formula argument. For instance, > require(gplots) > x <- data.frame(Score=rnorm(100), Time=rep(1:10, 10),