similar to: dataframe subsetting

Displaying 20 results from an estimated 20000 matches similar to: "dataframe subsetting"

2010 Jul 15
1
Very slow subsetting by name
Hi, I'm subsetting a named vector using character indices. My vector of indices (or keys) is 10x longer than the vector I'm subsetting. All my keys are distinct and only 10% of them are valid (i.e. match a name of the vector being subsetted). It is surprisingly slow: x1 <- 1:1000 names(x1) <- paste("a", x1, sep="") keys <- sample(c(names(x1),
2011 Oct 19
1
Subsetting data by eliminating redundant variables
Dear All, I am new to R, I have one question which might be easy. I have a large data with more than 250 variable, i am reducing number of variables by redun function as in the example below, n <- 100 x1 <- runif(n) x2 <- runif(n) x3 <- x1 + x2 + runif(n)/10 x4 <- x1 + x2 + x3 + runif(n)/10 x5 <- factor(sample(c('a','b','c'),n,replace=TRUE)) x6 <-
2012 Nov 24
5
subsetting - questions
Hello, I have two very basic questions (console attached): 1) What am I getting an error message for # 5 and # 7 ? 2) How to fix the code? I would appreciate receiving your help. Thanks, Pradip Muhuri ###### Reproducible Example ##### N <- 100 set.seed(13) df<-data.frame(matrix(sample(c(1:10),N, replace=TRUE),ncol=5)) keep_var <- c("X1", "X2") drop_var
2004 Jul 30
1
Subsetting dataframe
Dear R-help, I have a question on subsetting a dataframe and I searched all of R-help to no avail. Please view the following example dataframe: # Example > x <- factor(rep(c(1,2,3,4),2)) > y <- c(1,4,3,2,1,2,5,1,2) > z <- c(10,12,18,21,24,32,34,12,23) > test <- data.frame(x, y, z) > test x y z 1 1 1 10 2 2 4 12 3 3 3 18 4 4 2 21 5 1 1 24 6 2 2 32 7 3 5 34 8
1999 Jun 30
1
qr and Moore-Penrose
> Date: Wed, 30 Jun 1999 11:12:24 +0200 (MET DST) > From: Torsten Hothorn <hothorn at amadeus.statistik.uni-dortmund.de> > > yesterday I had a little shock using qr (or lm). having a matrix > > X <- cbind(1,diag(3)) > y <- 1:3 > > the qr.coef returns one NA (because X is singular). So I computed the > Moore-Penrose inverse of X (just from the
2003 Jan 22
1
dataframe subsetting behaviour
Hi, I'm trying to understand a behaviour that I have encountered and can't fathom. Here's some code I will use to illustrate the behaviour: # start with some data frame "a" having some named columns a <- data.frame(a=rep(1,3),c=rep(2,3),d=rep(3,3),e=rep(4,3)) # create a subset of the original data frame, but include a # name "b" that is not present in my
2013 Aug 25
3
Rodondeo de una matriz
Gracias, Jorge. Y cual fue la solucion a la que llegaron? --JIV Sent from my phone. Please excuse my brevity and misspelling. On Aug 25, 2013, at 8:36 AM, Jorge Ayuso Rejas <jayusor@gmail.com> wrote: Esto lo hice yo en una práctica en la universidad, Definíamos un problema de optimización entera minimizando el error de redondeo y restringiendo a la suma de filas y columnas. El 23 de
2010 Sep 01
1
Looks like a bug in subsetting of a complicated object
I don't understand what is happening! I have a (large) object sim1, an matrix list with dim c(101,101) where each element is an 3*3 matrix. I am subsetting that with a matrix coo, of dim c(100,2), of unique indices, but the resulting object has length 99, not 100 as expected. Code reproducing the problem follows: library(RandomFields) set.seed(123) sim0 <- GaussRF(x=seq(0, 100, by=1),
2005 Jun 07
1
Help with possible bug (assigning NA value to data.frame) ?
There's something peculiar that I do not understand here. However, did you realize that the thing you are assigning into parts of `a' is NULL? Check you're my.test.boot.ci.1: It's NULL. Be that as it may, I get: > a <- data.frame(matrix(1:4, nrow=2), X3=NA, X4=NA) > a X1 X2 X3 X4 1 1 3 NA NA 2 2 4 NA NA > a[a$X1 == 1,]$X3 <- NULL > a X1 X2 X3 X4 1 1
2009 Feb 25
3
indexing model names for AICc table
hi folks, I'm trying to build a table that contains information about a series of General Linear Models in order to calculate Akaike weights and other measures to compare all models in the series. i have an issue with indexing models and extracting the information (loglikehood, AIC's, etc.) that I need to compile them into the table. Below is some sample code that illustrates my
2007 Mar 15
2
replacing all NA's in a dataframe with zeros...
I've seen how to replace the NA's in a single column with a data frame *> mydata$ncigs[is.na(mydata$ncigs)]<-0 *But this is just one column... I have thousands of columns (!) that I need to do this, and I can't figure out a way, outside of the dreaded loop, do replace all NA's in an entire data frame (all vars) without naming each var separately. Yikes. I'm racking my
2003 Nov 10
5
Subsetting a list of vectors
Hi, I'm trying to subset a list which contains variable length vectors. What I want to do is extract (eg.) the 3rd item in each vector (with length >= 3). At the moment I'm using sapply(list.of.vectors, function(x) {x[3]}). The problem with this is that sapply returns a list of the same length of list.of.vectors so I end up with a whole lot of null entries from those vectors
2011 Mar 22
1
help need on working in subset within a dataframe
Dear R-experts Execuse me for an easy question, but I need help, sorry for that. >From days I have been working with a large dataset, where operations are needed within a component of dataset. Here is my question: I have big dataset where x1:.....x1000 or so. What I need to do is to work on 4 consequite variables to calculate a statistics and output. So far so good. There are more vector
2003 May 16
2
Efficient subsetting
Hi, I'm facing this problem quite a lot, so it seems worthwhile to check to see what the most efficient solution is. I've two vectors x (values ordered) and y. I've ranges x < x0, x0 <= x < x1, x1 <= x < x2, x2 <= x < x3, x > xn and want to construct a subvector yprime of y which consists of the first/last value of y whose x values are in the range. For
2011 Mar 21
3
How to substract a valur from dataframe with condition
Hello All, I need help with my dataframe, it is big but here I am using a small table as an example. My dataframe df looks like: X1 X2 X3 1 2011-02 0.00 96.00 2 2011-02 0.00 2.11 3 2011-02 2.00 3.08 4 2011-02 0.06 2.79 5 2011-02 0.00 96.00 6 2011-02 0.00 97.00 7 2011-02 0.08 2.23 I want values in columns X2 and X3 to be checked if they are greater than
2013 Aug 23
2
Rodondeo de una matriz
Buenas tardes a tod@s, Me gustaria redondear las entradas de una matriz m manteniendo la suma de filas y columnas constantes (son valores fijos conocidos). En la aplicacion que estoy trabajando (en la que por supuesto m tiene una dimension mayor que en el ejemplo), no son permitidos numeros decimales y por ello debe efectuarse el redondeo. La forma de m es: m <- matrix(c(3.546, 4.5345,
2012 Dec 30
4
How to multiple the vector and variables from dataframe
hi all: Here's a dataframe(dat) and a vector(z): dat: x1 x2 x3 0.2 1.2 2.5 0.5 2 5 0.8 3 6.2 > z [1] 10 100 100 I wanna do the following: 10*x1,100*x2,1000*x3 My solution is using the loop for z and dat(since the length of z is the same as ncol of dat),which is tedious. I wanna an efficient solution to do it . Any help? Many thanks! My best
2009 Nov 05
2
Merge records in the same dataframe
Hi: Suppose that I have a data frame as below x1 x2 x3 ... x10 wk1 wk2 ... Wk208 (these are the column names) For each record, x1, x2, x3 ... x10 are attributes. and wk1, wk2, ..., wk208 are the sales recoreded for this attribute combination. Suppose that now, that I want to do the following 1. Merge the data frame so that I have a new data frame grouped by values of x2 and x3 (for example).
2012 Aug 28
4
Search for locations of subsequences?
Is there a function to efficiently search for a subsequence within a vector? For example, with x <- 1:100 I'd like to search for the sequence c(49,50,51), and be told that it occurs exactly once, starting at location 49. (The items in the vectors might be numeric or character, and there might be repetitions within the search pattern or within the vector I'm searching.) Duncan
2010 Jun 22
1
applying ifelse to dataframe
The following dataframe will illustrate the problem DF<-data.frame(name=rep(1:5,each=2),x1=rep("A",10),x2=seq(10,19,by=1),x3=rep(NA,10),x4=seq(20,29,by=1)) DF$x3[5]<-50 # we have a data frame. we are interested in the columns x2,x3,x4 which contain sparse # values and many NA. DF name x1 x2 x3 x4 1 1 A 10 NA 20 2 1 A 11 NA 21 3 2 A 12 NA 22 4 2 A 13 NA