similar to: Select only unique rows from a data frame

Displaying 20 results from an estimated 10000 matches similar to: "Select only unique rows from a data frame"

2013 Feb 01
2
expand.grid on contents of a list
Hello! I have a list of variable length. One example is: X=vector("list",3) X[[1]]=1:2 X[[2]]=1:2 X[[3]]=1:2 How could I run expand.grid on the elements of X so that the results would be the same as expand.grid(1:2,1:2,1:2)? Thank you! Dimitri -- Dimitri Liakhovitski gfk.com <http://marketfusionanalytics.com/> [[alternative HTML version deleted]]
2013 Feb 12
3
grabbing from elements of a list without a loop
Hello! # I have a list with several data frames: mylist<-list(data.frame(a=1:2,b=2:3), data.frame(a=3:4,b=5:6),data.frame(a=7:8,b=9:10)) (mylist) # I want to grab only one specific column from each list element neededcolumns<-c(1,2,0) # number of the column I need from each element of the list # Below, I am doing it using a loop: newlist<-NULL for(i in 1:length(mylist) ) {
2013 Jan 29
4
Fastest way to compare a single value with all values in one column of a data frame
Hello! I have a large data frame x: x<-data.frame(item=letters[1:5],a=1:5,b=11:15) # in actuality, x has 1000 rows x$item<-as.character(x$item) I also have a small data frame y with just 1 row: y<-data.frame(item="f",a=3,b=10) y$item<-as.character(y$item) I have to decide if y$a is larger than the smallest of all the values in x$a. If it is, I want y to replace the whole
2013 Feb 03
1
Looping through rows of all elements of a list that has variable length
Dear R-ers, I have a list of data frames such that the length of the list is unknown in advance (it could be 1 or 2 or more). Each element of the list contains a data frame. I need to loop through all rows of the list element 1 AND (if applicable) of the list element 2 etc. and do something at each iteration. I am trying to figure out how to write a code that is generic, i.e., loops through the
2012 Dec 07
2
Assigning cases to groupings based on the values of several variables
Dear R-ers, my task is to simple: to assign cases to desired groupings based on the combined values on 2 variables. I can think of 3 methods of doing it. Method 1 seems to me pretty r-like, but it requires a lot of lines of code - onerous. Method 2 is a loop, so not very good - as it loops through all rows of mydata. Method 3 is a loop but loops through fewer lines, so it seems to me more
2011 Sep 16
2
"rounding" to a number that is LOWER than my number
Hello! What function would allow me to "round" down, rather than up? For example, x<-1.98 I'd like to get 1.9 - rather than 2.0. Reason - I am creating a minimum for an axis for a plot, and I need it to be lower than x (which, in turn, is the lowest number already). Thank you! -- Dimitri Liakhovitski marketfusionanalytics.com
2011 Aug 05
2
summing columns with NAs present
Hello! I have a data frame with some NAs. test<-data.frame(a=c(1,2,NA),b=c(10,NA,20)) I need to sum up values in 2 variables. However: test$a+test$b procudes NAs in rows that have NAs. How could I sum up columns while ignoring NAs (the way the function sum(..., na.rm=T) works? Thank you! -- Dimitri Liakhovitski marketfusionanalytics.com
2011 Aug 04
2
Efficient way of creating a shifted (lagged) variable?
Hello! I have a data set: set.seed(123) y<-data.frame(week=seq(as.Date("2010-01-03"), as.Date("2011-01-31"),by="week")) y$var1<-c(1,2,3,round(rnorm(54),1)) y$var2<-c(10,20,30,round(rnorm(54),1)) # All I need is to create lagged variables for var1 and var2. I looked around a bit and found several ways of doing it. They all seem quite complicated - while in
2011 Aug 01
1
Identifying US holidays
Hello! I am trying to identify which ones of a vector of dates are US holidays. And, ideally, which is which. And I do not know (a-priori) which dates those should be. I have, for example: x<-seq(as.Date("2011-01-01"),as.Date("2011-12-31"),by="day") (x) I think chron should help me here - but maybe I am not using it properly: library(chron) is.holiday(chron) #
2012 Jan 27
1
Overimposing one map in ssplot onto another
Hello! I have 2 maps - both created in ssplot and both identical in terms of outline. Is there any way to superimpose Map1 (which has black borders between Canadian provinces) onto Map2 (which is also a map of Canada)? Thanks a lot for your hints! Dimitri ### A. Reading in Canada data at the province and then at the county level: library(raster) getData('ISO3') # Canada's code is
2013 Mar 11
1
glm and lm can't find weights
Hello, and apologies for not providing an example. However, my question is more general. I have a lengthy function. This function is using another internal function that modifies the data frame I am reading in. This internal function is using the command model.frame (with data and weights inside) and returns a data frame I am using for further analyses. However, when I try to run my function
2011 Nov 10
3
optim seems to be finding a local minimum
Hello! I am trying to create an R optimization routine for a task that's currently being done using Excel (lots of tables, formulas, and Solver). However, otpim seems to be finding a local minimum. Example data, functions, and comparison with the solution found in Excel are below. I am not experienced in optimizations so thanks a lot for your advice! Dimitri ### 2 Inputs:
2011 Jul 21
4
squared "pie chart" - is there such a thing?
Hello! It's a shoot in the dark, but I'll try. If one has a total of 100 (e.g., %), and three components of the total, e.g., mytotal=data.frame(x=50,y=30,z=20), - one could build a pie chart with 3 sectors representing x, y, and z according to their proportions in the total. I am wondering if it's possible to build something very similar, but not on a circle but in a square - such that
2011 Jul 19
1
calculating mean excluding zeros
Sorry if it's been discussed before - don't seem to find it. I'd like to calculate a mean while ignoring zeros. "mean" doesn't seem to have an option for that. Any other function/package that could do it? Thanks for a pointer! -- Dimitri Liakhovitski marketfusionanalytics.com
2011 Jul 29
1
splitting a string based on the last underscore
Hello! Hope you could help me split the strings. I have a set of strings: x<-c("name_a1_2.5.o","name_a2_2.53.o","name_a3_bla_1.o") I need to extract from each string: 1. Its unique part that comes before the last "_", i.e.: "a1","a2","a3_bla". 2. The part that comes after the last "_" and before ".o"
2011 Sep 13
1
using vif from package "car" - "aliased coefficients in the model"
Hello! I have run a simple regression - lm and created a regression object "myreg". I can see all the coefficients when I print(myreg). Then I tried to run vif(myreg) from the package "car". However, it's giving me an error: in vif.lm(regr.f) : there are aliased coefficients in the model Very sorry for my question: Is there any way to get the vif's for all predictors?
2011 Aug 02
3
identifying weeks (dates) that certain days (dates) fall into
Hello! I have dates for the beginning of each week, e.g.: weekly<-data.frame(week=seq(as.Date("2010-04-01"), as.Date("2011-12-26"),by="week")) week # each week starts on a Monday I also have a vector of dates I am interested in, e.g.: july4<-as.Date(c("2010-07-04","2011-07-04")) I would like to flag the weeks in my weekly$week that
2011 Aug 12
1
Which Durbin-Watson is correct? (weights involved) - using durbinWatsonTest and dwtest (packages car and lmtest)
Hello! I have a data frame mysample (sorry for a long way of creating it below - but I need it in this form, and it works). I regress Y onto X1 through X11 - first without weights, then with weights: regtest1<-lm(Y~., data=mysample[-13])) regtest2<-lm(Y~., data=mysample[-13]),weights=mysample$weight) summary(regtest1) summary(regtest2) Then I calculate Durbin-Watson for both regressions
2009 Sep 04
2
transforming a badly organized data base into a list of data frames
Dear R-ers! I have a badly organized data base in Excel. Once I read it into R it looks like this (all variables become factors because of many spaces and other characters in Excel):
2012 Jul 11
2
help with merging 2 data frames
Dear R-ers, I feel I am close, but can't get it quite right. Thanks a lot for your help! Dimitri # I have 2 data frames: x<-data.frame(a=c("aa","aa","ab","ab","ba","ba","bb","bb"),b=c(1:2,1:2,1:2,1:2),d=c(10,20,30,40,50,60,70,80))