Displaying 20 results from an estimated 10000 matches similar to: "Select only unique rows from a data frame"
2013 Feb 01
2
expand.grid on contents of a list
Hello!
I have a list of variable length. One example is:
X=vector("list",3)
X[[1]]=1:2
X[[2]]=1:2
X[[3]]=1:2
How could I run expand.grid on the elements of X so that the results would
be the same as expand.grid(1:2,1:2,1:2)?
Thank you!
Dimitri
--
Dimitri Liakhovitski
gfk.com <http://marketfusionanalytics.com/>
[[alternative HTML version deleted]]
2013 Feb 12
3
grabbing from elements of a list without a loop
Hello!
# I have a list with several data frames:
mylist<-list(data.frame(a=1:2,b=2:3),
data.frame(a=3:4,b=5:6),data.frame(a=7:8,b=9:10))
(mylist)
# I want to grab only one specific column from each list element
neededcolumns<-c(1,2,0) # number of the column I need from each element of
the list
# Below, I am doing it using a loop:
newlist<-NULL
for(i in 1:length(mylist) ) {
2013 Jan 29
4
Fastest way to compare a single value with all values in one column of a data frame
Hello!
I have a large data frame x:
x<-data.frame(item=letters[1:5],a=1:5,b=11:15) # in actuality, x has 1000
rows
x$item<-as.character(x$item)
I also have a small data frame y with just 1 row:
y<-data.frame(item="f",a=3,b=10)
y$item<-as.character(y$item)
I have to decide if y$a is larger than the smallest of all the values in
x$a. If it is, I want y to replace the whole
2013 Feb 03
1
Looping through rows of all elements of a list that has variable length
Dear R-ers,
I have a list of data frames such that the length of the list is unknown in
advance (it could be 1 or 2 or more). Each element of the list contains a
data frame.
I need to loop through all rows of the list element 1 AND (if applicable)
of the list element 2 etc. and do something at each iteration.
I am trying to figure out how to write a code that is generic, i.e., loops
through the
2012 Dec 07
2
Assigning cases to groupings based on the values of several variables
Dear R-ers,
my task is to simple: to assign cases to desired groupings based on the
combined values on 2 variables. I can think of 3 methods of doing it.
Method 1 seems to me pretty r-like, but it requires a lot of lines of code
- onerous.
Method 2 is a loop, so not very good - as it loops through all rows of
mydata.
Method 3 is a loop but loops through fewer lines, so it seems to me more
2011 Sep 16
2
"rounding" to a number that is LOWER than my number
Hello!
What function would allow me to "round" down, rather than up?
For example, x<-1.98
I'd like to get 1.9 - rather than 2.0.
Reason - I am creating a minimum for an axis for a plot, and I need it
to be lower than x (which, in turn, is the lowest number already).
Thank you!
--
Dimitri Liakhovitski
marketfusionanalytics.com
2011 Aug 05
2
summing columns with NAs present
Hello!
I have a data frame with some NAs.
test<-data.frame(a=c(1,2,NA),b=c(10,NA,20))
I need to sum up values in 2 variables. However:
test$a+test$b
procudes NAs in rows that have NAs.
How could I sum up columns while ignoring NAs (the way the function
sum(..., na.rm=T) works?
Thank you!
--
Dimitri Liakhovitski
marketfusionanalytics.com
2011 Aug 04
2
Efficient way of creating a shifted (lagged) variable?
Hello!
I have a data set:
set.seed(123)
y<-data.frame(week=seq(as.Date("2010-01-03"), as.Date("2011-01-31"),by="week"))
y$var1<-c(1,2,3,round(rnorm(54),1))
y$var2<-c(10,20,30,round(rnorm(54),1))
# All I need is to create lagged variables for var1 and var2. I looked
around a bit and found several ways of doing it. They all seem quite
complicated - while in
2011 Aug 01
1
Identifying US holidays
Hello!
I am trying to identify which ones of a vector of dates are US
holidays. And, ideally, which is which. And I do not know (a-priori)
which dates those should be.
I have, for example:
x<-seq(as.Date("2011-01-01"),as.Date("2011-12-31"),by="day")
(x)
I think chron should help me here - but maybe I am not using it properly:
library(chron)
is.holiday(chron) #
2012 Jan 27
1
Overimposing one map in ssplot onto another
Hello!
I have 2 maps - both created in ssplot and both identical in terms of
outline. Is there any way to superimpose Map1 (which has black borders
between Canadian provinces) onto Map2 (which is also a map of Canada)?
Thanks a lot for your hints!
Dimitri
### A. Reading in Canada data at the province and then at the county level:
library(raster)
getData('ISO3') # Canada's code is
2013 Mar 11
1
glm and lm can't find weights
Hello, and apologies for not providing an example. However, my question is
more general.
I have a lengthy function. This function is using another internal function
that modifies the data frame I am reading in. This internal function is
using the command model.frame (with data and weights inside) and returns a
data frame I am using for further analyses.
However, when I try to run my function
2011 Nov 10
3
optim seems to be finding a local minimum
Hello!
I am trying to create an R optimization routine for a task that's
currently being done using Excel (lots of tables, formulas, and
Solver).
However, otpim seems to be finding a local minimum.
Example data, functions, and comparison with the solution found in
Excel are below.
I am not experienced in optimizations so thanks a lot for your advice!
Dimitri
### 2 Inputs:
2011 Jul 21
4
squared "pie chart" - is there such a thing?
Hello!
It's a shoot in the dark, but I'll try. If one has a total of 100
(e.g., %), and three components of the total, e.g.,
mytotal=data.frame(x=50,y=30,z=20), - one could build a pie chart with
3 sectors representing x, y, and z according to their proportions in
the total.
I am wondering if it's possible to build something very similar, but
not on a circle but in a square - such that
2011 Jul 19
1
calculating mean excluding zeros
Sorry if it's been discussed before - don't seem to find it.
I'd like to calculate a mean while ignoring zeros.
"mean" doesn't seem to have an option for that.
Any other function/package that could do it?
Thanks for a pointer!
--
Dimitri Liakhovitski
marketfusionanalytics.com
2011 Jul 29
1
splitting a string based on the last underscore
Hello!
Hope you could help me split the strings.
I have a set of strings:
x<-c("name_a1_2.5.o","name_a2_2.53.o","name_a3_bla_1.o")
I need to extract from each string:
1. Its unique part that comes before the last "_", i.e.: "a1","a2","a3_bla".
2. The part that comes after the last "_" and before ".o"
2011 Sep 13
1
using vif from package "car" - "aliased coefficients in the model"
Hello!
I have run a simple regression - lm and created a regression object "myreg".
I can see all the coefficients when I print(myreg).
Then I tried to run vif(myreg) from the package "car".
However, it's giving me an error: in vif.lm(regr.f) : there are
aliased coefficients in the model
Very sorry for my question: Is there any way to get the vif's for all
predictors?
2011 Aug 02
3
identifying weeks (dates) that certain days (dates) fall into
Hello!
I have dates for the beginning of each week, e.g.:
weekly<-data.frame(week=seq(as.Date("2010-04-01"),
as.Date("2011-12-26"),by="week"))
week # each week starts on a Monday
I also have a vector of dates I am interested in, e.g.:
july4<-as.Date(c("2010-07-04","2011-07-04"))
I would like to flag the weeks in my weekly$week that
2011 Aug 12
1
Which Durbin-Watson is correct? (weights involved) - using durbinWatsonTest and dwtest (packages car and lmtest)
Hello!
I have a data frame mysample (sorry for a long way of creating it
below - but I need it in this form, and it works). I regress Y onto X1
through X11 - first without weights, then with weights:
regtest1<-lm(Y~., data=mysample[-13]))
regtest2<-lm(Y~., data=mysample[-13]),weights=mysample$weight)
summary(regtest1)
summary(regtest2)
Then I calculate Durbin-Watson for both regressions
2009 Sep 04
2
transforming a badly organized data base into a list of data frames
Dear R-ers!
I have a badly organized data base in Excel. Once I read it into R it
looks like this (all variables become factors because of many spaces
and other characters in Excel):
2012 Jul 11
2
help with merging 2 data frames
Dear R-ers,
I feel I am close, but can't get it quite right.
Thanks a lot for your help!
Dimitri
# I have 2 data frames:
x<-data.frame(a=c("aa","aa","ab","ab","ba","ba","bb","bb"),b=c(1:2,1:2,1:2,1:2),d=c(10,20,30,40,50,60,70,80))