Displaying 20 results from an estimated 3000 matches similar to: "Analogue to SPSS regression commands ENTER and REMOVE in R?"
2009 Oct 13
4
replacing period with a space
Dear R-ers!
I have x as a variable in a data frame x.
x<-data.frame(x=c("aa.bb","cc.dd.ee"))
x$x<-as.character(x$x)
x
I am sorry for such a simple question - but how can I replace all
periods in x$x with spaces?
sub('.', ' ', x$x) - removes all letters to the left of each period...
Thanks a lot for your advice!
--
Dimitri Liakhovitski
Ninah.com
2010 Mar 30
4
Code is too slow: mean-centering variables in a data frame by subgroup
Dear R-ers,
I have a large data frame (several thousands of rows and about 2.5
thousand columns). One variable ("group") is a grouping variable with
over 30 levels. And I have a lot of NAs.
For each variable, I need to divide each value by variable mean - by
subgroup. I have the code but it's way too slow - takes me about 1.5
hours.
Below is a data example and my code that is too
2009 Sep 04
2
transforming a badly organized data base into a list of data frames
Dear R-ers!
I have a badly organized data base in Excel. Once I read it into R it
looks like this (all variables become factors because of many spaces
and other characters in Excel):
2009 Sep 23
2
Function to check if a vector contains a given value?
Dear R'rs,
is there a function that checks if a given vector contains a certain value.
E.g., x<-c(1,2,3,4).
How can I get a TRUE or FALSE for whether x contains a 2?
--
Dimitri Liakhovitski
Ninah.com
Dimitri.Liakhovitski at ninah.com
2010 Mar 18
1
R takes long time to open
Hello.
Until today I've been using R2.9 and since today R2.10 (on a PC).
In both of them it takes about 20 sec for the prompt to appear IN R
console after I start R. And every time it says: "Previous saved work
space restored" - even if I have not saved any workspace or, in case
of R2.10 - even though I have not used it once.
In the older versions - R would start within 2-3 sec.
Is
2010 Jan 20
5
standardizing one variable by dividing each value by the mean - but within levels of a factor
Hello!
I have a data frame with a factor and a numeric variable:
x<-data.frame(factor=c("b","b","d","d","e","e"),values=c(1,2,10,20,100,200))
For each level of "factor" - I would like to divide each value of
"values" by the mean of "values" that corresponds to the level of
"factor"
In other
2009 Sep 17
2
referring to a row number and to a row condition, and to columns simultaneously
Hello, dear R-ers!
I have a data frame:
x<-data.frame(a=c(4,2,4,1,3,4),b=c(1,3,4,1,5,0),c=c(NA,2,5,3,4,NA),d=rep(NA,6),e=rep(NA,6))
x
When x$a==1, I would like to replace NAs in columns d and e with 8 and
9, respectively
When x$a != 1, I would like to replace NAs in columns d and e 101 and
1022, respectively.
However, I only want to do it for rows 2:5 - while ignoring what's
happening in
2010 Mar 30
1
Efficiency question: replacing all NAs with a zero
Dear R'ers,
I have a very large data frame (over 4000 rows and 2,500 columns). My
task is very simple - I have to replace all NAs with a zero. My code
works fine on smaller data frames - but I have to deal with a huge one
and there are many NAs in each column.
R runs out of memory on me ("Reached total allocation of 1535Mb: see
help(memory.size)"). Is there any other, more efficient
2010 Mar 25
1
Precision level
Hello!
I am wondering at what point does R consider a numeric value to be
equal to zero - for statements of the type x==0 and x %in% 0.
Thank you very much!
--
Dimitri Liakhovitski
Ninah.com
Dimitri.Liakhovitski at ninah.com
2010 Mar 26
4
Competing with SPSS and SAS: improving code that loops through rows (data manipulation)
Dear R-ers,
In my question there are no statistics involved - it's all about data
manipulation in R.
I am trying to write a code that should replace what's currently being
done in SAS and SPSS. Or, at least, I am trying to show to my
colleagues R is not much worse than SAS/SPSS for the task at hand.
I've written a code that works but it's too slow. Probably because
it's
2011 Mar 30
2
summing values by week - based on daily dates - but with some dates missing
Dear everybody,
I have the following challenge. I have a data set with 2 subgroups,
dates (days), and corresponding values (see example code below).
Within each subgroup: I need to aggregate (sum) the values by week -
for weeks that start on a Monday (for example, 2008-12-29 was a
Monday).
I find it difficult because I have missing dates in my data - so that
sometimes I don't even have the
2010 Oct 01
3
Suppressing printing in the function
Hello!
I wrote a function that returns a data frame. Nowhere in the function
do I say print(my.data.frame), but when I run the function - the data
frame is printed on the console.
Is there any way to suppress it?
Thank you!
--
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com
2011 Feb 25
6
preventing repeat in "paste"
Hello!
s<-"start"; e<-"end"
middle<-as.character(c(1,2,3))
I would like to get the following result:
"start 123 end" or "start 1 2 3 end" or "start 1,2,3 end"
How can I avoide this (undesired) result:
paste(s,middle,e,sep=" ")
Thank you!
--
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com
2011 Feb 24
4
Running code sequentially from separate scripts (but not functions)
Hello!
I am wondering if it's possible to run - in sequence - code that is
stored in several R scripts.
For example:
Script in the file "code1.r" contains the code:
a = 3; b = 5; c = a + b
Script in the file "code2.r" contains the code:
d = 10; e = d - c
Script in the file "code3.r" contains the code:
result=e/a
I understand that I could write those 3 scripts
2010 Oct 25
3
finding the year of a date
I know that I can use as.yearmon in the package "zoo" to find the year
and the month of a date.
I can use as. yearqtr to find the year and the quarter.
But how can one find just the year of a date?
Thanks a lot!
--
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com
2010 Aug 04
6
applying strsplit to a whole column
I am sorry, I'd like to split my column ("names") such that all the
beginning of a string ("X..") is gone and only the rest of the text is
left.
x<-data.frame(names=c("X..aba","X..abb","X..abc","X..abd"))
x$names<-as.character(x$names)
(x)
str(x)
Can't figure out how to apply strsplit in this situation - without
using a
2010 Aug 13
3
transforming dates into years
Hello!
If I have in my data frame MyFrame a variable saved as a Date and want
to translate it into years, I currently do it like this using "zoo":
library(zoo)
as.year <- function(x) as.numeric(floor(as.yearmon(x)))
myFrame$year<-as.year(myFrame$date)
Is there a function that would do it directly - like "as.yearmon" -
but for years?
Thank you!
--
Dimitri
2010 Mar 09
2
looping through predictors
Dear R-ers,
I have a data frame data with predictors x1 through x5 and the
response variable y.
I am running a simple regression:
reg<-lm(y~x1, data=data)
I would like to loop through all predictors. Something like:
predictors<-c("x1","x2",... "x10)
for(i in predictors){
reg<-lm(y~i)
etc.
}
But it's not working. I am getting an error:
Error in
2011 Mar 30
3
optim and optimize are not finding the right parameter
Dear all,
I have a function that predicts DV based on one predictor pred:
pred<-c(0,3000000,7800000,15600000,23400000,131200000)
DV<-c(0,500,1000,1400,1700,1900)
## I define Function 1 that computes the predicted value based on pred
values and parameters a and b:
calc_DV_pred <- function(a,b) {
DV_pred <- rep(0,(length(pred)))
for(i in 1:length(DV_pred)){
DV_pred[i] <- a *
2008 Sep 08
7
Question about multiple regression
Dear R-list,
maybe some of you could point me in the right direction:
Are you aware of any FREE Fortran or Java libraries/actual pieces of
code that are VERY efficient (time-wise) in running the regular linear
least-squares multiple regression?
More specifically, I have to run small regression models (between 1
and 15 predictors) on samples of up to N=700 but thousands and
thousands of them.
I