search for: dimitrijoe

Displaying 20 results from an estimated 29 matches for "dimitrijoe".

2005 Oct 20
5
spliting an integer
Hi there, From the vector X of integers, X = c(11999, 122000, 81997) I would like to make these two vectors: Z= c(1999, 2000, 1997) Y =c(1 , 12 , 8) That is, each entry of vector Z receives the four last digits of each entry of X, and Y receives "the rest". Any suggestions? Thanks in advance, Dimitri [[alternative HTML version deleted]]
2006 Apr 29
1
splitting and saving a large dataframe
Hi, I searched for this in the mailing list, but found no results. I have a large dataframe ( dim(mydata)= 1297059 16, object.size(mydata= 145280576) ) , and I want to perform some calculations which can be done by a factor's levels, say, mydata$myfactor. So what I want is to split this dataframe into nlevels(mydata$myfactor) = 80 levels. But I must do this efficiently, that is, I
2006 Jul 05
1
creating a data frame from a list
Dear all, I have a list with three (named) numeric vectors: > lst = list(a=c(A=1,B=8) , b=c(A=2,B=3,C=0), c=c(B=2,D=0) ) > lst $a A B 1 8 $b A B C 2 3 0 $c B D 2 0 Now, I'd love to use this list to create the following data frame: > dtf = data.frame(a=c(A=1,B=8,C=NA,D=NA), + b=c(A=2,B=3,C=0,D=NA), + c=c(A=NA,B=2,C=NA,D=0) ) > dtf a b
2006 Jan 26
1
efficiency with "%*%"
Hi, x and y are (numeric) vectors. I wonder if one of the following is more efficient than the other: x%*%y or sum(x*y) ? Thanks, Dimitri Szerman
2006 Jul 12
1
help in vectorization
Hi, I have two data frames. One is like > dtf = data.frame(y=c(rep(2002,4), rep(2003,5)), + m=c(9:12, 1:5), + def=c(.74,.75,.76,.78,.80,.82,.85,.85,.87)) and the other dtf2 = data.frame(y=rep( c(2002,2003),20), m=c(trunc(runif(20,1,5)),trunc(runif(20,9,12))), inc=rnorm(40,mean=300,sd=150) ) What I want is to divide
2005 Oct 25
2
Inf in regressions
Hi, Suppose I I wish to run lm( y ~ x + z + log(w) ) where w assumes non-negative values. A problem arises when w=0, as log(0) = -Inf, and R doesn't accept that (as it "accepts" NA). Is there a way to tell R to do with -Inf the same it does with NA, i.e, to ignore it? ( Otherwise I have to do something like w[w==0] <- NA which doesn't hurt, but might be a bit
2005 Jun 24
2
Gini with frequencies
Hi there, I am trying to compute Gini coefficients for vectors containing income classes. The data I possess look loke this: yit <- c(135, 164, 234, 369) piit <- c(367, 884, 341, 74 ) where yit is the vector of income classes, and fit is the vector of associated frequencies.(This data is from Rustichini, Ichino and Checci (Journal of Public Economics, 1999) ). In ineq pacakge, Gini( )
2018 Jun 01
1
rasterize SpatialPolygon object using a RasterBrick object
I am trying to rasterize a SpatialPolygon object by a RasterBrick object. The documentation of the raster::rasterize function explicitly says this is allowed. Here's what I am doing # load the raster package library("raster") # create a raster brick object using the example from the brick function documentation b <- brick(system.file("external/rlogo.grd",
2009 Nov 15
1
R crashing
Hello, This is what I am trying to do: I wrote a little function that takes addresses (coordinates) as input, and returns the road distance between every two points using Google Maps. Catch is, there are 2000 addresses, so I have to get around 2x10^6 addresses. On my first go, this is what I did: ######################################### getRoadDist = function(X,complete=F){ # X must be a
2007 Apr 06
2
lm() intercept at the end, rather than at the beginning
Hi, I wonder if someone has already figured out a way of making summary(mylm) # where mylm is an object of the class lm() to print the "(Intercept)" at the last line, rather than the first line of the output. I don't know about, say, biostatistics, but in economics the intercept is usually the least interesting of the parameters of a regression model. That's why, say, Stata
2006 Apr 26
1
help using tapply
Dear R-mates, # Here's what I am trying to do. I have a dataset like this: id = c(rep(1,8), rep(2,8)) dur1 <- c( 17,18,19,18,24,19,24,24 ) est1 <- c( rep(1,5), rep(2,3) ) dur2 <- c(1,1,3,4,8,12,13,14) est2 <- rep(1,8) mydata = data.frame(id, estat=c(est1, est2), durat=c(dur1, dur2)) # I want to one have this: id = c(rep(1,8), rep(2,8))
2007 Apr 05
2
creating a data frame from a list
Dear all, A few months ago, I asked for your help on the following problem: I have a list with three (named) numeric vectors: > lst = list(a=c(A=1,B=8) , b=c(A=2,B=3,C=0), c=c(B=2,D=0) ) > lst $a A B 1 8 $b A B C 2 3 0 $c B D 2 0 Now, I'd love to use this list to create the following data frame: > dtf = data.frame(a=c(A=1,B=8,C=NA,D=NA), +
2005 May 05
1
creating names for regressios using the assign()
Hi there. I have a data frame, X, with n+m columns. I want to regress each of the first n columns on the last m. This is what I am trying to do: for ( i in 1:n ) assign( paste ("reg",1:14,sep="")[i] , lm( X[,i] ~ X[,i+1] + ... + X[,i+m], data= X ) ) It happens that some of the regressions, say the 3rd, seems to be a singular fit (or something else I don't know
2005 May 14
1
pmvnorm
Hi there, pmvnorm(lo=c(-Inf,-Inf), up=c(Inf,Inf), mean=c(0,0) ) should give me "1", right? But it doens't - it giver me "0". Would someone help me, please? [[alternative HTML version deleted]]
2005 May 16
1
memory and step()
Hi there, I'm trying to perform a step(), using variables from a data set with 32.000 observations. The upper model is not so long - (x1 + x2 + x3)^2 - where x1...x3 are the explanatory variables. Yet, I got a memory problem when performing it. The message is the following: Error: cannot allocate vector of size 26859 Kb In addition: Warning messages: 1: Reached total allocation of 246Mb:
2005 Jun 09
1
getting more than the coefficients
Hi there, I am trying to export a regression output to Latex. I am using the xtable function in the xtable library. Doing myfit <- lm(myformula, mydata) print.xtable(xtable(myfit), file="myfile") only returns the estimated coefficients and the correspondent standard erros, t-statiscs and p-values. But I wish to get a bit more, say, the number of observations used in the
2005 Jul 12
0
transition matrix and discretized data
Hi there, I have data on earnings of 12000 individuals at two points in time. I intend to construct a transition matrix, where the typical element, p_ij, gives the probability that an individual ends at the j-th decile of the earnings distribution given that he was was initially at the i-th decile. Thus, this is a bi-stochastic matrix. The problem is that the income data is nearly discrete in the
2005 Jul 12
1
vectorizaton
Hi, I got 1000 NxN matrices grouped in one array. I want one matrix in which p_ij is the average of all the 1000 matrices in the array. Here's what I'm trying to do: # P is the NxNx1000 array for(i in 1:N) for(j in 1:N) for(k in 1: 1000) mymat[ i, j ] <- mean( P [i , j , k ] ) Otherwise, I could have a NxNx1000 vector, and get the N^2 means of the 1+ (N^2)*(0: 999) elements. I
2005 Dec 12
0
marginal effects in glm's
Hi, I wonder if there is a function in (some package of) R which computes marginal effects of the variables in a glm, say, for concretness, a probit model. By marginal effects of the covariate x_j I mean d P(y=1 | x), which is approx g(xB)B_j dx_j where g is the pdf of the normal distribution, x is the vector of covariates (at some points, say, the mean values) and B is the estimated
2006 Apr 24
1
omitting coefficients in summary.lm()
Hi, I'm running a regression using lm(), in which one of the right-hand side variables is factor with many levels (say, 80). I am not intersted in the estimates of the resulting dummies, but I have to include them in my regression equation. So, I don't want the estimates associated with theses dummies to be printed by summary.lm( ). Is there an easy way to do this? Thank you, Dimitri