thr3ads.net - R help - [R] How to create data frame from data with unequal length [Nov 2007]

If this information is useful, please help other people find it:
Share via:

tom soyer

2007-Nov-28 18:27 UTC

[R] How to create data frame from data with unequal length

Hi,

I have two sets of data that I would like to put into a data frame. But
since they have different length, I am not sure how to do this. Here is an
example of my data:

data set one:
date         growth
1/1/2007   10
1/2/2007   10.2
1/3/2007   10.4
1/4/2007   10.6

data set two:
 date         growth
1/1/2007   22
1/2/2007   22.5
1/4/2007   22.4

I would like to combine the two data sets and create a data frame like this:
 date         growthA    growthB
1/1/2007   10            22
1/2/2007   10.2         22.5
1/3/2007   10.4         NA
1/4/2007   10.6         22.4

Or skipping the missing data point all together, like this:
 date         growthA    growthB
1/1/2007   10            22
1/2/2007   10.2         22.5
1/4/2007   10.6         22.4

Right now I am doing this by hand, and it is really time consuming. I am
wondering if there is an easier way of creating data frames from unequal
length data using existing R functions. Is there a way to create data
with equal length based on the date column? I would appreciate any help from
the group.

Thanks,

-- 
Tom

	[[alternative HTML version deleted]]

Matthew Keller

2007-Nov-28 18:33 UTC

head link

[R] How to create data frame from data with unequal length

Tom,

Check out ?merge. Does exactly what you need

Matt

On Nov 28, 2007 11:27 AM, tom soyer <tom.soyer at gmail.com>
wrote:> Hi,
>
> I have two sets of data that I would like to put into a data frame. But
> since they have different length, I am not sure how to do this. Here is an
> example of my data:
>
> data set one:
> date         growth
> 1/1/2007   10
> 1/2/2007   10.2
> 1/3/2007   10.4
> 1/4/2007   10.6
>
> data set two:
>  date         growth
> 1/1/2007   22
> 1/2/2007   22.5
> 1/4/2007   22.4
>
> I would like to combine the two data sets and create a data frame like
this:
>  date         growthA    growthB
> 1/1/2007   10            22
> 1/2/2007   10.2         22.5
> 1/3/2007   10.4         NA
> 1/4/2007   10.6         22.4
>
> Or skipping the missing data point all together, like this:
>  date         growthA    growthB
> 1/1/2007   10            22
> 1/2/2007   10.2         22.5
> 1/4/2007   10.6         22.4
>
> Right now I am doing this by hand, and it is really time consuming. I am
> wondering if there is an easier way of creating data frames from unequal
> length data using existing R functions. Is there a way to create data
> with equal length based on the date column? I would appreciate any help
from
> the group.
>
> Thanks,
>
> --
> Tom
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Matthew C Keller
Asst. Professor of Psychology
University of Colorado at Boulder
www.matthewckeller.com

Peter Dalgaard

2007-Nov-28 18:38 UTC

head link

[R] How to create data frame from data with unequal length

tom soyer wrote:> Hi,
>
> I have two sets of data that I would like to put into a data frame. But
> since they have different length, I am not sure how to do this. Here is an
> example of my data:
>
> data set one:
> date         growth
> 1/1/2007   10
> 1/2/2007   10.2
> 1/3/2007   10.4
> 1/4/2007   10.6
>
> data set two:
>  date         growth
> 1/1/2007   22
> 1/2/2007   22.5
> 1/4/2007   22.4
>
> I would like to combine the two data sets and create a data frame like
this:
>  date         growthA    growthB
> 1/1/2007   10            22
> 1/2/2007   10.2         22.5
> 1/3/2007   10.4         NA
> 1/4/2007   10.6         22.4
>
> Or skipping the missing data point all together, like this:
>  date         growthA    growthB
> 1/1/2007   10            22
> 1/2/2007   10.2         22.5
> 1/4/2007   10.6         22.4
>
> Right now I am doing this by hand, and it is really time consuming. I am
> wondering if there is an easier way of creating data frames from unequal
> length data using existing R functions. Is there a way to create data
> with equal length based on the date column? I would appreciate any help
from
> the group.
>
> Thanks,
>
>   I'd have a look at merge() if I were you.

-- 
   O__  ---- Peter Dalgaard             ?ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907

Henrique Dallazuanna

2007-Nov-28 18:40 UTC

head link

[R] How to create data frame from data with unequal length

Try this:

merge(df1, df2, by.y=1, by.x=1, all=T)
merge(df1, df2, by.y=1, by.x=1)

On 28/11/2007, tom soyer <tom.soyer at gmail.com>
wrote:> Hi,
>
> I have two sets of data that I would like to put into a data frame. But
> since they have different length, I am not sure how to do this. Here is an
> example of my data:
>
> data set one:
> date         growth
> 1/1/2007   10
> 1/2/2007   10.2
> 1/3/2007   10.4
> 1/4/2007   10.6
>
> data set two:
>  date         growth
> 1/1/2007   22
> 1/2/2007   22.5
> 1/4/2007   22.4
>
> I would like to combine the two data sets and create a data frame like
this:
>  date         growthA    growthB
> 1/1/2007   10            22
> 1/2/2007   10.2         22.5
> 1/3/2007   10.4         NA
> 1/4/2007   10.6         22.4
>
> Or skipping the missing data point all together, like this:
>  date         growthA    growthB
> 1/1/2007   10            22
> 1/2/2007   10.2         22.5
> 1/4/2007   10.6         22.4
>
> Right now I am doing this by hand, and it is really time consuming. I am
> wondering if there is an easier way of creating data frames from unequal
> length data using existing R functions. Is there a way to create data
> with equal length based on the date column? I would appreciate any help
from
> the group.
>
> Thanks,
>
> --
> Tom
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Henrique Dallazuanna
Curitiba-Paran?-Brasil
25? 25' 40" S 49? 16' 22" O

Malte Brockmann

2007-Nov-28 20:45 UTC

head link

[R] Problem using Tobit models in R (Testing and controlling for distributional assumptions and endogeneity)

Dear R-Community,

I am currently using Tobit models (survreg in the survival package).

1a) Does R provide a straight-forward way to test distributional assumptions for
tobit models?
1b) If not: I tried to apply the Hausman-test proposed in Newey (1987), Journal
of Econometrics, on the Tobit estimator and the symmetrically censored least
squares estimator proposed by Powell (1986) (quantreg package). Unfortunately,
quantreg only provides covariance matrices based on the bootstrap which are not
positive semi-definite, therefore the hausman test statistic based on the
difference between both covariance matrices can be negative. Newey proposes 2
ways to calculate positive semi-definite covariance matrices: Is there a way to
implement any of these without manually coding (or adapting) the tobit and SCLS
estimation procedures to extract the necessary information needed for the
estimation (first derivative of loglik w.r.t. theta, etc.)?

2) I apply the test for endogeneity proposed by Smith and Blundell (1986),
Econometrica, and one of my variables turns out to be endogenous. Does R have a
package for simultaneous equations with censored dependent variables? As far as
I know, the sem package does estimate these types of equations.

Thanks in advance
Malte

roger koenker

2007-Nov-28 21:14 UTC

head link

[R] Problem using Tobit models in R (Testing and controlling for distributional assumptions and endogeneity)

url:    www.econ.uiuc.edu/~roger            Roger Koenker
email    rkoenker at uiuc.edu            Department of Economics
vox:     217-333-4558                University of Illinois
fax:       217-244-6678                Champaign, IL 61820


On Nov 28, 2007, at 2:45 PM, Malte Brockmann wrote:
>
> Dear R-Community,
>
> I am currently using Tobit models (survreg in the survival package).
>
> 1a) Does R provide a straight-forward way to test distributional  
> assumptions for tobit models?
> 1b) If not: I tried to apply the Hausman-test proposed in Newey  
> (1987), Journal of Econometrics, on the Tobit estimator and the  
> symmetrically censored least squares estimator proposed by Powell  
> (1986) (quantreg package).
This "symmetrically censored least squares estimator"  is NOT what is
computed by the quantreg package.
What is computed is the Powell quantile regression estimator.
> Unfortunately, quantreg only provides covariance matrices based on  
> the bootstrap which are not positive semi-definite,
The bootstrapped covariance provided by quantreg is the usual sample  
covariance matrix of the bootstrapped
realizations and is therefore necessarily positive semi-definite.   
Perhaps what you meant to say was that the
difference between the two covariance matrices that you have computed  
was not psd;  this could easily happen.
Nothing ensures that the Powell QR estimate is less efficient than  
the usual (normal theory) tobit estimator,
indeed there are very plausible conditions under which this is not  
the case.
> therefore the hausman test statistic based on the difference  
> between both covariance matrices can be negative. Newey proposes 2  
> ways to calculate positive semi-definite covariance matrices: Is  
> there a way to implement any of these without manually coding (or  
> adapting) the tobit and SCLS estimation procedures to extract the  
> necessary information needed for the estimation (first derivative  
> of loglik w.r.t. theta, etc.)?
>
> 2) I apply the test for endogeneity proposed by Smith and Blundell  
> (1986), Econometrica, and one of my variables turns out to be  
> endogenous. Does R have a package for simultaneous equations with  
> censored dependent variables? As far as I know, the sem package  
> does estimate these types of equations.
>
> Thanks in advance
> Malte
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

Malte Brockmann

2007-Nov-29 08:02 UTC

head link

[R] Problem using Tobit models in R (Testing and controlling for distributional assumptions and endogeneity)

Roger, thanks for your reply and especially for pointing out that quantreg does
not calculate the SCLS estimator as I thought. You are also certainly right
about the covariance matrices, I meant the difference to be psd.

Nevertheless, my main questions remain open: How to I test distributional
assumptions and endogeneity for Tobit models? In case I cannot reject
endogeneity, how do I model structural equations with censored dependent
variables?





-----Urspr?ngliche Nachricht-----
Von: roger koenker [mailto:rkoenker at uiuc.edu] 
Gesendet: Mittwoch, 28. November 2007 22:14
An: Malte Brockmann
Cc: r-help at r-project.org
Betreff: Re: [R] Problem using Tobit models in R (Testing and controlling for
distributional assumptions and endogeneity)


url:    www.econ.uiuc.edu/~roger            Roger Koenker
email    rkoenker at uiuc.edu            Department of Economics
vox:     217-333-4558                University of Illinois
fax:       217-244-6678                Champaign, IL 61820


On Nov 28, 2007, at 2:45 PM, Malte Brockmann wrote:
>
> Dear R-Community,
>
> I am currently using Tobit models (survreg in the survival package).
>
> 1a) Does R provide a straight-forward way to test distributional  
> assumptions for tobit models?
> 1b) If not: I tried to apply the Hausman-test proposed in Newey  
> (1987), Journal of Econometrics, on the Tobit estimator and the  
> symmetrically censored least squares estimator proposed by Powell  
> (1986) (quantreg package).
This "symmetrically censored least squares estimator"  is NOT what is
computed by the quantreg package.
What is computed is the Powell quantile regression estimator.
> Unfortunately, quantreg only provides covariance matrices based on  
> the bootstrap which are not positive semi-definite,
The bootstrapped covariance provided by quantreg is the usual sample  
covariance matrix of the bootstrapped
realizations and is therefore necessarily positive semi-definite.   
Perhaps what you meant to say was that the
difference between the two covariance matrices that you have computed  
was not psd;  this could easily happen.
Nothing ensures that the Powell QR estimate is less efficient than  
the usual (normal theory) tobit estimator,
indeed there are very plausible conditions under which this is not  
the case.
> therefore the hausman test statistic based on the difference  
> between both covariance matrices can be negative. Newey proposes 2  
> ways to calculate positive semi-definite covariance matrices: Is  
> there a way to implement any of these without manually coding (or  
> adapting) the tobit and SCLS estimation procedures to extract the  
> necessary information needed for the estimation (first derivative  
> of loglik w.r.t. theta, etc.)?
>
> 2) I apply the test for endogeneity proposed by Smith and Blundell  
> (1986), Econometrica, and one of my variables turns out to be  
> endogenous. Does R have a package for simultaneous equations with  
> censored dependent variables? As far as I know, the sem package  
> does estimate these types of equations.
>
> Thanks in advance
> Malte
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

Possibly Parallel Threads

Search for more possibly parallel threads

R help - Nov 2007 - How to create data frame from data with unequal length

[R] How to create data frame from data with unequal length

[R] How to create data frame from data with unequal length

[R] How to create data frame from data with unequal length

[R] How to create data frame from data with unequal length

[R] Problem using Tobit models in R (Testing and controlling for distributional assumptions and endogeneity)

[R] Problem using Tobit models in R (Testing and controlling for distributional assumptions and endogeneity)

[R] Problem using Tobit models in R (Testing and controlling for distributional assumptions and endogeneity)

Possibly Parallel Threads