thr3ads.net - R help - [R] looping using 'diverse' package measures [Oct 2017]

If this information is useful, please help other people find it:
Share via:

Li Jiang

2017-Oct-19 09:08 UTC

[R] looping using 'diverse' package measures

Hi everyone,

I'm new at R (although I'm a Stata user for some time and somehow
proficient in it) and I'm trying to use the 'diverse' R package to
compute
a few diversity measures on a sample of firms for a period of about 10
years. I was wondering if you can give me some hints on how to best proceed
on using the 'diverse' package.

My sample has the following setup. It's comprised of a annual variable
number of firms which are identified by the companyid variable and the year
variable (unbalanced panel). In addition I also have a variable identifying
the worker, workerid. I then have a set of variables which i want to use as
the basis for calculating some of the measures in the 'diverse' package.
An
example of the sample is as follows, using the gender variable (0 for male
and 1 for female) as the variable of interest:

companyid   year    workerid    gender
85390   1999    46446384    0
85390   1999    126800000   1
85390   1999    163300000   0
85390   1999    60225451    0
85390   1999    60195422    0
85390   2000    60225451    0
85390   2000    3571000000  1
85390   2000    163300000   0
85390   2000    163300000   0
85390   2000    126800000   0
85390   2001    60195422    0
85390   2001    60225451    1
85390   2001    46446384    0
85390   2001    60195422    0
85390   2001    60225451    0
4391076 2005    13753759    0
4391076 2005    49988911    0
4391076 2005    112400000   0
4391076 2005    185500000   0
4391076 2005    35649643    0
4391076 2005    65809705    0
4391076 2005    114200000   0
4391076 2005    192100000   0
4391076 2005    64258701    0
4391076 2005    1212000000  1

Based on the 'diverse' need to calculate for each firm, for each year,
for
instance the diversity(gender) measure.  in Stata this would be obtained
just a issuing a by firm year command, but have no idea how to tackle this
is issue in R. Any ideas?

Best wishes,

Li

	[[alternative HTML version deleted]]

David L Carlson

2017-Oct-19 13:35 UTC

head link

[R] looping using 'diverse' package measures

You really need to spend some time learning the basics of R. There are thousands
of R packages, so you also need to spend time reading the documentation for the
package so that you can show us what the data format should be like. Here are
some simple ways to transform the data. You should also use dput() to include
your data in your email, not just a listing which can remove important
information about the structure of the original data:
> Example <- structure(list(companyid = c(85390L, 85390L, 85390L, 85390L, 85390L, 85390L, 85390L, 85390L, 85390L, 85390L, 85390L, 85390L, 
85390L, 85390L, 85390L, 4391076L, 4391076L, 4391076L, 4391076L, 
4391076L, 4391076L, 4391076L, 4391076L, 4391076L, 4391076L), 
    year = c(1999L, 1999L, 1999L, 1999L, 1999L, 2000L, 2000L, 
    2000L, 2000L, 2000L, 2001L, 2001L, 2001L, 2001L, 2001L, 2005L, 
    2005L, 2005L, 2005L, 2005L, 2005L, 2005L, 2005L, 2005L, 2005L
    ), workerid = c(46446384, 126800000, 163300000, 60225451, 
    60195422, 60225451, 3.571e+09, 163300000, 163300000, 126800000, 
    60195422, 60225451, 46446384, 60195422, 60225451, 13753759, 
    49988911, 112400000, 185500000, 35649643, 65809705, 114200000, 
    192100000, 64258701, 1.212e+09), gender = c(0L, 1L, 0L, 0L, 
    0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 1L)), .Names = c("companyid",
"year",
"workerid", "gender"), class = "data.frame",
row.names = c(NA,
-25L))> aggregate(gender~companyid+year, Example, mean)  companyid year gender
1     85390 1999    0.2
2     85390 2000    0.2
3     85390 2001    0.2
4   4391076 2005    0.1
> aggregate(gender~companyid+year, Example, table)  companyid year gender.0 gender.1
1     85390 1999        4        1
2     85390 2000        4        1
3     85390 2001        4        1
4   4391076 2005        9        1
> x <- xtabs(~gender+companyid+year, Example)
> ftable(x, row.vars=2:3, col.vars=1)               gender 0 1
companyid year           
85390     1999        4 1
          2000        4 1
          2001        4 1
          2005        0 0
4391076   1999        0 0
          2000        0 0
          2001        0 0
          2005        9 1

You should read these manual pages:
?dput
?aggregate
?xtabs
?ftable

----------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77843-4352




-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Li Jiang
Sent: Thursday, October 19, 2017 4:08 AM
To: r-help at r-project.org
Subject: [R] looping using 'diverse' package measures

Hi everyone,

I'm new at R (although I'm a Stata user for some time and somehow
proficient in it) and I'm trying to use the 'diverse' R package to
compute a few diversity measures on a sample of firms for a period of about 10
years. I was wondering if you can give me some hints on how to best proceed on
using the 'diverse' package.

My sample has the following setup. It's comprised of a annual variable
number of firms which are identified by the companyid variable and the year
variable (unbalanced panel). In addition I also have a variable identifying the
worker, workerid. I then have a set of variables which i want to use as the
basis for calculating some of the measures in the 'diverse' package. An
example of the sample is as follows, using the gender variable (0 for male and 1
for female) as the variable of interest:

companyid   year    workerid    gender
85390   1999    46446384    0
85390   1999    126800000   1
85390   1999    163300000   0
85390   1999    60225451    0
85390   1999    60195422    0
85390   2000    60225451    0
85390   2000    3571000000  1
85390   2000    163300000   0
85390   2000    163300000   0
85390   2000    126800000   0
85390   2001    60195422    0
85390   2001    60225451    1
85390   2001    46446384    0
85390   2001    60195422    0
85390   2001    60225451    0
4391076 2005    13753759    0
4391076 2005    49988911    0
4391076 2005    112400000   0
4391076 2005    185500000   0
4391076 2005    35649643    0
4391076 2005    65809705    0
4391076 2005    114200000   0
4391076 2005    192100000   0
4391076 2005    64258701    0
4391076 2005    1212000000  1

Based on the 'diverse' need to calculate for each firm, for each year,
for instance the diversity(gender) measure.  in Stata this would be obtained
just a issuing a by firm year command, but have no idea how to tackle this is
issue in R. Any ideas?

Best wishes,

Li

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Possibly Parallel Threads

Search for more reasonably related threads

R help - Oct 2017 - looping using 'diverse' package measures

[R] looping using 'diverse' package measures

[R] looping using 'diverse' package measures

Possibly Parallel Threads