thr3ads.net - R help - [R] Simulating data with conditions [May 2011]

If this information is useful, please help other people find it:
Share via:

AC Del Re

2011-May-24 22:36 UTC

[R] Simulating data with conditions

Hi,

I am wanting to simulate data where a percentage of the data has
multiple duplicated id variables (with unique values of another factor
variable for the dupicated id variables). Im having trouble figuring
out an efficent way to do so.

For example, consider this mock output [Note: Although the mock data
doesnt display this, I am eventually interested in 73% of id having 1
unique id, 22% with a duplicated id and 5% with 2 duplicated ids.
Also, I would like the 'al' variable to be randomly selected, perhaps
using sample() , from a 3-level factor "pt", "th",
"ob" AND for an id
with duplicates to have unique values for the 'al' variable]:

Something like this:

id    z    al

1    .5    "pt"
2    .4    "ob"
3    .7    "pt"
4    .3     "th"
5    .5     "pt"
5    .6     "ob"
6    .3     "th"
6    .2     "ob"
7    .1     "pt"
7    .3     "th"
7    .1     "ob"

This would be the general idea although I will eventually create a
much larger data set with z based on rnorm(), etc.

Any help toward a solution is much appreciated!

AC

R help - May 2011 - Simulating data with conditions

[R] Simulating data with conditions