thr3ads.net - R help - [R] Help with simulation of unbalanced clustered data [Dec 2020]

If this information is useful, please help other people find it:
Share via:

Chao Liu

2020-Dec-16 02:48 UTC

[R] Help with simulation of unbalanced clustered data

Dear R experts,

I want to simulate some unbalanced clustered data. The number of clusters
is 20 and the average number of observations is 30. However, I would like
to create an unbalanced clustered data per cluster where there are 10% more
observations than specified (i.e., 33 rather than 30). I then want to
randomly exclude an appropriate number of observations (i.e., 60) to arrive
at the specified average number of observations per cluster (i.e., 30). The
probability of excluding an observation within each cluster was not uniform
(i.e., some clusters had no cases removed and others had more excluded).
Therefore in the end I still have 600 observations in total. How to realize
that in R? Thank you for your help!

Best,

Liu

	[[alternative HTML version deleted]]

Jeff Newmiller

2020-Dec-16 13:50 UTC

head link

[R] Help with simulation of unbalanced clustered data

This is R-help, not R-do-my-work-for-me. It is also not a homework help line.
The Posting Guide is required reading. Assuming this is not homework, since each
step in your problem definition can be mapped to a fairly basic operation in R
(the sample function and indexing being key tools), you should be showing your
work with a reproducible example that illustrates where you are stuck or why the
result you are getting does not exhibit the desired properties.

On December 15, 2020 6:48:12 PM PST, Chao Liu <psychaoliu at gmail.com>
wrote:>Dear R experts,
>
>I want to simulate some unbalanced clustered data. The number of
>clusters
>is 20 and the average number of observations is 30. However, I would
>like
>to create an unbalanced clustered data per cluster where there are 10%
>more
>observations than specified (i.e., 33 rather than 30). I then want to
>randomly exclude an appropriate number of observations (i.e., 60) to
>arrive
>at the specified average number of observations per cluster (i.e., 30).
>The
>probability of excluding an observation within each cluster was not
>uniform
>(i.e., some clusters had no cases removed and others had more
>excluded).
>Therefore in the end I still have 600 observations in total. How to
>realize
>that in R? Thank you for your help!
>
>Best,
>
>Liu
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
-- 
Sent from my phone. Please excuse my brevity.

Abby Spurdle

2020-Dec-17 04:32 UTC

head link

[R] Help with simulation of unbalanced clustered data

Hi Chao Liu,

I'm having difficulty following your question, and examples.
And also, I don't see the motivation for increasing, then decreasing
the sample sizes.
Intuitively, one would compute the correct sample sizes, first time round...

But I thought I'd add some comments, just in case they're useful.

If the problem relates to memberships (in clusters), then the problem
can be simplified.
All one needs is an integer vector, where each value is the index of
the cluster.

To compute random memberships of 600 observations in 20 clusters, one could run:

    m <- sample (1:20, 600, TRUE)

To compute the number of observations per cluster, one could then run:

    table (m)

In the above code, the probability of an observation being assigned to
each cluster, is uniform.
Non-uniform sampling can be achieved by supplying a 4th argument to
the sample function, which is a numeric vector of weights.

On Wed, Dec 16, 2020 at 10:08 PM Chao Liu <psychaoliu at gmail.com>
wrote:>
> Dear R experts,
>
> I want to simulate some unbalanced clustered data. The number of clusters
> is 20 and the average number of observations is 30. However, I would like
> to create an unbalanced clustered data per cluster where there are 10% more
> observations than specified (i.e., 33 rather than 30). I then want to
> randomly exclude an appropriate number of observations (i.e., 60) to arrive
> at the specified average number of observations per cluster (i.e., 30). The
> probability of excluding an observation within each cluster was not uniform
> (i.e., some clusters had no cases removed and others had more excluded).
> Therefore in the end I still have 600 observations in total. How to realize
> that in R? Thank you for your help!
>
> Best,
>
> Liu
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

R help - Dec 2020 - Help with simulation of unbalanced clustered data

[R] Help with simulation of unbalanced clustered data

[R] Help with simulation of unbalanced clustered data

[R] Help with simulation of unbalanced clustered data