thr3ads.net - R help - [R] Conditional Random selection [Nov 2015]

If this information is useful, please help other people find it:
Share via:

Ashta

2015-Nov-21 18:56 UTC

[R] Conditional Random selection

Thank you  David!

I rerun the your script and it is giving me the first three time periods
is it doing random sampling?

      tab.fan
  time X1  X2
2    2  5 230
3    3  1 300
5    5  2  10



On Sat, Nov 21, 2015 at 12:20 PM, David L Carlson <dcarlson at tamu.edu>
wrote:> Use dput() to send data to the list as it is more compact:
>
>> dput(tab)
> structure(list(time = 1:8, X1 = c(0L, 5L, 1L, 0L, 2L, 3L, 1L,
> 4L), X2 = c(251L, 230L, 300L, 25L, 10L, 101L, 300L, 185L)), .Names =
c("time",
> "X1", "X2"), class = "data.frame", row.names
= c(NA, -8L))
>
> You can just remove the lines with X1 = 0 since you don't want to use
them.
>
>> tab.sub <- tab[tab$X1>0, ]
>
> Then the following gives you a sample:
>
>> tab.sub[cumsum(sample(tab.sub$X2))<=500, ]
>
> Note, that your "solution" of times 6, 7, and 8 will never appear
because the sum of the values is 586.
>
>
> David L. Carlson
> Department of Anthropology
> Texas A&M University
>
> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Ashta
> Sent: Saturday, November 21, 2015 11:53 AM
> To: R help <r-help at r-project.org>
> Subject: [R] Conditional Random selection
>
> Hi all,
>
> I have a data set that contains samples collected over time.   In
> each time period the total number of samples are given (X2)   The goal
> is to  select 500  random samples.    The selection should be based on
> time  (select time periods until I reach 500 samples). Also the time
> period should have greater than 0 for  X1 variable. X1 is an indicator
> variable.
>
> Select "time" until reaching the  sum of X2  is > 500 and if  
X1 is  >  0
>
> tab  <- read.table(textConnection(" time   X1 X2
> 1      0        251
> 2      5        230
> 3      1        300
> 4      0         25
> 5      2         10
> 6      3         101
> 7      1         300
>  8     4         185   "),header = TRUE)
>
> In the above example,  samples from time 1 and 4  will not be selected
> ( X1 is zero)
> So I could reach my target by selecting time 6,7, and 8 or  time 2 and
> 3 and so on.
>
> Can any one help to do that?
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Bert Gunter

2015-Nov-21 19:15 UTC

head link

[R] Conditional Random selection

David's "solution" is incorrect. It can also fail to give you
times
with a total of 500 items to sample from in the time periods.

It is not entirely clear what you want. The solution below gives you a
random sample of time periods in which X1>0 and the total number of
samples among them is >= 500. It does not give you the fewest number
of periods that can do this. Is this what you want?

tab[with(tab,{
  rownums<- sample(seq_len(nrow(tab))[X1>0])
  sz <- cumsum(X2[rownums])
  rownums[c(TRUE,sz<500)]
}),]

Cheers,
Bert


Bert Gunter

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
   -- Clifford Stoll


On Sat, Nov 21, 2015 at 10:56 AM, Ashta <sewashm at gmail.com>
wrote:> Thank you  David!
>
> I rerun the your script and it is giving me the first three time periods
> is it doing random sampling?
>
>       tab.fan
>   time X1  X2
> 2    2  5 230
> 3    3  1 300
> 5    5  2  10
>
>
>
> On Sat, Nov 21, 2015 at 12:20 PM, David L Carlson <dcarlson at
tamu.edu> wrote:
>> Use dput() to send data to the list as it is more compact:
>>
>>> dput(tab)
>> structure(list(time = 1:8, X1 = c(0L, 5L, 1L, 0L, 2L, 3L, 1L,
>> 4L), X2 = c(251L, 230L, 300L, 25L, 10L, 101L, 300L, 185L)), .Names =
c("time",
>> "X1", "X2"), class = "data.frame",
row.names = c(NA, -8L))
>>
>> You can just remove the lines with X1 = 0 since you don't want to
use them.
>>
>>> tab.sub <- tab[tab$X1>0, ]
>>
>> Then the following gives you a sample:
>>
>>> tab.sub[cumsum(sample(tab.sub$X2))<=500, ]
>>
>> Note, that your "solution" of times 6, 7, and 8 will never
appear because the sum of the values is 586.
>>
>>
>> David L. Carlson
>> Department of Anthropology
>> Texas A&M University
>>
>> -----Original Message-----
>> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of
Ashta
>> Sent: Saturday, November 21, 2015 11:53 AM
>> To: R help <r-help at r-project.org>
>> Subject: [R] Conditional Random selection
>>
>> Hi all,
>>
>> I have a data set that contains samples collected over time.   In
>> each time period the total number of samples are given (X2)   The goal
>> is to  select 500  random samples.    The selection should be based on
>> time  (select time periods until I reach 500 samples). Also the time
>> period should have greater than 0 for  X1 variable. X1 is an indicator
>> variable.
>>
>> Select "time" until reaching the  sum of X2  is > 500 and
if   X1 is  >  0
>>
>> tab  <- read.table(textConnection(" time   X1 X2
>> 1      0        251
>> 2      5        230
>> 3      1        300
>> 4      0         25
>> 5      2         10
>> 6      3         101
>> 7      1         300
>>  8     4         185   "),header = TRUE)
>>
>> In the above example,  samples from time 1 and 4  will not be selected
>> ( X1 is zero)
>> So I could reach my target by selecting time 6,7, and 8 or  time 2 and
>> 3 and so on.
>>
>> Can any one help to do that?
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Ashta

2015-Nov-21 19:30 UTC

head link

[R] Conditional Random selection

Thank you Bert!

What I want is at least 500 samples based on random  sampling of time
period. This allows samples  collected at the same time period are
included together.

Your script is doing what I wanted to do!!

Many thanks




On Sat, Nov 21, 2015 at 1:15 PM, Bert Gunter <bgunter.4567 at gmail.com>
wrote:> David's "solution" is incorrect. It can also fail to give you
times
> with a total of 500 items to sample from in the time periods.
>
> It is not entirely clear what you want. The solution below gives you a
> random sample of time periods in which X1>0 and the total number of
> samples among them is >= 500. It does not give you the fewest number
> of periods that can do this. Is this what you want?
>
> tab[with(tab,{
>   rownums<- sample(seq_len(nrow(tab))[X1>0])
>   sz <- cumsum(X2[rownums])
>   rownums[c(TRUE,sz<500)]
> }),]
>
> Cheers,
> Bert
>
>
> Bert Gunter
>
> "Data is not information. Information is not knowledge. And knowledge
> is certainly not wisdom."
>    -- Clifford Stoll
>
>
> On Sat, Nov 21, 2015 at 10:56 AM, Ashta <sewashm at gmail.com> wrote:
>> Thank you  David!
>>
>> I rerun the your script and it is giving me the first three time
periods
>> is it doing random sampling?
>>
>>       tab.fan
>>   time X1  X2
>> 2    2  5 230
>> 3    3  1 300
>> 5    5  2  10
>>
>>
>>
>> On Sat, Nov 21, 2015 at 12:20 PM, David L Carlson <dcarlson at
tamu.edu> wrote:
>>> Use dput() to send data to the list as it is more compact:
>>>
>>>> dput(tab)
>>> structure(list(time = 1:8, X1 = c(0L, 5L, 1L, 0L, 2L, 3L, 1L,
>>> 4L), X2 = c(251L, 230L, 300L, 25L, 10L, 101L, 300L, 185L)), .Names
= c("time",
>>> "X1", "X2"), class = "data.frame",
row.names = c(NA, -8L))
>>>
>>> You can just remove the lines with X1 = 0 since you don't want
to use them.
>>>
>>>> tab.sub <- tab[tab$X1>0, ]
>>>
>>> Then the following gives you a sample:
>>>
>>>> tab.sub[cumsum(sample(tab.sub$X2))<=500, ]
>>>
>>> Note, that your "solution" of times 6, 7, and 8 will
never appear because the sum of the values is 586.
>>>
>>>
>>> David L. Carlson
>>> Department of Anthropology
>>> Texas A&M University
>>>
>>> -----Original Message-----
>>> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of
Ashta
>>> Sent: Saturday, November 21, 2015 11:53 AM
>>> To: R help <r-help at r-project.org>
>>> Subject: [R] Conditional Random selection
>>>
>>> Hi all,
>>>
>>> I have a data set that contains samples collected over time.   In
>>> each time period the total number of samples are given (X2)   The
goal
>>> is to  select 500  random samples.    The selection should be based
on
>>> time  (select time periods until I reach 500 samples). Also the
time
>>> period should have greater than 0 for  X1 variable. X1 is an
indicator
>>> variable.
>>>
>>> Select "time" until reaching the  sum of X2  is > 500
and if   X1 is  >  0
>>>
>>> tab  <- read.table(textConnection(" time   X1 X2
>>> 1      0        251
>>> 2      5        230
>>> 3      1        300
>>> 4      0         25
>>> 5      2         10
>>> 6      3         101
>>> 7      1         300
>>>  8     4         185   "),header = TRUE)
>>>
>>> In the above example,  samples from time 1 and 4  will not be
selected
>>> ( X1 is zero)
>>> So I could reach my target by selecting time 6,7, and 8 or  time 2
and
>>> 3 and so on.
>>>
>>> Can any one help to do that?
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

R help - Nov 2015 - Conditional Random selection

[R] Conditional Random selection

[R] Conditional Random selection

[R] Conditional Random selection