thr3ads.net - R help - [R] Conditional Random selection [Nov 2015]

If this information is useful, please help other people find it:
Share via:

ruipbarradas at sapo.pt

2015-Nov-21 20:38 UTC

[R] Conditional Random selection

Hello,

Try

tapply(tab$S1, tab$time, function(x) length(unique(x)))

Hope this helps,

Rui Barradas
?

Citando Ashta <sewashm at gmail.com>:
> Hi? Bert? and all,
> I have related question.? In each? time period there were different
> locations where the samples were collected (S1).? ?I? want count? the
> number of unique locations (S1)? for each unique time period . So in
> time 1 the samples were collected from two locations and time 2 only
> from one location and time 3? from? three locations..
>
> tab? <- read.table(textConnection(" time? ?S1? rep
> 1? ? ? 1? ? ? ?1
> 1? ? ? 2? ? ? ?1
> 1? ? ? 2? ? ? ?2
> 2? ? ? 1? ? ? ?1
> 2? ? ? 1? ? ? ?2
> 2? ? ? 1? ? ? ?3
> 2? ? ? 1? ? ? ?4
> 3? ? ? 1? ? ? ?1
> 3? ? ? 2? ? ? ?1
> 3? ? ? 3? ? ? ?1? ?"),header = TRUE)
>
> what I want is
>
> time? S1
> ? ?1? ? 2
> ? ?2? ? 1
> ? ?3? ? 3
>
> Thank you again.
>
> On Sat, Nov 21, 2015 at 1:30 PM, Ashta <sewashm at gmail.com> wrote:
>> Thank you Bert!
>>
>> What I want is at least 500 samples based on random? sampling of time
>> period. This allows samples? collected at the same time period are
>> included together.
>>
>> Your script is doing what I wanted to do!!
>>
>> Many thanks
>>
>> On Sat, Nov 21, 2015 at 1:15 PM, Bert Gunter <bgunter.4567 at
gmail.com> wrote:
>>> David's "solution" is incorrect. It can also fail to
give you times
>>> with a total of 500 items to sample from in the time periods.
>>>
>>> It is not entirely clear what you want. The solution below gives
you a
>>> random sample of time periods in which X1>0 and the total number
of
>>> samples among them is >= 500. It does not give you the fewest
number
>>> of periods that can do this. Is this what you want?
>>>
>>> tab[with(tab,{
>>> ? rownums<- sample(seq_len(nrow(tab))[X1>0])
>>> ? sz <- cumsum(X2[rownums])
>>> ? rownums[c(TRUE,sz<500)]
>>> }),]
>>>
>>> Cheers,
>>> Bert
>>>
>>> Bert Gunter
>>>
>>> "Data is not information. Information is not knowledge. And
knowledge
>>> is certainly not wisdom."
>>> ? ?-- Clifford Stoll
>>>
>>> On Sat, Nov 21, 2015 at 10:56 AM, Ashta <sewashm at
gmail.com> wrote:
>>>> Thank you? David!
>>>>
>>>> I rerun the your script and it is giving me the first three
time periods
>>>> is it doing random sampling?
>>>>
>>>> ? ? ? tab.fan
>>>> ? time X1? X2
>>>> 2? ? 2? 5 230
>>>> 3? ? 3? 1 300
>>>> 5? ? 5? 2? 10
>>>>
>>>> On Sat, Nov 21, 2015 at 12:20 PM, David L Carlson  
>>>> <dcarlson at tamu.edu> wrote:
>>>>> Use dput() to send data to the list as it is more compact:
>>>>>> dput(tab)
>>>>>
>>>>> structure(list(time = 1:8, X1 = c(0L, 5L, 1L, 0L, 2L, 3L,
1L,
>>>>> 4L), X2 = c(251L, 230L, 300L, 25L, 10L, 101L, 300L, 185L)),
>>>>> .Names = c("time",
>>>>> "X1", "X2"), class =
"data.frame", row.names = c(NA, -8L))
>>>>>
>>>>> You can just remove the lines with X1 = 0 since you
don't want
>>>>> to use them.
>>>>>> tab.sub <- tab[tab$X1>0, ]
>>>>>
>>>>> Then the following gives you a sample:
>>>>>> tab.sub[cumsum(sample(tab.sub$X2))<=500, ]
>>>>>
>>>>> Note, that your "solution" of times 6, 7, and 8
will never
>>>>> appear because the sum of the values is 586.
>>>>>
>>>>> David L. Carlson
>>>>> Department of Anthropology
>>>>> Texas A&M University
>>>>>
>>>>> -----Original Message-----
>>>>> From: R-help [mailto:r-help-bounces at r-project.org] On
Behalf Of Ashta
>>>>> Sent: Saturday, November 21, 2015 11:53 AM
>>>>> To: R help <r-help at r-project.org>
>>>>> Subject: [R] Conditional Random selection
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I have a data set that contains samples collected over
time.? ?In
>>>>> each time period the total number of samples are given
(X2)? ?The goal
>>>>> is to? select 500? random samples.? ? The selection should
be based on
>>>>> time? (select time periods until I reach 500 samples). Also
the time
>>>>> period should have greater than 0 for? X1 variable. X1 is
an indicator
>>>>> variable.
>>>>>
>>>>> Select "time" until reaching the? sum of X2? is
> 500 and if?
>>>>> ?X1 is? >? 0
>>>>>
>>>>> tab? <- read.table(textConnection(" time? ?X1 X2
>>>>> 1? ? ? 0? ? ? ? 251
>>>>> 2? ? ? 5? ? ? ? 230
>>>>> 3? ? ? 1? ? ? ? 300
>>>>> 4? ? ? 0? ? ? ? ?25
>>>>> 5? ? ? 2? ? ? ? ?10
>>>>> 6? ? ? 3? ? ? ? ?101
>>>>> 7? ? ? 1? ? ? ? ?300
>>>>> 8? ? ?4? ? ? ? ?185? ?"),header = TRUE)
>>>>>
>>>>> In the above example,? samples from time 1 and 4? will not
be selected
>>>>> ( X1 is zero)
>>>>> So I could reach my target by selecting time 6,7, and 8 or?
time 2 and
>>>>> 3 and so on.
>>>>>
>>>>> Can any one help to do that?
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide  
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained,
reproducible code.
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide  
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible
code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide  
> http://www.R-project.org/posting-guide.htmland provide commented,  
> minimal, self-contained, reproducible code.
?

	[[alternative HTML version deleted]]

Ashta

2015-Nov-21 20:48 UTC

head link

[R] Conditional Random selection

Hi  Rui ,

I tried that one  before I send out my original message.
it gave me only this,

tapply(tab$S1, tab$time, function(x) length(unique(x)))
1 2 3
2 1 3

I am expecting an output of like this

 time  S1
    1    2
    2    1
    3    3






On Sat, Nov 21, 2015 at 2:38 PM,  <ruipbarradas at sapo.pt>
wrote:> Hello,
>
> Try
>
> tapply(tab$S1, tab$time, function(x) length(unique(x)))
>
> Hope this helps,
>
> Rui Barradas
>
>
> Citando Ashta <sewashm at gmail.com>:
>
> Hi  Bert  and all,
> I have related question.  In each  time period there were different
> locations where the samples were collected (S1).   I  want count  the
> number of unique locations (S1)  for each unique time period . So in
> time 1 the samples were collected from two locations and time 2 only
> from one location and time 3  from  three locations..
>
> tab  <- read.table(textConnection(" time   S1  rep
> 1      1       1
> 1      2       1
> 1      2       2
> 2      1       1
> 2      1       2
> 2      1       3
> 2      1       4
> 3      1       1
> 3      2       1
> 3      3       1   "),header = TRUE)
>
> what I want is
>
> time  S1
>    1    2
>    2    1
>    3    3
>
> Thank you again.
>
>
>
> On Sat, Nov 21, 2015 at 1:30 PM, Ashta <sewashm at gmail.com> wrote:
>
> Thank you Bert!
>
> What I want is at least 500 samples based on random  sampling of time
> period. This allows samples  collected at the same time period are
> included together.
>
> Your script is doing what I wanted to do!!
>
> Many thanks
>
>
>
>
> On Sat, Nov 21, 2015 at 1:15 PM, Bert Gunter <bgunter.4567 at
gmail.com> wrote:
>
> David's "solution" is incorrect. It can also fail to give you
times
> with a total of 500 items to sample from in the time periods.
>
> It is not entirely clear what you want. The solution below gives you a
> random sample of time periods in which X1>0 and the total number of
> samples among them is >= 500. It does not give you the fewest number
> of periods that can do this. Is this what you want?
>
> tab[with(tab,{
>   rownums<- sample(seq_len(nrow(tab))[X1>0])
>   sz <- cumsum(X2[rownums])
>   rownums[c(TRUE,sz<500)]
> }),]
>
> Cheers,
> Bert
>
>
> Bert Gunter
>
> "Data is not information. Information is not knowledge. And knowledge
> is certainly not wisdom."
>    -- Clifford Stoll
>
>
> On Sat, Nov 21, 2015 at 10:56 AM, Ashta <sewashm at gmail.com> wrote:
>
> Thank you  David!
>
> I rerun the your script and it is giving me the first three time periods
> is it doing random sampling?
>
>       tab.fan
>   time X1  X2
> 2    2  5 230
> 3    3  1 300
> 5    5  2  10
>
>
>
> On Sat, Nov 21, 2015 at 12:20 PM, David L Carlson <dcarlson at
tamu.edu> wrote:
>
> Use dput() to send data to the list as it is more compact:
>
> dput(tab)
>
> structure(list(time = 1:8, X1 = c(0L, 5L, 1L, 0L, 2L, 3L, 1L,
> 4L), X2 = c(251L, 230L, 300L, 25L, 10L, 101L, 300L, 185L)), .Names >
c("time",
> "X1", "X2"), class = "data.frame", row.names
= c(NA, -8L))
>
> You can just remove the lines with X1 = 0 since you don't want to use
them.
>
> tab.sub <- tab[tab$X1>0, ]
>
> Then the following gives you a sample:
>
> tab.sub[cumsum(sample(tab.sub$X2))<=500, ]
>
> Note, that your "solution" of times 6, 7, and 8 will never appear
because
> the sum of the values is 586.
>
>
> David L. Carlson
> Department of Anthropology
> Texas A&M University
>
> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Ashta
> Sent: Saturday, November 21, 2015 11:53 AM
> To: R help <r-help at r-project.org>
> Subject: [R] Conditional Random selection
>
> Hi all,
>
> I have a data set that contains samples collected over time.   In
> each time period the total number of samples are given (X2)   The goal
> is to  select 500  random samples.    The selection should be based on
> time  (select time periods until I reach 500 samples). Also the time
> period should have greater than 0 for  X1 variable. X1 is an indicator
> variable.
>
> Select "time" until reaching the  sum of X2  is > 500 and if  
X1 is  >  0
>
> tab  <- read.table(textConnection(" time   X1 X2
> 1      0        251
> 2      5        230
> 3      1        300
> 4      0         25
> 5      2         10
> 6      3         101
> 7      1         300
> 8     4         185   "),header = TRUE)
>
> In the above example,  samples from time 1 and 4  will not be selected
> ( X1 is zero)
> So I could reach my target by selecting time 6,7, and 8 or  time 2 and
> 3 and so on.
>
> Can any one help to do that?
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.htmland provide commented, minimal,
> self-contained, reproducible code.
>
>
>

ruipbarradas at sapo.pt

2015-Nov-21 21:40 UTC

head link

[R] Conditional Random selection

Hello,

Is that a real doubt? Like Bert said, you should spend some time with  
an R tutorial. All you need is to know how to form a data.frame.

tmp <- tapply(tab1$S1, tab1$time, function(x) length(unique(x)))
data.frame(time = names(tmp), S1 = tmp)

Rui Barradas
?

Citando Ashta <sewashm at gmail.com>:
> Hi? Rui ,
>
> I tried that one? before I send out my original message.
> it gave me only this,
>
> tapply(tab$S1, tab$time, function(x) length(unique(x)))
> 1 2 3
> 2 1 3
>
> I am expecting an output of like this
>
> time? S1
> ? ?1? ? 2
> ? ?2? ? 1
> ? ?3? ? 3
>
> On Sat, Nov 21, 2015 at 2:38 PM,? <ruipbarradas at sapo.pt> wrote:
>> Hello,
>>
>> Try
>>
>> tapply(tab$S1, tab$time, function(x) length(unique(x)))
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>> Citando Ashta <sewashm at gmail.com>:
>>
>> Hi? Bert? and all,
>> I have related question.? In each? time period there were different
>> locations where the samples were collected (S1).? ?I? want count? the
>> number of unique locations (S1)? for each unique time period . So in
>> time 1 the samples were collected from two locations and time 2 only
>> from one location and time 3? from? three locations..
>>
>> tab? <- read.table(textConnection(" time? ?S1? rep
>> 1? ? ? 1? ? ? ?1
>> 1? ? ? 2? ? ? ?1
>> 1? ? ? 2? ? ? ?2
>> 2? ? ? 1? ? ? ?1
>> 2? ? ? 1? ? ? ?2
>> 2? ? ? 1? ? ? ?3
>> 2? ? ? 1? ? ? ?4
>> 3? ? ? 1? ? ? ?1
>> 3? ? ? 2? ? ? ?1
>> 3? ? ? 3? ? ? ?1? ?"),header = TRUE)
>>
>> what I want is
>>
>> time? S1
>> ? ?1? ? 2
>> ? ?2? ? 1
>> ? ?3? ? 3
>>
>> Thank you again.
>>
>> On Sat, Nov 21, 2015 at 1:30 PM, Ashta <sewashm at gmail.com>
wrote:
>>
>> Thank you Bert!
>>
>> What I want is at least 500 samples based on random? sampling of time
>> period. This allows samples? collected at the same time period are
>> included together.
>>
>> Your script is doing what I wanted to do!!
>>
>> Many thanks
>>
>> On Sat, Nov 21, 2015 at 1:15 PM, Bert Gunter <bgunter.4567 at
gmail.com> wrote:
>>
>> David's "solution" is incorrect. It can also fail to give
you times
>> with a total of 500 items to sample from in the time periods.
>>
>> It is not entirely clear what you want. The solution below gives you a
>> random sample of time periods in which X1>0 and the total number of
>> samples among them is >= 500. It does not give you the fewest number
>> of periods that can do this. Is this what you want?
>>
>> tab[with(tab,{
>> ? rownums<- sample(seq_len(nrow(tab))[X1>0])
>> ? sz <- cumsum(X2[rownums])
>> ? rownums[c(TRUE,sz<500)]
>> }),]
>>
>> Cheers,
>> Bert
>>
>> Bert Gunter
>>
>> "Data is not information. Information is not knowledge. And
knowledge
>> is certainly not wisdom."
>> ? ?-- Clifford Stoll
>>
>> On Sat, Nov 21, 2015 at 10:56 AM, Ashta <sewashm at gmail.com>
wrote:
>>
>> Thank you? David!
>>
>> I rerun the your script and it is giving me the first three time
periods
>> is it doing random sampling?
>>
>> ? ? ? tab.fan
>> ? time X1? X2
>> 2? ? 2? 5 230
>> 3? ? 3? 1 300
>> 5? ? 5? 2? 10
>>
>> On Sat, Nov 21, 2015 at 12:20 PM, David L Carlson <dcarlson at
tamu.edu> wrote:
>>
>> Use dput() to send data to the list as it is more compact:
>>
>> dput(tab)
>>
>> structure(list(time = 1:8, X1 = c(0L, 5L, 1L, 0L, 2L, 3L, 1L,
>> 4L), X2 = c(251L, 230L, 300L, 25L, 10L, 101L, 300L, 185L)), .Names
>> c("time",
>> "X1", "X2"), class = "data.frame",
row.names = c(NA, -8L))
>>
>> You can just remove the lines with X1 = 0 since you don't want to
use them.
>>
>> tab.sub <- tab[tab$X1>0, ]
>>
>> Then the following gives you a sample:
>>
>> tab.sub[cumsum(sample(tab.sub$X2))<=500, ]
>>
>> Note, that your "solution" of times 6, 7, and 8 will never
appear because
>> the sum of the values is 586.
>>
>> David L. Carlson
>> Department of Anthropology
>> Texas A&M University
>>
>> -----Original Message-----
>> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of
Ashta
>> Sent: Saturday, November 21, 2015 11:53 AM
>> To: R help <r-help at r-project.org>
>> Subject: [R] Conditional Random selection
>>
>> Hi all,
>>
>> I have a data set that contains samples collected over time.? ?In
>> each time period the total number of samples are given (X2)? ?The goal
>> is to? select 500? random samples.? ? The selection should be based on
>> time? (select time periods until I reach 500 samples). Also the time
>> period should have greater than 0 for? X1 variable. X1 is an indicator
>> variable.
>>
>> Select "time" until reaching the? sum of X2? is > 500 and
if? ?X1 is? >? 0
>>
>> tab? <- read.table(textConnection(" time? ?X1 X2
>> 1? ? ? 0? ? ? ? 251
>> 2? ? ? 5? ? ? ? 230
>> 3? ? ? 1? ? ? ? 300
>> 4? ? ? 0? ? ? ? ?25
>> 5? ? ? 2? ? ? ? ?10
>> 6? ? ? 3? ? ? ? ?101
>> 7? ? ? 1? ? ? ? ?300
>> 8? ? ?4? ? ? ? ?185? ?"),header = TRUE)
>>
>> In the above example,? samples from time 1 and 4? will not be selected
>> ( X1 is zero)
>> So I could reach my target by selecting time 6,7, and 8 or? time 2 and
>> 3 and so on.
>>
>> Can any one help to do that?
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.htmland provide commented,
minimal,
>> self-contained, reproducible code.
>> ?
>
> ?
	[[alternative HTML version deleted]]

R help - Nov 2015 - Conditional Random selection

[R] Conditional Random selection

[R] Conditional Random selection

[R] Conditional Random selection