On Thu, Apr 21, 2011 at 8:34 PM, Penny Bilton <pennybilton at xnet.co.nz>
wrote:> Hi Josh,
>
> Thanks for your reply.
>
> The problem is have is in trying to retain the proportions of 2 groups in
my
> data while sampling into training and test sets. I find that different
> ?arguments for set.seed give very different proportions of my 2 groups in
> the training and test sets.
Sure, just because numbers are random does not guarantee that equal
numbers from both groups will be sampled. Perhaps you are looking for
some sort of constrained random sampling like sampling x from group 1
and x from group 2? If so, try calling sample() separately on each
group (for help applying the same function to different groups, take a
look at ?by or ?tapply for example).
Josh
PS cced back to list
>
>
> Penny.
>
>
>
> On 22/04/2011 3:27 p.m., Joshua Wiley wrote:
>>
>> Hi,
>>
>> On Thu, Apr 21, 2011 at 8:18 PM, Penny Bilton<pennybilton at
xnet.co.nz>
>> ?wrote:
>>>
>>> I am using /set.seed()/ ? before the /sample/ ? function.
>>>
>>> How does the length of the argument of /set.seed()/ ? and order of
the
>>> digits affect how the sampling is carried out?
>>
>> You can use set.seed() to specify a particular seed so that while
>> pseudo-random numbers are sampled, you can repeat it. ?For example:
>>
>> set.seed(10)
>> rnorm(10)
>> set.seed(10)
>> rnorm(10)
>>
>>> Specifically, I have used set.seed(123456789). Will this
configuration
>>> give me a genuinely random sampling??
>>
>> You will never get truly random sampling from a computer algorithm,
>> but it is darn close and more than adequate in the majority of cases.
>> 123456789 is just a length 1 vector containing the number 123456789,
>> not 9 separate numbers.
>>
>> Google will be able to give you a lot of information on pseudo-random
>> number algorithms as well as the concept of "seeds". ?Also
see
>> ?set.seed
>>
>> Cheers,
>>
>> Josh
>>
>>>
>>> Thank you in anticipation.
>>>
>>> Penny.
>>>
>>>
>>> ? ? ? ?[[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>
>
--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/