thr3ads.net - R help - [R] rlnorm behaviour [Jun 2016]

If this information is useful, please help other people find it:
Share via:

Ayyappa Chaturvedula

2016-Jun-14 15:02 UTC

[R] rlnorm behaviour

Dear Group,

I am trying to simulate a dataset with 200 individuals with random
assignment of Sex (1,0) and Weight from lognormal distribution specific to
Sex.  I am intrigued by the behavior of rlnorm function to impute a value
of Weight from the specified distribution.  Here is the code:
ID<-1:200
Sex<-sample(c(0,1),200,replace=T,prob=c(0.4,0.6))
fulldata<-data.frame(ID,Sex)
fulldata$Wt<-ifelse(fulldata$Sex==1,rlnorm(100, meanlog = log(85.1), sdlog
= sqrt(0.0329)),
                    rlnorm(100, meanlog = log(73), sdlog = sqrt(0.0442)))

mean(fulldata$Wt[fulldata$Sex==0]);to check the mean is close to 73
mean(fulldata$Wt[fulldata$Sex==1]);to check the mean is close to 85

I see that the number of simulated values has an effect on the mean
calculated after imputation. That is, the code rlnorm(100, meanlog log(73),
sdlog = sqrt(0.0442)) gives much better match compared to
rlnorm(1, meanlog = log(73), sdlog = sqrt(0.0442)) in ifelse statement in
the code above.

My understanding is that ifelse will be imputing only one value where the
condition is met as specified.  I appreciate your insights on the behavior
for better performance of increasing sample number.  I appreciate your
comments.

Regards,
Ayyappa

	[[alternative HTML version deleted]]

Thierry Onkelinx

2016-Jun-14 15:15 UTC

head link

[R] rlnorm behaviour

Dear Ayyappa,

ifelse works on a vector. See the example below.

ifelse(
  sample(c(TRUE, FALSE), size = length(letters), replace = TRUE),
  letters,
  LETTERS
)

However, note that it will recycle short vectors when they are not of equal
length.

ifelse(
  sample(c(TRUE, FALSE), size = 2 * length(letters), replace = TRUE),
  letters,
  LETTERS
)

In your code the length of the condition vector is 200, the length of the
two other vectors is 100.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2016-06-14 17:02 GMT+02:00 Ayyappa Chaturvedula <ayyappach at gmail.com>:
> Dear Group,
>
> I am trying to simulate a dataset with 200 individuals with random
> assignment of Sex (1,0) and Weight from lognormal distribution specific to
> Sex.  I am intrigued by the behavior of rlnorm function to impute a value
> of Weight from the specified distribution.  Here is the code:
> ID<-1:200
> Sex<-sample(c(0,1),200,replace=T,prob=c(0.4,0.6))
> fulldata<-data.frame(ID,Sex)
> fulldata$Wt<-ifelse(fulldata$Sex==1,rlnorm(100, meanlog = log(85.1),
sdlog
> = sqrt(0.0329)),
>                     rlnorm(100, meanlog = log(73), sdlog = sqrt(0.0442)))
>
> mean(fulldata$Wt[fulldata$Sex==0]);to check the mean is close to 73
> mean(fulldata$Wt[fulldata$Sex==1]);to check the mean is close to 85
>
> I see that the number of simulated values has an effect on the mean
> calculated after imputation. That is, the code rlnorm(100, meanlog >
log(73), sdlog = sqrt(0.0442)) gives much better match compared to
> rlnorm(1, meanlog = log(73), sdlog = sqrt(0.0442)) in ifelse statement in
> the code above.
>
> My understanding is that ifelse will be imputing only one value where the
> condition is met as specified.  I appreciate your insights on the behavior
> for better performance of increasing sample number.  I appreciate your
> comments.
>
> Regards,
> Ayyappa
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Thierry Onkelinx

2016-Jun-14 15:42 UTC

head link

[R] rlnorm behaviour

Please keep r-help in cc.

Yes. Have a look at this example

ifelse(
  sample(c(TRUE, FALSE), size = 0.5 * length(letters), replace = TRUE),
  letters,
  LETTERS
)


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2016-06-14 17:31 GMT+02:00 Ayyappa Chaturvedula <ayyappach at gmail.com>:
> Thank you very much for your kind support.  The length of my condition
> vector is ~80 because I want only Sex==1 and else will be the other.  I
> understand now how ifelse works.  If the vector of the simulated vector is
> longer than the condition vector, then it takes the first few elements to
> match the length of condition vector and discards the rest?
>
> Regards,
> Ayyappa
>
> On Tue, Jun 14, 2016 at 10:15 AM, Thierry Onkelinx <
> thierry.onkelinx at inbo.be> wrote:
>
>> Dear Ayyappa,
>>
>> ifelse works on a vector. See the example below.
>>
>> ifelse(
>>   sample(c(TRUE, FALSE), size = length(letters), replace = TRUE),
>>   letters,
>>   LETTERS
>> )
>>
>> However, note that it will recycle short vectors when they are not of
>> equal length.
>>
>> ifelse(
>>   sample(c(TRUE, FALSE), size = 2 * length(letters), replace = TRUE),
>>   letters,
>>   LETTERS
>> )
>>
>> In your code the length of the condition vector is 200, the length of
the
>> two other vectors is 100.
>>
>> Best regards,
>>
>> ir. Thierry Onkelinx
>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
>> and Forest
>> team Biometrie & Kwaliteitszorg / team Biometrics & Quality
Assurance
>> Kliniekstraat 25
>> 1070 Anderlecht
>> Belgium
>>
>> To call in the statistician after the experiment is done may be no more
>> than asking him to perform a post-mortem examination: he may be able to
say
>> what the experiment died of. ~ Sir Ronald Aylmer Fisher
>> The plural of anecdote is not data. ~ Roger Brinner
>> The combination of some data and an aching desire for an answer does
not
>> ensure that a reasonable answer can be extracted from a given body of
data.
>> ~ John Tukey
>>
>> 2016-06-14 17:02 GMT+02:00 Ayyappa Chaturvedula <ayyappach at
gmail.com>:
>>
>>> Dear Group,
>>>
>>> I am trying to simulate a dataset with 200 individuals with random
>>> assignment of Sex (1,0) and Weight from lognormal distribution
specific
>>> to
>>> Sex.  I am intrigued by the behavior of rlnorm function to impute a
value
>>> of Weight from the specified distribution.  Here is the code:
>>> ID<-1:200
>>> Sex<-sample(c(0,1),200,replace=T,prob=c(0.4,0.6))
>>> fulldata<-data.frame(ID,Sex)
>>> fulldata$Wt<-ifelse(fulldata$Sex==1,rlnorm(100, meanlog =
log(85.1),
>>> sdlog
>>> = sqrt(0.0329)),
>>>                     rlnorm(100, meanlog = log(73), sdlog =
sqrt(0.0442)))
>>>
>>> mean(fulldata$Wt[fulldata$Sex==0]);to check the mean is close to 73
>>> mean(fulldata$Wt[fulldata$Sex==1]);to check the mean is close to 85
>>>
>>> I see that the number of simulated values has an effect on the mean
>>> calculated after imputation. That is, the code rlnorm(100, meanlog
>>> log(73), sdlog = sqrt(0.0442)) gives much better match compared to
>>> rlnorm(1, meanlog = log(73), sdlog = sqrt(0.0442)) in ifelse
statement in
>>> the code above.
>>>
>>> My understanding is that ifelse will be imputing only one value
where the
>>> condition is met as specified.  I appreciate your insights on the
>>> behavior
>>> for better performance of increasing sample number.  I appreciate
your
>>> comments.
>>>
>>> Regards,
>>> Ayyappa
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>
	[[alternative HTML version deleted]]

Ayyappa Chaturvedula

2016-Jun-14 15:47 UTC

head link

[R] rlnorm behaviour

I am sorry, I missed that.  I think I made it more appropriate and not
using unnecessary simulated values.  Thank you for your help.

fulldata$Wt<-ifelse(fulldata$Sex==1,rlnorm(length(fulldata$Sex[fulldata$Sex==1]),
meanlog = log(85.1), sdlog = sqrt(0.0329)),
                    rlnorm(length(fulldata$Sex[fulldata$Sex==0]), meanlog
log(73), sdlog = sqrt(0.0442)))

On Tue, Jun 14, 2016 at 10:42 AM, Thierry Onkelinx <thierry.onkelinx at
inbo.be> wrote:
> Please keep r-help in cc.
>
> Yes. Have a look at this example
>
> ifelse(
>   sample(c(TRUE, FALSE), size = 0.5 * length(letters), replace = TRUE),
>   letters,
>   LETTERS
> )
>
>
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
> Forest
> team Biometrie & Kwaliteitszorg / team Biometrics & Quality
Assurance
> Kliniekstraat 25
> 1070 Anderlecht
> Belgium
>
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to say
> what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
>
> 2016-06-14 17:31 GMT+02:00 Ayyappa Chaturvedula <ayyappach at
gmail.com>:
>
>> Thank you very much for your kind support.  The length of my condition
>> vector is ~80 because I want only Sex==1 and else will be the other.  I
>> understand now how ifelse works.  If the vector of the simulated vector
is
>> longer than the condition vector, then it takes the first few elements
to
>> match the length of condition vector and discards the rest?
>>
>> Regards,
>> Ayyappa
>>
>> On Tue, Jun 14, 2016 at 10:15 AM, Thierry Onkelinx <
>> thierry.onkelinx at inbo.be> wrote:
>>
>>> Dear Ayyappa,
>>>
>>> ifelse works on a vector. See the example below.
>>>
>>> ifelse(
>>>   sample(c(TRUE, FALSE), size = length(letters), replace = TRUE),
>>>   letters,
>>>   LETTERS
>>> )
>>>
>>> However, note that it will recycle short vectors when they are not
of
>>> equal length.
>>>
>>> ifelse(
>>>   sample(c(TRUE, FALSE), size = 2 * length(letters), replace =
TRUE),
>>>   letters,
>>>   LETTERS
>>> )
>>>
>>> In your code the length of the condition vector is 200, the length
of
>>> the two other vectors is 100.
>>>
>>> Best regards,
>>>
>>> ir. Thierry Onkelinx
>>> Instituut voor natuur- en bosonderzoek / Research Institute for
Nature
>>> and Forest
>>> team Biometrie & Kwaliteitszorg / team Biometrics & Quality
Assurance
>>> Kliniekstraat 25
>>> 1070 Anderlecht
>>> Belgium
>>>
>>> To call in the statistician after the experiment is done may be no
more
>>> than asking him to perform a post-mortem examination: he may be
able to say
>>> what the experiment died of. ~ Sir Ronald Aylmer Fisher
>>> The plural of anecdote is not data. ~ Roger Brinner
>>> The combination of some data and an aching desire for an answer
does not
>>> ensure that a reasonable answer can be extracted from a given body
of data.
>>> ~ John Tukey
>>>
>>> 2016-06-14 17:02 GMT+02:00 Ayyappa Chaturvedula <ayyappach at
gmail.com>:
>>>
>>>> Dear Group,
>>>>
>>>> I am trying to simulate a dataset with 200 individuals with
random
>>>> assignment of Sex (1,0) and Weight from lognormal distribution
specific
>>>> to
>>>> Sex.  I am intrigued by the behavior of rlnorm function to
impute a
>>>> value
>>>> of Weight from the specified distribution.  Here is the code:
>>>> ID<-1:200
>>>> Sex<-sample(c(0,1),200,replace=T,prob=c(0.4,0.6))
>>>> fulldata<-data.frame(ID,Sex)
>>>> fulldata$Wt<-ifelse(fulldata$Sex==1,rlnorm(100, meanlog =
log(85.1),
>>>> sdlog
>>>> = sqrt(0.0329)),
>>>>                     rlnorm(100, meanlog = log(73), sdlog
>>>> sqrt(0.0442)))
>>>>
>>>> mean(fulldata$Wt[fulldata$Sex==0]);to check the mean is close
to 73
>>>> mean(fulldata$Wt[fulldata$Sex==1]);to check the mean is close
to 85
>>>>
>>>> I see that the number of simulated values has an effect on the
mean
>>>> calculated after imputation. That is, the code rlnorm(100,
meanlog >>>> log(73), sdlog = sqrt(0.0442)) gives much better match
compared to
>>>> rlnorm(1, meanlog = log(73), sdlog = sqrt(0.0442)) in ifelse
statement
>>>> in
>>>> the code above.
>>>>
>>>> My understanding is that ifelse will be imputing only one value
where
>>>> the
>>>> condition is met as specified.  I appreciate your insights on
the
>>>> behavior
>>>> for better performance of increasing sample number.  I
appreciate your
>>>> comments.
>>>>
>>>> Regards,
>>>> Ayyappa
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible
code.
>>>>
>>>
>>>
>>
>
	[[alternative HTML version deleted]]

R help - Jun 2016 - rlnorm behaviour

[R] rlnorm behaviour

[R] rlnorm behaviour

[R] rlnorm behaviour

[R] rlnorm behaviour