thr3ads.net - R help - [R] [non-statistics question]methodological problem [Oct 2007]

If this information is useful, please help other people find it:
Share via:

eugen pircalabelu

2007-Oct-27 08:21 UTC

[R] [non-statistics question]methodological problem

Good afternoon!

As mentioned in the subject, my question regards more the methodological part
that accompanies survey design and the statistical part that is involved.
So, I have the following data:

a<-data.frame (id_hh=c(1:5), strata=c(1,1,2,2,1),
Nhstrata=c(100,100,200,200,100), Nrmemb=c(2,4,2,5,4))
a$ocmemb1<-c("wk","jl","st","jl","st")
 a$ocmemb2<-c("wk","jl","st","wk","wk")

where id_hh is a code of identification for the household (my analysis refers to
households), strata is the strata from which the hh is sampled, Nhstrata is the
dimension of the population strata from which the hh is sampled, nrmemb is the
no of members in a hh and ocmemb1,2...is the occupation of each individual
member of the hh (worker,jobless,student).
> a  id_hh strata Nhstrata Nrmemb ocmemb1 ocmemb2
1     1      1      100      2      wk      wk
2     2      1      100      4      jl      jl
3     3      2      200      2      st      st
4     4      2      200      5      jl      wk
5     5      1      100      4      st      wk

Now, is there a possibility of designing some weights for each household based
on the characteristics of individuals which form the hh? Say, I want to
calibrate each hh for its occupational category but i don't have the
additional data for household, rather it is available for individuals, ex: I
don't know that 32% of households are included in the category of  studenthh
(inclusion which is based on the status of the head of hh), but i know that 32%
of all the individuals from which the sample  of hhs is drawn are all students.
So, is there a possibility of designing these weights for hhs where additional
information is available for the individuals which form that hhs? And is it a
solid way of calibrating, i mean is it reliable and trustworthy?


Thank you and have a great day!




 __________________________________________________



	[[alternative HTML version deleted]]

Thomas Lumley

2007-Oct-29 15:04 UTC

head link

[R] [non-statistics question]methodological problem

On Sat, 27 Oct 2007, eugen pircalabelu wrote:>
> As mentioned in the subject, my question regards more the methodological 
> part that accompanies survey design and the statistical part that is 
> involved. So, I have the following data:
You might get more helpful (or more authoritative) advice on 
methodological issues in survey sampling on other lists, in particular 
from srmsnet, rather than posting the same question twice to r-help.
>
> Now, is there a possibility of designing some weights for each household 
> based on the characteristics of individuals which form the hh? Say, I 
> want to calibrate each hh for its occupational category but i don't
have
> the additional data for household, rather it is available for 
> individuals, ex: I don't know that 32% of households are included in
the
> category of studenthh (inclusion which is based on the status of the 
> head of hh), but i know that 32% of all the individuals from which the 
> sample of hhs is drawn are all students.
Yes and no.  You can't calibrate to population totals you don't know.

You can create household-level weights that calibrate the individual-level 
data to individual-level population totals. And the survey() package knows 
how to do this: it is the aggregate.stage= or aggregate.index= argument to 
calibrate(), depending on whether you are using replicate weights or 
design information for your standard errors.

I don't know if this technique is useful in your setting.  My impression 
is that it is mainly used by national statistics agencies that want to 
avoid weird-looking inconsistencies (eg 2,000,000 marriages involving 
1,100,000 men and 900,000 women [1]).  It is presumably less efficient 
than using individual-level weights.  A description from Statistics 
Belgium is linked from ?calibrate.

 	-thomas

[1] Apart from in civilised places like, eg, Canada or MA.

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

Seemingly Similar Threads

Search for more reasonably related threads

R help - Oct 2007 - [non-statistics question]methodological problem

[R] [non-statistics question]methodological problem

[R] [non-statistics question]methodological problem

Seemingly Similar Threads