On Mon, May 24, 2021 at 2:11 PM Greg Minshall <minshall at umich.edu>
wrote:
> [...]
> if you have 500 columns of possibly-NA'd variables, you could have one
> column of 500 "bits", where each bit has one of N values, N being
the
> number of explanations the corresponding column has for why the NA
> exists.
>
The mere thought of implementing something like that gives me shivers. Not
to mention such a solution should also be robust when subsetting,
splitting, column and row binding, etc. and everything can be lost if the
user deletes that particular column without realising its importance.
Social science datasets are much more alive and complex than one might
first think: there are multi-wave studies with tens of countries, and
aggregating such data is already a complex process to add even more
complexity on top of that.
As undocumented as they may be, or even subject to change, I think the R
internals are much more reliable that this.
Best wishes,
Adrian
--
Adrian Dusa
University of Bucharest
Romanian Social Data Archive
Soseaua Panduri nr. 90-92
050663 Bucharest sector 5
Romania
https://adriandusa.eu
[[alternative HTML version deleted]]