Ron Crump wrote:> Hi,
>
> I have a dataframe that contains pedigree information;
> that is individual, sire and dam identities as separate
> columns. It also has date of birth.
>
> These identifiers are not numeric, or not sequential.
>
> Obviously, an identifier can appear in one or two columns,
> depending on whether it was a parent or not. These should
> be consistent.
>
> Not all identifiers appear in the individual column - it
> is possible for a parent not to have its own record if its
> parents were not known.
>
> Missing parental (sire and/or dam) identifiers can occur.
>
> I need to export the data for use in another program that
> requires the pedigree to be coded as integers, increasing
> with date of birth (therefore sire and dam always have
> lower identifiers than their offspring) and with missing
> values coded as 0.
>
> How would I go about doing this?
>
You might look at http://www.qimr.edu.au/davidD/sib-pair.R,
specifically the read.pedigree() and wrlink() functions. The former is not
very impressive speedwise -- I usually perform these tasks in the
my Sib-pair (Fortran) program, which is on the same webpage. It will order
the pedigree by generational position, so a DOB is not required to do the sort.
Terry Therneau's kinship package does that ordering, but doesn't include
output routines for the Linkage format.
David Duffy.
| David Duffy (MBBS PhD) ,-_|\
| email: davidD at qimr.edu.au ph: INT+61+7+3362-0217 fax: -0101 / *
| Epidemiology Unit, Queensland Institute of Medical Research \_,-._/
| 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG 4D0B994A v