Dear R people, The new packge 'icd9' provides a range of tools for working with ICD-9-CM codes. http://cran.r-project.org/web/packages/icd9/index.html https://github.com/jackwasey/icd9 ICD-9 (clinical modification) is primarily used for categorizing diseases in the USA for hospital administration, whereas ICD-10 is used by the rest of the world for disease surveillance. This package is currently restricted to ICD-9-CM codes. I've seen other R code which manipulates ICD-9 codes, but the mistake is often made of thinking they are numeric. This is not the case, e.g. 100.0 is different from 100 and 100.00 . This package takes care of validating these codes, explaining them (converting code to plain English), comparing them, and attributing codes to groups of codes to assign co-morbidities to patients. ICD-9 codes are often provided in a shortened format without a decimal place, and these have distinct validation rules. Functions to convert between decimal and short forms are provided. All key parts use vectorized code, and comorbidities for a million patient visits can be assigned in a few seconds on a modest workstation. SAS code is published by AHRQ to allow assignment of ICD-9 codes to comorbidities. This package contains some SAS source to R code translation, so that the canonical ICD-9-CM to comorbidity mapping provided by AHRQ can be derived directly without the cumbersome and error-prone manual task of re-encoding the relationships in R. I believe a SAS to R converter was an April Fools' joke some time ago, but this is indeed a very limited answer to that problem. http://www.biostatistics.dk/sas2r/index.html A short vignette covers the major use-cases. http://cran.r-project.org/web/packages/icd9/vignettes/icd9.pdf The code is supported by a fairly thorough test suite, and is well documented in the hope that it will be easier for users of the package to understand it, and to get involved. I chose only to export key functions where I had thought carefully about the external API, but all internal functions are documented and contain potentially useful nuggets for power users. Comments and contributions are most welcome. In particular, I'd love to see unit tests corresponding to any failures you may encounter working with your own ICD-9 data. Hope you find this useful. Jack -- Jack Wasey Resident Physician, Anesthesiology and Critical Care Medicine Johns Hopkins Hospital Baltimore, MD, USA