Hi group,
My case has N physicians with each seeing M patients.
One physician could have seen a group of patients, or,
a patient could have been seen by multiple number of
physicians. In order words, there are overlaps. Now,
I have the following NxM matrix
Patient#1 Patient#2 Patient#3 .......
Patient#m
Physician#1 1 0 1 .......
0
Physician#2 1 1 1 .......
1
Physician#3 0 1 0 .......
1
. . . . .......
.
. . . . .......
.
Physician#n 1 1 0 .......
0
"1" indicates previous encouter and "0" otherwise. My
aim is to identify physician group practice based on
the common patients they see. Any suggestion on which
R package would best serve this purpose? Thank you so
much!
Regards,
Kelvin
For the dissimilarity metric I would suggest manhattan, as provided by dist
(base package), daisy, agnes (both cluster package), for in your case a common
"0" is meaningful - means that both pysicians didn't see the
patient.
When using complete linkage you can see exactly how many patients (seen or not
seen) the pysicians in one cluster have at least in common. If the height goes
up too fast so that you would have to extract to many clusters you can use
average linkage.
For the clustering you can use hclust from the base package, agnes from the
cluster package, or, when hclust or agnes run out of memory, clara (see thread
[R] cluster analysis for 80000 observations)
sincerely, Markus
___________________
Markus Preisetanz
Consultant
Client Vela GmbH
Albert-Roßhaupter-Str. 32
81369 München
fon: +49 (0) 89 742 17-113
fax: +49 (0) 89 742 17-150
mailto:markus.preisetanz@clientvela.com
<mailto:markus.preisetanz@clientvela.com>
Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen.
Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten
haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser E-Mail ist nicht
gestattet.
This e-mail may contain confidential and/or privileged infor...{{dropped}}
Seemingly Similar Threads
- cluster analysis: "error in vector("double", length): given vector size is too big {Fehler in vector("double", length) : angegebene Vektorgröße ist zu groß}
- aggregate data.frame using column-specific functions
- scatterplot3d: how to show scatterpoints in 2D-space with color as 3rd dimension?
- AID / Tree Analysis in R
- Adding dimnames to image()