Hi everyone, I have a certain number of samples and I want to visualize the groups those samples belong to. For example, suppose to have three variables, age, sex, and smoker/nonsmoker, and three samples, S1, S2, S3. S1 is 35, male, nonsmoker S2 is 24, female, nonsmoker S3 is 24, female, smoker at the end I have the following data frame: S1 S2 S3 age 35 24 30 sex M F F smk N N S What I would like is to see this represented in a matrix with colors representing the group the specific sample belongs to. In the example, Age would have three levels, sex and smoker/nonsmoker will have two. An example of what I would like to obtain is from the attached image (from The Cancer Genome Browser at UCSC) You can see the class of each sample represented by the color. Clearly here there are useless variables, like sample name, but the example gives an idea of what I would like to get. So far I was able to achieve a pseudo-result with colorbar.plot, but I find it hard to get the labels in the correct position, as it seems like I cannot find a way to automatically put them near each class bar Any suggestions other than colorbar.plot?
On 11/14/2012 11:04 AM, michele caseposta wrote:> Hi everyone, > I have a certain number of samples and I want to visualize the groups those samples belong to. > For example, suppose to have three variables, age, sex, and smoker/nonsmoker, and three samples, S1, S2, S3. > S1 is 35, male, nonsmoker > S2 is 24, female, nonsmoker > S3 is 24, female, smoker > > at the end I have the following data frame: > > S1 S2 S3 > age 35 24 30 > sex M F F > smk N N S > > What I would like is to see this represented in a matrix with colors representing the group the specific sample belongs to. In the example, Age would have three levels, sex and smoker/nonsmoker will have two. > > An example of what I would like to obtain is from the attached image (from The Cancer Genome Browser at UCSC) > You can see the class of each sample represented by the color. > Clearly here there are useless variables, like sample name, but the example gives an idea of what I would like to get. > > So far I was able to achieve a pseudo-result with colorbar.plot, but I find it hard to get the labels in the correct position, as it seems like I cannot find a way to automatically put them near each class bar > > Any suggestions other than colorbar.plot? >Hi michele, Your picture didn't come thought, but it was fairly easy to find. I'm not entirely sure about this, but are you looking for an hierarchic breakdown of your variables? The illustration on the right side of your example looks like this. Sizetrees provide such a breakdown by successive stacked bars, in which each bar in the leftmost stack splits into its components, like smoke -> sex -> age. Alternatively you can illustrate relationships like these with nested bar plots, in which subcategories are nested within the superordinate categories. See the sizetree and barNest functions in the plotrix package. Jim
Hi Jim, thanks again for your support. Yes, I meant the subject codes; I will add a new variable and set the color to white all over. Thanks, Michele On Nov 15, 2012, at 1:13 AM, Jim Lemon wrote:> On 11/15/2012 07:21 AM, michele caseposta wrote: >> Back again. >> Is there a quick way to add the sample names in the plot? >> I was not able to find anything other than creating a new category with the name in it (and the same color all over). > > Hi Michele, > If by "sample names" you mean the variable names in your data frame: > > library(plotrix) > mcdat<-data.frame(smk=c("No","No","Yes"),sex=c("M","M","F"), > age=c(35,24,30)) > sizetree(mcdat,toplab=c("Smoker","Sex","Age"), > col=list(c("gray80","gray20"),c("pink","lightblue"),rainbow(3))) > > This displays the variable names at the top of the stacked bars. If you mean the subject codes, then yes, you would have to create a new variable. > > Jim