Rolf Turner
2020-Mar-29 00:46 UTC
[R] Syntax for geom_point in ggplot2, distinguishing point types.
I have an application in which I want to plot points with different colours and symbols to distinguish (three) types of point. I thought I had it figured out, mostly by a trial-and-error approach and by simple-mindedly following instructions from others, without (sad to say) really understanding what I was doing. I have now encountered an example where my syntax screws up, and I can't see how to fix it in a coherent and generally applicable way. The problem arises if the data column (factor) specifying the point types is missing one level. When this happens, the types get out of synch. I have attached a source-able file "demo.txt" and the two data sets upon which it depends as "data1.txt" and "data2.txt" (dput() format). The result from using the data from "data1.txt" (plotObj1) demonstrates the problem, i.e. the unwanted phenomenon. The result from using the data from "data2.txt" (plotObj2) is "as it should be". In the data from data1.txt the "extremes" column, which specifies the point type, is missing the level "0":> table(Dat1$extremes) > > 0 intermed 1 > 0 39 43As you will be able to see, the plot of these data has the "0" symbol (a triangle) where there should be an "intermed" symbol (a solid dot), and the "intermed" symbol where there should be a "1" symbol (an upside-down triangle). The data from data2.txt have an extremes column with all three levels and the corresponding plot is as desired. I'm pretty sure that the solution to my problem is quite simple, but I am too stupid to figure it out. Can anyone guide me in the right direction? Thanks. cheers, Rolf Turner -- Honorary Research Fellow Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: demo.txt URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20200329/c9433604/attachment.txt> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: data1.txt URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20200329/c9433604/attachment-0001.txt> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: data2.txt URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20200329/c9433604/attachment-0002.txt>
Jeff Newmiller
2020-Mar-29 01:52 UTC
[R] Syntax for geom_point in ggplot2, distinguishing point types.
My general approach is to import data without letting the import convert to factors. Then I explicitly create my factors with the levels I want them to have, with explicit vectors specified for the levels. Fixing this after the analysis is possible but a headache. On March 28, 2020 5:46:25 PM PDT, Rolf Turner <r.turner at auckland.ac.nz> wrote:> >I have an application in which I want to plot points with different >colours and symbols to distinguish (three) types of point. I thought I > >had it figured out, mostly by a trial-and-error approach and by >simple-mindedly following instructions from others, without (sad to >say) >really understanding what I was doing. > >I have now encountered an example where my syntax screws up, and I >can't >see how to fix it in a coherent and generally applicable way. The >problem arises if the data column (factor) specifying the point types >is >missing one level. When this happens, the types get out of synch. > >I have attached a source-able file "demo.txt" and the two data sets >upon >which it depends as "data1.txt" and "data2.txt" (dput() format). > >The result from using the data from "data1.txt" (plotObj1) demonstrates > >the problem, i.e. the unwanted phenomenon. The result from using the >data from "data2.txt" (plotObj2) is "as it should be". > >In the data from data1.txt the "extremes" column, which specifies the >point type, is missing the level "0": > >> table(Dat1$extremes) >> >> 0 intermed 1 >> 0 39 43 > >As you will be able to see, the plot of these data has the "0" symbol >(a >triangle) where there should be an "intermed" symbol (a solid dot), and > >the "intermed" symbol where there should be a "1" symbol (an >upside-down >triangle). > >The data from data2.txt have an extremes column with all three levels >and the corresponding plot is as desired. > >I'm pretty sure that the solution to my problem is quite simple, but I >am too stupid to figure it out. Can anyone guide me in the right >direction? > >Thanks. > >cheers, > >Rolf Turner-- Sent from my phone. Please excuse my brevity.