Anupam Tyagi
2024-Sep-04 13:16 UTC
[R] dotchart and dotplot(lattice) plot with two/three conditioning variables
Hello, I am trying to make a Cleaveland Dotplot with two, if possible three, variables on the vertical axis. I was able to do it in Stata with two variables, Year and Population (see graph at the link: https://drive.google.com/file/d/1SiIfmmqk6IFa_OI5i26Ux1ZxkN2oek-o/view?usp=sharing ). I hope the link to the graph works. I have never tried this before. I want to make a similar (possibly better) graph in R. I tried several ways to make it in R with dotchart() and dotplot(lattice). I have been only partially successful thus far. I would like Year, Population and popGroup on the vertical axis. If popGroup occupies too much space, then I would like a gap between the groups of Cities and Villages, so they can be seen as distinct "Populations". My code and a made-up data are below (in actual data I have 18 categories in "Population", instead of only six in the made-up data). How can I make this type of graph? # Only for 2004-05. How to plot 2011-12 on the same plot? dotchart(test$"X0_50"[test$"Year"=="2004-05"], labels=test$Population, xlab = "Income Share ", main = "Income shares of percentiles of population", xlim = c(12, 50)) points(test$"X50_90"[test$"Year"=="2004-05"], 1:6, pch = 2) points(test$"X90_100"[test$"Year"=="2004-05"], 1:6, pch = 16) legend(x = "topleft", legend = c("0-50%", "50-90%", "90-100%"), pch = c(1,2, 16) ) # reorder so Year 2004-05 is plotted before Year 2011-12. This is not plotting correctly for # second and third variables. Gap between different Cities and Villages is quite a bit. test2 <- test[order(test$seqCode, test$Year, decreasing = T),] dotchart(test2$"X0_50", labels=test2$Year, xlab = "Income Share ", main = "Income shares of percentiles of population", groups as.factor(test2$Population), xlim = c(12, 50)) points(test2$"X50_90", 1:12, pch = 2) points(test2$"X90_100", 1: 12, pch = 16) # use lattice library library(lattice) dotplot(reorder(Population, -seqCode) ~ test$"X0_50" + test$"X50_90" + test$"X90_100", data = test, auto.key = TRUE) testLong <- reshape(test, idvar = c("Population", "Year"), varying = list(5:7), v.names = "ptile", direction = "long") dotplot(reorder(Population, -seqCode) ~ ptile | Year, data = testLong, groups = time, auto.key = T) Dataframe is below using dput(). Dataframe is named "test" in my code. structure(list(seqCode = c(1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L), popGroup = c("City", "City", "City", "Village", "Village", "Village", "City", "City", "City", "Village", "Village", "Village"), Population = c("Dallas", "Boston", "Chicago", "Kip", "Von", "Dan", "Dallas", "Boston", "Chicago", "Kip", "Von", "Dan" ), Year = c("2004-05", "2004-05", "2004-05", "2004-05", "2004-05", "2004-05", "2011-12", "2011-12", "2011-12", "2011-12", "2011-12", "2011-12"), X0_50 = c(15.47, 21.29, 18.04, 15.62, 18.89, 24.37, 17.43, 17.99, 18.04, 14.95, 16.33, 28.98), X50_90 = c(44.12, 43.25, 45.72, 46.15, 43.84, 46.24, 44.39, 44.08, 43.62, 42.89, 44.57, 47.14), X90_100 = c(40.42, 35.47, 36.24, 38.24, 37.27, 29.39, 38.18, 37.93, 38.34, 42.16, 39.11, 23.88)), class "data.frame", row.names = c(NA, -12L)) -- Anupam.
Deepayan Sarkar
2024-Sep-04 14:31 UTC
[R] dotchart and dotplot(lattice) plot with two/three conditioning variables
For lattice::dotplot(), you are close; this is more like the layout you want: dotplot(Year ~ ptile | reorder(Population, ptile, mean), testLong, groups = c("0-50", "50-90", "90-100")[time], layout = c(1, NA), par.settings = simpleTheme(pch = 16), auto.key = TRUE) dotchart() works better with tables, but unfortunately it doesn't seem to handle more than two dimensions, so you can only get one group at a time: xtabs(ptile ~ Year + Population, testLong, subset = time == 1) |> dotchart(pch = 16) This seems like something that should not be too difficult to improve. Best, -Deepayan On Wed, 4 Sept 2024 at 18:46, Anupam Tyagi <anuptyagi at gmail.com> wrote:> Hello, I am trying to make a Cleaveland Dotplot with two, if possible > three, variables on the vertical axis. I was able to do it in Stata > with two variables, Year and Population (see graph at the link: > > https://drive.google.com/file/d/1SiIfmmqk6IFa_OI5i26Ux1ZxkN2oek-o/view?usp=sharing > ). I hope the link to the graph works. I have never tried this before. > > I want to make a similar (possibly better) graph in R. I tried several > ways to make it in R with dotchart() and dotplot(lattice). I have been > only partially successful thus far. I would like Year, Population and > popGroup on the vertical axis. If popGroup occupies too much space, > then I would like a gap between the groups of Cities and Villages, so > they can be seen as distinct "Populations". My code and a made-up data > are below (in actual data I have 18 categories in "Population", > instead of only six in the made-up data). How can I make this type of > graph? > > # Only for 2004-05. How to plot 2011-12 on the same plot? > dotchart(test$"X0_50"[test$"Year"=="2004-05"], labels=test$Population, > xlab = "Income Share ", > main = "Income shares of percentiles of population", xlim = c(12, > 50)) > points(test$"X50_90"[test$"Year"=="2004-05"], 1:6, pch = 2) > points(test$"X90_100"[test$"Year"=="2004-05"], 1:6, pch = 16) > legend(x = "topleft", > legend = c("0-50%", "50-90%", "90-100%"), > pch = c(1,2, 16) > ) > > # reorder so Year 2004-05 is plotted before Year 2011-12. This is not > plotting correctly for > # second and third variables. Gap between different Cities and > Villages is quite a bit. > test2 <- test[order(test$seqCode, test$Year, decreasing = T),] > > dotchart(test2$"X0_50", labels=test2$Year, xlab = "Income Share ", > main = "Income shares of percentiles of population", groups > as.factor(test2$Population), xlim = c(12, 50)) > points(test2$"X50_90", 1:12, pch = 2) > points(test2$"X90_100", 1: 12, pch = 16) > > > # use lattice library > library(lattice) > dotplot(reorder(Population, -seqCode) ~ test$"X0_50" + test$"X50_90" + > test$"X90_100", data = test, auto.key = TRUE) > > testLong <- reshape(test, idvar = c("Population", "Year"), varying > list(5:7), > v.names = "ptile", direction = "long") > > dotplot(reorder(Population, -seqCode) ~ ptile | Year, data = testLong, > groups = time, auto.key = T) > > Dataframe is below using dput(). Dataframe is named "test" in my code. > > structure(list(seqCode = c(1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, > 4L, 5L, 6L), popGroup = c("City", "City", "City", "Village", > "Village", "Village", "City", "City", "City", "Village", "Village", > "Village"), Population = c("Dallas", "Boston", "Chicago", "Kip", > "Von", "Dan", "Dallas", "Boston", "Chicago", "Kip", "Von", "Dan" > ), Year = c("2004-05", "2004-05", "2004-05", "2004-05", "2004-05", > "2004-05", "2011-12", "2011-12", "2011-12", "2011-12", "2011-12", > "2011-12"), X0_50 = c(15.47, 21.29, 18.04, 15.62, 18.89, 24.37, > 17.43, 17.99, 18.04, 14.95, 16.33, 28.98), X50_90 = c(44.12, > 43.25, 45.72, 46.15, 43.84, 46.24, 44.39, 44.08, 43.62, 42.89, > 44.57, 47.14), X90_100 = c(40.42, 35.47, 36.24, 38.24, 37.27, > 29.39, 38.18, 37.93, 38.34, 42.16, 39.11, 23.88)), class > "data.frame", row.names = c(NA, > -12L)) > > -- > Anupam. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]