Anupam Tyagi
2024-Sep-04 13:16 UTC
[R] dotchart and dotplot(lattice) plot with two/three conditioning variables
Hello, I am trying to make a Cleaveland Dotplot with two, if possible
three, variables on the vertical axis. I was able to do it in Stata
with two variables, Year and Population (see graph at the link:
https://drive.google.com/file/d/1SiIfmmqk6IFa_OI5i26Ux1ZxkN2oek-o/view?usp=sharing
). I hope the link to the graph works. I have never tried this before.
I want to make a similar (possibly better) graph in R. I tried several
ways to make it in R with dotchart() and dotplot(lattice). I have been
only partially successful thus far. I would like Year, Population and
popGroup on the vertical axis. If popGroup occupies too much space,
then I would like a gap between the groups of Cities and Villages, so
they can be seen as distinct "Populations". My code and a made-up data
are below (in actual data I have 18 categories in "Population",
instead of only six in the made-up data). How can I make this type of
graph?
# Only for 2004-05. How to plot 2011-12 on the same plot?
dotchart(test$"X0_50"[test$"Year"=="2004-05"],
labels=test$Population,
xlab = "Income Share ",
main = "Income shares of percentiles of population", xlim =
c(12, 50))
points(test$"X50_90"[test$"Year"=="2004-05"], 1:6,
pch = 2)
points(test$"X90_100"[test$"Year"=="2004-05"],
1:6, pch = 16)
legend(x = "topleft",
legend = c("0-50%", "50-90%", "90-100%"),
pch = c(1,2, 16)
)
# reorder so Year 2004-05 is plotted before Year 2011-12. This is not
plotting correctly for
# second and third variables. Gap between different Cities and
Villages is quite a bit.
test2 <- test[order(test$seqCode, test$Year, decreasing = T),]
dotchart(test2$"X0_50", labels=test2$Year, xlab = "Income Share
",
main = "Income shares of percentiles of population", groups
as.factor(test2$Population), xlim = c(12, 50))
points(test2$"X50_90", 1:12, pch = 2)
points(test2$"X90_100", 1: 12, pch = 16)
# use lattice library
library(lattice)
dotplot(reorder(Population, -seqCode) ~ test$"X0_50" +
test$"X50_90" +
test$"X90_100", data = test, auto.key = TRUE)
testLong <- reshape(test, idvar = c("Population",
"Year"), varying = list(5:7),
v.names = "ptile", direction =
"long")
dotplot(reorder(Population, -seqCode) ~ ptile | Year, data = testLong,
groups = time, auto.key = T)
Dataframe is below using dput(). Dataframe is named "test" in my code.
structure(list(seqCode = c(1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L,
4L, 5L, 6L), popGroup = c("City", "City", "City",
"Village",
"Village", "Village", "City", "City",
"City", "Village", "Village",
"Village"), Population = c("Dallas", "Boston",
"Chicago", "Kip",
"Von", "Dan", "Dallas", "Boston",
"Chicago", "Kip", "Von", "Dan"
), Year = c("2004-05", "2004-05", "2004-05",
"2004-05", "2004-05",
"2004-05", "2011-12", "2011-12",
"2011-12", "2011-12", "2011-12",
"2011-12"), X0_50 = c(15.47, 21.29, 18.04, 15.62, 18.89, 24.37,
17.43, 17.99, 18.04, 14.95, 16.33, 28.98), X50_90 = c(44.12,
43.25, 45.72, 46.15, 43.84, 46.24, 44.39, 44.08, 43.62, 42.89,
44.57, 47.14), X90_100 = c(40.42, 35.47, 36.24, 38.24, 37.27,
29.39, 38.18, 37.93, 38.34, 42.16, 39.11, 23.88)), class "data.frame",
row.names = c(NA,
-12L))
--
Anupam.
Deepayan Sarkar
2024-Sep-04 14:31 UTC
[R] dotchart and dotplot(lattice) plot with two/three conditioning variables
For lattice::dotplot(), you are close; this is more like the layout you
want:
dotplot(Year ~ ptile | reorder(Population, ptile, mean), testLong,
groups = c("0-50", "50-90",
"90-100")[time],
layout = c(1, NA),
par.settings = simpleTheme(pch = 16), auto.key = TRUE)
dotchart() works better with tables, but unfortunately it doesn't seem to
handle more than two dimensions, so you can only get one group at a time:
xtabs(ptile ~ Year + Population, testLong, subset = time == 1) |>
dotchart(pch = 16)
This seems like something that should not be too difficult to improve.
Best,
-Deepayan
On Wed, 4 Sept 2024 at 18:46, Anupam Tyagi <anuptyagi at gmail.com> wrote:
> Hello, I am trying to make a Cleaveland Dotplot with two, if possible
> three, variables on the vertical axis. I was able to do it in Stata
> with two variables, Year and Population (see graph at the link:
>
>
https://drive.google.com/file/d/1SiIfmmqk6IFa_OI5i26Ux1ZxkN2oek-o/view?usp=sharing
> ). I hope the link to the graph works. I have never tried this before.
>
> I want to make a similar (possibly better) graph in R. I tried several
> ways to make it in R with dotchart() and dotplot(lattice). I have been
> only partially successful thus far. I would like Year, Population and
> popGroup on the vertical axis. If popGroup occupies too much space,
> then I would like a gap between the groups of Cities and Villages, so
> they can be seen as distinct "Populations". My code and a made-up
data
> are below (in actual data I have 18 categories in "Population",
> instead of only six in the made-up data). How can I make this type of
> graph?
>
> # Only for 2004-05. How to plot 2011-12 on the same plot?
>
dotchart(test$"X0_50"[test$"Year"=="2004-05"],
labels=test$Population,
> xlab = "Income Share ",
> main = "Income shares of percentiles of population",
xlim = c(12,
> 50))
> points(test$"X50_90"[test$"Year"=="2004-05"],
1:6, pch = 2)
>
points(test$"X90_100"[test$"Year"=="2004-05"],
1:6, pch = 16)
> legend(x = "topleft",
> legend = c("0-50%", "50-90%",
"90-100%"),
> pch = c(1,2, 16)
> )
>
> # reorder so Year 2004-05 is plotted before Year 2011-12. This is not
> plotting correctly for
> # second and third variables. Gap between different Cities and
> Villages is quite a bit.
> test2 <- test[order(test$seqCode, test$Year, decreasing = T),]
>
> dotchart(test2$"X0_50", labels=test2$Year, xlab = "Income
Share ",
> main = "Income shares of percentiles of population",
groups > as.factor(test2$Population), xlim = c(12, 50))
> points(test2$"X50_90", 1:12, pch = 2)
> points(test2$"X90_100", 1: 12, pch = 16)
>
>
> # use lattice library
> library(lattice)
> dotplot(reorder(Population, -seqCode) ~ test$"X0_50" +
test$"X50_90" +
> test$"X90_100", data = test, auto.key = TRUE)
>
> testLong <- reshape(test, idvar = c("Population",
"Year"), varying > list(5:7),
> v.names = "ptile", direction =
"long")
>
> dotplot(reorder(Population, -seqCode) ~ ptile | Year, data = testLong,
> groups = time, auto.key = T)
>
> Dataframe is below using dput(). Dataframe is named "test" in my
code.
>
> structure(list(seqCode = c(1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L,
> 4L, 5L, 6L), popGroup = c("City", "City",
"City", "Village",
> "Village", "Village", "City",
"City", "City", "Village", "Village",
> "Village"), Population = c("Dallas",
"Boston", "Chicago", "Kip",
> "Von", "Dan", "Dallas", "Boston",
"Chicago", "Kip", "Von", "Dan"
> ), Year = c("2004-05", "2004-05", "2004-05",
"2004-05", "2004-05",
> "2004-05", "2011-12", "2011-12",
"2011-12", "2011-12", "2011-12",
> "2011-12"), X0_50 = c(15.47, 21.29, 18.04, 15.62, 18.89, 24.37,
> 17.43, 17.99, 18.04, 14.95, 16.33, 28.98), X50_90 = c(44.12,
> 43.25, 45.72, 46.15, 43.84, 46.24, 44.39, 44.08, 43.62, 42.89,
> 44.57, 47.14), X90_100 = c(40.42, 35.47, 36.24, 38.24, 37.27,
> 29.39, 38.18, 37.93, 38.34, 42.16, 39.11, 23.88)), class >
"data.frame", row.names = c(NA,
> -12L))
>
> --
> Anupam.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]