Jim Lemon
2022-Oct-03 07:45 UTC
[R] Creating a year-month indicator and groupby with category
Hi Tariq, There were a couple of glitches in your data structure. Here's an example of a simple plot: dat<-structure(list(year = c(2018, 2019, 2019, 2019, 2019, 2019, 2019, 2019, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019), month = c(12, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5), company = c("ABC", "ABC", "ABC", "ABC", "ABC", "ABC", "ABC", "ABC", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH" ), share = c(20, 16.5, 15, 15.5, 15.5, 16, 17, 16.5, 61, 55, 53, 53, 54, 53, 58, 54, 50, 47, 55, 50, 52, 51, 51.5, 52, 53, 54, 55, 53, 54, 50, 42, 48, 41, 40, 39, 36.5, 35), com_name = c(1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), row.names = c(NA, -37L), spec = structure(list(cols = list(year = structure(list(), class c("collector_double", "collector")), month = structure(list(), class = c("collector_double", "collector")), company = structure(list(), class = c("collector_character", "collector")), share = structure(list(), class = c("collector_double", "collector")), com_name = structure(list(), class = c("collector_double", "collector"))), default = structure(list(), class = c("collector_guess", "collector")), delim = ","), class = "col_spec"), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame")) # convert year and month fields to dates about the middle of each month dat$date<-as.Date(paste(dat$year,dat$month,15,sep="-"),"%Y-%m-%d") # plot the values for one company plot(dat$date[dat$company=="ABC"],dat$share[dat$company=="ABC"], main="Plot of dat",xlab="Year",ylab="Share", xlim=range(dat$date),ylim=range(dat$share), type="l",col="red") # add a line for the other one lines(dat$date[dat$company=="FGH"],dat$share[dat$company=="FGH"],col="green") # get the x plot limits as they are date values xspan<-par("usr")[1:2] # place a legend about in the middle of the plot legend(xspan[1]+diff(xspan)*0.3,35,c("ABC","FGH"),lty=1,col=c("red","green")) There are many more elegant ways to plot something like this. Jim On Mon, Oct 3, 2022 at 10:05 AM Tariq Khasiri <tariqkhasiri at gmail.com> wrote:> > Hello, > > I have the following data. I want to show in a line plot how each different > company is earning over the timeline of my data sample. > > I'm not sure how I can create the *year-month indicator* to plot it nicely > in my horizontal axis out of my dataset. > > After creating the *year-month* indicator ( which will be in my x axis) I > want to create a dataframe where I can groupby companies over the > year-month indicator by putting *share *in the y axis as variables. > > ### data is like the following > > dput(dat) > structure(list(year = c(2018, 2019, 2019, 2019, 2019, 2019, 2019, > 2019, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, > 2017, 2017, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, > 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019), month = c(12, > 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, > 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5), company = c("ABC", > "ABC", "ABC", "ABC", "ABC", "ABC", "ABC", "ABC", "FGH", "FGH", > "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", > "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", > "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH" > ), share = c(20, 16.5, 15, 15.5, 15.5, 16, 17, 16.5, 61, 55, > 53, 53, 54, 53, 58, 54, 50, 47, 55, 50, 52, 51, 51.5, 52, 53, > 54, 55, 53, 54, 50, 42, 48, 41, 40, 39, 36.5, 35), com_name = c(1, > 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), row.names = c(NA, > -37L), spec = structure(list(cols = list(year = structure(list(), class > c("collector_double", > "collector")), month = structure(list(), class = c("collector_double", > "collector")), company = structure(list(), class = c("collector_character", > "collector")), share = structure(list(), class = c("collector_double", > "collector")), com_name = structure(list(), class = c("collector_double", > "collector"))), default = structure(list(), class = c("collector_guess", > "collector")), delim = ","), class = "col_spec"), problems = <pointer: > 0x7fd732028680>, class = c("spec_tbl_df", > "tbl_df", "tbl", "data.frame")) > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Tariq Khasiri
2022-Oct-03 22:30 UTC
[R] Creating a year-month indicator and groupby with category
Thanks everyone for being so kind and patient with me throughout the process! Mr. Barradas and Mr. Lemon, very generous of you for taking the time and patience to go over my code and data , and taking the time to give me meaningful feedback! With your help and suggestion, I was successful in making a graph from my data. In my main data I have four companies, and just making the graph process a little more advanced. However after writing the command , I get the error that I have 4 values but gave only 2 values. Would anyone kindly guide me what's the mistake and how I can rectify this? Data is the same. But my main data has 4 companies whereas in R project I just gave 2 only for convenience. #### Code needed to execute it ######### library(tidyverse) library(showtext) library(usefunc) library(patchwork) library(cowplot) library(rcartocolor) library(zoo) # load fonts font_add_google(name = "Bungee Shade", family = "bungee") font_add_google(name = "Dosis", family = "dosis") showtext_auto() # set colours f_cols = c("#008080", "#329999", "#66b2b2", "#7fbfbf", "#99cccc", "#cce5e5") m_cols = c("#4b0082", "#6e329b", "#9366b4", "#a57fc0", "#b799cd", "#dbcce6") dat$YearMonth <- as.yearmon(paste(dat$year, " ", dat$month), "%Y %m") # plot of share of companies per year p1 <- ggplot(data = dat, mapping = aes(x = YearMonth, y = share, colour = company)) + geom_line() + geom_point(size = 1) + scale_colour_manual("", values = c(f_cols[1], m_cols[1]), labels = c("F", "M" , "C" , "G")) + scale_y_continuous(limits = c(0, 80)) + coord_cartesian(expand = F) + labs(x = "Year", y = "Share of Companies") + theme(legend.position = c(0.1, 0.9), legend.title = element_blank(), legend.text = element_text(family = "dosis", size = 14), panel.background = element_rect(fill = "#FAFAFA", colour "#FAFAFA"), plot.background = element_rect(fill = "#FAFAFA", colour "#FAFAFA"), legend.background = element_rect(fill = "transparent", colour "transparent"), legend.key = element_rect(fill = "transparent", colour "transparent"), axis.title.y = element_text(margin = margin(0, 20, 0, 0), family "dosis"), axis.text = element_text(family = "dosis"), plot.margin = unit(c(0.5, 0.8, 0.5, 0.5), "cm"), panel.grid.major = element_line(colour = "#DEDEDE"), panel.grid.minor = element_blank()) p1 The error is saying : ??ggplot2 (local) FUN(X[[i]], ...) 7. ??base::unlist(...) 8. ??base::lapply(scales$scales, function(scale) scale$map_df(df = df)) 9. ??ggplot2 (local) FUN(X[[i]], ...) 10. ??scale$map_df(df = df) 11. ??ggplot2 (local) f(..., self = self) 12. ??base::lapply(aesthetics, function(j) self$map(df[[j]])) 13. ??ggplot2 (local) FUN(X[[i]], ...) 14. ??self$map(df[[j]]) 15. ??ggplot2 (local) f(..., self = self) 16. ??self$palette(n) 17. ??ggplot2 (local) f(...) 18. ??rlang::abort(glue("Insufficient values in manual scale. {n} needed but only {length(values)} provided.")) On Mon, 3 Oct 2022 at 02:45, Jim Lemon <drjimlemon at gmail.com> wrote:> Hi Tariq, > There were a couple of glitches in your data structure. Here's an > example of a simple plot: > > dat<-structure(list(year = c(2018, 2019, 2019, 2019, 2019, 2019, 2019, > 2019, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, > 2017, 2017, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, > 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019), month = c(12, > 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, > 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5), company = c("ABC", > "ABC", "ABC", "ABC", "ABC", "ABC", "ABC", "ABC", "FGH", "FGH", > "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", > "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", > "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH" > ), share = c(20, 16.5, 15, 15.5, 15.5, 16, 17, 16.5, 61, 55, > 53, 53, 54, 53, 58, 54, 50, 47, 55, 50, 52, 51, 51.5, 52, 53, > 54, 55, 53, 54, 50, 42, 48, 41, 40, 39, 36.5, 35), com_name = c(1, > 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), row.names = c(NA, > -37L), spec = structure(list(cols = list(year = structure(list(), class > c("collector_double", > "collector")), month = structure(list(), class = c("collector_double", > "collector")), company = structure(list(), class = c("collector_character", > "collector")), share = structure(list(), class = c("collector_double", > "collector")), com_name = structure(list(), class = c("collector_double", > "collector"))), default = structure(list(), class = c("collector_guess", > "collector")), delim = ","), class = "col_spec"), class = c("spec_tbl_df", > "tbl_df", "tbl", "data.frame")) > # convert year and month fields to dates about the middle of each month > dat$date<-as.Date(paste(dat$year,dat$month,15,sep="-"),"%Y-%m-%d") > # plot the values for one company > plot(dat$date[dat$company=="ABC"],dat$share[dat$company=="ABC"], > main="Plot of dat",xlab="Year",ylab="Share", > xlim=range(dat$date),ylim=range(dat$share), > type="l",col="red") > # add a line for the other one > > lines(dat$date[dat$company=="FGH"],dat$share[dat$company=="FGH"],col="green") > # get the x plot limits as they are date values > xspan<-par("usr")[1:2] > # place a legend about in the middle of the plot > > legend(xspan[1]+diff(xspan)*0.3,35,c("ABC","FGH"),lty=1,col=c("red","green")) > > There are many more elegant ways to plot something like this. > > Jim > > On Mon, Oct 3, 2022 at 10:05 AM Tariq Khasiri <tariqkhasiri at gmail.com> > wrote: > > > > Hello, > > > > I have the following data. I want to show in a line plot how each > different > > company is earning over the timeline of my data sample. > > > > I'm not sure how I can create the *year-month indicator* to plot it > nicely > > in my horizontal axis out of my dataset. > > > > After creating the *year-month* indicator ( which will be in my x axis) I > > want to create a dataframe where I can groupby companies over the > > year-month indicator by putting *share *in the y axis as variables. > > > > ### data is like the following > > > > dput(dat) > > structure(list(year = c(2018, 2019, 2019, 2019, 2019, 2019, 2019, > > 2019, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, > > 2017, 2017, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, > > 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019), month = c(12, > > 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, > > 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5), company = c("ABC", > > "ABC", "ABC", "ABC", "ABC", "ABC", "ABC", "ABC", "FGH", "FGH", > > "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", > > "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", > > "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH", "FGH" > > ), share = c(20, 16.5, 15, 15.5, 15.5, 16, 17, 16.5, 61, 55, > > 53, 53, 54, 53, 58, 54, 50, 47, 55, 50, 52, 51, 51.5, 52, 53, > > 54, 55, 53, 54, 50, 42, 48, 41, 40, 39, 36.5, 35), com_name = c(1, > > 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > > 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), row.names = c(NA, > > -37L), spec = structure(list(cols = list(year = structure(list(), class > > c("collector_double", > > "collector")), month = structure(list(), class = c("collector_double", > > "collector")), company = structure(list(), class > c("collector_character", > > "collector")), share = structure(list(), class = c("collector_double", > > "collector")), com_name = structure(list(), class = c("collector_double", > > "collector"))), default = structure(list(), class = c("collector_guess", > > "collector")), delim = ","), class = "col_spec"), problems = <pointer: > > 0x7fd732028680>, class = c("spec_tbl_df", > > "tbl_df", "tbl", "data.frame")) > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]