I am not averse to a factor-based solution, but I would still have to manually enter that factor each month, correct? If possible, I?d just like to point R at that column and have it do the work. ? Nathan Parsons, B.SC, M.Sc, G.C. Ph.D. Candidate, Dept. of Sociology, Portland State University Adjunct Professor, Dept. of Sociology, Washington State University Graduate Advocate, American Association of University Professors (OR) Recent work (https://www.researchgate.net/profile/Nathan_Parsons3/publications) Schedule an appointment (https://calendly.com/nate-parsons)> On Wednesday, Jul 21, 2021 at 8:30 PM, Tom Woolman <twoolman at ontargettek.com (mailto:twoolman at ontargettek.com)> wrote: > > Couldn't you convert the date columns to character type data in a data > frame, and then convert those strings to factors in a 2nd step? > > The only downside I think to treating dates as factor levels is that > you might have an awful lot of factors if you have a large enough > dataset. > > > > Quoting "N. F. Parsons" <nathan.f.parsons at gmail.com>: > > > Hi all, > > > > If I have a tibble as follows: > > > > tibble(dates = c(rep("2021-07-04", 2), rep("2021-07-25", 3), > > rep("2021-07-18", 4))) > > > > how in the world do I add a column that evaluates each of those dates and > > assigns it a categorical value such that > > > > dates cycle > > <chr> <chr> > > 2021-07-04 1 > > 2021-07-04 1 > > 2021-07-25 3 > > 2021-07-25 3 > > 2021-07-25 3 > > 2021-07-18 2 > > 2021-07-18 2 > > 2021-07-18 2 > > 2021-07-18 2 > > > > Not to further complicate matters, but some months I may only have one > > date, and some months I will have 4 dates - so thats not a fixed quantity. > > We've literally been doing this by hand at my job and I'd like to automate > > it. > > > > Thanks in advance! > > > > Nate Parsons > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]
Not if you use as.factor to convert a character type column to factor levels. It should recode the distinct string values to factors automatically for you. i.e., df$datefactors <- as.factor(df$datestrings) Quoting "N. F. Parsons" <nathan.f.parsons at gmail.com>:> I am not averse to a factor-based solution, but I would still have > to manually enter that factor each month, correct? If possible, I?d > just like to point R at that column and have it do the work. > > ? > Nathan Parsons, B.SC, M.Sc, G.C. > > Ph.D. Candidate, Dept. of Sociology, Portland State University > Adjunct Professor, Dept. of Sociology, Washington State University > Graduate Advocate, American Association of University Professors (OR) > > Recent work > (https://www.researchgate.net/profile/Nathan_Parsons3/publications) > Schedule an appointment (https://calendly.com/nate-parsons) > >> On Wednesday, Jul 21, 2021 at 8:30 PM, Tom Woolman >> <twoolman at ontargettek.com (mailto:twoolman at ontargettek.com)> wrote: >> >> Couldn't you convert the date columns to character type data in a data >> frame, and then convert those strings to factors in a 2nd step? >> >> The only downside I think to treating dates as factor levels is that >> you might have an awful lot of factors if you have a large enough >> dataset. >> >> >> >> Quoting "N. F. Parsons" <nathan.f.parsons at gmail.com>: >> >> > Hi all, >> > >> > If I have a tibble as follows: >> > >> > tibble(dates = c(rep("2021-07-04", 2), rep("2021-07-25", 3), >> > rep("2021-07-18", 4))) >> > >> > how in the world do I add a column that evaluates each of those dates and >> > assigns it a categorical value such that >> > >> > dates cycle >> > <chr> <chr> >> > 2021-07-04 1 >> > 2021-07-04 1 >> > 2021-07-25 3 >> > 2021-07-25 3 >> > 2021-07-25 3 >> > 2021-07-18 2 >> > 2021-07-18 2 >> > 2021-07-18 2 >> > 2021-07-18 2 >> > >> > Not to further complicate matters, but some months I may only have one >> > date, and some months I will have 4 dates - so thats not a fixed quantity. >> > We've literally been doing this by hand at my job and I'd like to automate >> > it. >> > >> > Thanks in advance! >> > >> > Nate Parsons >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.
You have been told how to do it. If you do not understand, you should find a suitable tutorial to learn about how R factors work. There are some difficulties in converting dates on an ongoing basis to factors, so I think you should take Tom's advice to rethink this. It sounds as if you might also do well to find someone with more R experience to consult with... "How in the world do I..." does not inspire confidence that you know what you are doing. Cheers, Bert On Wed, Jul 21, 2021, 8:47 PM N. F. Parsons <nathan.f.parsons at gmail.com> wrote:> I am not averse to a factor-based solution, but I would still have to > manually enter that factor each month, correct? If possible, I?d just like > to point R at that column and have it do the work. > > ? > Nathan Parsons, B.SC, M.Sc, G.C. > > Ph.D. Candidate, Dept. of Sociology, Portland State University > Adjunct Professor, Dept. of Sociology, Washington State University > Graduate Advocate, American Association of University Professors (OR) > > Recent work ( > https://www.researchgate.net/profile/Nathan_Parsons3/publications) > Schedule an appointment (https://calendly.com/nate-parsons) > > > On Wednesday, Jul 21, 2021 at 8:30 PM, Tom Woolman < > twoolman at ontargettek.com (mailto:twoolman at ontargettek.com)> wrote: > > > > Couldn't you convert the date columns to character type data in a data > > frame, and then convert those strings to factors in a 2nd step? > > > > The only downside I think to treating dates as factor levels is that > > you might have an awful lot of factors if you have a large enough > > dataset. > > > > > > > > Quoting "N. F. Parsons" <nathan.f.parsons at gmail.com>: > > > > > Hi all, > > > > > > If I have a tibble as follows: > > > > > > tibble(dates = c(rep("2021-07-04", 2), rep("2021-07-25", 3), > > > rep("2021-07-18", 4))) > > > > > > how in the world do I add a column that evaluates each of those dates > and > > > assigns it a categorical value such that > > > > > > dates cycle > > > <chr> <chr> > > > 2021-07-04 1 > > > 2021-07-04 1 > > > 2021-07-25 3 > > > 2021-07-25 3 > > > 2021-07-25 3 > > > 2021-07-18 2 > > > 2021-07-18 2 > > > 2021-07-18 2 > > > 2021-07-18 2 > > > > > > Not to further complicate matters, but some months I may only have one > > > date, and some months I will have 4 dates - so thats not a fixed > quantity. > > > We've literally been doing this by hand at my job and I'd like to > automate > > > it. > > > > > > Thanks in advance! > > > > > > Nate Parsons > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
I wonder if you mean that you want the levels of the factor to reset within each month? That is not obvious from your example, but implied by your question. Andrew -- Andrew Robinson Director, CEBRA and Professor of Biosecurity, School/s of BioSciences and Mathematics & Statistics University of Melbourne, VIC 3010 Australia Tel: (+61) 0403 138 955 Email: apro at unimelb.edu.au Website: https://researchers.ms.unimelb.edu.au/~apro at unimelb/ I acknowledge the Traditional Owners of the land I inhabit, and pay my respects to their Elders. On 22 Jul 2021, 1:47 PM +1000, N. F. Parsons <nathan.f.parsons at gmail.com>, wrote: External email: Please exercise caution I am not averse to a factor-based solution, but I would still have to manually enter that factor each month, correct? If possible, I?d just like to point R at that column and have it do the work. ? Nathan Parsons, B.SC, M.Sc, G.C. Ph.D. Candidate, Dept. of Sociology, Portland State University Adjunct Professor, Dept. of Sociology, Washington State University Graduate Advocate, American Association of University Professors (OR) Recent work (https://www.researchgate.net/profile/Nathan_Parsons3/publications) Schedule an appointment (https://calendly.com/nate-parsons) On Wednesday, Jul 21, 2021 at 8:30 PM, Tom Woolman <twoolman at ontargettek.com (mailto:twoolman at ontargettek.com)> wrote: Couldn't you convert the date columns to character type data in a data frame, and then convert those strings to factors in a 2nd step? The only downside I think to treating dates as factor levels is that you might have an awful lot of factors if you have a large enough dataset. Quoting "N. F. Parsons" <nathan.f.parsons at gmail.com>: Hi all, If I have a tibble as follows: tibble(dates = c(rep("2021-07-04", 2), rep("2021-07-25", 3), rep("2021-07-18", 4))) how in the world do I add a column that evaluates each of those dates and assigns it a categorical value such that dates cycle <chr> <chr> 2021-07-04 1 2021-07-04 1 2021-07-25 3 2021-07-25 3 2021-07-25 3 2021-07-18 2 2021-07-18 2 2021-07-18 2 2021-07-18 2 Not to further complicate matters, but some months I may only have one date, and some months I will have 4 dates - so thats not a fixed quantity. We've literally been doing this by hand at my job and I'd like to automate it. Thanks in advance! Nate Parsons [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
Hello, Here are 3 solutions, one of them the coercion to factor one. Since you are using tibbles, I assume you also want a dplyr solution. library(dplyr) df1 <- tibble(dates = c(rep("2021-07-04", 2), rep("2021-07-25", 3), rep("2021-07-18", 4))) # base R as.integer(factor(df1$dates)) match(df1$dates, unique(sort(df1$dates))) # dplyr df1 %>% group_by(dates) %>% mutate(cycle = cur_group_id()) My favorite is by far the 1st but that's a matter of opinion. Hope this helps, Rui Barradas ?s 04:46 de 22/07/21, N. F. Parsons escreveu:> I am not averse to a factor-based solution, but I would still have to manually enter that factor each month, correct? If possible, I?d just like to point R at that column and have it do the work. > > ? > Nathan Parsons, B.SC, M.Sc, G.C. > > Ph.D. Candidate, Dept. of Sociology, Portland State University > Adjunct Professor, Dept. of Sociology, Washington State University > Graduate Advocate, American Association of University Professors (OR) > > Recent work (https://www.researchgate.net/profile/Nathan_Parsons3/publications) > Schedule an appointment (https://calendly.com/nate-parsons) > >> On Wednesday, Jul 21, 2021 at 8:30 PM, Tom Woolman <twoolman at ontargettek.com (mailto:twoolman at ontargettek.com)> wrote: >> >> Couldn't you convert the date columns to character type data in a data >> frame, and then convert those strings to factors in a 2nd step? >> >> The only downside I think to treating dates as factor levels is that >> you might have an awful lot of factors if you have a large enough >> dataset. >> >> >> >> Quoting "N. F. Parsons" <nathan.f.parsons at gmail.com>: >> >>> Hi all, >>> >>> If I have a tibble as follows: >>> >>> tibble(dates = c(rep("2021-07-04", 2), rep("2021-07-25", 3), >>> rep("2021-07-18", 4))) >>> >>> how in the world do I add a column that evaluates each of those dates and >>> assigns it a categorical value such that >>> >>> dates cycle >>> <chr> <chr> >>> 2021-07-04 1 >>> 2021-07-04 1 >>> 2021-07-25 3 >>> 2021-07-25 3 >>> 2021-07-25 3 >>> 2021-07-18 2 >>> 2021-07-18 2 >>> 2021-07-18 2 >>> 2021-07-18 2 >>> >>> Not to further complicate matters, but some months I may only have one >>> date, and some months I will have 4 dates - so thats not a fixed quantity. >>> We've literally been doing this by hand at my job and I'd like to automate >>> it. >>> >>> Thanks in advance! >>> >>> Nate Parsons >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
date_df <- tibble(dates = c(rep("2021-07-04", 2), rep("2021-07-25", 3), rep("2021-07-18", 4))) cycle_from_date <- function(date,dates){ dates |> unique() |> sort() -> ranks match(date,ranks) } date_df |> mutate(cycle_new=cycle_from_date(dates,dates))> On 22.07.2021, at 05:46, N. F. Parsons <nathan.f.parsons at gmail.com> wrote: > >>> tibble(dates = c(rep("2021-07-04", 2), rep("2021-07-25", 3), >>> rep("2021-07-18", 4)))