Data <- read.csv("./input/Source.csv", header=T) v1 <- sort(unique(Data[, 1])) cat(format(v1, justify = "right"), sep = "\n") OK, working with the options you presented. This is the combination where I gain the most benefit. However, there is no listing of a column header with the output of this syntax. > cat(format(v1, justify = "right"), sep = "\n") ?2 ?3 ?4 ?5 ?6 ?7 ?8 ?9 10 > NOTE The output here is correct (unique) based on the entries from the column. QUESTION How does one add a text label of something as simple as v1 to the vertical output of this syntax, please? *Stephen Dawson, DSL* /Executive Strategy Consultant/ Business & Technology +1 (865) 804-3454 http://www.shdawson.com <http://www.shdawson.com> On 12/22/21 11:13 AM, Stephen H. Dawson, DSL via R-help wrote:> OK, now I get what you are suggesting. > > Much appreciated. > > > Kindest Regards, > *Stephen Dawson, DSL* > /Executive Strategy Consultant/ > Business & Technology > +1 (865) 804-3454 > http://www.shdawson.com <http://www.shdawson.com> > > > On 12/22/21 11:08 AM, Duncan Murdoch wrote: >> On 22/12/2021 10:55 a.m., Stephen H. Dawson, DSL wrote: >>> I see. >>> >>> So, we are talking taking the output into a new dataframe. I was hoping >>> to have the output rendered on screen without another dataframe, but I >>> can live with this option it if must occur. >>> >>> Am I correct the desired vertical output must first go to a dataframe? >> >> No, that's just one option.? The other 3 don't use dataframes. >> >> Duncan Murdoch >>> >>> >>> *Stephen Dawson, DSL* >>> /Executive Strategy Consultant/ >>> Business & Technology >>> +1 (865) 804-3454 >>> http://www.shdawson.com <http://www.shdawson.com> >>> >>> >>> On 12/22/21 10:47 AM, Duncan Murdoch wrote: >>>> On 22/12/2021 10:20 a.m., Stephen H. Dawson, DSL wrote: >>>>> Thanks for the reply. >>>>> >>>>> Both syntax options work to render the correct (unique) output. >>>>> However, >>>>> the output is rendered as horizontal. What needs to happen to get the >>>>> output to render vertical, please? >>>> >>>> The result of those expressions is a vector of the same type as the >>>> column, so your question is really about how to get a vector to print >>>> one element per line. >>>> >>>> Probably the simplest way is to put the vector in a dataframe (or >>>> matrix, or tibble, depending on which formatting you prefer). For >>>> example, >>>> >>>>> ??? v <- c("red", "green", "blue") >>>>> ??? data.frame(v) >>>> ?????? v >>>> 1?? red >>>> 2 green >>>> 3? blue >>>> >>>> If you want a more minimal display, try >>>> >>>>> cat(v, sep = "\n") >>>> red >>>> green >>>> blue >>>> >>>> or >>>> >>>>> cat(format(v, justify = "right"), sep = "\n") >>>> ?? red >>>> green >>>> ??blue >>>> >>>> If you want this to happen when you auto-print the object, you can >>>> give it a class attribute and write a function to print that class, >>>> e.g. >>>> >>>>> ?? class(v) <- "oneperline" >>>>> >>>>> ??? print.oneperline <- function(x, ...) cat(format(x, justify >>>> "right"), sep = "\n") >>>>> >>>>> ??? v >>>> ?? red >>>> green >>>> ??blue >>>> >>>> Duncan Murdoch >>>> >>>>> >>>>> >>>>> *Stephen Dawson, DSL* >>>>> /Executive Strategy Consultant/ >>>>> Business & Technology >>>>> +1 (865) 804-3454 >>>>> http://www.shdawson.com <http://www.shdawson.com> >>>>> >>>>> >>>>> On 12/21/21 11:38 AM, Duncan Murdoch wrote: >>>>>> On 21/12/2021 11:31 a.m., Duncan Murdoch wrote: >>>>>>> On 21/12/2021 11:20 a.m., Stephen H. Dawson, DSL wrote: >>>>>>>> Thanks for the reply. >>>>>>>> >>>>>>>> sort(unique(Data[1])) >>>>>>>> Error in `[.data.frame`(x, order(x, na.last = na.last, >>>>>>>> decreasing >>>>>>>> decreasing)) : >>>>>>>> ???? ? undefined columns selected >>>>>>> >>>>>>> That's the wrong syntax:? Data[1] is not "column one of Data". Use >>>>>>> Data[[1]] for that, so >>>>>>> >>>>>>> ????? sort(unique(Data[[1]])) >>>>>> >>>>>> Actually, I'd probably recommend >>>>>> >>>>>> ??? sort(unique(Data[, 1])) >>>>>> >>>>>> instead.? This treats Data as a matrix rather than as a list. >>>>>> Dataframes are lists that look like matrices, but to me the matrix >>>>>> aspect is usually more intuitive. >>>>>> >>>>>> Duncan Murdoch >>>>>> >>>>>>> >>>>>>> I think Rui already pointed out the typo in the quoted text >>>>>>> below... >>>>>>> >>>>>>> Duncan Murdoch >>>>>>> >>>>>>>> >>>>>>>> The recommended syntax did not work, as listed above. >>>>>>>> >>>>>>>> What I want is the sort of distinct column output. Again, the >>>>>>>> column >>>>>>>> may >>>>>>>> be text or numbers. This is a huge analysis effort with data >>>>>>>> coming at >>>>>>>> me from many different sources. >>>>>>>> >>>>>>>> >>>>>>>> *Stephen Dawson, DSL* >>>>>>>> /Executive Strategy Consultant/ >>>>>>>> Business & Technology >>>>>>>> +1 (865) 804-3454 >>>>>>>> http://www.shdawson.com <http://www.shdawson.com> >>>>>>>> >>>>>>>> >>>>>>>> On 12/21/21 11:07 AM, Duncan Murdoch wrote: >>>>>>>>> On 21/12/2021 10:16 a.m., Stephen H. Dawson, DSL via R-help >>>>>>>>> wrote: >>>>>>>>>> Thanks everyone for the replies. >>>>>>>>>> >>>>>>>>>> It is clear one either needs to write a function or put the >>>>>>>>>> unique >>>>>>>>>> entries into another dataframe. >>>>>>>>>> >>>>>>>>>> It seems odd R cannot sort a list of unique column entries with >>>>>>>>>> ease. >>>>>>>>>> Python and SQL can do it with ease. >>>>>>>>> >>>>>>>>> I've seen several responses that looked pretty simple. It's >>>>>>>>> hard to >>>>>>>>> beat sort(unique(x)), though there's a fair bit of confusion >>>>>>>>> about >>>>>>>>> what you actually want.? Maybe you should post an example of the >>>>>>>>> code >>>>>>>>> you'd use in Python? >>>>>>>>> >>>>>>>>> Duncan Murdoch >>>>>>>>> >>>>>>>>>> >>>>>>>>>> QUESTION >>>>>>>>>> Is there a simpler means than other than the unique function to >>>>>>>>>> capture >>>>>>>>>> distinct column entries, then sort that list? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *Stephen Dawson, DSL* >>>>>>>>>> /Executive Strategy Consultant/ >>>>>>>>>> Business & Technology >>>>>>>>>> +1 (865) 804-3454 >>>>>>>>>> http://www.shdawson.com <http://www.shdawson.com> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 12/20/21 5:53 PM, Rui Barradas wrote: >>>>>>>>>>> Hello, >>>>>>>>>>> >>>>>>>>>>> Inline. >>>>>>>>>>> >>>>>>>>>>> ?s 21:18 de 20/12/21, Stephen H. Dawson, DSL via R-help >>>>>>>>>>> escreveu: >>>>>>>>>>>> Thanks. >>>>>>>>>>>> >>>>>>>>>>>> sort(unique(Data[[1]])) >>>>>>>>>>>> >>>>>>>>>>>> This syntax provides row numbers, not column values. >>>>>>>>>>> >>>>>>>>>>> This is not right. >>>>>>>>>>> The syntax Data[1] extracts a sub-data.frame, the syntax >>>>>>>>>>> Data[[1]] >>>>>>>>>>> extracts the column vector. >>>>>>>>>>> >>>>>>>>>>> As for my previous answer, it was not addressing the >>>>>>>>>>> question, I >>>>>>>>>>> misinterpreted it as being a question on how to sort by numeric >>>>>>>>>>> order >>>>>>>>>>> when the data is not numeric. Here is a, hopefully, complete >>>>>>>>>>> answer. >>>>>>>>>>> Still with package stringr. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> cols_to_sort <- 1:4 >>>>>>>>>>> >>>>>>>>>>> Data2 <- lapply(Data[cols_to_sort], \(x){ >>>>>>>>>>> ??? ?? stringr::str_sort(unique(x), numeric = TRUE) >>>>>>>>>>> }) >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Or using Avi's suggestion of writing a function to do all the >>>>>>>>>>> work and >>>>>>>>>>> simplify the lapply loop later, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> unisort2 <- function(vec, ...) >>>>>>>>>>> stringr::str_sort(unique(vec), ...) >>>>>>>>>>> Data2 <- lapply(Data[cols_to_sort], unisort, numeric = TRUE) >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hope this helps, >>>>>>>>>>> >>>>>>>>>>> Rui Barradas >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *Stephen Dawson, DSL* >>>>>>>>>>>> /Executive Strategy Consultant/ >>>>>>>>>>>> Business & Technology >>>>>>>>>>>> +1 (865) 804-3454 >>>>>>>>>>>> http://www.shdawson.com <http://www.shdawson.com> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 12/20/21 11:58 AM, Stephen H. Dawson, DSL via R-help wrote: >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Running a simple syntax set to review entries in dataframe >>>>>>>>>>>>> columns. >>>>>>>>>>>>> Here is the working code. >>>>>>>>>>>>> >>>>>>>>>>>>> Data <- read.csv("./input/Source.csv", header=T) >>>>>>>>>>>>> describe(Data) >>>>>>>>>>>>> summary(Data) >>>>>>>>>>>>> unique(Data[1]) >>>>>>>>>>>>> unique(Data[2]) >>>>>>>>>>>>> unique(Data[3]) >>>>>>>>>>>>> unique(Data[4]) >>>>>>>>>>>>> >>>>>>>>>>>>> I would like to add sort the unique entries. The data in the >>>>>>>>>>>>> various >>>>>>>>>>>>> columns are not defined as numbers, but also text. I realize >>>>>>>>>>>>> 1 and >>>>>>>>>>>>> 10 will not sort properly, as the column is not defined as a >>>>>>>>>>>>> number, >>>>>>>>>>>>> but want to see what I have in the columns viewed as sorted. >>>>>>>>>>>>> >>>>>>>>>>>>> QUESTION >>>>>>>>>>>>> What is the best process to sort unique output, please? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks. >>>>>>>>>>>> >>>>>>>>>>>> ______________________________________________ >>>>>>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and >>>>>>>>>>>> more, see >>>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>>>>>> PLEASE do read the posting guide >>>>>>>>>>>> http://www.R-project.org/posting-guide.html >>>>>>>>>>>> and provide commented, minimal, self-contained, reproducible >>>>>>>>>>>> code. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ______________________________________________ >>>>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, >>>>>>>>>> see >>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>>>> PLEASE do read the posting guide >>>>>>>>>> http://www.R-project.org/posting-guide.html >>>>>>>>>> and provide commented, minimal, self-contained, reproducible >>>>>>>>>> code. >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>> >> >> > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
On 22/12/2021 12:01 p.m., Stephen H. Dawson, DSL wrote:> Data <- read.csv("./input/Source.csv", header=T) > v1 <- sort(unique(Data[, 1])) > cat(format(v1, justify = "right"), sep = "\n") > > OK, working with the options you presented. This is the combination > where I gain the most benefit. > > However, there is no listing of a column header with the output of this > syntax. > > > cat(format(v1, justify = "right"), sep = "\n") > ?2 > ?3 > ?4 > ?5 > ?6 > ?7 > ?8 > ?9 > 10 > > > > NOTE > The output here is correct (unique) based on the entries from the column. > > QUESTION > How does one add a text label of something as simple as v1 to the > vertical output of this syntax, please?In this case, you'd just put in cat("v1\n") before the given command. In the general case where you want to get the name of the column from the dataframe, I think you'll need to write your own function. The one Rui just posted looks pretty good. To get it to print without the row numbers as in the example above, just change it a little in the header and one other line: print.sortUnique <- function(x, row.names = FALSE, ...){ n <- max(lengths(x)) y <- lapply(x, \(.x) c(.x, rep("", n - length(.x)))) y <- do.call(cbind.data.frame, y) names(y) <- names(x) print(y, row.names = row.names, ...) invisible(x) } This will give > Data2 V1 V2 V3 V4 3 2 2 1 5 4 3 2 6 5 4 4 7 6 5 5 8 9 6 6 9 11 8 9 12 15 9 10 14 16 11 11 15 17 14 12 18 18 15 13 19 19 17 14 20 19 16 20 18 19 with his example data. Duncan Murdoch> > *Stephen Dawson, DSL* > /Executive Strategy Consultant/ > Business & Technology > +1 (865) 804-3454 > http://www.shdawson.com <http://www.shdawson.com> > > > On 12/22/21 11:13 AM, Stephen H. Dawson, DSL via R-help wrote: >> OK, now I get what you are suggesting. >> >> Much appreciated. >> >> >> Kindest Regards, >> *Stephen Dawson, DSL* >> /Executive Strategy Consultant/ >> Business & Technology >> +1 (865) 804-3454 >> http://www.shdawson.com <http://www.shdawson.com> >> >> >> On 12/22/21 11:08 AM, Duncan Murdoch wrote: >>> On 22/12/2021 10:55 a.m., Stephen H. Dawson, DSL wrote: >>>> I see. >>>> >>>> So, we are talking taking the output into a new dataframe. I was hoping >>>> to have the output rendered on screen without another dataframe, but I >>>> can live with this option it if must occur. >>>> >>>> Am I correct the desired vertical output must first go to a dataframe? >>> >>> No, that's just one option.? The other 3 don't use dataframes. >>> >>> Duncan Murdoch >>>> >>>> >>>> *Stephen Dawson, DSL* >>>> /Executive Strategy Consultant/ >>>> Business & Technology >>>> +1 (865) 804-3454 >>>> http://www.shdawson.com <http://www.shdawson.com> >>>> >>>> >>>> On 12/22/21 10:47 AM, Duncan Murdoch wrote: >>>>> On 22/12/2021 10:20 a.m., Stephen H. Dawson, DSL wrote: >>>>>> Thanks for the reply. >>>>>> >>>>>> Both syntax options work to render the correct (unique) output. >>>>>> However, >>>>>> the output is rendered as horizontal. What needs to happen to get the >>>>>> output to render vertical, please? >>>>> >>>>> The result of those expressions is a vector of the same type as the >>>>> column, so your question is really about how to get a vector to print >>>>> one element per line. >>>>> >>>>> Probably the simplest way is to put the vector in a dataframe (or >>>>> matrix, or tibble, depending on which formatting you prefer). For >>>>> example, >>>>> >>>>>> ??? v <- c("red", "green", "blue") >>>>>> ??? data.frame(v) >>>>> ?????? v >>>>> 1?? red >>>>> 2 green >>>>> 3? blue >>>>> >>>>> If you want a more minimal display, try >>>>> >>>>>> cat(v, sep = "\n") >>>>> red >>>>> green >>>>> blue >>>>> >>>>> or >>>>> >>>>>> cat(format(v, justify = "right"), sep = "\n") >>>>> ?? red >>>>> green >>>>> ??blue >>>>> >>>>> If you want this to happen when you auto-print the object, you can >>>>> give it a class attribute and write a function to print that class, >>>>> e.g. >>>>> >>>>>> ?? class(v) <- "oneperline" >>>>>> >>>>>> ??? print.oneperline <- function(x, ...) cat(format(x, justify >>>>> "right"), sep = "\n") >>>>>> >>>>>> ??? v >>>>> ?? red >>>>> green >>>>> ??blue >>>>> >>>>> Duncan Murdoch >>>>> >>>>>> >>>>>> >>>>>> *Stephen Dawson, DSL* >>>>>> /Executive Strategy Consultant/ >>>>>> Business & Technology >>>>>> +1 (865) 804-3454 >>>>>> http://www.shdawson.com <http://www.shdawson.com> >>>>>> >>>>>> >>>>>> On 12/21/21 11:38 AM, Duncan Murdoch wrote: >>>>>>> On 21/12/2021 11:31 a.m., Duncan Murdoch wrote: >>>>>>>> On 21/12/2021 11:20 a.m., Stephen H. Dawson, DSL wrote: >>>>>>>>> Thanks for the reply. >>>>>>>>> >>>>>>>>> sort(unique(Data[1])) >>>>>>>>> Error in `[.data.frame`(x, order(x, na.last = na.last, >>>>>>>>> decreasing >>>>>>>>> decreasing)) : >>>>>>>>> ???? ? undefined columns selected >>>>>>>> >>>>>>>> That's the wrong syntax:? Data[1] is not "column one of Data". Use >>>>>>>> Data[[1]] for that, so >>>>>>>> >>>>>>>> ????? sort(unique(Data[[1]])) >>>>>>> >>>>>>> Actually, I'd probably recommend >>>>>>> >>>>>>> ??? sort(unique(Data[, 1])) >>>>>>> >>>>>>> instead.? This treats Data as a matrix rather than as a list. >>>>>>> Dataframes are lists that look like matrices, but to me the matrix >>>>>>> aspect is usually more intuitive. >>>>>>> >>>>>>> Duncan Murdoch >>>>>>> >>>>>>>> >>>>>>>> I think Rui already pointed out the typo in the quoted text >>>>>>>> below... >>>>>>>> >>>>>>>> Duncan Murdoch >>>>>>>> >>>>>>>>> >>>>>>>>> The recommended syntax did not work, as listed above. >>>>>>>>> >>>>>>>>> What I want is the sort of distinct column output. Again, the >>>>>>>>> column >>>>>>>>> may >>>>>>>>> be text or numbers. This is a huge analysis effort with data >>>>>>>>> coming at >>>>>>>>> me from many different sources. >>>>>>>>> >>>>>>>>> >>>>>>>>> *Stephen Dawson, DSL* >>>>>>>>> /Executive Strategy Consultant/ >>>>>>>>> Business & Technology >>>>>>>>> +1 (865) 804-3454 >>>>>>>>> http://www.shdawson.com <http://www.shdawson.com> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 12/21/21 11:07 AM, Duncan Murdoch wrote: >>>>>>>>>> On 21/12/2021 10:16 a.m., Stephen H. Dawson, DSL via R-help >>>>>>>>>> wrote: >>>>>>>>>>> Thanks everyone for the replies. >>>>>>>>>>> >>>>>>>>>>> It is clear one either needs to write a function or put the >>>>>>>>>>> unique >>>>>>>>>>> entries into another dataframe. >>>>>>>>>>> >>>>>>>>>>> It seems odd R cannot sort a list of unique column entries with >>>>>>>>>>> ease. >>>>>>>>>>> Python and SQL can do it with ease. >>>>>>>>>> >>>>>>>>>> I've seen several responses that looked pretty simple. It's >>>>>>>>>> hard to >>>>>>>>>> beat sort(unique(x)), though there's a fair bit of confusion >>>>>>>>>> about >>>>>>>>>> what you actually want.? Maybe you should post an example of the >>>>>>>>>> code >>>>>>>>>> you'd use in Python? >>>>>>>>>> >>>>>>>>>> Duncan Murdoch >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> QUESTION >>>>>>>>>>> Is there a simpler means than other than the unique function to >>>>>>>>>>> capture >>>>>>>>>>> distinct column entries, then sort that list? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> *Stephen Dawson, DSL* >>>>>>>>>>> /Executive Strategy Consultant/ >>>>>>>>>>> Business & Technology >>>>>>>>>>> +1 (865) 804-3454 >>>>>>>>>>> http://www.shdawson.com <http://www.shdawson.com> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 12/20/21 5:53 PM, Rui Barradas wrote: >>>>>>>>>>>> Hello, >>>>>>>>>>>> >>>>>>>>>>>> Inline. >>>>>>>>>>>> >>>>>>>>>>>> ?s 21:18 de 20/12/21, Stephen H. Dawson, DSL via R-help >>>>>>>>>>>> escreveu: >>>>>>>>>>>>> Thanks. >>>>>>>>>>>>> >>>>>>>>>>>>> sort(unique(Data[[1]])) >>>>>>>>>>>>> >>>>>>>>>>>>> This syntax provides row numbers, not column values. >>>>>>>>>>>> >>>>>>>>>>>> This is not right. >>>>>>>>>>>> The syntax Data[1] extracts a sub-data.frame, the syntax >>>>>>>>>>>> Data[[1]] >>>>>>>>>>>> extracts the column vector. >>>>>>>>>>>> >>>>>>>>>>>> As for my previous answer, it was not addressing the >>>>>>>>>>>> question, I >>>>>>>>>>>> misinterpreted it as being a question on how to sort by numeric >>>>>>>>>>>> order >>>>>>>>>>>> when the data is not numeric. Here is a, hopefully, complete >>>>>>>>>>>> answer. >>>>>>>>>>>> Still with package stringr. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> cols_to_sort <- 1:4 >>>>>>>>>>>> >>>>>>>>>>>> Data2 <- lapply(Data[cols_to_sort], \(x){ >>>>>>>>>>>> ??? ?? stringr::str_sort(unique(x), numeric = TRUE) >>>>>>>>>>>> }) >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Or using Avi's suggestion of writing a function to do all the >>>>>>>>>>>> work and >>>>>>>>>>>> simplify the lapply loop later, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> unisort2 <- function(vec, ...) >>>>>>>>>>>> stringr::str_sort(unique(vec), ...) >>>>>>>>>>>> Data2 <- lapply(Data[cols_to_sort], unisort, numeric = TRUE) >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Hope this helps, >>>>>>>>>>>> >>>>>>>>>>>> Rui Barradas >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> *Stephen Dawson, DSL* >>>>>>>>>>>>> /Executive Strategy Consultant/ >>>>>>>>>>>>> Business & Technology >>>>>>>>>>>>> +1 (865) 804-3454 >>>>>>>>>>>>> http://www.shdawson.com <http://www.shdawson.com> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 12/20/21 11:58 AM, Stephen H. Dawson, DSL via R-help wrote: >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Running a simple syntax set to review entries in dataframe >>>>>>>>>>>>>> columns. >>>>>>>>>>>>>> Here is the working code. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Data <- read.csv("./input/Source.csv", header=T) >>>>>>>>>>>>>> describe(Data) >>>>>>>>>>>>>> summary(Data) >>>>>>>>>>>>>> unique(Data[1]) >>>>>>>>>>>>>> unique(Data[2]) >>>>>>>>>>>>>> unique(Data[3]) >>>>>>>>>>>>>> unique(Data[4]) >>>>>>>>>>>>>> >>>>>>>>>>>>>> I would like to add sort the unique entries. The data in the >>>>>>>>>>>>>> various >>>>>>>>>>>>>> columns are not defined as numbers, but also text. I realize >>>>>>>>>>>>>> 1 and >>>>>>>>>>>>>> 10 will not sort properly, as the column is not defined as a >>>>>>>>>>>>>> number, >>>>>>>>>>>>>> but want to see what I have in the columns viewed as sorted. >>>>>>>>>>>>>> >>>>>>>>>>>>>> QUESTION >>>>>>>>>>>>>> What is the best process to sort unique output, please? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks. >>>>>>>>>>>>> >>>>>>>>>>>>> ______________________________________________ >>>>>>>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and >>>>>>>>>>>>> more, see >>>>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>>>>>>> PLEASE do read the posting guide >>>>>>>>>>>>> http://www.R-project.org/posting-guide.html >>>>>>>>>>>>> and provide commented, minimal, self-contained, reproducible >>>>>>>>>>>>> code. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> ______________________________________________ >>>>>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, >>>>>>>>>>> see >>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>>>>> PLEASE do read the posting guide >>>>>>>>>>> http://www.R-project.org/posting-guide.html >>>>>>>>>>> and provide commented, minimal, self-contained, reproducible >>>>>>>>>>> code. >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>> >>> >>> >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >
Stephen, Why should there be a column header when you take your data and reformat it? cat(format(v1, justify = "right"), sep = "\n") The above is no longer your original data structure and has specified what you want printed. Your column header and other names associated with your original data.frame are stored as attributes that you sort of discarded. The name you want is associated not with v1 but with what you call Data[,1] and you can get that name using names(Data[,1]) and put it where you want. In your case, if you want the single line above your values to have that name, this would do it: cat(format(names(Data[,1]). "\n", v1, justify = "right"), sep = "\n") -----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of Stephen H. Dawson, DSL via R-help Sent: Wednesday, December 22, 2021 12:02 PM To: Duncan Murdoch <murdoch.duncan at gmail.com>; Rui Barradas <ruipbarradas at sapo.pt>; Stephen H. Dawson, DSL via R-help <r-help at r-project.org> Subject: Re: [R] Adding SORT to UNIQUE Data <- read.csv("./input/Source.csv", header=T) v1 <- sort(unique(Data[, 1])) cat(format(v1, justify = "right"), sep = "\n") OK, working with the options you presented. This is the combination where I gain the most benefit. However, there is no listing of a column header with the output of this syntax. > cat(format(v1, justify = "right"), sep = "\n") 2 3 4 5 6 7 8 9 10 > NOTE The output here is correct (unique) based on the entries from the column. QUESTION How does one add a text label of something as simple as v1 to the vertical output of this syntax, please? *Stephen Dawson, DSL* /Executive Strategy Consultant/ Business & Technology +1 (865) 804-3454 http://www.shdawson.com <http://www.shdawson.com> On 12/22/21 11:13 AM, Stephen H. Dawson, DSL via R-help wrote:> OK, now I get what you are suggesting. > > Much appreciated. > > > Kindest Regards, > *Stephen Dawson, DSL* > /Executive Strategy Consultant/ > Business & Technology > +1 (865) 804-3454 > http://www.shdawson.com <http://www.shdawson.com> > > > On 12/22/21 11:08 AM, Duncan Murdoch wrote: >> On 22/12/2021 10:55 a.m., Stephen H. Dawson, DSL wrote: >>> I see. >>> >>> So, we are talking taking the output into a new dataframe. I was >>> hoping to have the output rendered on screen without another >>> dataframe, but I can live with this option it if must occur. >>> >>> Am I correct the desired vertical output must first go to a dataframe? >> >> No, that's just one option. The other 3 don't use dataframes. >> >> Duncan Murdoch >>> >>> >>> *Stephen Dawson, DSL* >>> /Executive Strategy Consultant/ >>> Business & Technology >>> +1 (865) 804-3454 >>> http://www.shdawson.com <http://www.shdawson.com> >>> >>> >>> On 12/22/21 10:47 AM, Duncan Murdoch wrote: >>>> On 22/12/2021 10:20 a.m., Stephen H. Dawson, DSL wrote: >>>>> Thanks for the reply. >>>>> >>>>> Both syntax options work to render the correct (unique) output. >>>>> However, >>>>> the output is rendered as horizontal. What needs to happen to get >>>>> the output to render vertical, please? >>>> >>>> The result of those expressions is a vector of the same type as the >>>> column, so your question is really about how to get a vector to >>>> print one element per line. >>>> >>>> Probably the simplest way is to put the vector in a dataframe (or >>>> matrix, or tibble, depending on which formatting you prefer). For >>>> example, >>>> >>>>> v <- c("red", "green", "blue") >>>>> data.frame(v) >>>> v >>>> 1 red >>>> 2 green >>>> 3 blue >>>> >>>> If you want a more minimal display, try >>>> >>>>> cat(v, sep = "\n") >>>> red >>>> green >>>> blue >>>> >>>> or >>>> >>>>> cat(format(v, justify = "right"), sep = "\n") >>>> red >>>> green >>>> blue >>>> >>>> If you want this to happen when you auto-print the object, you can >>>> give it a class attribute and write a function to print that class, >>>> e.g. >>>> >>>>> class(v) <- "oneperline" >>>>> >>>>> print.oneperline <- function(x, ...) cat(format(x, justify >>>> "right"), sep = "\n") >>>>> >>>>> v >>>> red >>>> green >>>> blue >>>> >>>> Duncan Murdoch >>>> >>>>> >>>>> >>>>> *Stephen Dawson, DSL* >>>>> /Executive Strategy Consultant/ >>>>> Business & Technology >>>>> +1 (865) 804-3454 >>>>> http://www.shdawson.com <http://www.shdawson.com> >>>>> >>>>> >>>>> On 12/21/21 11:38 AM, Duncan Murdoch wrote: >>>>>> On 21/12/2021 11:31 a.m., Duncan Murdoch wrote: >>>>>>> On 21/12/2021 11:20 a.m., Stephen H. Dawson, DSL wrote: >>>>>>>> Thanks for the reply. >>>>>>>> >>>>>>>> sort(unique(Data[1])) >>>>>>>> Error in `[.data.frame`(x, order(x, na.last = na.last, >>>>>>>> decreasing >>>>>>>> decreasing)) : >>>>>>>> undefined columns selected >>>>>>> >>>>>>> That's the wrong syntax: Data[1] is not "column one of Data". >>>>>>> Use Data[[1]] for that, so >>>>>>> >>>>>>> sort(unique(Data[[1]])) >>>>>> >>>>>> Actually, I'd probably recommend >>>>>> >>>>>> sort(unique(Data[, 1])) >>>>>> >>>>>> instead. This treats Data as a matrix rather than as a list. >>>>>> Dataframes are lists that look like matrices, but to me the >>>>>> matrix aspect is usually more intuitive. >>>>>> >>>>>> Duncan Murdoch >>>>>> >>>>>>> >>>>>>> I think Rui already pointed out the typo in the quoted text >>>>>>> below... >>>>>>> >>>>>>> Duncan Murdoch >>>>>>> >>>>>>>> >>>>>>>> The recommended syntax did not work, as listed above. >>>>>>>> >>>>>>>> What I want is the sort of distinct column output. Again, the >>>>>>>> column may be text or numbers. This is a huge analysis effort >>>>>>>> with data coming at me from many different sources. >>>>>>>> >>>>>>>> >>>>>>>> *Stephen Dawson, DSL* >>>>>>>> /Executive Strategy Consultant/ Business & Technology >>>>>>>> +1 (865) 804-3454 >>>>>>>> http://www.shdawson.com <http://www.shdawson.com> >>>>>>>> >>>>>>>> >>>>>>>> On 12/21/21 11:07 AM, Duncan Murdoch wrote: >>>>>>>>> On 21/12/2021 10:16 a.m., Stephen H. Dawson, DSL via R-help >>>>>>>>> wrote: >>>>>>>>>> Thanks everyone for the replies. >>>>>>>>>> >>>>>>>>>> It is clear one either needs to write a function or put the >>>>>>>>>> unique entries into another dataframe. >>>>>>>>>> >>>>>>>>>> It seems odd R cannot sort a list of unique column entries >>>>>>>>>> with ease. >>>>>>>>>> Python and SQL can do it with ease. >>>>>>>>> >>>>>>>>> I've seen several responses that looked pretty simple. It's >>>>>>>>> hard to beat sort(unique(x)), though there's a fair bit of >>>>>>>>> confusion about what you actually want. Maybe you should post >>>>>>>>> an example of the code you'd use in Python? >>>>>>>>> >>>>>>>>> Duncan Murdoch >>>>>>>>> >>>>>>>>>> >>>>>>>>>> QUESTION >>>>>>>>>> Is there a simpler means than other than the unique function >>>>>>>>>> to capture distinct column entries, then sort that list? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *Stephen Dawson, DSL* >>>>>>>>>> /Executive Strategy Consultant/ Business & Technology >>>>>>>>>> +1 (865) 804-3454 >>>>>>>>>> http://www.shdawson.com <http://www.shdawson.com> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 12/20/21 5:53 PM, Rui Barradas wrote: >>>>>>>>>>> Hello, >>>>>>>>>>> >>>>>>>>>>> Inline. >>>>>>>>>>> >>>>>>>>>>> ?s 21:18 de 20/12/21, Stephen H. Dawson, DSL via R-help >>>>>>>>>>> escreveu: >>>>>>>>>>>> Thanks. >>>>>>>>>>>> >>>>>>>>>>>> sort(unique(Data[[1]])) >>>>>>>>>>>> >>>>>>>>>>>> This syntax provides row numbers, not column values. >>>>>>>>>>> >>>>>>>>>>> This is not right. >>>>>>>>>>> The syntax Data[1] extracts a sub-data.frame, the syntax >>>>>>>>>>> Data[[1]] extracts the column vector. >>>>>>>>>>> >>>>>>>>>>> As for my previous answer, it was not addressing the >>>>>>>>>>> question, I misinterpreted it as being a question on how to >>>>>>>>>>> sort by numeric order when the data is not numeric. Here is >>>>>>>>>>> a, hopefully, complete answer. >>>>>>>>>>> Still with package stringr. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> cols_to_sort <- 1:4 >>>>>>>>>>> >>>>>>>>>>> Data2 <- lapply(Data[cols_to_sort], \(x){ >>>>>>>>>>> stringr::str_sort(unique(x), numeric = TRUE) >>>>>>>>>>> }) >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Or using Avi's suggestion of writing a function to do all >>>>>>>>>>> the work and simplify the lapply loop later, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> unisort2 <- function(vec, ...) >>>>>>>>>>> stringr::str_sort(unique(vec), ...) >>>>>>>>>>> Data2 <- lapply(Data[cols_to_sort], unisort, numeric = TRUE) >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hope this helps, >>>>>>>>>>> >>>>>>>>>>> Rui Barradas >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *Stephen Dawson, DSL* >>>>>>>>>>>> /Executive Strategy Consultant/ Business & Technology >>>>>>>>>>>> +1 (865) 804-3454 >>>>>>>>>>>> http://www.shdawson.com <http://www.shdawson.com> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 12/20/21 11:58 AM, Stephen H. Dawson, DSL via R-help wrote: >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Running a simple syntax set to review entries in dataframe >>>>>>>>>>>>> columns. >>>>>>>>>>>>> Here is the working code. >>>>>>>>>>>>> >>>>>>>>>>>>> Data <- read.csv("./input/Source.csv", header=T) >>>>>>>>>>>>> describe(Data) >>>>>>>>>>>>> summary(Data) >>>>>>>>>>>>> unique(Data[1]) >>>>>>>>>>>>> unique(Data[2]) >>>>>>>>>>>>> unique(Data[3]) >>>>>>>>>>>>> unique(Data[4]) >>>>>>>>>>>>> >>>>>>>>>>>>> I would like to add sort the unique entries. The data in >>>>>>>>>>>>> the various columns are not defined as numbers, but also >>>>>>>>>>>>> text. I realize >>>>>>>>>>>>> 1 and >>>>>>>>>>>>> 10 will not sort properly, as the column is not defined as >>>>>>>>>>>>> a number, but want to see what I have in the columns >>>>>>>>>>>>> viewed as sorted. >>>>>>>>>>>>> >>>>>>>>>>>>> QUESTION >>>>>>>>>>>>> What is the best process to sort unique output, please? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks. >>>>>>>>>>>> >>>>>>>>>>>> ______________________________________________ >>>>>>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and >>>>>>>>>>>> more, see https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>>>>>> PLEASE do read the posting guide >>>>>>>>>>>> http://www.R-project.org/posting-guide.html >>>>>>>>>>>> and provide commented, minimal, self-contained, >>>>>>>>>>>> reproducible code. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ______________________________________________ >>>>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, >>>>>>>>>> see >>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>>>> PLEASE do read the posting guide >>>>>>>>>> http://www.R-project.org/posting-guide.html >>>>>>>>>> and provide commented, minimal, self-contained, reproducible >>>>>>>>>> code. >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>> >> >> > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.