... and here is a non-dplyr rsolution:> z <-gsub("[^[:digit:]]"," ",dd$Lower)> sapply(strsplit(z," +"),function(x)sum(as.numeric(x),na.rm=TRUE))[1] 105 67 60 100 80 Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Apr 18, 2016 at 10:07 AM, Richard M. Heiberger <rmh at temple.edu> wrote:> ## Continuing with your data > > AA <- stringr::str_extract_all(dd[[2]],"[[:digit:]]+") > BB <- lapply(AA, as.numeric) > ## I think you are looking for one of the following two expressions > sum(unlist(BB)) > sapply(BB, sum) > > > On Mon, Apr 18, 2016 at 12:48 PM, Burhan ul haq <ulhaqz at gmail.com> wrote: >> Hi, >> >> I request help with the following: >> >> INPUT: A data frame where column "Lower" is a character containing numeric >> values (different count or occurrences of numeric values in each row, >> mostly 2) >> >>> dput(dd) >> structure(list(State = c("Alabama", "Alaska", "Arizona", "Arkansas", >> "California"), Lower = c("R 72?33", "R/Coalition 27(23 R, 4 D)?12 D, 1 >> Ind.", >> "R 36?24", "R 64?35, 1 Ind.", "D 52?28"), Upper = c("R 26?8, 1 Ind.", >> "R/Coalition 15(14 R, 1 D)?5 D", "R 18?12", "R 24?11", "D 26?14" >> )), .Names = c("State", "Lower", "Upper"), row.names = c(NA, >> 5L), class = "data.frame") >> >> PROBLEM: Need to extract all numeric values and sum them. There are few >> exceptions like row2. But these can be ignored and will be fixed manually >> >> SOLUTION SO FAR: >> str_extract_all(dd[[2]],"[[:digit:]]+"), returns a list of numbers as >> character. I am unable to unlist it, because it mixes them all together, ... >> >> And if I may add, is there a "dplyr" way of doing it ... >> >> >> Thanks >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
... and a slightly more efficient non-dplyr 1-liner:> sapply(strsplit(dd$Lower,"[^[:digit:]]"),function(x)sum(as.numeric(x), na.rm=TRUE)) [1] 105 67 60 100 80 Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Apr 18, 2016 at 10:43 AM, Bert Gunter <bgunter.4567 at gmail.com> wrote:> ... and here is a non-dplyr rsolution: > >> z <-gsub("[^[:digit:]]"," ",dd$Lower) > >> sapply(strsplit(z," +"),function(x)sum(as.numeric(x),na.rm=TRUE)) > [1] 105 67 60 100 80 > > > Cheers, > Bert > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Mon, Apr 18, 2016 at 10:07 AM, Richard M. Heiberger <rmh at temple.edu> wrote: >> ## Continuing with your data >> >> AA <- stringr::str_extract_all(dd[[2]],"[[:digit:]]+") >> BB <- lapply(AA, as.numeric) >> ## I think you are looking for one of the following two expressions >> sum(unlist(BB)) >> sapply(BB, sum) >> >> >> On Mon, Apr 18, 2016 at 12:48 PM, Burhan ul haq <ulhaqz at gmail.com> wrote: >>> Hi, >>> >>> I request help with the following: >>> >>> INPUT: A data frame where column "Lower" is a character containing numeric >>> values (different count or occurrences of numeric values in each row, >>> mostly 2) >>> >>>> dput(dd) >>> structure(list(State = c("Alabama", "Alaska", "Arizona", "Arkansas", >>> "California"), Lower = c("R 72?33", "R/Coalition 27(23 R, 4 D)?12 D, 1 >>> Ind.", >>> "R 36?24", "R 64?35, 1 Ind.", "D 52?28"), Upper = c("R 26?8, 1 Ind.", >>> "R/Coalition 15(14 R, 1 D)?5 D", "R 18?12", "R 24?11", "D 26?14" >>> )), .Names = c("State", "Lower", "Upper"), row.names = c(NA, >>> 5L), class = "data.frame") >>> >>> PROBLEM: Need to extract all numeric values and sum them. There are few >>> exceptions like row2. But these can be ignored and will be fixed manually >>> >>> SOLUTION SO FAR: >>> str_extract_all(dd[[2]],"[[:digit:]]+"), returns a list of numbers as >>> character. I am unable to unlist it, because it mixes them all together, ... >>> >>> And if I may add, is there a "dplyr" way of doing it ... >>> >>> >>> Thanks >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.
Dear Gunter / Heiberger, Thanks for the help. This is what I was looking for:> ... and here is a non-dplyr rsolution: > >> z <-gsub("[^[:digit:]]"," ",dd$Lower) > >> sapply(strsplit(z," +"),function(x)sum(as.numeric(x),na.rm=TRUE)) > [1] 105 67 60 100 80And that would explain, why one could not use "unlist" as a grand sum total was not desired, but rather sum for each of the rows. Br / On Mon, Apr 18, 2016 at 10:57 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote:> ... and a slightly more efficient non-dplyr 1-liner: > > > sapply(strsplit(dd$Lower,"[^[:digit:]]"), > function(x)sum(as.numeric(x), na.rm=TRUE)) > > [1] 105 67 60 100 80 > > Cheers, > Bert > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Mon, Apr 18, 2016 at 10:43 AM, Bert Gunter <bgunter.4567 at gmail.com> > wrote: > > ... and here is a non-dplyr rsolution: > > > >> z <-gsub("[^[:digit:]]"," ",dd$Lower) > > > >> sapply(strsplit(z," +"),function(x)sum(as.numeric(x),na.rm=TRUE)) > > [1] 105 67 60 100 80 > > > > > > Cheers, > > Bert > > Bert Gunter > > > > "The trouble with having an open mind is that people keep coming along > > and sticking things into it." > > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > > > > On Mon, Apr 18, 2016 at 10:07 AM, Richard M. Heiberger <rmh at temple.edu> > wrote: > >> ## Continuing with your data > >> > >> AA <- stringr::str_extract_all(dd[[2]],"[[:digit:]]+") > >> BB <- lapply(AA, as.numeric) > >> ## I think you are looking for one of the following two expressions > >> sum(unlist(BB)) > >> sapply(BB, sum) > >> > >> > >> On Mon, Apr 18, 2016 at 12:48 PM, Burhan ul haq <ulhaqz at gmail.com> > wrote: > >>> Hi, > >>> > >>> I request help with the following: > >>> > >>> INPUT: A data frame where column "Lower" is a character containing > numeric > >>> values (different count or occurrences of numeric values in each row, > >>> mostly 2) > >>> > >>>> dput(dd) > >>> structure(list(State = c("Alabama", "Alaska", "Arizona", "Arkansas", > >>> "California"), Lower = c("R 72?33", "R/Coalition 27(23 R, 4 D)?12 D, 1 > >>> Ind.", > >>> "R 36?24", "R 64?35, 1 Ind.", "D 52?28"), Upper = c("R 26?8, 1 Ind.", > >>> "R/Coalition 15(14 R, 1 D)?5 D", "R 18?12", "R 24?11", "D 26?14" > >>> )), .Names = c("State", "Lower", "Upper"), row.names = c(NA, > >>> 5L), class = "data.frame") > >>> > >>> PROBLEM: Need to extract all numeric values and sum them. There are few > >>> exceptions like row2. But these can be ignored and will be fixed > manually > >>> > >>> SOLUTION SO FAR: > >>> str_extract_all(dd[[2]],"[[:digit:]]+"), returns a list of numbers as > >>> character. I am unable to unlist it, because it mixes them all > together, ... > >>> > >>> And if I may add, is there a "dplyr" way of doing it ... > >>> > >>> > >>> Thanks > >>> > >>> [[alternative HTML version deleted]] > >>> > >>> ______________________________________________ > >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]