Using Ulrik?s example data (and assuming I understand what is wanted), here is what I would do: ex.dat <- c("FName: fname1", "Fval: Fval1.name1", "Fval: ", "FName: fname2", "Fval: Fval2.name2", "FName: fname3") tst <- data.frame(x = ex.dat, stringsAsFactors=FALSE) sp <- strsplit(tst$x, ':', fixed=TRUE) chk <- unlist(lapply(sp, function(txt) txt[2] != ' ')) newtst <- tst[chk,,drop=FALSE] This both assumes and requires that ALL of the rows are structured as in the example data in the original question. For example: if any row is missing the ?:?, it will fail. If the ?:? is not followed by a space character it may fail (I have not checked) -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 7/13/17, 6:47 AM, "R-help on behalf of Ulrik Stervbo" <r-help-bounces at r-project.org on behalf of ulrik.stervbo at gmail.com> wrote: Hi Vijayan, one way going about it *could* be this: library(dplyr) library(tidyr) library(purrr) ex_dat <- c("FName: fname1", "Fval: Fval1.name1", "Fval: ", "FName: fname2", "Fval: Fval2.name2", "FName: fname3") data.frame(x = ex_dat) %>% separate(x, c("F1", "F2"), sep = ": ") %>% filter(F2 != "") %>% group_by(F1) %>% mutate(indx = row_number()) %>% spread(F1, F2, fill = "") %>% gather(F1, F2, FName, Fval) %>% arrange(indx) %>% mutate(x = paste(F1, F2, sep = ": ")) %>% select(x) %>% flatten_chr() It is not particularly nice or clever, but it gets the job done using R. HTH Ulrik On Thu, 13 Jul 2017 at 13:13 Vijayan Padmanabhan <V.Padmanabhan at itc.in> wrote: > > Dear R-help Group > > > Scenario 1: > I have a text file running to 1000 of lines...that > is like as follows: > > [922] "FieldName: Wk3PackSubMonth" > > [923] "FieldValue: Apr" > > [924] "FieldName: Wk3PackSubYear" > > [925] "FieldValue: 2017" > > [926] "FieldName: Wk3Code1" > > [927] "FieldValue: " > > [928] "FieldValue: K4" > > [929] "FieldName: Wk3Code2" > > [930] "FieldValue: " > > [931] "FieldValue: Q49" > > > I want this to be programmatically corrected to > read as follows: (All consecutive lines starting > with FieldValue is cleaned to retain only one > line) > > [922] "FieldName: Wk3PackSubMonth" > > [923] "FieldValue: Apr" > > [924] "FieldName: Wk3PackSubYear" > > [925] "FieldValue: 2017" > > [926] "FieldName: Wk3Code1" > > [927] "FieldValue: K4" > > [928] "FieldName: Wk3Code2" > > [929] "FieldValue: Q49" > > Scenario 2: > In the same file, in some instances, the lines > could be as follows: in this case, wherever a line > is beginning with FieldName and the subsequent > line is not displaying a FieldValue, I would want > to programmatically identify such lines and insert > FieldValue (as blank). > > [941] "FieldName: Wk3Code6" > > [942] "FieldValue: " > > [943] "FieldName: Wk3Code7" > > [944] "FieldValue: " > > [945] "FieldName: Wk3PackWSColorStiffRemarkCode1" > > [946] "FieldName: Wk3PackWSColorWrappRemarkCode1" > > [947] "FieldName: > Wk3PackWSDelamiStiffRemarkCode1" > > > ie in the above, it should be replaced as > > [941] "FieldName: Wk3Code6" > > [942] "FieldValue: " > > [943] "FieldName: Wk3Code7" > > [944] "FieldValue: " > > [945] "FieldName: Wk3PackWSColorStiffRemarkCode1" > [946] "FieldValue: " > > [947] "FieldName: Wk3PackWSColorWrappRemarkCode1" > [948] "FieldValue: " > > [949] "FieldName: > Wk3PackWSDelamiStiffRemarkCode1" > [950] "FieldValue: " > > > Can anybod suggest how to acheive this in R? > > Thanks for your time. > Regards > VP > > > > Disclaimer:\ This Communication is for the exclusive use...{{dropped:8}} > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
@Don your solution does not solve Vijayan's scenario 2. I used spread and gather for that. An alternative solution to insert mising Fval - picking up with Don's newtst - is newtst <- c("FName: fname1", "Fval: Fval1.name1", "FName: fname2", "Fval: Fval2.name2", "FName: fname3", "FName: fname4", "Fval: fval4.fname4") newtst_new <- vector(mode = "character", length = sum(grepl("FName", newtst)) * 2) newtst_len <- length(newtst) i <- 1 j <- 1 while(i <= newtst_len){ if(grepl("FName", newtst[i]) & grepl("Fval", newtst[i + 1])){ newtst_new[c(j, j + 1)] <- newtst[c(i, i + 1)] i <- i + 2 }else{ newtst_new[c(j, j + 1)] <- c(newtst[c(i)], "Fval: ") i <- i + 1 } j <- j + 2 } newtst_new which is also not very pretty. HTH Ulrik On Thu, 13 Jul 2017 at 16:48 MacQueen, Don <macqueen1 at llnl.gov> wrote:> Using Ulrik?s example data (and assuming I understand what is wanted), > here is what I would do: > > ex.dat <- c("FName: fname1", "Fval: Fval1.name1", "Fval: ", "FName: > fname2", "Fval: Fval2.name2", "FName: fname3") > tst <- data.frame(x = ex.dat, stringsAsFactors=FALSE) > > sp <- strsplit(tst$x, ':', fixed=TRUE) > chk <- unlist(lapply(sp, function(txt) txt[2] != ' ')) > newtst <- tst[chk,,drop=FALSE] > > This both assumes and requires that ALL of the rows are structured as in > the example data in the original question. > For example: > if any row is missing the ?:?, it will fail. > If the ?:? is not followed by a space character it may fail (I have not > checked) > > -Don > > -- > Don MacQueen > > Lawrence Livermore National Laboratory > 7000 East Ave., L-627 > Livermore, CA 94550 > 925-423-1062 > > > On 7/13/17, 6:47 AM, "R-help on behalf of Ulrik Stervbo" < > r-help-bounces at r-project.org on behalf of ulrik.stervbo at gmail.com> wrote: > > Hi Vijayan, > > one way going about it *could* be this: > > library(dplyr) > library(tidyr) > library(purrr) > > ex_dat <- c("FName: fname1", "Fval: Fval1.name1", "Fval: ", "FName: > fname2", "Fval: Fval2.name2", "FName: fname3") > > data.frame(x = ex_dat) %>% > separate(x, c("F1", "F2"), sep = ": ") %>% > filter(F2 != "") %>% > group_by(F1) %>% > mutate(indx = row_number()) %>% > spread(F1, F2, fill = "") %>% > gather(F1, F2, FName, Fval) %>% > arrange(indx) %>% > mutate(x = paste(F1, F2, sep = ": ")) %>% > select(x) %>% > flatten_chr() > > It is not particularly nice or clever, but it gets the job done using > R. > > HTH > Ulrik > > On Thu, 13 Jul 2017 at 13:13 Vijayan Padmanabhan <V.Padmanabhan at itc.in > > > wrote: > > > > > Dear R-help Group > > > > > > Scenario 1: > > I have a text file running to 1000 of lines...that > > is like as follows: > > > > [922] "FieldName: Wk3PackSubMonth" > > > > [923] "FieldValue: Apr" > > > > [924] "FieldName: Wk3PackSubYear" > > > > [925] "FieldValue: 2017" > > > > [926] "FieldName: Wk3Code1" > > > > [927] "FieldValue: " > > > > [928] "FieldValue: K4" > > > > [929] "FieldName: Wk3Code2" > > > > [930] "FieldValue: " > > > > [931] "FieldValue: Q49" > > > > > > I want this to be programmatically corrected to > > read as follows: (All consecutive lines starting > > with FieldValue is cleaned to retain only one > > line) > > > > [922] "FieldName: Wk3PackSubMonth" > > > > [923] "FieldValue: Apr" > > > > [924] "FieldName: Wk3PackSubYear" > > > > [925] "FieldValue: 2017" > > > > [926] "FieldName: Wk3Code1" > > > > [927] "FieldValue: K4" > > > > [928] "FieldName: Wk3Code2" > > > > [929] "FieldValue: Q49" > > > > Scenario 2: > > In the same file, in some instances, the lines > > could be as follows: in this case, wherever a line > > is beginning with FieldName and the subsequent > > line is not displaying a FieldValue, I would want > > to programmatically identify such lines and insert > > FieldValue (as blank). > > > > [941] "FieldName: Wk3Code6" > > > > [942] "FieldValue: " > > > > [943] "FieldName: Wk3Code7" > > > > [944] "FieldValue: " > > > > [945] "FieldName: Wk3PackWSColorStiffRemarkCode1" > > > > [946] "FieldName: Wk3PackWSColorWrappRemarkCode1" > > > > [947] "FieldName: > > Wk3PackWSDelamiStiffRemarkCode1" > > > > > > ie in the above, it should be replaced as > > > > [941] "FieldName: Wk3Code6" > > > > [942] "FieldValue: " > > > > [943] "FieldName: Wk3Code7" > > > > [944] "FieldValue: " > > > > [945] "FieldName: Wk3PackWSColorStiffRemarkCode1" > > [946] "FieldValue: " > > > > [947] "FieldName: Wk3PackWSColorWrappRemarkCode1" > > [948] "FieldValue: " > > > > [949] "FieldName: > > Wk3PackWSDelamiStiffRemarkCode1" > > [950] "FieldValue: " > > > > > > Can anybod suggest how to acheive this in R? > > > > Thanks for your time. > > Regards > > VP > > > > > > > > Disclaimer:\ This Communication is for the exclusive > use...{{dropped:8}} > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > >[[alternative HTML version deleted]]
Thanks Ulrik and MacQueen I am taking inputs from both your options to arrive at the solution that will work for my specific requirements.. Will post my final solution once I succeed..which could help others with similar challenge in their work.. Appreciate both your time shared on suggesting these solutions.. Thanks & Regards VP From: Ulrik Stervbo <ulrik.stervbo at gmail.com> To: "MacQueen, Don" <macqueen1 at llnl.gov>, Vijayan Padmanabhan <V.Padmanabhan at itc.in>, "r-help at r-project.org" <r-help at r-project.org> Date: 14-07-2017 10:39 Subject: Re: [R] Help with R script @Don your solution does not solve Vijayan's scenario 2. I used spread and gather for that. An alternative solution to insert mising Fval - picking up with Don's newtst - is newtst <- c("FName: fname1", "Fval: Fval1.name1", "FName: fname2", "Fval: Fval2.name2", "FName: fname3", "FName: fname4", "Fval: fval4.fname4") newtst_new <- vector(mode = "character", length sum(grepl("FName", newtst)) * 2) newtst_len <- length(newtst) i <- 1 j <- 1 while(i <= newtst_len){ ? if(grepl("FName", newtst[i]) & grepl("Fval", newtst[i + 1])){ ? ? newtst_new[c(j, j + 1)] <- newtst[c(i, i + 1)] ? ? i <- i + 2 ? }else{ ? ? newtst_new[c(j, j + 1)] <- c(newtst[c(i)], "Fval: ") ? ? i <- i + 1 ? } ? j <- j + 2 } newtst_new which is also not very pretty. HTH Ulrik On Thu, 13 Jul 2017 at 16:48 MacQueen, Don < macqueen1 at llnl.gov> wrote: Using Ulrik?s example data (and assuming I understand what is wanted), here is what I would do: ex.dat <- c("FName: fname1", "Fval: Fval1.name1", "Fval: ", "FName: fname2", "Fval: Fval2.name2", "FName: fname3") tst <- data.frame(x = ex.dat, stringsAsFactors=FALSE) sp <- strsplit(tst$x, ':', fixed=TRUE) chk <- unlist(lapply(sp, function(txt) txt[2] ! ' ')) newtst <- tst[chk,,drop=FALSE] This both assumes and requires that ALL of the rows are structured as in the example data in the original question. For example: ? if any row is missing the ?:?, it will fail. ? If the ?:? is not followed by a space character it may fail (I have not checked) -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 7/13/17, 6:47 AM, "R-help on behalf of Ulrik Stervbo" <r-help-bounces at r-project.org on behalf of ulrik.stervbo at gmail.com> wrote: ? ? Hi Vijayan, ? ? one way going about it *could* be this: ? ? library(dplyr) ? ? library(tidyr) ? ? library(purrr) ? ? ex_dat <- c("FName: fname1", "Fval: Fval1.name1", "Fval: ", "FName: ? ? fname2", "Fval: Fval2.name2", "FName: fname3") ? ? data.frame(x = ex_dat) %>% ? ? ? separate(x, c("F1", "F2"), sep = ": ") %>% ? ? ? filter(F2 != "") %>% ? ? ? group_by(F1) %>% ? ? ? mutate(indx = row_number()) %>% ? ? ? spread(F1, F2, fill = "") %>% ? ? ? gather(F1, F2, FName, Fval) %>% ? ? ? arrange(indx) %>% ? ? ? mutate(x = paste(F1, F2, sep = ": ")) %>% ? ? ? select(x) %>% ? ? ? flatten_chr() ? ? It is not particularly nice or clever, but it gets the job done using R. ? ? HTH ? ? Ulrik ? ? On Thu, 13 Jul 2017 at 13:13 Vijayan Padmanabhan <V.Padmanabhan at itc.in> ? ? wrote: ? ? > ? ? > Dear R-help Group ? ? > ? ? > ? ? > Scenario 1: ? ? > I have a text file running to 1000 of lines...that ? ? > is like as follows: ? ? > ? ? > [922] "FieldName: Wk3PackSubMonth" ? ? > ? ? >? [923] "FieldValue: Apr" ? ? > ? ? >? [924] "FieldName: Wk3PackSubYear" ? ? > ? ? >? [925] "FieldValue: 2017" ? ? > ? ? >? [926] "FieldName: Wk3Code1" ? ? > ? ? >? [927] "FieldValue: " ? ? > ? ? >? [928] "FieldValue: K4" ? ? > ? ? >? [929] "FieldName: Wk3Code2" ? ? > ? ? >? [930] "FieldValue: " ? ? > ? ? >? [931] "FieldValue: Q49" ? ? > ? ? > ? ? > I want this to be programmatically corrected to ? ? > read as follows: (All consecutive lines starting ? ? > with FieldValue is cleaned to retain only one ? ? > line) ? ? > ? ? > [922] "FieldName: Wk3PackSubMonth" ? ? > ? ? >? [923] "FieldValue: Apr" ? ? > ? ? >? [924] "FieldName: Wk3PackSubYear" ? ? > ? ? >? [925] "FieldValue: 2017" ? ? > ? ? >? [926] "FieldName: Wk3Code1" ? ? > ? ? >? [927] "FieldValue: K4" ? ? > ? ? >? [928] "FieldName: Wk3Code2" ? ? > ? ? >? [929] "FieldValue: Q49" ? ? > ? ? > Scenario 2: ? ? > In the same file, in some instances, the lines ? ? > could be as follows: in this case, wherever a line ? ? > is beginning with FieldName and the subsequent ? ? > line is not displaying a FieldValue, I would want ? ? > to programmatically identify such lines and insert ? ? > FieldValue (as blank). ? ? > ? ? > [941] "FieldName: Wk3Code6" ? ? > ? ? >? [942] "FieldValue: " ? ? > ? ? >? [943] "FieldName: Wk3Code7" ? ? > ? ? >? [944] "FieldValue: " ? ? > ? ? >? [945] "FieldName: Wk3PackWSColorStiffRemarkCode1" ? ? > ? ? >? [946] "FieldName: Wk3PackWSColorWrappRemarkCode1" ? ? > ? ? >? [947] "FieldName: ? ? > Wk3PackWSDelamiStiffRemarkCode1" ? ? > ? ? > ? ? > ie in the above, it should be replaced as ? ? > ? ? > [941] "FieldName: Wk3Code6" ? ? > ? ? >? [942] "FieldValue: " ? ? > ? ? >? [943] "FieldName: Wk3Code7" ? ? > ? ? >? [944] "FieldValue: " ? ? > ? ? >? [945] "FieldName: Wk3PackWSColorStiffRemarkCode1" ? ? >? [946] "FieldValue: " ? ? > ? ? >? [947] "FieldName: Wk3PackWSColorWrappRemarkCode1" ? ? >? [948] "FieldValue: " ? ? > ? ? >? [949] "FieldName: ? ? > Wk3PackWSDelamiStiffRemarkCode1" ? ? >? [950] "FieldValue: " ? ? > ? ? > ? ? > Can anybod suggest how to acheive this in R? ? ? > ? ? > Thanks for your time. ? ? > Regards ? ? > VP ? ? > ? ? > ? ? > ? ? > Disclaimer:\ This Communication is for the exclusive use...{{dropped:8}} ? ? > ? ? > ______________________________________________ ? ? > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see ? ? > https://stat.ethz.ch/mailman/listinfo/r-help ? ? > PLEASE do read the posting guide ? ? > http://www.R-project.org/posting-guide.html ? ? > and provide commented, minimal, self-contained, reproducible code. ? ? > ? ? ? ? [[alternative HTML version deleted]] ______________________________________________ ? ? R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see ? ? https://stat.ethz.ch/mailman/listinfo/r-help ? ? PLEASE do read the posting guide http://www.R-project.org/posting-guide.html ? ? and provide commented, minimal, self-contained, reproducible code. Disclaimer: This Communication is for the exclusive use of the intended recipient(s) and shall not attach any liability on the originator or ITC Ltd./its Subsidiaries/its Group Companies. If you are the addressee, the contents of this email are intended for your use only and it shall not be forwarded to any third party, without first obtaining written authorization from the originator or ITC Ltd./its Subsidiaries/its Group Companies. It may contain information which is confidential and legally privileged and the same shall not be used or dealt with by any third party in any manner whatsoever without the specific consent of ITC Ltd./its Subsidiaries/its Group Companies. If this Email is received in error, please contact the System Administrator of ITC Limited at webmaster at itc.in by quoting the name of the sender and the Email address to which it has been sent and then delete it. Please note that ITC Ltd/its subsidiaries/its Group Companies accept no responsibility for viruses and it is your responsibility to scan or otherwise check this Email and any attachments. Please be advised that Email communications will not result in an agreement binding ITC Ltd/its subsidiaries/its Group Companies. Such contracts should be executed separately and only by managers authorized in this behalf.