Dear R-help Group Scenario 1: I have a text file running to 1000 of lines...that is like as follows: [922] "FieldName: Wk3PackSubMonth" [923] "FieldValue: Apr" [924] "FieldName: Wk3PackSubYear" [925] "FieldValue: 2017" [926] "FieldName: Wk3Code1" [927] "FieldValue: " [928] "FieldValue: K4" [929] "FieldName: Wk3Code2" [930] "FieldValue: " [931] "FieldValue: Q49" I want this to be programmatically corrected to read as follows: (All consecutive lines starting with FieldValue is cleaned to retain only one line) [922] "FieldName: Wk3PackSubMonth" [923] "FieldValue: Apr" [924] "FieldName: Wk3PackSubYear" [925] "FieldValue: 2017" [926] "FieldName: Wk3Code1" [927] "FieldValue: K4" [928] "FieldName: Wk3Code2" [929] "FieldValue: Q49" Scenario 2: In the same file, in some instances, the lines could be as follows: in this case, wherever a line is beginning with FieldName and the subsequent line is not displaying a FieldValue, I would want to programmatically identify such lines and insert FieldValue (as blank). [941] "FieldName: Wk3Code6" [942] "FieldValue: " [943] "FieldName: Wk3Code7" [944] "FieldValue: " [945] "FieldName: Wk3PackWSColorStiffRemarkCode1" [946] "FieldName: Wk3PackWSColorWrappRemarkCode1" [947] "FieldName: Wk3PackWSDelamiStiffRemarkCode1" ie in the above, it should be replaced as [941] "FieldName: Wk3Code6" [942] "FieldValue: " [943] "FieldName: Wk3Code7" [944] "FieldValue: " [945] "FieldName: Wk3PackWSColorStiffRemarkCode1" [946] "FieldValue: " [947] "FieldName: Wk3PackWSColorWrappRemarkCode1" [948] "FieldValue: " [949] "FieldName: Wk3PackWSDelamiStiffRemarkCode1" [950] "FieldValue: " Can anybod suggest how to acheive this in R? Thanks for your time. Regards VP Disclaimer:\ This Communication is for the exclusive use...{{dropped:8}}
Hi Vijayan, one way going about it *could* be this: library(dplyr) library(tidyr) library(purrr) ex_dat <- c("FName: fname1", "Fval: Fval1.name1", "Fval: ", "FName: fname2", "Fval: Fval2.name2", "FName: fname3") data.frame(x = ex_dat) %>% separate(x, c("F1", "F2"), sep = ": ") %>% filter(F2 != "") %>% group_by(F1) %>% mutate(indx = row_number()) %>% spread(F1, F2, fill = "") %>% gather(F1, F2, FName, Fval) %>% arrange(indx) %>% mutate(x = paste(F1, F2, sep = ": ")) %>% select(x) %>% flatten_chr() It is not particularly nice or clever, but it gets the job done using R. HTH Ulrik On Thu, 13 Jul 2017 at 13:13 Vijayan Padmanabhan <V.Padmanabhan at itc.in> wrote:> > Dear R-help Group > > > Scenario 1: > I have a text file running to 1000 of lines...that > is like as follows: > > [922] "FieldName: Wk3PackSubMonth" > > [923] "FieldValue: Apr" > > [924] "FieldName: Wk3PackSubYear" > > [925] "FieldValue: 2017" > > [926] "FieldName: Wk3Code1" > > [927] "FieldValue: " > > [928] "FieldValue: K4" > > [929] "FieldName: Wk3Code2" > > [930] "FieldValue: " > > [931] "FieldValue: Q49" > > > I want this to be programmatically corrected to > read as follows: (All consecutive lines starting > with FieldValue is cleaned to retain only one > line) > > [922] "FieldName: Wk3PackSubMonth" > > [923] "FieldValue: Apr" > > [924] "FieldName: Wk3PackSubYear" > > [925] "FieldValue: 2017" > > [926] "FieldName: Wk3Code1" > > [927] "FieldValue: K4" > > [928] "FieldName: Wk3Code2" > > [929] "FieldValue: Q49" > > Scenario 2: > In the same file, in some instances, the lines > could be as follows: in this case, wherever a line > is beginning with FieldName and the subsequent > line is not displaying a FieldValue, I would want > to programmatically identify such lines and insert > FieldValue (as blank). > > [941] "FieldName: Wk3Code6" > > [942] "FieldValue: " > > [943] "FieldName: Wk3Code7" > > [944] "FieldValue: " > > [945] "FieldName: Wk3PackWSColorStiffRemarkCode1" > > [946] "FieldName: Wk3PackWSColorWrappRemarkCode1" > > [947] "FieldName: > Wk3PackWSDelamiStiffRemarkCode1" > > > ie in the above, it should be replaced as > > [941] "FieldName: Wk3Code6" > > [942] "FieldValue: " > > [943] "FieldName: Wk3Code7" > > [944] "FieldValue: " > > [945] "FieldName: Wk3PackWSColorStiffRemarkCode1" > [946] "FieldValue: " > > [947] "FieldName: Wk3PackWSColorWrappRemarkCode1" > [948] "FieldValue: " > > [949] "FieldName: > Wk3PackWSDelamiStiffRemarkCode1" > [950] "FieldValue: " > > > Can anybod suggest how to acheive this in R? > > Thanks for your time. > Regards > VP > > > > Disclaimer:\ This Communication is for the exclusive use...{{dropped:8}} > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Using Ulrik?s example data (and assuming I understand what is wanted), here is what I would do: ex.dat <- c("FName: fname1", "Fval: Fval1.name1", "Fval: ", "FName: fname2", "Fval: Fval2.name2", "FName: fname3") tst <- data.frame(x = ex.dat, stringsAsFactors=FALSE) sp <- strsplit(tst$x, ':', fixed=TRUE) chk <- unlist(lapply(sp, function(txt) txt[2] != ' ')) newtst <- tst[chk,,drop=FALSE] This both assumes and requires that ALL of the rows are structured as in the example data in the original question. For example: if any row is missing the ?:?, it will fail. If the ?:? is not followed by a space character it may fail (I have not checked) -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 7/13/17, 6:47 AM, "R-help on behalf of Ulrik Stervbo" <r-help-bounces at r-project.org on behalf of ulrik.stervbo at gmail.com> wrote: Hi Vijayan, one way going about it *could* be this: library(dplyr) library(tidyr) library(purrr) ex_dat <- c("FName: fname1", "Fval: Fval1.name1", "Fval: ", "FName: fname2", "Fval: Fval2.name2", "FName: fname3") data.frame(x = ex_dat) %>% separate(x, c("F1", "F2"), sep = ": ") %>% filter(F2 != "") %>% group_by(F1) %>% mutate(indx = row_number()) %>% spread(F1, F2, fill = "") %>% gather(F1, F2, FName, Fval) %>% arrange(indx) %>% mutate(x = paste(F1, F2, sep = ": ")) %>% select(x) %>% flatten_chr() It is not particularly nice or clever, but it gets the job done using R. HTH Ulrik On Thu, 13 Jul 2017 at 13:13 Vijayan Padmanabhan <V.Padmanabhan at itc.in> wrote: > > Dear R-help Group > > > Scenario 1: > I have a text file running to 1000 of lines...that > is like as follows: > > [922] "FieldName: Wk3PackSubMonth" > > [923] "FieldValue: Apr" > > [924] "FieldName: Wk3PackSubYear" > > [925] "FieldValue: 2017" > > [926] "FieldName: Wk3Code1" > > [927] "FieldValue: " > > [928] "FieldValue: K4" > > [929] "FieldName: Wk3Code2" > > [930] "FieldValue: " > > [931] "FieldValue: Q49" > > > I want this to be programmatically corrected to > read as follows: (All consecutive lines starting > with FieldValue is cleaned to retain only one > line) > > [922] "FieldName: Wk3PackSubMonth" > > [923] "FieldValue: Apr" > > [924] "FieldName: Wk3PackSubYear" > > [925] "FieldValue: 2017" > > [926] "FieldName: Wk3Code1" > > [927] "FieldValue: K4" > > [928] "FieldName: Wk3Code2" > > [929] "FieldValue: Q49" > > Scenario 2: > In the same file, in some instances, the lines > could be as follows: in this case, wherever a line > is beginning with FieldName and the subsequent > line is not displaying a FieldValue, I would want > to programmatically identify such lines and insert > FieldValue (as blank). > > [941] "FieldName: Wk3Code6" > > [942] "FieldValue: " > > [943] "FieldName: Wk3Code7" > > [944] "FieldValue: " > > [945] "FieldName: Wk3PackWSColorStiffRemarkCode1" > > [946] "FieldName: Wk3PackWSColorWrappRemarkCode1" > > [947] "FieldName: > Wk3PackWSDelamiStiffRemarkCode1" > > > ie in the above, it should be replaced as > > [941] "FieldName: Wk3Code6" > > [942] "FieldValue: " > > [943] "FieldName: Wk3Code7" > > [944] "FieldValue: " > > [945] "FieldName: Wk3PackWSColorStiffRemarkCode1" > [946] "FieldValue: " > > [947] "FieldName: Wk3PackWSColorWrappRemarkCode1" > [948] "FieldValue: " > > [949] "FieldName: > Wk3PackWSDelamiStiffRemarkCode1" > [950] "FieldValue: " > > > Can anybod suggest how to acheive this in R? > > Thanks for your time. > Regards > VP > > > > Disclaimer:\ This Communication is for the exclusive use...{{dropped:8}} > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.