Dear R-help Group Scenario 1: I have a text file running to 1000 of lines...that is like as follows: [922] "FieldName: Wk3PackSubMonth" [923] "FieldValue: Apr" [924] "FieldName: Wk3PackSubYear" [925] "FieldValue: 2017" [926] "FieldName: Wk3Code1" [927] "FieldValue: " [928] "FieldValue: K4" [929] "FieldName: Wk3Code2" [930] "FieldValue: " [931] "FieldValue: Q49" I want this to be programmatically corrected to read as follows: (All consecutive lines starting with FieldValue is cleaned to retain only one line) [922] "FieldName: Wk3PackSubMonth" [923] "FieldValue: Apr" [924] "FieldName: Wk3PackSubYear" [925] "FieldValue: 2017" [926] "FieldName: Wk3Code1" [927] "FieldValue: K4" [928] "FieldName: Wk3Code2" [929] "FieldValue: Q49" Scenario 2: In the same file, in some instances, the lines could be as follows: in this case, wherever a line is beginning with FieldName and the subsequent line is not displaying a FieldValue, I would want to programmatically identify such lines and insert FieldValue (as blank). [941] "FieldName: Wk3Code6" [942] "FieldValue: " [943] "FieldName: Wk3Code7" [944] "FieldValue: " [945] "FieldName: Wk3PackWSColorStiffRemarkCode1" [946] "FieldName: Wk3PackWSColorWrappRemarkCode1" [947] "FieldName: Wk3PackWSDelamiStiffRemarkCode1" ie in the above, it should be replaced as [941] "FieldName: Wk3Code6" [942] "FieldValue: " [943] "FieldName: Wk3Code7" [944] "FieldValue: " [945] "FieldName: Wk3PackWSColorStiffRemarkCode1" [946] "FieldValue: " [947] "FieldName: Wk3PackWSColorWrappRemarkCode1" [948] "FieldValue: " [949] "FieldName: Wk3PackWSDelamiStiffRemarkCode1" [950] "FieldValue: " Can anybod suggest how to acheive this in R? Thanks for your time. Regards VP

Hi Vijayan, one way going about it *could* be this: library(dplyr) library(tidyr) library(purrr) ex_dat <- c("FName: fname1", "Fval: Fval1.name1", "Fval: ", "FName: fname2", "Fval: Fval2.name2", "FName: fname3") data.frame(x = ex_dat) %>% separate(x, c("F1", "F2"), sep = ": ") %>% filter(F2 != "") %>% group_by(F1) %>% mutate(indx = row_number()) %>% spread(F1, F2, fill = "") %>% gather(F1, F2, FName, Fval) %>% arrange(indx) %>% mutate(x = paste(F1, F2, sep = ": ")) %>% select(x) %>% flatten_chr() It is not particularly nice or clever, but it gets the job done using R. HTH Ulrik

Using Ulrik?s example data (and assuming I understand what is wanted), here is what I would do: ex.dat <- c("FName: fname1", "Fval: Fval1.name1", "Fval: ", "FName: fname2", "Fval: Fval2.name2", "FName: fname3") tst <- data.frame(x = ex.dat, stringsAsFactors=FALSE) sp <- strsplit(tst$x, ':', fixed=TRUE) chk <- unlist(lapply(sp, function(txt) txt[2] != ' ')) newtst <- tst[chk,,drop=FALSE] This both assumes and requires that ALL of the rows are structured as in the example data in the original question. For example: if any row is missing the ?:?, it will fail. If the ?:? is not followed by a space character it may fail (I have not checked) -Don -- Don MacQueen Lawrence Livermore National Laboratory