thr3ads.net - R help - [R] Help with R script [Jul 2017]

If this information is useful, please help other people find it:
Share via:

MacQueen, Don

2017-Jul-13 14:48 UTC

[R] Help with R script

Using Ulrik?s example data (and assuming I understand what is wanted), here is
what I would do:

ex.dat <- c("FName: fname1", "Fval: Fval1.name1",
"Fval: ", "FName: fname2", "Fval: Fval2.name2",
"FName: fname3")
tst <- data.frame(x = ex.dat, stringsAsFactors=FALSE)

sp <- strsplit(tst$x, ':', fixed=TRUE)
chk <- unlist(lapply(sp, function(txt) txt[2] != ' '))
newtst <- tst[chk,,drop=FALSE]

This both assumes and requires that ALL of the rows are structured as in the
example data in the original question.
For example:
  if any row is missing the ?:?, it will fail.
  If the ?:? is not followed by a space character it may fail (I have not
checked)

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062


On 7/13/17, 6:47 AM, "R-help on behalf of Ulrik Stervbo"
<r-help-bounces at r-project.org on behalf of ulrik.stervbo at gmail.com>
wrote:

    Hi Vijayan,
    
    one way going about it *could* be this:
    
    library(dplyr)
    library(tidyr)
    library(purrr)
    
    ex_dat <- c("FName: fname1", "Fval: Fval1.name1",
"Fval: ", "FName:
    fname2", "Fval: Fval2.name2", "FName: fname3")
    
    data.frame(x = ex_dat) %>%
      separate(x, c("F1", "F2"), sep = ": ")
%>%
      filter(F2 != "") %>%
      group_by(F1) %>%
      mutate(indx = row_number()) %>%
      spread(F1, F2, fill = "") %>%
      gather(F1, F2, FName, Fval) %>%
      arrange(indx) %>%
      mutate(x = paste(F1, F2, sep = ": ")) %>%
      select(x) %>%
      flatten_chr()
    
    It is not particularly nice or clever, but it gets the job done using R.
    
    HTH
    Ulrik
    
    On Thu, 13 Jul 2017 at 13:13 Vijayan Padmanabhan <V.Padmanabhan at
itc.in>
    wrote:
    
    >
    > Dear R-help Group
    >
    >
    > Scenario 1:
    > I have a text file running to 1000 of lines...that
    > is like as follows:
    >
    > [922] "FieldName: Wk3PackSubMonth"
    >
    >  [923] "FieldValue: Apr"
    >
    >  [924] "FieldName: Wk3PackSubYear"
    >
    >  [925] "FieldValue: 2017"
    >
    >  [926] "FieldName: Wk3Code1"
    >
    >  [927] "FieldValue: "
    >
    >  [928] "FieldValue: K4"
    >
    >  [929] "FieldName: Wk3Code2"
    >
    >  [930] "FieldValue: "
    >
    >  [931] "FieldValue: Q49"
    >
    >
    > I want this to be programmatically corrected to
    > read as follows: (All consecutive lines starting
    > with FieldValue is cleaned to retain only one
    > line)
    >
    > [922] "FieldName: Wk3PackSubMonth"
    >
    >  [923] "FieldValue: Apr"
    >
    >  [924] "FieldName: Wk3PackSubYear"
    >
    >  [925] "FieldValue: 2017"
    >
    >  [926] "FieldName: Wk3Code1"
    >
    >  [927] "FieldValue: K4"
    >
    >  [928] "FieldName: Wk3Code2"
    >
    >  [929] "FieldValue: Q49"
    >
    > Scenario 2:
    > In the same file, in some instances, the lines
    > could be as follows: in this case, wherever a line
    > is beginning with FieldName and the subsequent
    > line is not displaying a FieldValue, I would want
    > to programmatically identify such lines and insert
    > FieldValue (as blank).
    >
    > [941] "FieldName: Wk3Code6"
    >
    >  [942] "FieldValue: "
    >
    >  [943] "FieldName: Wk3Code7"
    >
    >  [944] "FieldValue: "
    >
    >  [945] "FieldName: Wk3PackWSColorStiffRemarkCode1"
    >
    >  [946] "FieldName: Wk3PackWSColorWrappRemarkCode1"
    >
    >  [947] "FieldName:
    > Wk3PackWSDelamiStiffRemarkCode1"
    >
    >
    > ie in the above, it should be replaced as
    >
    > [941] "FieldName: Wk3Code6"
    >
    >  [942] "FieldValue: "
    >
    >  [943] "FieldName: Wk3Code7"
    >
    >  [944] "FieldValue: "
    >
    >  [945] "FieldName: Wk3PackWSColorStiffRemarkCode1"
    >  [946] "FieldValue: "
    >
    >  [947] "FieldName: Wk3PackWSColorWrappRemarkCode1"
    >  [948] "FieldValue: "
    >
    >  [949] "FieldName:
    > Wk3PackWSDelamiStiffRemarkCode1"
    >  [950] "FieldValue: "
    >
    >
    > Can anybod suggest how to acheive this in R?
    >
    > Thanks for your time.
    > Regards
    > VP
    >
    >
    >
    > Disclaimer:\ This Communication is for the exclusive
use...{{dropped:8}}
    >
    > ______________________________________________
    > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
    > https://stat.ethz.ch/mailman/listinfo/r-help
    > PLEASE do read the posting guide
    > http://www.R-project.org/posting-guide.html
    > and provide commented, minimal, self-contained, reproducible code.
    >
    
    	[[alternative HTML version deleted]]
    
    ______________________________________________
    R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.

Ulrik Stervbo

2017-Jul-14 05:09 UTC

head link

[R] Help with R script

@Don your solution does not solve Vijayan's scenario 2. I used spread and
gather for that.

An alternative solution to insert mising Fval - picking up with Don's
newtst - is

newtst <- c("FName: fname1", "Fval: Fval1.name1",
"FName: fname2", "Fval:
Fval2.name2", "FName: fname3", "FName: fname4",
"Fval: fval4.fname4")

newtst_new <- vector(mode = "character", length =
sum(grepl("FName",
newtst)) * 2)
newtst_len <- length(newtst)
i <- 1
j <- 1
while(i <= newtst_len){
  if(grepl("FName", newtst[i]) & grepl("Fval", newtst[i
+ 1])){
    newtst_new[c(j, j + 1)] <- newtst[c(i, i + 1)]
    i <- i + 2
  }else{
    newtst_new[c(j, j + 1)] <- c(newtst[c(i)], "Fval: ")
    i <- i + 1
  }
  j <- j + 2

}
newtst_new

which is also not very pretty.

HTH
Ulrik

On Thu, 13 Jul 2017 at 16:48 MacQueen, Don <macqueen1 at llnl.gov> wrote:
> Using Ulrik?s example data (and assuming I understand what is wanted),
> here is what I would do:
>
> ex.dat <- c("FName: fname1", "Fval: Fval1.name1",
"Fval: ", "FName:
> fname2", "Fval: Fval2.name2", "FName: fname3")
> tst <- data.frame(x = ex.dat, stringsAsFactors=FALSE)
>
> sp <- strsplit(tst$x, ':', fixed=TRUE)
> chk <- unlist(lapply(sp, function(txt) txt[2] != ' '))
> newtst <- tst[chk,,drop=FALSE]
>
> This both assumes and requires that ALL of the rows are structured as in
> the example data in the original question.
> For example:
>   if any row is missing the ?:?, it will fail.
>   If the ?:? is not followed by a space character it may fail (I have not
> checked)
>
> -Don
>
> --
> Don MacQueen
>
> Lawrence Livermore National Laboratory
> 7000 East Ave., L-627
> Livermore, CA 94550
> 925-423-1062
>
>
> On 7/13/17, 6:47 AM, "R-help on behalf of Ulrik Stervbo" <
> r-help-bounces at r-project.org on behalf of ulrik.stervbo at gmail.com>
wrote:
>
>     Hi Vijayan,
>
>     one way going about it *could* be this:
>
>     library(dplyr)
>     library(tidyr)
>     library(purrr)
>
>     ex_dat <- c("FName: fname1", "Fval:
Fval1.name1", "Fval: ", "FName:
>     fname2", "Fval: Fval2.name2", "FName: fname3")
>
>     data.frame(x = ex_dat) %>%
>       separate(x, c("F1", "F2"), sep = ": ")
%>%
>       filter(F2 != "") %>%
>       group_by(F1) %>%
>       mutate(indx = row_number()) %>%
>       spread(F1, F2, fill = "") %>%
>       gather(F1, F2, FName, Fval) %>%
>       arrange(indx) %>%
>       mutate(x = paste(F1, F2, sep = ": ")) %>%
>       select(x) %>%
>       flatten_chr()
>
>     It is not particularly nice or clever, but it gets the job done using
> R.
>
>     HTH
>     Ulrik
>
>     On Thu, 13 Jul 2017 at 13:13 Vijayan Padmanabhan <V.Padmanabhan at
itc.in
> >
>     wrote:
>
>     >
>     > Dear R-help Group
>     >
>     >
>     > Scenario 1:
>     > I have a text file running to 1000 of lines...that
>     > is like as follows:
>     >
>     > [922] "FieldName: Wk3PackSubMonth"
>     >
>     >  [923] "FieldValue: Apr"
>     >
>     >  [924] "FieldName: Wk3PackSubYear"
>     >
>     >  [925] "FieldValue: 2017"
>     >
>     >  [926] "FieldName: Wk3Code1"
>     >
>     >  [927] "FieldValue: "
>     >
>     >  [928] "FieldValue: K4"
>     >
>     >  [929] "FieldName: Wk3Code2"
>     >
>     >  [930] "FieldValue: "
>     >
>     >  [931] "FieldValue: Q49"
>     >
>     >
>     > I want this to be programmatically corrected to
>     > read as follows: (All consecutive lines starting
>     > with FieldValue is cleaned to retain only one
>     > line)
>     >
>     > [922] "FieldName: Wk3PackSubMonth"
>     >
>     >  [923] "FieldValue: Apr"
>     >
>     >  [924] "FieldName: Wk3PackSubYear"
>     >
>     >  [925] "FieldValue: 2017"
>     >
>     >  [926] "FieldName: Wk3Code1"
>     >
>     >  [927] "FieldValue: K4"
>     >
>     >  [928] "FieldName: Wk3Code2"
>     >
>     >  [929] "FieldValue: Q49"
>     >
>     > Scenario 2:
>     > In the same file, in some instances, the lines
>     > could be as follows: in this case, wherever a line
>     > is beginning with FieldName and the subsequent
>     > line is not displaying a FieldValue, I would want
>     > to programmatically identify such lines and insert
>     > FieldValue (as blank).
>     >
>     > [941] "FieldName: Wk3Code6"
>     >
>     >  [942] "FieldValue: "
>     >
>     >  [943] "FieldName: Wk3Code7"
>     >
>     >  [944] "FieldValue: "
>     >
>     >  [945] "FieldName: Wk3PackWSColorStiffRemarkCode1"
>     >
>     >  [946] "FieldName: Wk3PackWSColorWrappRemarkCode1"
>     >
>     >  [947] "FieldName:
>     > Wk3PackWSDelamiStiffRemarkCode1"
>     >
>     >
>     > ie in the above, it should be replaced as
>     >
>     > [941] "FieldName: Wk3Code6"
>     >
>     >  [942] "FieldValue: "
>     >
>     >  [943] "FieldName: Wk3Code7"
>     >
>     >  [944] "FieldValue: "
>     >
>     >  [945] "FieldName: Wk3PackWSColorStiffRemarkCode1"
>     >  [946] "FieldValue: "
>     >
>     >  [947] "FieldName: Wk3PackWSColorWrappRemarkCode1"
>     >  [948] "FieldValue: "
>     >
>     >  [949] "FieldName:
>     > Wk3PackWSDelamiStiffRemarkCode1"
>     >  [950] "FieldValue: "
>     >
>     >
>     > Can anybod suggest how to acheive this in R?
>     >
>     > Thanks for your time.
>     > Regards
>     > VP
>     >
>     >
>     >
>     > Disclaimer:\ This Communication is for the exclusive
> use...{{dropped:8}}
>     >
>     > ______________________________________________
>     > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>     > https://stat.ethz.ch/mailman/listinfo/r-help
>     > PLEASE do read the posting guide
>     > http://www.R-project.org/posting-guide.html
>     > and provide commented, minimal, self-contained, reproducible code.
>     >
>
>         [[alternative HTML version deleted]]
>
>     ______________________________________________
>     R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>     https://stat.ethz.ch/mailman/listinfo/r-help
>     PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>     and provide commented, minimal, self-contained, reproducible code.
>
>
>
	[[alternative HTML version deleted]]

Vijayan Padmanabhan

2017-Jul-14 11:59 UTC

head link

[R] Help with R script

Thanks Ulrik and MacQueen
I am taking inputs from both your options to
arrive at the solution that will work for my
specific requirements..
Will post my final solution once I succeed..which
could help others with similar challenge in their
work..
Appreciate both your time shared on suggesting
these solutions..

Thanks & Regards
VP




From:	Ulrik Stervbo
            <ulrik.stervbo at gmail.com>
To:	"MacQueen, Don" <macqueen1 at llnl.gov>,
            Vijayan Padmanabhan
            <V.Padmanabhan at itc.in>,
            "r-help at r-project.org"
            <r-help at r-project.org>
Date:	14-07-2017 10:39
Subject:	Re: [R] Help with R script



@Don your solution does not solve Vijayan's
scenario 2. I used spread and gather for that.

An alternative solution to insert mising Fval -
picking up with Don's newtst - is

newtst <- c("FName: fname1", "Fval: Fval1.name1",
"FName: fname2", "Fval: Fval2.name2", "FName:
fname3", "FName: fname4", "Fval: fval4.fname4")

newtst_new <- vector(mode = "character", length
sum(grepl("FName", newtst)) * 2)
newtst_len <- length(newtst)
i <- 1
j <- 1
while(i <= newtst_len){
? if(grepl("FName", newtst[i]) & grepl("Fval",
newtst[i + 1])){
? ? newtst_new[c(j, j + 1)] <- newtst[c(i, i + 1)]
? ? i <- i + 2
? }else{
? ? newtst_new[c(j, j + 1)] <- c(newtst[c(i)],
"Fval: ")
? ? i <- i + 1
? }
? j <- j + 2

}
newtst_new

which is also not very pretty.

HTH
Ulrik

On Thu, 13 Jul 2017 at 16:48 MacQueen, Don <
macqueen1 at llnl.gov> wrote:
  Using Ulrik?s example data (and assuming I
  understand what is wanted), here is what I would
  do:

  ex.dat <- c("FName: fname1", "Fval:
  Fval1.name1", "Fval: ", "FName: fname2", "Fval:
  Fval2.name2", "FName: fname3")
  tst <- data.frame(x = ex.dat,
  stringsAsFactors=FALSE)

  sp <- strsplit(tst$x, ':', fixed=TRUE)
  chk <- unlist(lapply(sp, function(txt) txt[2] !  ' '))
  newtst <- tst[chk,,drop=FALSE]

  This both assumes and requires that ALL of the
  rows are structured as in the example data in
  the original question.
  For example:
  ? if any row is missing the ?:?, it will fail.
  ? If the ?:? is not followed by a space
  character it may fail (I have not checked)

  -Don

  --
  Don MacQueen

  Lawrence Livermore National Laboratory
  7000 East Ave., L-627
  Livermore, CA 94550
  925-423-1062


  On 7/13/17, 6:47 AM, "R-help on behalf of Ulrik
  Stervbo" <r-help-bounces at r-project.org on behalf
  of ulrik.stervbo at gmail.com> wrote:

  ? ? Hi Vijayan,

  ? ? one way going about it *could* be this:

  ? ? library(dplyr)
  ? ? library(tidyr)
  ? ? library(purrr)

  ? ? ex_dat <- c("FName: fname1", "Fval:
  Fval1.name1", "Fval: ", "FName:
  ? ? fname2", "Fval: Fval2.name2", "FName:
  fname3")

  ? ? data.frame(x = ex_dat) %>%
  ? ? ? separate(x, c("F1", "F2"), sep = ": ")
%>%
  ? ? ? filter(F2 != "") %>%
  ? ? ? group_by(F1) %>%
  ? ? ? mutate(indx = row_number()) %>%
  ? ? ? spread(F1, F2, fill = "") %>%
  ? ? ? gather(F1, F2, FName, Fval) %>%
  ? ? ? arrange(indx) %>%
  ? ? ? mutate(x = paste(F1, F2, sep = ": ")) %>%
  ? ? ? select(x) %>%
  ? ? ? flatten_chr()

  ? ? It is not particularly nice or clever, but
  it gets the job done using R.

  ? ? HTH
  ? ? Ulrik

  ? ? On Thu, 13 Jul 2017 at 13:13 Vijayan
  Padmanabhan <V.Padmanabhan at itc.in>
  ? ? wrote:

  ? ? >
  ? ? > Dear R-help Group
  ? ? >
  ? ? >
  ? ? > Scenario 1:
  ? ? > I have a text file running to 1000 of
  lines...that
  ? ? > is like as follows:
  ? ? >
  ? ? > [922] "FieldName: Wk3PackSubMonth"
  ? ? >
  ? ? >? [923] "FieldValue: Apr"
  ? ? >
  ? ? >? [924] "FieldName: Wk3PackSubYear"
  ? ? >
  ? ? >? [925] "FieldValue: 2017"
  ? ? >
  ? ? >? [926] "FieldName: Wk3Code1"
  ? ? >
  ? ? >? [927] "FieldValue: "
  ? ? >
  ? ? >? [928] "FieldValue: K4"
  ? ? >
  ? ? >? [929] "FieldName: Wk3Code2"
  ? ? >
  ? ? >? [930] "FieldValue: "
  ? ? >
  ? ? >? [931] "FieldValue: Q49"
  ? ? >
  ? ? >
  ? ? > I want this to be programmatically
  corrected to
  ? ? > read as follows: (All consecutive lines
  starting
  ? ? > with FieldValue is cleaned to retain only
  one
  ? ? > line)
  ? ? >
  ? ? > [922] "FieldName: Wk3PackSubMonth"
  ? ? >
  ? ? >? [923] "FieldValue: Apr"
  ? ? >
  ? ? >? [924] "FieldName: Wk3PackSubYear"
  ? ? >
  ? ? >? [925] "FieldValue: 2017"
  ? ? >
  ? ? >? [926] "FieldName: Wk3Code1"
  ? ? >
  ? ? >? [927] "FieldValue: K4"
  ? ? >
  ? ? >? [928] "FieldName: Wk3Code2"
  ? ? >
  ? ? >? [929] "FieldValue: Q49"
  ? ? >
  ? ? > Scenario 2:
  ? ? > In the same file, in some instances, the
  lines
  ? ? > could be as follows: in this case,
  wherever a line
  ? ? > is beginning with FieldName and the
  subsequent
  ? ? > line is not displaying a FieldValue, I
  would want
  ? ? > to programmatically identify such lines
  and insert
  ? ? > FieldValue (as blank).
  ? ? >
  ? ? > [941] "FieldName: Wk3Code6"
  ? ? >
  ? ? >? [942] "FieldValue: "
  ? ? >
  ? ? >? [943] "FieldName: Wk3Code7"
  ? ? >
  ? ? >? [944] "FieldValue: "
  ? ? >
  ? ? >? [945] "FieldName:
  Wk3PackWSColorStiffRemarkCode1"
  ? ? >
  ? ? >? [946] "FieldName:
  Wk3PackWSColorWrappRemarkCode1"
  ? ? >
  ? ? >? [947] "FieldName:
  ? ? > Wk3PackWSDelamiStiffRemarkCode1"
  ? ? >
  ? ? >
  ? ? > ie in the above, it should be replaced as
  ? ? >
  ? ? > [941] "FieldName: Wk3Code6"
  ? ? >
  ? ? >? [942] "FieldValue: "
  ? ? >
  ? ? >? [943] "FieldName: Wk3Code7"
  ? ? >
  ? ? >? [944] "FieldValue: "
  ? ? >
  ? ? >? [945] "FieldName:
  Wk3PackWSColorStiffRemarkCode1"
  ? ? >? [946] "FieldValue: "
  ? ? >
  ? ? >? [947] "FieldName:
  Wk3PackWSColorWrappRemarkCode1"
  ? ? >? [948] "FieldValue: "
  ? ? >
  ? ? >? [949] "FieldName:
  ? ? > Wk3PackWSDelamiStiffRemarkCode1"
  ? ? >? [950] "FieldValue: "
  ? ? >
  ? ? >
  ? ? > Can anybod suggest how to acheive this in
  R?
  ? ? >
  ? ? > Thanks for your time.
  ? ? > Regards
  ? ? > VP
  ? ? >
  ? ? >
  ? ? >
  ? ? > Disclaimer:\ This Communication is for the
  exclusive use...{{dropped:8}}
  ? ? >
  ? ? >
  ______________________________________________
  ? ? > R-help at r-project.org mailing list -- To
  UNSUBSCRIBE and more, see
  ? ? >
  https://stat.ethz.ch/mailman/listinfo/r-help
  ? ? > PLEASE do read the posting guide
  ? ? >
  http://www.R-project.org/posting-guide.html
  ? ? > and provide commented, minimal,
  self-contained, reproducible code.
  ? ? >

  ? ? ? ? [[alternative HTML version deleted]]


  ______________________________________________
  ? ? R-help at r-project.org mailing list -- To
  UNSUBSCRIBE and more, see
  ? ? https://stat.ethz.ch/mailman/listinfo/r-help
  ? ? PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  ? ? and provide commented, minimal,
  self-contained, reproducible code.


Disclaimer:
This Communication is for the exclusive use of the intended recipient(s) and
shall not attach any liability on the originator or ITC Ltd./its
Subsidiaries/its Group Companies.
If you are the addressee, the contents of this email are intended for your use
only and it shall not be forwarded to any third party, without first obtaining
written authorization from the originator or ITC Ltd./its Subsidiaries/its Group
Companies.
It may contain information which is confidential and legally privileged and the
same shall not be used or dealt with by any third party in any manner whatsoever
without the specific consent of ITC Ltd./its Subsidiaries/its Group Companies.
If this Email is received in error, please contact the System Administrator of
ITC Limited at webmaster at itc.in by quoting the name of the sender and the
Email address to which it has been sent and then delete it.
Please note that ITC Ltd/its subsidiaries/its Group Companies accept no
responsibility for viruses and it is your responsibility to scan or otherwise
check this Email and any attachments.
Please be advised that Email communications will not result in an agreement
binding ITC Ltd/its subsidiaries/its Group Companies. Such contracts should be
executed separately and only by managers authorized in this behalf.

Apparently Analagous Threads

Search for more maybe matching threads

R help - Jul 2017 - Help with R script

[R] Help with R script

[R] Help with R script

[R] Help with R script

Apparently Analagous Threads