thr3ads.net - R help - [R] Reformatting text inside a data frame [Sep 2015]

If this information is useful, please help other people find it:
Share via:

Jon BR

2015-Sep-07 19:27 UTC

[R] Reformatting text inside a data frame

Hi all,
    I've read in a large data frame that has formatting similar to the one
in the small example below:

df <-
data.frame(c(1,2,3),c(NA,"AD=2;BA=8","AD=9;BA=1"),c("AD=13;BA=49","AD=1;BA=2",NA));
names(df) <- c("rowNum","first","second")
> df  rowNum     first      second
1      1      <NA> AD=13;BA=49
2      2 AD=2;BA=8   AD=1;BA=2
3      3 AD=9;BA=1        <NA>


I'd like to reformat all of the non-NA entries in df from "first"
and
"second" and so-on such that "AD=13;BA=49" will be replaced
by the
following string: "13_13-49".

So applied to df, the output would be the following:

  rowNum     first      second
1      1      <NA> 13_13-49
2      2 2_2-8   1_1-2
3      3 9_9-1        <NA>


I'm generally a big proponent of shell scripting with awk, but I'd
prefer
an all-R solution if one exists (and also to learn how to do this more
generally).

Could someone point out an appropriate paradigm or otherwise point me in
the right direction?

Best,
Jonathan

	[[alternative HTML version deleted]]

John Kane

2015-Sep-07 19:48 UTC

head link

[R] Reformatting text inside a data frame

I'm not making a lot of sense of the data, it looks like you want more
recodes than you have mentioned  but in any case  you might want to look at the
recode function in the car package.  It "should" do what you want
thought there may be faster ways to do it.

BTW, for supplying sample data have a look at ?dput . Using dput() means that we
see exactly the same data as you do.

Sorry not to be of more help
John Kane
Kingston ON Canada

> -----Original Message-----
> From: jonsleepy at gmail.com
> Sent: Mon, 7 Sep 2015 15:27:05 -0400
> To: r-help at r-project.org
> Subject: [R] Reformatting text inside a data frame
> 
> Hi all,
>     I've read in a large data frame that has formatting similar to the
> one
> in the small example below:
> 
> df <-
>
data.frame(c(1,2,3),c(NA,"AD=2;BA=8","AD=9;BA=1"),c("AD=13;BA=49","AD=1;BA=2",NA));
> names(df) <- c("rowNum","first","second")
> 
>> df
>   rowNum     first      second
> 1      1      <NA> AD=13;BA=49
> 2      2 AD=2;BA=8   AD=1;BA=2
> 3      3 AD=9;BA=1        <NA>
> 
> 
> I'd like to reformat all of the non-NA entries in df from
"first" and
> "second" and so-on such that "AD=13;BA=49" will be
replaced by the
> following string: "13_13-49".
> 
> So applied to df, the output would be the following:
> 
>   rowNum     first      second
> 1      1      <NA> 13_13-49
> 2      2 2_2-8   1_1-2
> 3      3 9_9-1        <NA>
> 
> 
> I'm generally a big proponent of shell scripting with awk, but I'd
prefer
> an all-R solution if one exists (and also to learn how to do this more
> generally).
> 
> Could someone point out an appropriate paradigm or otherwise point me in
> the right direction?
> 
> Best,
> Jonathan
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
____________________________________________________________
FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

Jon BR

2015-Sep-07 20:20 UTC

head link

[R] Reformatting text inside a data frame

Hi John,
     Thanks for the reply; I'm pasting here the output from dput, with a
'df <-' added in front:

df <- structure(list(rowNum = c(1, 2, 3), first = structure(c(NA, 1L,
2L), .Label = c("AD=2;BA=8", "AD=9;BA=1"), class =
"factor"),
    second = structure(c(2L, 1L, NA), .Label = c("AD=1;BA=2",
    "AD=13;BA=49"), class = "factor")), .Names =
c("rowNum",
"first", "second"), row.names = c(NA, -3L), class =
"data.frame")




To add more specifics, about what I would like; each value to be adjusted
has the following general format:

"AD=X;BA=Y"

I would like to extract the values of X and Y and format them as a string
as such:

"X_X-Y"


Here's how I would handle a specific instance using awk in a shell script:

echo  "AD=X;BA=Y" | awk '{split($1,a,"AD=");
split(a[2],b,";");
split(b[2],c,"BA="); print
b[1]"_"b[1]"-"c[2]}'
X_X-Y

I'd like this to apply for all the entries that aren't NA to the right
of
column 1.

Hoping this adds clarity for any others who also didn't follow my example.

Thanks in advance for any tips-

Best,
Jonathan

On Mon, Sep 7, 2015 at 3:48 PM, John Kane <jrkrideau at inbox.com> wrote:
> I'm not making a lot of sense of the data, it looks like you want more
> recodes than you have mentioned  but in any case  you might want to look at
> the recode function in the car package.  It "should" do what you
want
> thought there may be faster ways to do it.
>
> BTW, for supplying sample data have a look at ?dput . Using dput() means
> that we see exactly the same data as you do.
>
> Sorry not to be of more help
> John Kane
> Kingston ON Canada
>
>
> > -----Original Message-----
> > From: jonsleepy at gmail.com
> > Sent: Mon, 7 Sep 2015 15:27:05 -0400
> > To: r-help at r-project.org
> > Subject: [R] Reformatting text inside a data frame
> >
> > Hi all,
> >     I've read in a large data frame that has formatting similar to
the
> > one
> > in the small example below:
> >
> > df <-
> >
>
data.frame(c(1,2,3),c(NA,"AD=2;BA=8","AD=9;BA=1"),c("AD=13;BA=49","AD=1;BA=2",NA));
> > names(df) <-
c("rowNum","first","second")
> >
> >> df
> >   rowNum     first      second
> > 1      1      <NA> AD=13;BA=49
> > 2      2 AD=2;BA=8   AD=1;BA=2
> > 3      3 AD=9;BA=1        <NA>
> >
> >
> > I'd like to reformat all of the non-NA entries in df from
"first" and
> > "second" and so-on such that "AD=13;BA=49" will be
replaced by the
> > following string: "13_13-49".
> >
> > So applied to df, the output would be the following:
> >
> >   rowNum     first      second
> > 1      1      <NA> 13_13-49
> > 2      2 2_2-8   1_1-2
> > 3      3 9_9-1        <NA>
> >
> >
> > I'm generally a big proponent of shell scripting with awk, but
I'd prefer
> > an all-R solution if one exists (and also to learn how to do this more
> > generally).
> >
> > Could someone point out an appropriate paradigm or otherwise point me
in
> > the right direction?
> >
> > Best,
> > Jonathan
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ____________________________________________________________
> FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!
> Check it out at http://www.inbox.com/earth
>
>
>
	[[alternative HTML version deleted]]

R help - Sep 2015 - Reformatting text inside a data frame

[R] Reformatting text inside a data frame

[R] Reformatting text inside a data frame

[R] Reformatting text inside a data frame