thr3ads.net - R help - [R] how to Subset based on partial matching of columns? [Apr 2015]

If this information is useful, please help other people find it:
Share via:

samarvir singh

2015-Apr-09 00:57 UTC

[R] how to Subset based on partial matching of columns?

So I have a list that contains certain characters as shown below

`list <- c("MY","GM+"
,"TY","RS","LG")`

And I have a variable named "CODE" in the data frame as follows

`code <- c("MY GM+", ,"LGTY",
"RS","TY")`
'x <- c(1:5)
`df <- data.frame(x,code)`

df

     x code
    1 MY GM+
    2
    3 LGTY
    4 RS
    5 TY


Now I want to create 5 new variables named
"MY","GM+","TY","RS","LG"

Which takes binary value, 1 if there's a match case in the CODE variable

    df
     x  code         MY GM+ TY RS LG
    1  MY GM+  1     1      0    0   0
    2                  0     0      0    0   0
    3  LGTY       0     0     1     0   1
    4  RS           0     0      0    1    0
    5  TY           0     0      1    0    0

Really appreciate your help. Thank you.

	[[alternative HTML version deleted]]

Sarah Goslee

2015-Apr-09 13:24 UTC

head link

[R] how to Subset based on partial matching of columns?

Hi,

Please don't put quotes around your code. It makes it hard to copy and
paste. Alternatively, don't post in HTML, because it screws up your
code.

On Wed, Apr 8, 2015 at 8:57 PM, samarvir singh <samarvir1996 at gmail.com>
wrote:> So I have a list that contains certain characters as shown below
>
> `list <- c("MY","GM+"
,"TY","RS","LG")`
That's a character vector, not a list. A list is a specific type of object
in R.
> And I have a variable named "CODE" in the data frame as follows
>
> `code <- c("MY GM+", ,"LGTY",
"RS","TY")`
That doesn't work, and I have no idea what you expect to have there,
so I'm deleting the extra comma. Also, your vector is named code, not
CODE.

code <- c("MY GM+", "LGTY",
"RS","TY")
x <- c(1:4)
> 'x <- c(1:5)
> `df <- data.frame(x,code)`
You problably actually want
mydf <- data.frame(x, code, stringsAsFactors=FALSE)

Note I changed the name, because df() is a base R function.

> Now I want to create 5 new variables named
"MY","GM+","TY","RS","LG"
>
> Which takes binary value, 1 if there's a match case in the CODE
variable
>
>     df
>      x  code         MY GM+ TY RS LG
>     1  MY GM+  1     1      0    0   0
>     2                  0     0      0    0   0
>     3  LGTY       0     0     1     0   1
>     4  RS           0     0      0    1    0
>     5  TY           0     0      1    0    0
grepl() will give you a logical match

data.frame(mydf, sapply(code, function(x)grepl(x, mydf$code)),
stringsAsFactors=FALSE, check.names=FALSE)

Sarah


-- 
Sarah Goslee
http://www.functionaldiversity.org

samarvir singh

2015-Apr-09 13:49 UTC

head link

[R] how to Subset based on partial matching of columns?

Thank you. Sarah Goslee. I am rather new in learning R. So people like you
are great support. Really appreciate you, taking the time to correct my
mistakes. Thanks

On Thu 9 Apr, 2015 6:54 pm Sarah Goslee <sarah.goslee at gmail.com> wrote:
> Hi,
>
> Please don't put quotes around your code. It makes it hard to copy and
> paste. Alternatively, don't post in HTML, because it screws up your
> code.
>
> On Wed, Apr 8, 2015 at 8:57 PM, samarvir singh <samarvir1996 at
gmail.com>
> wrote:
> > So I have a list that contains certain characters as shown below
> >
> > `list <- c("MY","GM+"
,"TY","RS","LG")`
>
> That's a character vector, not a list. A list is a specific type of
object
> in R.
>
> > And I have a variable named "CODE" in the data frame as
follows
> >
> > `code <- c("MY GM+", ,"LGTY",
"RS","TY")`
>
> That doesn't work, and I have no idea what you expect to have there,
> so I'm deleting the extra comma. Also, your vector is named code, not
> CODE.
>
> code <- c("MY GM+", "LGTY",
"RS","TY")
> x <- c(1:4)
>
> > 'x <- c(1:5)
> > `df <- data.frame(x,code)`
>
> You problably actually want
> mydf <- data.frame(x, code, stringsAsFactors=FALSE)
>
> Note I changed the name, because df() is a base R function.
>
>
> > Now I want to create 5 new variables named
"MY","GM+","TY","RS","LG"
> >
> > Which takes binary value, 1 if there's a match case in the CODE
variable
> >
> >     df
> >      x  code         MY GM+ TY RS LG
> >     1  MY GM+  1     1      0    0   0
> >     2                  0     0      0    0   0
> >     3  LGTY       0     0     1     0   1
> >     4  RS           0     0      0    1    0
> >     5  TY           0     0      1    0    0
>
> grepl() will give you a logical match
>
> data.frame(mydf, sapply(code, function(x)grepl(x, mydf$code)),
> stringsAsFactors=FALSE, check.names=FALSE)
>
> Sarah
>
>
> --
> Sarah Goslee
> http://www.functionaldiversity.org
>
	[[alternative HTML version deleted]]

R help - Apr 2015 - how to Subset based on partial matching of columns?

[R] how to Subset based on partial matching of columns?

[R] how to Subset based on partial matching of columns?

[R] how to Subset based on partial matching of columns?