thr3ads.net - R help - [R] Separating a Complicated String Vector [Jan 2015]

If this information is useful, please help other people find it:
Share via:

npretnar

2015-Jan-04 02:41 UTC

[R] Separating a Complicated String Vector

I have a string variable (V1) in a data frame structured as follows:

V1	V2	
A	5
a1	1
a2	1
a3	1
a4	1
a5	1
B	4
b1	1	
b2	1
b3	1
b4	1

I want the following:

V1	V2	V3	
a1	1	A	
a2	1	A
a3	1	A
a4	1	A
a5	1	A
b1	1	B
b2	1	B
b3	1	B
b4	1	B

I am not sure how to go about making this transformation besides writing a long
vector that contains each of the categorical string names (these are state
names, so it would be a really long vector). Any help would be greatly
appreciated.

Thanks,

Nicholas Pretnar
Mizzou Economics Grad Assistant
npretnar at gmail.com

Ista Zahn

2015-Jan-04 03:22 UTC

head link

[R] Separating a Complicated String Vector

I'm not sure what's so complicated about that (am I missing
something?). You can search using grep, and replace using gsub, so

tmpDF <- read.table(text="V1      V2
A       5
a1      1
a2      1
a3      1
a4      1
a5      1
B       4
b1      1
b2      1
b3      1
b4      1",
                    header=TRUE)
tmpDF <- tmpDF[grepl("[0-9]", tmpDF$V1), ]
data.frame(tmpDF, V3 = toupper(gsub("[0-9]", "", tmpDF$V1)))

Seems to do the trick.

Best,
Ista

On Sat, Jan 3, 2015 at 9:41 PM, npretnar <npretnar at gmail.com>
wrote:> I have a string variable (V1) in a data frame structured as follows:
>
> V1      V2
> A       5
> a1      1
> a2      1
> a3      1
> a4      1
> a5      1
> B       4
> b1      1
> b2      1
> b3      1
> b4      1
>
> I want the following:
>
> V1      V2      V3
> a1      1       A
> a2      1       A
> a3      1       A
> a4      1       A
> a5      1       A
> b1      1       B
> b2      1       B
> b3      1       B
> b4      1       B
>
> I am not sure how to go about making this transformation besides writing a
long vector that contains each of the categorical string names (these are state
names, so it would be a really long vector). Any help would be greatly
appreciated.
>
> Thanks,
>
> Nicholas Pretnar
> Mizzou Economics Grad Assistant
> npretnar at gmail.com
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

npretnar

2015-Jan-04 05:20 UTC

head link

[R] Separating a Complicated String Vector

Sorry. Bad example on my part. Try this. V1 is ...

V1
alabama
bates
tuscaloosa
smith
arkansas
fayette
little rock
alaska
juneau
nome

And I want:

V1			V2
alabama	bates
alabama	tuscaloosa
alabama	smith
arkansas	fayette
arkansas	little rock
alaska		juneau
alaskas		nome

This is more representative of the problem, extended to all 50 states.

- Nick


On Jan 3, 2015, at 9:22 PM, Ista Zahn wrote:
> I'm not sure what's so complicated about that (am I missing
> something?). You can search using grep, and replace using gsub, so
> 
> tmpDF <- read.table(text="V1      V2
> A       5
> a1      1
> a2      1
> a3      1
> a4      1
> a5      1
> B       4
> b1      1
> b2      1
> b3      1
> b4      1",
>                    header=TRUE)
> tmpDF <- tmpDF[grepl("[0-9]", tmpDF$V1), ]
> data.frame(tmpDF, V3 = toupper(gsub("[0-9]", "",
tmpDF$V1)))
> 
> Seems to do the trick.
> 
> Best,
> Ista
> 
> On Sat, Jan 3, 2015 at 9:41 PM, npretnar <npretnar at gmail.com>
wrote:
>> I have a string variable (V1) in a data frame structured as follows:
>> 
>> V1      V2
>> A       5
>> a1      1
>> a2      1
>> a3      1
>> a4      1
>> a5      1
>> B       4
>> b1      1
>> b2      1
>> b3      1
>> b4      1
>> 
>> I want the following:
>> 
>> V1      V2      V3
>> a1      1       A
>> a2      1       A
>> a3      1       A
>> a4      1       A
>> a5      1       A
>> b1      1       B
>> b2      1       B
>> b3      1       B
>> b4      1       B
>> 
>> I am not sure how to go about making this transformation besides
writing a long vector that contains each of the categorical string names (these
are state names, so it would be a really long vector). Any help would be greatly
appreciated.
>> 
>> Thanks,
>> 
>> Nicholas Pretnar
>> Mizzou Economics Grad Assistant
>> npretnar at gmail.com
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

R help - Jan 2015 - Separating a Complicated String Vector

[R] Separating a Complicated String Vector

[R] Separating a Complicated String Vector

[R] Separating a Complicated String Vector