thr3ads.net - R help - [R] Split data frame and create a new column [Nov 2012]

If this information is useful, please help other people find it:
Share via:

Zlatan

2012-Nov-16 00:05 UTC

[R] Split data frame and create a new column

I need to split a data frame into 3 columns. The column I want to split
contains indices of lag (prefix L1 or L2 and suffix 01, 03, 04), station
name (shown in the sample data as capitalized G, P and S) and pollutant
name. Names with no ?L? prefix or 01/04 suffix are lag 0. Lag 01 is average
of lag 0 and 1, and 04 is average of 0 to 4 days. How can one do that in R? 
I will ignore the other components( e.g. 10 , max or mean)



Current stand

L1o3maxG10
L1o3P10
L2o3G10
noxP10
pm25S_01
comeanS_03
noxP_04

What I want to get :

pollutant  Lag	station
o3	1	G
o3	1	P
o3	2	G
nox	0	P
Pm25	01	S
co	03	S
nox	04	P


Thanks




--
View this message in context:
http://r.789695.n4.nabble.com/Split-data-frame-and-create-a-new-column-tp4649683.html
Sent from the R help mailing list archive at Nabble.com.

Rui Barradas

2012-Nov-17 15:22 UTC

head link

[R] Split data frame and create a new column

Hello,

I don't know if this is general purpose but try


x <- scan(what = "character", text="
L1o3maxG10
L1o3P10
L2o3G10
noxP10
pm25S_01
comeanS_03
noxP_04")

fun <- function(x){
     r1 <- unlist(strsplit(x, "L[[:digit:]]+|G|P|S"))
     r1 <- r1[nchar(r1) != 0]
     r1 <- r1[rep(c(TRUE, FALSE), length(r1)/2)]
     r1 <- unlist(strsplit(r1, "max|mean"))
     r1 <- r1[nchar(r1) != 0]

     r2 <- integer(length(x))
     w2 <- grep("L[[:digit:]]+", x)
     re2 <- regexpr("L[[:digit:]]+", x)
     re2 <- unlist(strsplit(regmatches(x, re2), "L"))
     re2 <- re2[nchar(re2) != 0]
     r2[w2] <- re2
     w2 <- grep("G_|P_|S_", x)
     re2 <- regmatches(x, regexpr("(G_|P_|S_)[[:digit:]]+", x))
     re2 <- unlist(strsplit(re2, "G_|P_|S_"))
     re2 <- re2[nchar(re2) != 0]
     r2[w2] <- re2

     r3 <- regmatches(x, regexpr("G|P|S", x))

     data.frame(r1, r2, r3)
}

fun(x)


Hope this helps,

Rui Barradas
Em 16-11-2012 00:05, Zlatan escreveu:> I need to split a data frame into 3 columns. The column I want to split
> contains indices of lag (prefix L1 or L2 and suffix 01, 03, 04), station
> name (shown in the sample data as capitalized G, P and S) and pollutant
> name. Names with no ?L? prefix or 01/04 suffix are lag 0. Lag 01 is average
> of lag 0 and 1, and 04 is average of 0 to 4 days. How can one do that in R?
> I will ignore the other components( e.g. 10 , max or mean)
>
>
>
> Current stand
>
> L1o3maxG10
> L1o3P10
> L2o3G10
> noxP10
> pm25S_01
> comeanS_03
> noxP_04
>
> What I want to get :
>
> pollutant  Lag	station
> o3	1	G
> o3	1	P
> o3	2	G
> nox	0	P
> Pm25	01	S
> co	03	S
> nox	04	P
>
>
> Thanks
>
>
>
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/Split-data-frame-and-create-a-new-column-tp4649683.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Apparently Analagous Threads

Search for more reasonably related threads

R help - Nov 2012 - Split data frame and create a new column

[R] Split data frame and create a new column

[R] Split data frame and create a new column

Apparently Analagous Threads