thr3ads.net - R help - [R] restructuring datset problem [Sep 2008]

If this information is useful, please help other people find it:
Share via:

Gellrich Mario

2008-Sep-07 18:23 UTC

[R] restructuring datset problem

Hi,

I've got a question regarding the restructering of a data set. What I have
are municipality zip-codes and the names of 5'000 built-up areas within
municipalities. The following example shows, what I would like to do:

Input (Zip-Codes and Names): 

#     CODE     NAME
#1       3      aaa
#2       3      aab
#3       3      aac
#4       4      bba
#5       4      bbb
#6       4      bbc
#7       4      bbd
#8       5      cca
#9       5      ccb

Desired Output (Zip-Codes and restructured names)

#  CODE  V2    V3    V4    V5
#1  3   aaa   aab   aac    NA
#2  4   bba   bbb   bbc   bbd
#3  5   cca   ccb   NA     NA

I tougth about this problem several hours and tried functions like aggregate()
and t() in combination with for-loops but didn't came to the output above.
Can anybody help me?

Best regards,

Mario

jim holtman

2008-Sep-07 20:55 UTC

head link

[R] restructuring datset problem

This should do it for you:

  CODE NAME
1    3  aaa
2    3  aab
3    3  aac
4    4  bba
5    4  bbb
6    4  bbc
7    4  bbd
8    5  cca
9    5  ccb> x.s <- split(x$NAME, x$CODE)
> maxLine <- max(table(x$CODE))
> # pad out the lines
> x.pad <- lapply(x.s, function(line){+     # convert to character
+     line <- as.character(line)
+     length(line) <- maxLine
+     line
+ })> as.data.frame(do.call(rbind, x.pad))   V1  V2   V3   V4
3 aaa aab  aac <NA>
4 bba bbb  bbc  bbd
5 cca ccb <NA> <NA>


On Sun, Sep 7, 2008 at 2:23 PM, Gellrich  Mario
<mario.gellrich at env.ethz.ch> wrote:> Hi,
>
> I've got a question regarding the restructering of a data set. What I
have are municipality zip-codes and the names of 5'000 built-up areas within
municipalities. The following example shows, what I would like to do:
>
> Input (Zip-Codes and Names):
>
> #     CODE     NAME
> #1       3      aaa
> #2       3      aab
> #3       3      aac
> #4       4      bba
> #5       4      bbb
> #6       4      bbc
> #7       4      bbd
> #8       5      cca
> #9       5      ccb
>
> Desired Output (Zip-Codes and restructured names)
>
> #  CODE  V2    V3    V4    V5
> #1  3   aaa   aab   aac    NA
> #2  4   bba   bbb   bbc   bbd
> #3  5   cca   ccb   NA     NA
>
> I tougth about this problem several hours and tried functions like
aggregate() and t() in combination with for-loops but didn't came to the
output above. Can anybody help me?
>
> Best regards,
>
> Mario
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

Gabor Grothendieck

2008-Sep-07 21:30 UTC

head link

[R] restructuring datset problem

Try this:
> # read in data ensuring NAME is character, not factor
> Lines <- " CODE NAME+ 1    3  aaa
+ 2    3  aab
+ 3    3  aac
+ 4    4  bba
+ 5    4  bbb
+ 6    4  bbc
+ 7    4  bbd
+ 8    5  cca
+ 9    5  ccb
+ "> DF <- read.table(textConnection(Lines), header = TRUE, as.is = TRUE)
>
> DF$seq = ave(DF$CODE, DF$CODE, FUN = seq_along)
> tapply(DF$NAME, DF[c("CODE", "seq")], c)    seq
CODE 1     2     3     4
   3 "aaa" "aab" "aac" NA
   4 "bba" "bbb" "bbc" "bbd"
   5 "cca" "ccb" NA    NA


On Sun, Sep 7, 2008 at 2:23 PM, Gellrich  Mario
<mario.gellrich at env.ethz.ch> wrote:> Hi,
>
> I've got a question regarding the restructering of a data set. What I
have are municipality zip-codes and the names of 5'000 built-up areas within
municipalities. The following example shows, what I would like to do:
>
> Input (Zip-Codes and Names):
>
> #     CODE     NAME
> #1       3      aaa
> #2       3      aab
> #3       3      aac
> #4       4      bba
> #5       4      bbb
> #6       4      bbc
> #7       4      bbd
> #8       5      cca
> #9       5      ccb
>
> Desired Output (Zip-Codes and restructured names)
>
> #  CODE  V2    V3    V4    V5
> #1  3   aaa   aab   aac    NA
> #2  4   bba   bbb   bbc   bbd
> #3  5   cca   ccb   NA     NA
>
> I tougth about this problem several hours and tried functions like
aggregate() and t() in combination with for-loops but didn't came to the
output above. Can anybody help me?
>
> Best regards,
>
> Mario
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Maybe Matching Threads

Search for more apparently analagous threads

R help - Sep 2008 - restructuring datset problem

[R] restructuring datset problem

[R] restructuring datset problem

[R] restructuring datset problem

Maybe Matching Threads