thr3ads.net - R help - [R] Reorganize(stack data) a dataframe inducing names [Jul 2011]

If this information is useful, please help other people find it:
Share via:

Francesca

2011-Jul-27 08:28 UTC

[R] Reorganize(stack data) a dataframe inducing names

Dear Contributors,
thanks for collaboration.
I am trying to reorganize data frame, that looks like this:

     n1.Index   Date        PX_LAST    n2.Index   Date.1     PX_LAST.1
n3.Index       Date.2             PX_LAST.2
1     NA        04/02/07    1.34      NA              04/02/07      1.36
       NA              04/02/07      1.33
2     NA        04/09/07    1.34      NA              04/09/07
1.36           NA              04/09/07      1.33
3     NA         04/16/07    1.34      NA             04/16/07      1.36
      NA              04/16/07      1.33
4     NA         04/30/07    1.36      NA             04/30/07
1.40           NA              04/30/07      1.37
5     NA        05/07/07    1.36      NA              05/07/07
1.40           NA              05/07/07      1.37
6     NA         05/14/07    1.36      NA             05/14/07      1.40
      NA              05/14/07      1.37
7     NA         05/22/07    1.36      NA             05/22/07      1.40
      NA              05/22/07      1.37


While what I would like to obtain is:
I would like to obtain stacked data as:

n1.Index       Date        PX_LAST
n1.Index    04/02/07    1.34
n1.Index    04/09/07    1.34
n1.Index     04/16/07    1.34
n1.Index     04/30/07    1.36
n1.Index    05/07/07    1.36
n1.Index     05/14/07    1.36
n1.Index     05/22/07    1.36
n2.Index      04/02/07    1.36
n2.Index     04/16/07    1.36
n2.Index     04/16/07    1.36
n2.Index     04/30/07    1.40
n2.Index     05/07/07    1.40
n2.Index     05/14/07    1.40
n2.Index     05/22/07    1.40
n3.Index     04/02/07    1.33
n3.Index     04/16/07    1.33
n3.Index     04/16/07    1.33
n3.Index     04/30/07    1.37

I have tried the function stack, but it uses only one argument. Then I
have tested the melt function from the package reshape, but it
seems not to be reproducing the correct organization of the data, as
it takes date as the id values.
PS: the n1 index names are not ordered in the original database, so
I cannot fill in the NA with the names using a recursive formula.
Thank you for any help you can provide.
Francesca

-- 
Francesca

----------------------------------
Francesca Pancotto, PhD
Dipartimento di Economia
Università di Bologna
Piazza Scaravilli, 2
40126 Bologna
Office: +39 051 2098135
Cell: +39 393 6019138
Web: http://www2.dse.unibo.it/francesca.pancotto/
----------------------------------

	[[alternative HTML version deleted]]

Jim Lemon

2011-Jul-27 10:02 UTC

head link

[R] Reorganize(stack data) a dataframe inducing names

On 07/27/2011 06:28 PM, Francesca wrote:> Dear Contributors,
> thanks for collaboration.
> I am trying to reorganize data frame, that looks like this:
>
>       n1.Index   Date        PX_LAST    n2.Index   Date.1     PX_LAST.1
> n3.Index       Date.2             PX_LAST.2
> 1     NA        04/02/07    1.34      NA              04/02/07      1.36
>         NA              04/02/07      1.33
> 2     NA        04/09/07    1.34      NA              04/09/07
> 1.36           NA              04/09/07      1.33
> 3     NA         04/16/07    1.34      NA             04/16/07      1.36
>        NA              04/16/07      1.33
> 4     NA         04/30/07    1.36      NA             04/30/07
> 1.40           NA              04/30/07      1.37
> 5     NA        05/07/07    1.36      NA              05/07/07
> 1.40           NA              05/07/07      1.37
> 6     NA         05/14/07    1.36      NA             05/14/07      1.40
>        NA              05/14/07      1.37
> 7     NA         05/22/07    1.36      NA             05/22/07      1.40
>        NA              05/22/07      1.37
>
>
> While what I would like to obtain is:
> I would like to obtain stacked data as:
>
> n1.Index       Date        PX_LAST
> n1.Index    04/02/07    1.34
> n1.Index    04/09/07    1.34
> n1.Index     04/16/07    1.34
> n1.Index     04/30/07    1.36
> n1.Index    05/07/07    1.36
> n1.Index     05/14/07    1.36
> n1.Index     05/22/07    1.36
> n2.Index      04/02/07    1.36
> n2.Index     04/16/07    1.36
> n2.Index     04/16/07    1.36
> n2.Index     04/30/07    1.40
> n2.Index     05/07/07    1.40
> n2.Index     05/14/07    1.40
> n2.Index     05/22/07    1.40
> n3.Index     04/02/07    1.33
> n3.Index     04/16/07    1.33
> n3.Index     04/16/07    1.33
> n3.Index     04/30/07    1.37
>
> I have tried the function stack, but it uses only one argument. Then I
> have tested the melt function from the package reshape, but it
> seems not to be reproducing the correct organization of the data, as
> it takes date as the id values.
> PS: the n1 index names are not ordered in the original database, so
> I cannot fill in the NA with the names using a recursive formula.
Hi Francesca,
Oddly enough, I answered a similar question a few days ago. The function 
below turns one or more columns in a data frame into two columns, one a 
factor that defaults to the name(s) of the columns and the other the 
data that was in that column. It also "stretches" the remaining
columns
in the data frame to the same number of rows and sticks the two 
together. It doesn't do exactly what you show above, but it might be 
good enough. A bit of coding could get the factor levels the way you want.

stretch.var<-function(data,to.stretch,
  stretch.names=c("newvar","scores")) {

  datadim<-dim(data)
  to.rep<-which(!(1:datadim[2] %in% to.stretch))
  nrep<-length(to.rep)
  newDF<-data.frame(rep(data[,to.rep[1]],length(to.stretch)))
  if(nrep > 1) {
   for(repvar in 2:nrep)
    newDF[[repvar]]<-rep(data[[to.rep[repvar]]],length(to.stretch))
  }
  newDF<-cbind(newDF,rep(names(data[,to.stretch]),each=datadim[1]),
   unlist(data[,to.stretch]))
  names(newDF)<-c(names(data[to.rep]),stretch.names)
  rownames(newDF)<-NULL
  return(newDF)
}
# read in the data
fp<-read.table("fp.dat",header=TRUE)
# pass only the columns that you want in the result
stretch.var(fp[,c(2,3,6,9)],2:4,c("n1.index","PX_LAST"))

        Date  n1.index PX_LAST
1  04/02/07   PX_LAST    1.34
2  04/09/07   PX_LAST    1.34
3  04/16/07   PX_LAST    1.34
4  04/30/07   PX_LAST    1.36
5  05/07/07   PX_LAST    1.36
6  05/14/07   PX_LAST    1.36
7  05/22/07   PX_LAST    1.36
8  04/02/07 PX_LAST.1    1.36
9  04/09/07 PX_LAST.1    1.36
10 04/16/07 PX_LAST.1    1.36
11 04/30/07 PX_LAST.1    1.40
12 05/07/07 PX_LAST.1    1.40
13 05/14/07 PX_LAST.1    1.40
14 05/22/07 PX_LAST.1    1.40
15 04/02/07 PX_LAST.2    1.33
16 04/09/07 PX_LAST.2    1.33
17 04/16/07 PX_LAST.2    1.33
18 04/30/07 PX_LAST.2    1.37
19 05/07/07 PX_LAST.2    1.37
20 05/14/07 PX_LAST.2    1.37
21 05/22/07 PX_LAST.2    1.37

Jim

David Winsemius

2011-Jul-27 12:24 UTC

head link

[R] Reorganize(stack data) a dataframe inducing names

On Jul 27, 2011, at 4:28 AM, Francesca wrote:
> Dear Contributors,
> thanks for collaboration.
> I am trying to reorganize data frame, that looks like this:
>
>     n1.Index   Date        PX_LAST    n2.Index   Date.1     PX_LAST.1
> n3.Index       Date.2             PX_LAST.2
> 1     NA        04/02/07    1.34      NA              04/02/07       
> 1.36
>       NA              04/02/07      1.33
> 2     NA        04/09/07    1.34      NA              04/09/07
> 1.36           NA              04/09/07      1.33
> 3     NA         04/16/07    1.34      NA             04/16/07       
> 1.36
>      NA              04/16/07      1.33
> 4     NA         04/30/07    1.36      NA             04/30/07
> 1.40           NA              04/30/07      1.37
> 5     NA        05/07/07    1.36      NA              05/07/07
> 1.40           NA              05/07/07      1.37
> 6     NA         05/14/07    1.36      NA             05/14/07       
> 1.40
>      NA              05/14/07      1.37
> 7     NA         05/22/07    1.36      NA             05/22/07       
> 1.40
>      NA              05/22/07      1.37
>
>
> While what I would like to obtain is:
> I would like to obtain stacked data as:
>
> n1.Index       Date        PX_LAST
> n1.Index    04/02/07    1.34
> n1.Index    04/09/07    1.34
> n1.Index     04/16/07    1.34
> n1.Index     04/30/07    1.36
> n1.Index    05/07/07    1.36
> n1.Index     05/14/07    1.36
> n1.Index     05/22/07    1.36
> n2.Index      04/02/07    1.36
> n2.Index     04/16/07    1.36
> n2.Index     04/16/07    1.36
> n2.Index     04/30/07    1.40
> n2.Index     05/07/07    1.40
> n2.Index     05/14/07    1.40
> n2.Index     05/22/07    1.40
> n3.Index     04/02/07    1.33
> n3.Index     04/16/07    1.33
> n3.Index     04/16/07    1.33
> n3.Index     04/30/07    1.37
>
> I have tried the function stack, but it uses only one argument. Then I
> have tested the melt function from the package reshape, but it
> seems not to be reproducing the correct organization of the data, as
> it takes date as the id values.
> PS: the n1 index names are not ordered in the original database, so
> I cannot fill in the NA with the names using a recursive formula.
> Thank you for any help you can provide.
(only on the last point, since you already have been offered a  
solution ...) You should read more rhelp questions and answers. This  
thread yesterday had three different ways that you could have replaced  
the values of those *.Index columns with their names:

[R] Recoding Multiple Variables in a Data Frame in One Step

Ehlers liked Dunlap's solution, but I thought those two were equally  
clever. Mine was clearly not the best.
> Francesca
>
> -- 
> Francesca
>
> ----------------------------------
> Francesca Pancotto, PhD
> Dipartimento di Economia
> Universit? di Bologna
> Piazza Scaravilli, 2
> 40126 Bologna
> Office: +39 051 2098135
> Cell: +39 393 6019138
> Web: http://www2.dse.unibo.it/francesca.pancotto/
> ----------------------------------
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT

Francesca

2011-Aug-01 15:13 UTC

head link

[R] Reorganize(stack data) a dataframe inducing names

Dear Contributors
thanks for any help you can provide. I searched the threads
but I could not find any query that satisfied my needs.
This is my database:
 index time         values
13732  27965 DATA.Q211.SUM.Index    04/08/11         1.42
13733  27974 DATA.Q211.SUM.Index    05/10/11         1.45
13734  27984 DATA.Q211.SUM.Index    06/01/11         1.22
13746  28615 DATA.Q211.TDS.Index    04/07/11         1.35
13747  28624 DATA.Q211.TDS.Index    05/20/11         1.40
13754  29262 DATA.Q211.UBS.Index    05/02/11         1.30
13755  29272 DATA.Q211.UBS.Index    05/03/11         1.48
13761  29915 DATA.Q211.UCM.Index    04/28/11         1.43
13768  30565 DATA.Q211.VDE.Index    05/02/11         1.48
13775  31215 DATA.Q211.WF.Index     04/14/11         1.44
13776  31225 DATA.Q211.WF.Index     05/12/11         1.42
13789  31865 DATA.Q211.WPC.Index    04/01/11         1.40
13790  31875 DATA.Q211.WPC.Index    04/08/11         1.42
13791  31883 DATA.Q211.WPC.Index    05/10/11         1.43
13804  32515 DATA.Q211.XTB.Index    04/29/11         1.50
13805  32525 DATA.Q211.XTB.Index    05/30/11         1.40
13806  32532 DATA.Q211.XTB.Index    06/28/11         1.43

I need to select only the rows of this database that correspond to each
of the first occurrences of the string represented in column
index. In the example shown I would like to obtain a new
data.frame which is

index time         values
13732  27965 DATA.Q211.SUM.Index    04/08/11         1.42
13746  28615 DATA.Q211.TDS.Index    04/07/11         1.35
13754  29262 DATA.Q211.UBS.Index    05/02/11         1.30
13761  29915 DATA.Q211.UCM.Index    04/28/11         1.43
13768  30565 DATA.Q211.VDE.Index    05/02/11         1.48
13775  31215 DATA.Q211.WF.Index    04/14/11         1.44
13789  31865 DATA.Q211.WPC.Index    04/01/11         1.40
13804  32515 DATA.Q211.XTB.Index    04/29/11         1.50

As you can see, it is not the whole string to change,
rather a substring that is part of it. I want to select
only the first values related to the row that presents for the first time
the different part of the string(substring).
I know how to select rows according to a substring condition on the
index column, but I cannot use it here because the substring changes
and moreover the number of occurrences per substring is variable.

Thank you for any help you can provide.
Francesca

	[[alternative HTML version deleted]]

Possibly Parallel Threads

Search for more apparently analagous threads

R help - Jul 2011 - Reorganize(stack data) a dataframe inducing names

[R] Reorganize(stack data) a dataframe inducing names

[R] Reorganize(stack data) a dataframe inducing names

[R] Reorganize(stack data) a dataframe inducing names

[R] Reorganize(stack data) a dataframe inducing names

Possibly Parallel Threads