I think I must be missing something obvious, but I'm having trouble getting a data transformation to work on groupings of data within a data frame (csss3) as defined by 2 factors (population, locid). The data are sorted by year within locid within population and I want to lag another variable (dbc), i.e, shift them down by 1 row replacing the first row with NA, within groups defined by locid nested within population. I thought I could do something using by(csss3,list(locid, population), function) but don't seem to be having any success. Any suggestions?? Brian Brian S. Cade U. S. Geological Survey Fort Collins Science Center 2150 Centre Ave., Bldg. C Fort Collins, CO 80526-8818 email: brian_cade@usgs.gov tel: 970 226-9326 [[alternative HTML version deleted]]
Fair enough.  To clarify what I'm trying to achieve I've pasted below a 
small piece of the larger data frame with the hierarchical structure of 
factors POPULATION and LOCID and the ascending order of YEARS and the 
variable DBC that I would like to transform to another variable that is a 
lag of the previous years DBC (call it LAG1DBC) within LOCID within 
POPULATION.  The desired outcome is shown in the second example data set 
pasted below the first.  The setup is desired for doing some 1st order 
autoregressive analyses (not in the time series library).  Any examples 
I've tried doing using by() only seem to work for outputing results not 
creating new variables in an existing data frame.  I suspect that people 
do similar types of hierarchical subgroup data manipulations all the time 
in R (I know how to do these easily in SYSTAT), so I'm sure I'm missing 
some obvious, simple trick.  My search of the R newslist archives and 
various other R documentation has not yielded any solutions yet. 
Suggestions are graciously welcomed.
       LOCID  POPULATION  YEAR        DBC
1      algb-1           A 1992 0.70451575
2      algb-1           A 1993 0.59506851
3      algb-1           A 1997 0.84837544
4      algb-1           A 1998 0.50283182
5      algb-1           A 2000 0.91242707
6      algb-2           A 1992 0.09747155
7      algb-2           A 1993 0.84772253
8      algb-2           A 1997 0.43974081
9      algb-2           A 1998 0.83108544
10     algb-2           A 2000 0.22291192
11     algb-3           A 1992 0.44234175
12     algb-3           A 1993 0.54089534
5680 taylr-73           B 2001 0.43918082
5681 taylr-73           B 2002 0.34694427
5682 taylr-73           B 2003 3.35619190
5683 taylr-73           B 2004 0.71575815
5684 taylr-73           B 2005 0.42038506
5685 taylr-74           B 1992 3.88410354
5686 taylr-74           B 1993 3.32472557
5687 taylr-74           B 1994 3.29861501
5688 taylr-74           B 1996 0.48153827
5689 taylr-74           B 1997 3.63570636
5690 taylr-74           B 1998 1.94630194
       LOCID  POPULATION  YEAR        DBC LAG1DBC
1      algb-1           A 1992 0.70451575       NA 
2      algb-1           A 1993 0.59506851 0.70451575
3      algb-1           A 1997 0.84837544       0.59506851
4      algb-1           A 1998 0.50283182 0.84837544
5      algb-1           A 2000 0.91242707       0.50283182
6      algb-2           A 1992 0.09747155       NA
7      algb-2           A 1993 0.84772253 0.09747155
8      algb-2           A 1997 0.43974081       0.84772253
9      algb-2           A 1998 0.83108544       0.43974081
10     algb-2           A 2000 0.22291192       0.83108544
11     algb-3           A 1992 0.44234175       NA
12     algb-3           A 1993 0.54089534       0.44234175
5680 taylr-73           B 2001 0.43918082       NA
5681 taylr-73           B 2002 0.34694427       0.43918082
5682 taylr-73           B 2003 3.35619190       0.34694427
5683 taylr-73           B 2004 0.71575815       3.35619190
5684 taylr-73           B 2005 0.42038506       0.71575815
5685 taylr-74           B 1992 3.88410354       NA
5686 taylr-74           B 1993 3.32472557       3.88410354
5687 taylr-74           B 1994 3.29861501       3.32472557
5688 taylr-74           B 1996 0.48153827       3.29861501
5689 taylr-74           B 1997 3.63570636       0.48153827
5690 taylr-74           B 1998 1.94630194       3.63570636
Brian
Brian S. Cade
U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818
email:  brian_cade@usgs.gov
tel:  970 226-9326
Florence Combes <fcombes@gmail.com> 
10/13/2005 05:34 AM
To
Brian S Cade <brian_cade@usgs.gov>
cc
Subject
Re: [R] subsetting with by() or other function??
maybe an example of the data you have and the data you want could be 
helpful for the people of the list to understand, and so to be able to 
help you ? 
best regards, 
Florence. 
On 10/12/05, Brian S Cade <brian_cade@usgs.gov> wrote:
I think I must be missing something obvious, but I'm having trouble
getting a data transformation to work on groupings of data within a data
frame (csss3) as defined by 2 factors (population, locid).  The data are
sorted by year within locid within population and I want to lag another
variable (dbc), i.e, shift them down by 1 row replacing the first row with
NA, within groups defined by locid nested within population.  I thought I 
could do something using by(csss3,list(locid, population), function) but
don't seem to be having any success.  Any suggestions??
Brian
Brian S. Cade
U. S. Geological Survey
Fort Collins Science Center 
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818
email:  brian_cade@usgs.gov
tel:  970 226-9326
        [[alternative HTML version deleted]]
______________________________________________ 
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html
	[[alternative HTML version deleted]]
Dimitris:  Thank you for the suggestion but I get an error just as when I 
did similar commands using by(), The error given is 
Error in "$<-.data.frame"(`*tmp*`, "LAGDBC", value = 
tapply(csss3lagm81$DBC,  : 
        replacement has 1089 rows, data has 8314
So I'm not sure what the problem is - why does the transformed tmp only 
have 1089 rows instead of 8314 like the full data frame?
Brian
Brian S. Cade
U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818
email:  brian_cade@usgs.gov
tel:  970 226-9326
"Dimitris Rizopoulos" <dimitris.rizopoulos@med.kuleuven.be> 
10/13/2005 10:04 AM
To
"Brian S Cade" <brian_cade@usgs.gov>
cc
Subject
Re: [R] subsetting with by() or other function??
I think this should be something like:
dat$LAG1DBC <- tapply(dat$DBC, dat$LOCID, function(x) c(NA, 
x[-length(x)]))
I hope it helps.
Best,
Dimitris
----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven
Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://www.med.kuleuven.be/biostat/
     http://www.student.kuleuven.be/~m0390867/dimitris.htm
----- Original Message ----- 
From: "Brian S Cade" <brian_cade@usgs.gov>
To: "Florence Combes" <fcombes@gmail.com>;
<r-help@stat.math.ethz.ch>
Sent: Thursday, October 13, 2005 5:48 PM
Subject: Re: [R] subsetting with by() or other function??
> Fair enough.  To clarify what I'm trying to achieve I've pasted 
> below a
> small piece of the larger data frame with the hierarchical structure 
> of
> factors POPULATION and LOCID and the ascending order of YEARS and 
> the
> variable DBC that I would like to transform to another variable that 
> is a
> lag of the previous years DBC (call it LAG1DBC) within LOCID within
> POPULATION.  The desired outcome is shown in the second example data 
> set
> pasted below the first.  The setup is desired for doing some 1st 
> order
> autoregressive analyses (not in the time series library).  Any 
> examples
> I've tried doing using by() only seem to work for outputing results 
> not
> creating new variables in an existing data frame.  I suspect that 
> people
> do similar types of hierarchical subgroup data manipulations all the 
> time
> in R (I know how to do these easily in SYSTAT), so I'm sure I'm 
> missing
> some obvious, simple trick.  My search of the R newslist archives 
> and
> various other R documentation has not yielded any solutions yet.
> Suggestions are graciously welcomed.
>
>       LOCID  POPULATION  YEAR        DBC
> 1      algb-1           A 1992 0.70451575
> 2      algb-1           A 1993 0.59506851
> 3      algb-1           A 1997 0.84837544
> 4      algb-1           A 1998 0.50283182
> 5      algb-1           A 2000 0.91242707
> 6      algb-2           A 1992 0.09747155
> 7      algb-2           A 1993 0.84772253
> 8      algb-2           A 1997 0.43974081
> 9      algb-2           A 1998 0.83108544
> 10     algb-2           A 2000 0.22291192
> 11     algb-3           A 1992 0.44234175
> 12     algb-3           A 1993 0.54089534
> 5680 taylr-73           B 2001 0.43918082
> 5681 taylr-73           B 2002 0.34694427
> 5682 taylr-73           B 2003 3.35619190
> 5683 taylr-73           B 2004 0.71575815
> 5684 taylr-73           B 2005 0.42038506
> 5685 taylr-74           B 1992 3.88410354
> 5686 taylr-74           B 1993 3.32472557
> 5687 taylr-74           B 1994 3.29861501
> 5688 taylr-74           B 1996 0.48153827
> 5689 taylr-74           B 1997 3.63570636
> 5690 taylr-74           B 1998 1.94630194
>
>       LOCID  POPULATION  YEAR        DBC
> 1      algb-1           A 1992 0.70451575       NA
> 2      algb-1           A 1993 0.59506851 0.70451575
> 3      algb-1           A 1997 0.84837544       0.59506851
> 4      algb-1           A 1998 0.50283182 0.84837544
> 5      algb-1           A 2000 0.91242707       0.50283182
> 6      algb-2           A 1992 0.09747155       NA
> 7      algb-2           A 1993 0.84772253 0.09747155
> 8      algb-2           A 1997 0.43974081       0.84772253
> 9      algb-2           A 1998 0.83108544       0.43974081
> 10     algb-2           A 2000 0.22291192       0.83108544
> 11     algb-3           A 1992 0.44234175       NA
> 12     algb-3           A 1993 0.54089534       0.44234175
> 5680 taylr-73           B 2001 0.43918082       NA
> 5681 taylr-73           B 2002 0.34694427       0.43918082
> 5682 taylr-73           B 2003 3.35619190       0.34694427
> 5683 taylr-73           B 2004 0.71575815       3.35619190
> 5684 taylr-73           B 2005 0.42038506       0.71575815
> 5685 taylr-74           B 1992 3.88410354       NA
> 5686 taylr-74           B 1993 3.32472557       3.88410354
> 5687 taylr-74           B 1994 3.29861501       3.32472557
> 5688 taylr-74           B 1996 0.48153827       3.29861501
> 5689 taylr-74           B 1997 3.63570636       0.48153827
> 5690 taylr-74           B 1998 1.94630194       3.63570636
>
> Brian
>
>
>
> Brian S. Cade
>
> U. S. Geological Survey
> Fort Collins Science Center
> 2150 Centre Ave., Bldg. C
> Fort Collins, CO  80526-8818
>
> email:  brian_cade@usgs.gov
> tel:  970 226-9326
>
>
>
> Florence Combes <fcombes@gmail.com>
> 10/13/2005 05:34 AM
>
> To
> Brian S Cade <brian_cade@usgs.gov>
> cc
>
> Subject
> Re: [R] subsetting with by() or other function??
>
>
>
>
>
>
> maybe an example of the data you have and the data you want could be
> helpful for the people of the list to understand, and so to be able 
> to
> help you ?
>
> best regards,
>
> Florence.
>
>
>
> On 10/12/05, Brian S Cade <brian_cade@usgs.gov> wrote:
> I think I must be missing something obvious, but I'm having trouble
> getting a data transformation to work on groupings of data within a 
> data
> frame (csss3) as defined by 2 factors (population, locid).  The data 
> are
> sorted by year within locid within population and I want to lag 
> another
> variable (dbc), i.e, shift them down by 1 row replacing the first 
> row with
> NA, within groups defined by locid nested within population.  I 
> thought I
> could do something using by(csss3,list(locid, population), function) 
> but
> don't seem to be having any success.  Any suggestions??
>
> Brian
>
> Brian S. Cade
>
> U. S. Geological Survey
> Fort Collins Science Center
> 2150 Centre Ave., Bldg. C
> Fort Collins, CO  80526-8818
>
> email:  brian_cade@usgs.gov
> tel:  970 226-9326
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
	[[alternative HTML version deleted]]