Kristiina Hurme
2012-Jun-30 21:21 UTC
[R] How to adjust the start of a series to zero? (i.e. subtract the first value from the sequence)
Hello, I'd have a time series, where I am plotting the means and sd of a distance for a variety of positions along a bird's bill. I'd like to set each line (represented by "point") to start at zero, so that I can look at the absolute change along the series. At the moment I only know how to do that in Excel, by subtracting the value of time 1, point 1 from all other times for point 1. My actual data set has many points ( 20 per bird, only 3 shown here), so I would love to make this faster in R. Ideally, I would have another column titled "adj_mean" for the adjusted means. Here is an example.> sort2v4point time mean sd 1 1 1 52.501000 1.5073927 3 1 2 54.501818 0.8510329 4 1 3 56.601739 1.5787222 5 1 4 57.200000 1.2292726 6 1 5 59.300000 2.2632327 7 1 6 57.800893 1.4745218 8 1 7 55.303508 2.2661855 9 1 8 51.100943 1.8540025 10 1 9 50.600000 1.7126977 2 1 10 52.904716 1.1010460 111 2 1 50.605963 1.2633969 113 2 2 52.203828 0.7890765 114 2 3 54.100909 1.1013344 115 2 4 55.000000 1.1547005 116 2 5 57.001725 1.6341500 117 2 6 55.003591 1.5652438 118 2 7 52.911089 1.7373914 119 2 8 49.204022 1.0350809 120 2 9 48.904103 0.8747568 112 2 10 50.915700 0.8765483 131 3 1 48.608228 0.8433913 133 3 2 49.307101 0.4827703 134 3 3 51.310824 0.9424023 135 3 4 52.413350 0.6997860 136 3 5 54.116723 1.1927297 137 3 6 52.618161 1.1686288 138 3 7 49.822764 1.6303473 139 3 8 47.107336 1.2013356 140 3 9 47.104214 1.1986148 132 3 10 48.719484 0.6765047 and I would like it to look like this... (which I did in Excel). The start of each time 1-10 has an adj_mean of 0.> sort2v4point time mean sd adj_mean 1 1 1 52.501 1.5073927 0 3 1 2 54.501818 0.8510329 2.000818 4 1 3 56.601739 1.5787222 4.100739 5 1 4 57.2 1.2292726 4.699 6 1 5 59.3 2.2632327 6.799 7 1 6 57.800893 1.4745218 5.299893 8 1 7 55.303508 2.2661855 2.802508 9 1 8 51.100943 1.8540025 -1.400057 10 1 9 50.6 1.7126977 -1.901 2 1 10 52.904716 1.101046 0.403716 111 2 1 50.605963 1.2633969 0 113 2 2 52.203828 0.7890765 1.597865 114 2 3 54.100909 1.1013344 3.494946 115 2 4 55 1.1547005 4.394037 116 2 5 57.001725 1.63415 6.395762 117 2 6 55.003591 1.5652438 4.397628 118 2 7 52.911089 1.7373914 2.305126 119 2 8 49.204022 1.0350809 -1.401941 120 2 9 48.904103 0.8747568 -1.70186 112 2 10 50.9157 0.8765483 0.309737 131 3 1 48.608228 0.8433913 0 133 3 2 49.307101 0.4827703 0.698873 134 3 3 51.310824 0.9424023 2.702596 135 3 4 52.41335 0.699786 3.805122 136 3 5 54.116723 1.1927297 5.508495 137 3 6 52.618161 1.1686288 4.009933 138 3 7 49.822764 1.6303473 1.214536 139 3 8 47.107336 1.2013356 -1.500892 140 3 9 47.104214 1.1986148 -1.504014 132 3 10 48.719484 0.6765047 0.111256 Thank you so much for your help. Kristiina -- View this message in context: http://r.789695.n4.nabble.com/How-to-adjust-the-start-of-a-series-to-zero-i-e-subtract-the-first-value-from-the-sequence-tp4634999.html Sent from the R help mailing list archive at Nabble.com.
Phil Spector
2012-Jun-30 22:31 UTC
[R] How to adjust the start of a series to zero? (i.e. subtract the first value from the sequence)
Kristiina - If the data will always be sorted so that the first time for a point appears first in the data frame, you can use: sort2v4$adj_mean = sort2v4$mean - ave(sort2v4$mean,sort2v4$point,FUN=function(x)x[1]) Otherwise, something like this should work: firstmeans = subset(sort2v4,time==1,select=c(point,mean)) names(firstmeans)[2] = 'adj' sort2v4 = merge(sort2v4,firstmeans) sort2v4$adj_mean = with(sort2v4,mean-adj) sort2v4$adj = NULL In the future, you may want to learn about the dput function, which makes it a little easier for others to reproduce your data. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spector at stat.berkeley.edu On Sat, 30 Jun 2012, Kristiina Hurme wrote:> Hello, > I'd have a time series, where I am plotting the means and sd of a distance > for a variety of positions along a bird's bill. I'd like to set each line > (represented by "point") to start at zero, so that I can look at the > absolute change along the series. At the moment I only know how to do that > in Excel, by subtracting the value of time 1, point 1 from all other times > for point 1. My actual data set has many points ( 20 per bird, only 3 shown > here), so I would love to make this faster in R. Ideally, I would have > another column titled "adj_mean" for the adjusted means. > > Here is an example. > >> sort2v4 > point time mean sd > 1 1 1 52.501000 1.5073927 > 3 1 2 54.501818 0.8510329 > 4 1 3 56.601739 1.5787222 > 5 1 4 57.200000 1.2292726 > 6 1 5 59.300000 2.2632327 > 7 1 6 57.800893 1.4745218 > 8 1 7 55.303508 2.2661855 > 9 1 8 51.100943 1.8540025 > 10 1 9 50.600000 1.7126977 > 2 1 10 52.904716 1.1010460 > 111 2 1 50.605963 1.2633969 > 113 2 2 52.203828 0.7890765 > 114 2 3 54.100909 1.1013344 > 115 2 4 55.000000 1.1547005 > 116 2 5 57.001725 1.6341500 > 117 2 6 55.003591 1.5652438 > 118 2 7 52.911089 1.7373914 > 119 2 8 49.204022 1.0350809 > 120 2 9 48.904103 0.8747568 > 112 2 10 50.915700 0.8765483 > 131 3 1 48.608228 0.8433913 > 133 3 2 49.307101 0.4827703 > 134 3 3 51.310824 0.9424023 > 135 3 4 52.413350 0.6997860 > 136 3 5 54.116723 1.1927297 > 137 3 6 52.618161 1.1686288 > 138 3 7 49.822764 1.6303473 > 139 3 8 47.107336 1.2013356 > 140 3 9 47.104214 1.1986148 > 132 3 10 48.719484 0.6765047 > > and I would like it to look like this... (which I did in Excel). The start > of each time 1-10 has an adj_mean of 0. >> sort2v4 > point time mean sd adj_mean > 1 1 1 52.501 1.5073927 0 > 3 1 2 54.501818 0.8510329 2.000818 > 4 1 3 56.601739 1.5787222 4.100739 > 5 1 4 57.2 1.2292726 4.699 > 6 1 5 59.3 2.2632327 6.799 > 7 1 6 57.800893 1.4745218 5.299893 > 8 1 7 55.303508 2.2661855 2.802508 > 9 1 8 51.100943 1.8540025 -1.400057 > 10 1 9 50.6 1.7126977 -1.901 > 2 1 10 52.904716 1.101046 0.403716 > 111 2 1 50.605963 1.2633969 0 > 113 2 2 52.203828 0.7890765 1.597865 > 114 2 3 54.100909 1.1013344 3.494946 > 115 2 4 55 1.1547005 4.394037 > 116 2 5 57.001725 1.63415 6.395762 > 117 2 6 55.003591 1.5652438 4.397628 > 118 2 7 52.911089 1.7373914 2.305126 > 119 2 8 49.204022 1.0350809 -1.401941 > 120 2 9 48.904103 0.8747568 -1.70186 > 112 2 10 50.9157 0.8765483 0.309737 > 131 3 1 48.608228 0.8433913 0 > 133 3 2 49.307101 0.4827703 0.698873 > 134 3 3 51.310824 0.9424023 2.702596 > 135 3 4 52.41335 0.699786 3.805122 > 136 3 5 54.116723 1.1927297 5.508495 > 137 3 6 52.618161 1.1686288 4.009933 > 138 3 7 49.822764 1.6303473 1.214536 > 139 3 8 47.107336 1.2013356 -1.500892 > 140 3 9 47.104214 1.1986148 -1.504014 > 132 3 10 48.719484 0.6765047 0.111256 > > Thank you so much for your help. > Kristiina > > -- > View this message in context: http://r.789695.n4.nabble.com/How-to-adjust-the-start-of-a-series-to-zero-i-e-subtract-the-first-value-from-the-sequence-tp4634999.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Rui Barradas
2012-Jun-30 22:41 UTC
[R] How to adjust the start of a series to zero? (i.e. subtract the first value from the sequence)
Hello, Try, where 'dat' is your dataset, dd <- lapply(split(dat, dat$point), function(x) x$mean - x$mean[1]) dat$adj_mean <- NA for(i in names(dd)) dat$adj_mean[dat$point == i] <- dd[[i]] rm(dd) # clean-up Now 'dat' has one extra column, with the adjusted mean values. Hope this helps, Rui Barradas Em 30-06-2012 22:21, Kristiina Hurme escreveu:> Hello, > I'd have a time series, where I am plotting the means and sd of a distance > for a variety of positions along a bird's bill. I'd like to set each line > (represented by "point") to start at zero, so that I can look at the > absolute change along the series. At the moment I only know how to do that > in Excel, by subtracting the value of time 1, point 1 from all other times > for point 1. My actual data set has many points ( 20 per bird, only 3 shown > here), so I would love to make this faster in R. Ideally, I would have > another column titled "adj_mean" for the adjusted means. > > Here is an example. > >> sort2v4 > point time mean sd > 1 1 1 52.501000 1.5073927 > 3 1 2 54.501818 0.8510329 > 4 1 3 56.601739 1.5787222 > 5 1 4 57.200000 1.2292726 > 6 1 5 59.300000 2.2632327 > 7 1 6 57.800893 1.4745218 > 8 1 7 55.303508 2.2661855 > 9 1 8 51.100943 1.8540025 > 10 1 9 50.600000 1.7126977 > 2 1 10 52.904716 1.1010460 > 111 2 1 50.605963 1.2633969 > 113 2 2 52.203828 0.7890765 > 114 2 3 54.100909 1.1013344 > 115 2 4 55.000000 1.1547005 > 116 2 5 57.001725 1.6341500 > 117 2 6 55.003591 1.5652438 > 118 2 7 52.911089 1.7373914 > 119 2 8 49.204022 1.0350809 > 120 2 9 48.904103 0.8747568 > 112 2 10 50.915700 0.8765483 > 131 3 1 48.608228 0.8433913 > 133 3 2 49.307101 0.4827703 > 134 3 3 51.310824 0.9424023 > 135 3 4 52.413350 0.6997860 > 136 3 5 54.116723 1.1927297 > 137 3 6 52.618161 1.1686288 > 138 3 7 49.822764 1.6303473 > 139 3 8 47.107336 1.2013356 > 140 3 9 47.104214 1.1986148 > 132 3 10 48.719484 0.6765047 > > and I would like it to look like this... (which I did in Excel). The start > of each time 1-10 has an adj_mean of 0. >> sort2v4 > point time mean sd adj_mean > 1 1 1 52.501 1.5073927 0 > 3 1 2 54.501818 0.8510329 2.000818 > 4 1 3 56.601739 1.5787222 4.100739 > 5 1 4 57.2 1.2292726 4.699 > 6 1 5 59.3 2.2632327 6.799 > 7 1 6 57.800893 1.4745218 5.299893 > 8 1 7 55.303508 2.2661855 2.802508 > 9 1 8 51.100943 1.8540025 -1.400057 > 10 1 9 50.6 1.7126977 -1.901 > 2 1 10 52.904716 1.101046 0.403716 > 111 2 1 50.605963 1.2633969 0 > 113 2 2 52.203828 0.7890765 1.597865 > 114 2 3 54.100909 1.1013344 3.494946 > 115 2 4 55 1.1547005 4.394037 > 116 2 5 57.001725 1.63415 6.395762 > 117 2 6 55.003591 1.5652438 4.397628 > 118 2 7 52.911089 1.7373914 2.305126 > 119 2 8 49.204022 1.0350809 -1.401941 > 120 2 9 48.904103 0.8747568 -1.70186 > 112 2 10 50.9157 0.8765483 0.309737 > 131 3 1 48.608228 0.8433913 0 > 133 3 2 49.307101 0.4827703 0.698873 > 134 3 3 51.310824 0.9424023 2.702596 > 135 3 4 52.41335 0.699786 3.805122 > 136 3 5 54.116723 1.1927297 5.508495 > 137 3 6 52.618161 1.1686288 4.009933 > 138 3 7 49.822764 1.6303473 1.214536 > 139 3 8 47.107336 1.2013356 -1.500892 > 140 3 9 47.104214 1.1986148 -1.504014 > 132 3 10 48.719484 0.6765047 0.111256 > > Thank you so much for your help. > Kristiina > > -- > View this message in context: http://r.789695.n4.nabble.com/How-to-adjust-the-start-of-a-series-to-zero-i-e-subtract-the-first-value-from-the-sequence-tp4634999.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
arun
2012-Jun-30 23:16 UTC
[R] How to adjust the start of a series to zero? (i.e. subtract the first value from the sequence)
HI, Try this: #dat1: data dat2<-split(dat1,dat1$point) adjmeanlist<-lapply(dat2,function(x)x[,3]-x[,3][1]) dat3<-data.frame(dat1,adjmean=unlist(adjmeanlist)) ?head(dat3) ? point time???? mean??????? sd? adjmean 1???? 1??? 1 52.50100 1.5073927 0.000000 3???? 1??? 2 54.50182 0.8510329 2.000818 4???? 1??? 3 56.60174 1.5787222 4.100739 5???? 1??? 4 57.20000 1.2292726 4.699000 6???? 1??? 5 59.30000 2.2632327 6.799000 7???? 1??? 6 57.80089 1.4745218 5.299893 A.K. ----- Original Message ----- From: Kristiina Hurme <kristiina.hurme at uconn.edu> To: r-help at r-project.org Cc: Sent: Saturday, June 30, 2012 5:21 PM Subject: [R] How to adjust the start of a series to zero? (i.e. subtract the first value from the sequence) Hello, I'd have a time series, where I am plotting the means and sd of a distance for a variety of positions along a bird's bill. I'd like to set each line (represented by "point") to start at zero, so that I can look at the absolute change along the series. At the moment I only know how to do that in Excel, by subtracting the value of time 1, point 1 from all other times for point 1. My actual data set has many points ( 20 per bird, only 3 shown here), so I would love to make this faster in R. Ideally, I would have another column titled "adj_mean" for the adjusted means. Here is an example.> sort2v4? ? point time? ? ? mean? ? ? ? sd 1? ? ? 1? ? 1 52.501000 1.5073927 3? ? ? 1? ? 2 54.501818 0.8510329 4? ? ? 1? ? 3 56.601739 1.5787222 5? ? ? 1? ? 4 57.200000 1.2292726 6? ? ? 1? ? 5 59.300000 2.2632327 7? ? ? 1? ? 6 57.800893 1.4745218 8? ? ? 1? ? 7 55.303508 2.2661855 9? ? ? 1? ? 8 51.100943 1.8540025 10? ? ? 1? ? 9 50.600000 1.7126977 2? ? ? 1? 10 52.904716 1.1010460 111? ? 2? ? 1 50.605963 1.2633969 113? ? 2? ? 2 52.203828 0.7890765 114? ? 2? ? 3 54.100909 1.1013344 115? ? 2? ? 4 55.000000 1.1547005 116? ? 2? ? 5 57.001725 1.6341500 117? ? 2? ? 6 55.003591 1.5652438 118? ? 2? ? 7 52.911089 1.7373914 119? ? 2? ? 8 49.204022 1.0350809 120? ? 2? ? 9 48.904103 0.8747568 112? ? 2? 10 50.915700 0.8765483 131? ? 3? ? 1 48.608228 0.8433913 133? ? 3? ? 2 49.307101 0.4827703 134? ? 3? ? 3 51.310824 0.9424023 135? ? 3? ? 4 52.413350 0.6997860 136? ? 3? ? 5 54.116723 1.1927297 137? ? 3? ? 6 52.618161 1.1686288 138? ? 3? ? 7 49.822764 1.6303473 139? ? 3? ? 8 47.107336 1.2013356 140? ? 3? ? 9 47.104214 1.1986148 132? ? 3? 10 48.719484 0.6765047 and I would like it to look like this... (which I did in Excel). The start of each time 1-10 has an adj_mean of 0.> sort2v4??? ??? ??? ??? ?????? point??? time??? mean??? sd??? adj_mean 1??? 1??? 1??? 52.501??? 1.5073927??? 0 3??? 1??? 2??? 54.501818??? 0.8510329??? 2.000818 4??? 1??? 3??? 56.601739??? 1.5787222??? 4.100739 5??? 1??? 4??? 57.2??? 1.2292726??? 4.699 6??? 1??? 5??? 59.3??? 2.2632327??? 6.799 7??? 1??? 6??? 57.800893??? 1.4745218??? 5.299893 8??? 1??? 7??? 55.303508??? 2.2661855??? 2.802508 9??? 1??? 8??? 51.100943??? 1.8540025??? -1.400057 10??? 1??? 9??? 50.6??? 1.7126977??? -1.901 2??? 1??? 10??? 52.904716??? 1.101046??? 0.403716 111??? 2??? 1??? 50.605963??? 1.2633969??? 0 113??? 2??? 2??? 52.203828??? 0.7890765??? 1.597865 114??? 2??? 3??? 54.100909??? 1.1013344??? 3.494946 115??? 2??? 4??? 55??? 1.1547005??? 4.394037 116??? 2??? 5??? 57.001725??? 1.63415??? 6.395762 117??? 2??? 6??? 55.003591??? 1.5652438??? 4.397628 118??? 2??? 7??? 52.911089??? 1.7373914??? 2.305126 119??? 2??? 8??? 49.204022??? 1.0350809??? -1.401941 120??? 2??? 9??? 48.904103??? 0.8747568??? -1.70186 112??? 2??? 10??? 50.9157??? 0.8765483??? 0.309737 131??? 3??? 1??? 48.608228??? 0.8433913??? 0 133??? 3??? 2??? 49.307101??? 0.4827703??? 0.698873 134??? 3??? 3??? 51.310824??? 0.9424023??? 2.702596 135??? 3??? 4??? 52.41335??? 0.699786??? 3.805122 136??? 3??? 5??? 54.116723??? 1.1927297??? 5.508495 137??? 3??? 6??? 52.618161??? 1.1686288??? 4.009933 138??? 3??? 7??? 49.822764??? 1.6303473??? 1.214536 139??? 3??? 8??? 47.107336??? 1.2013356??? -1.500892 140??? 3??? 9??? 47.104214??? 1.1986148??? -1.504014 132??? 3??? 10??? 48.719484??? 0.6765047??? 0.111256 Thank you so much for your help. Kristiina -- View this message in context: http://r.789695.n4.nabble.com/How-to-adjust-the-start-of-a-series-to-zero-i-e-subtract-the-first-value-from-the-sequence-tp4634999.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.