michael.hopgood
2011-Jan-17 20:51 UTC
[R] Extraction and replacement of data in a data frame
Dear R family, I am a relative newbie and have been dabbling with R for a little while. Simple things really, but my employers are beginning to see the benefits of using R instead of excel. We have a remote monitoring station measuring groundwater levels. We download the date as a .csv file and up until now, we have been using excel to analyse the data. It?s been a hassle trying to wrestle with that damn program as my boss wants to do things that excel was never meant to do, so I?ve convinced my boss to give R a chance. It?s been a steep learning curve, but I?m fairly confident I can reduce the amount of labour involved in producing and improving the graphs we show our clients. The groundwater levels are measured by pressure sensors lowered into the monitoring wells. After a certain time, the sensors were lowered further into the well, thus creating a disparity in the measurements. The data frame I import into R looks something like this: Date Waterhead (mm) 10-01-01 100 10-01-02 105 10-01-03 101 10-01-04 99 10-01-05 85 10-01-06 200 10-01-07 199 10-01-08 195 10-01-09 185 10-01-10 170 For example, on the 10-10-06, the sensor was lowered by 115 mm. When I download the csv file, I download the data from the beginning of the measurement period. I then need to adjust the height by 115 mm to account for the lowering of the parameter. My question to you is how do I do that in R? I am after a formula or a manipulation that selects the first five measurements and adds a fixed amount. This is something that is added everytime I download the csv file and import it into R so that when I display my data, it is based on the following data frame: Date Waterhead (mm) 10-01-01 215 10-01-02 220 10-01-03 216 10-01-04 214 10-01-05 200 10-01-06 200 10-01-07 199 10-01-08 195 10-01-09 185 10-01-10 170 In short, I want to select a fixed number of rows of a column from my data frame, add a constant to these, and insert the new values into their respective rows without affecting the subsequent rows. I hope I have produced a reproducible example. I have been searching high and low for a solution, but have come up against a brick wall. I feel I have read something that tackles this some time in the past, but can?t find it again. Thanks in advance! -- View this message in context: http://r.789695.n4.nabble.com/Extraction-and-replacement-of-data-in-a-data-frame-tp3221261p3221261.html Sent from the R help mailing list archive at Nabble.com.
Mike Marchywka
2011-Jan-17 22:56 UTC
[R] Extraction and replacement of data in a data frame
----------------------------------------> Date: Mon, 17 Jan 2011 12:51:43 -0800 > From: michael.hopgood at mrm.se > To: r-help at r-project.org > Subject: [R] Extraction and replacement of data in a data frame > > > Dear R family, > I am a relative newbie and have been dabbling with R for a little while. > Simple things really, but my employers are beginning to see the benefits of > using R instead of excel. We have a remote monitoring station measuring > groundwater levels. We download the date as a .csv file and up until now, > we have been using excel to analyse the data. It?s been a hassle trying to > wrestle with that damn program as my boss wants to do things that excel was > never meant to do, so I?ve convinced my boss to give R a chance. It?s been > a steep learning curve, but I?m fairly confident I can reduce the amount of > labour involved in producing and improving the graphs we show our clients. > > The groundwater levels are measured by pressure sensors lowered into the > monitoring wells. After a certain time, the sensors were lowered further > into the well, thus creating a disparity in the measurements. > > The data frame I import into R looks something like this: > Date Waterhead (mm) > 10-01-01 100 > 10-01-02 105 > 10-01-03 101 > 10-01-04 99 > 10-01-05 85 > 10-01-06 200 > 10-01-07 199 > 10-01-08 195 > 10-01-09 185 > 10-01-10 170 > > For example, on the 10-10-06, the sensor was lowered by 115 mm. > When I download the csv file, I download the data from the beginning of the > measurement period. I then need to adjust the height by 115 mm to account > for the lowering of the parameter. My question to you is how do I do that > in R? > I am after a formula or a manipulation that selects the first five > measurements and adds a fixed amount. This is something that is added > everytime I download the csv file and import it into R so that when I > display my data, it is based on the following data frame:See if this helps, I'm still learning how to do good R but this seems to work. Just personal pref I converted your data to csv, ?254? cat xxx.txt | awk '{print "20"$1","$2}' > xxx.csv I've neer used posix before, just copying what I've seen here but it seemed to work as shown below, x<-read.table("xxx.csv",sep=",") str(x) x$V1=as.POSIXct(x$V1) str(x) y=(x$V1>as.POSIXct("2010-01-05")) y x$V2[y]=x$V2[y]+10000 x output ends like 5? 2010-01-05?? 200 6? 2010-01-06 10200 7? 2010-01-07 10199 8? 2010-01-08 10195 9? 2010-01-09 10185 10 2010-01-10 10170> > Date Waterhead (mm) > 10-01-01 215 > 10-01-02 220 > 10-01-03 216 > 10-01-04 214 > 10-01-05 200 > 10-01-06 200 > 10-01-07 199 > 10-01-08 195 > 10-01-09 185 > 10-01-10 170 > > In short, I want to select a fixed number of rows of a column from my data > frame, add a constant to these, and insert the new values into their > respective rows without affecting the subsequent rows. I hope I have > produced a reproducible example. I have been searching high and low for a > solution, but have come up against a brick wall. I feel I have read > something that tackles this some time in the past, but can?t find it again. > Thanks in advance! > > -- > View this message in context: http://r.789695.n4.nabble.com/Extraction-and-replacement-of-data-in-a-data-frame-tp3221261p3221261.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Dennis Murphy
2011-Jan-18 01:28 UTC
[R] Extraction and replacement of data in a data frame
Hi: Try this little utility function to see if it meets your needs; the new variable is for testing; it takes a data frame, adjustment date and adjustment amount as parameters. headAdj <- function(df, day, amt) { # Check for four or two number year and format accordingly u <- unlist(strsplit(day, '-'))[1] if(nchar(u) == 4L) day = as.Date(day, format = '%Y-%m-%d') else if(nchar(u) == 2L) day = as.Date(day, format = '%y-%m-%d') # Make the adjustment and print out the modified data frame df$whadj <- df$Waterhead + (df$Date <= day) * amt df }> headAdj(df, '2010-01-05', 115)Date Waterhead whadj 1 2010-01-01 100 215 2 2010-01-02 105 220 3 2010-01-03 101 216 4 2010-01-04 99 214 5 2010-01-05 85 200 6 2010-01-06 200 200 7 2010-01-07 199 199 8 2010-01-08 195 195 9 2010-01-09 185 185 10 2010-01-10 170 170> headAdj(df, '10-01-05', 115)<ditto> HTH, Dennis On Mon, Jan 17, 2011 at 12:51 PM, michael.hopgood <michael.hopgood@mrm.se>wrote:> > Dear R family, > I am a relative newbie and have been dabbling with R for a little while. > Simple things really, but my employers are beginning to see the benefits of > using R instead of excel. We have a remote monitoring station measuring > groundwater levels. We download the date as a .csv file and up until now, > we have been using excel to analyse the data. It’s been a hassle trying to > wrestle with that damn program as my boss wants to do things that excel was > never meant to do, so I’ve convinced my boss to give R a chance. It’s > been > a steep learning curve, but I’m fairly confident I can reduce the amount of > labour involved in producing and improving the graphs we show our clients. > > The groundwater levels are measured by pressure sensors lowered into the > monitoring wells. After a certain time, the sensors were lowered further > into the well, thus creating a disparity in the measurements. > > The data frame I import into R looks something like this: > Date Waterhead (mm) > 10-01-01 100 > 10-01-02 105 > 10-01-03 101 > 10-01-04 99 > 10-01-05 85 > 10-01-06 200 > 10-01-07 199 > 10-01-08 195 > 10-01-09 185 > 10-01-10 170 > > For example, on the 10-10-06, the sensor was lowered by 115 mm. > When I download the csv file, I download the data from the beginning of the > measurement period. I then need to adjust the height by 115 mm to account > for the lowering of the parameter. My question to you is how do I do that > in R? > I am after a formula or a manipulation that selects the first five > measurements and adds a fixed amount. This is something that is added > everytime I download the csv file and import it into R so that when I > display my data, it is based on the following data frame: > > Date Waterhead (mm) > 10-01-01 215 > 10-01-02 220 > 10-01-03 216 > 10-01-04 214 > 10-01-05 200 > 10-01-06 200 > 10-01-07 199 > 10-01-08 195 > 10-01-09 185 > 10-01-10 170 > > In short, I want to select a fixed number of rows of a column from my data > frame, add a constant to these, and insert the new values into their > respective rows without affecting the subsequent rows. I hope I have > produced a reproducible example. I have been searching high and low for a > solution, but have come up against a brick wall. I feel I have read > something that tackles this some time in the past, but can’t find it again. > Thanks in advance! > > -- > View this message in context: > http://r.789695.n4.nabble.com/Extraction-and-replacement-of-data-in-a-data-frame-tp3221261p3221261.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Gabor Grothendieck
2011-Jan-18 02:31 UTC
[R] Extraction and replacement of data in a data frame
On Mon, Jan 17, 2011 at 3:51 PM, michael.hopgood <michael.hopgood at mrm.se> wrote:> > Dear R family, > I am a relative newbie and have been dabbling with R for a little while. > Simple things really, but my employers are beginning to see the benefits of > using R instead of excel. We have a remote monitoring station measuring > groundwater levels. ?We download the ?date as a .csv file and up until now, > we have been using excel to analyse the data. ?It?s been a hassle trying to > wrestle with that damn program as my boss wants to do things that excel was > never meant to do, ?so I?ve convinced my boss to give R a chance. ?It?s been > a steep learning curve, but I?m fairly confident I can reduce the amount of > labour involved in producing and improving the graphs we show our clients. > > The groundwater levels are measured by pressure sensors lowered into the > monitoring wells. ? After a certain time, the sensors were lowered further > into the well, thus creating a disparity in the measurements. > > The data frame I import into R looks something like this: > Date ? ? ? ? ? ?Waterhead (mm) > 10-01-01 ? ? 100 > 10-01-02 ? ? 105 > 10-01-03 ? ? 101 > 10-01-04 ? ? ?99 > 10-01-05 ? ? ?85 > 10-01-06 ? ?200 > 10-01-07 ? ?199 > 10-01-08 ? ?195 > 10-01-09 ? ?185 > 10-01-10 ? ?170 > > For example, on the 10-10-06, the sensor was lowered by 115 mm. > When I download the csv file, I download the data from the beginning of the > measurement period. I then need to adjust the height by 115 mm to account > for the lowering of the parameter. ?My question to you is how do I do that > in R? > I am after a formula or a manipulation that selects the first five > measurements and adds a fixed amount. ?This is something that is added > everytime I download the csv file and import it into R so that when I > display my data, it is based on the following data frame: > > Date ? ? ? ? ? ?Waterhead (mm) > 10-01-01 ? ? 215 > 10-01-02 ? ? 220 > 10-01-03 ? ? 216 > 10-01-04 ? ? ?214 > 10-01-05 ? ? ?200 > 10-01-06 ? ?200 > 10-01-07 ? ?199 > 10-01-08 ? ?195 > 10-01-09 ? ?185 > 10-01-10 ? ?170 > > In short, I want to select a fixed number of rows of a column from my data > frame, add a constant to these, and insert the new values into their > respective rows without affecting the subsequent rows. ?I hope I have > produced a reproducible example. ?I have been searching high and low for a > solution, but have come up against a brick wall. I feel I have read > something that tackles this some time in the past, but can?t find it again. > Thanks in advance!Try this using the builtin data frame, BOD:> BODTime demand 1 1 8.3 2 2 10.3 3 3 19.0 4 4 16.0 5 5 15.6 6 7 19.8> > # add 100 to the first two rows in column 2 > BOD[1:2, 2] <- BOD[1:2, 2] + 100 > BODTime demand 1 1 108.3 2 2 110.3 3 3 19.0 4 4 16.0 5 5 15.6 6 7 19.8 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
michael.hopgood
2011-Jan-21 12:03 UTC
[R] Extraction and replacement of data in a data frame
Dear all, Thank you for the prompt responses. It is until today that I have managed to scrap together the time to develop my R-project further. In my free time, I have been reading various intro manuals, so I have a rough idea of what needs doing. Sometimes, though, putting it into practice is more troublesome than it looks. It is fascinating how pliable this programming language is. I will report on my progress as soon as I can. Sincerely, Michael Hopgood. -- View this message in context: http://r.789695.n4.nabble.com/Extraction-and-replacement-of-data-in-a-data-frame-tp3221261p3229476.html Sent from the R help mailing list archive at Nabble.com.