Hello All, I have a data file in a text format and there are two data sets. The data set are continuous. For each data set there is a header which has the number of data rows and the name of data series. For example first data set has "6240 Terry Cove-Model". Then the data for that series follows upto 6240 rows. Then another data would start and it will have the header such as "5200 Terry-Observed" The sample data would look like: 6240 Terry Cove-Model 300 .300110459327698 300.041656494141 .289277672767639 300.083343505859 .276237487792969 300.125 .258902788162231 300.166656494141 .236579895019531 300.208343505859 .221315026283264 300.25 .214318037033081 300.291656494141 .190926909446716 300.333343505859 .158144593238831 300.375 .113302707672119 300.416656494141 .103684902191162 300.458343505859 9.72903966903687E-02 300.5 8.76948833465576E-02 300.541656494141 8.42459201812744E-02 300.583343505859 .078397274017334 300.625 8.44632387161255E-02 300.666656494141 9.32939052581787E-02 300.708343505859 .113663911819458 300.75 .123064398765564 300.791656494141 .157548069953918 300.833343505859 .148393034934998 300.875 .135645747184753 300.916656494141 .137590646743774 300.958343505859 .133154153823853 301 .131152510643005 301.041656494141 .114152908325195 301.083343505859 8.04083347320557E-02 301.125 5.53587675094604E-02 301.166656494141 3.17397117614746E-02 301.208343505859 4.07266616821289E-03 301.25 -2.15455293655396E-02 301.291656494141 -4.07489538192749E-02 301.333343505859 -5.85414171218872E-02 301.375 -7.53517150878906E-02 301.416656494141 -8.49723815917969E-02 301.458343505859 -7.91778564453125E-02 301.5 -7.02846050262451E-02 301.541656494141 -7.24701881408691E-02 301.583343505859 -7.76907205581665E-02 301.625 -6.82642459869385E-02 62401 Terry Cove-Data 300 .216407993 300.0042 .204216005 300.0083 .210311999 300.0125 .195071996 300.0167 .192023999 300.0208 .179831992 300.025 .188976001 300.0292 .185928004 300.0333 .195071996 300.0375 .219456009 300.0417 .210311999 300.0458 .204216005 300.05 .195071996 300.0542 .188976001 300.0583 .195071996 300.0625 .195071996 300.0667 .185928004 300.0708 .173735998 300.075 .170688001 300.0792 .167640004 300.0833 .167640004 300.0875 .167640004 300.0917 .167640004 300.0958 .161543991 300.1 .1524 300.1042 .158495994 300.1083 .149352003 300.1125 .158495994 300.1167 .1524 300.1208 .1524 300.125 .149352003 300.1292 .143256 300.1333 .146303997 300.1375 .149352003 300.1417 .146303997 300.1458 .137159996 300.15 .131064002 300.1542 .124967999 300.1583 .128015996 300.1625 .124967999 300.1667 .131064002 300.1708 .124967999 300.175 .124967999 300.1792 .134111999 300.1833 .118871996 300.1875 .128015996 300.1917 .131064002 300.1958 .128015996 300.2 .131064002 300.2042 .128015996 300.2083 .121920002 300.2125 .115823999 300.2167 .112776001 300.2208 .103632001 300.225 .097535998 300.2292 .103632001 300.2333 .094488001 300.2375 .082296003 300.2417 .0762 300.2458 .079247997 300.25 .067056 300.2542 .064007998 300.2583 .045720002 300.2625 .033528 300.2667 .036575999 300.2708 .036575999 300.275 .036575999 300.2792 .027432001 300.2833 .027432001 300.2875 .021336 300.2917 .012192 300.2958 .009144 300.3 .009144 300.3042 .003048 300.3083 0 300.3125 -.003048 300.3167 -.006096 300.3208 0 300.325 .006096 300.3292 -.003048 300.3333 .006096 The full data set can be downloaded from https://www.dropbox.com/s/chhw3vz6ru1godk/Practicedata.Dat I want to make a comparison graph between modeled and observed. Once I am able to read two data sets as two sets of data or combined in one I would be able to create the time series graph. Another thing I need to do is create another sub data set where both the series have common data. One data might have more intervals than another. After I find two data sets of same interval then I want to plot a correlation graph. I hope I made it clear what I want to do. Thank you so much. Best Regards, Janesh [[alternative HTML version deleted]]
Janesh Devkota
2013-Jan-21 08:19 UTC
[R] How to read a file with two data sets in text format
I was able to read the data using the following code: jd1 <- read.table('Practicedata.dat',header=T,sep="\t",nrow=6240) jd2 <- read.table('Practicedata.dat',header=T,sep="\t",skip=6241) colnames(jd1) <- c("Date","Mod") colnames(jd2) <- c("Date", "Obs") p <- ggplot(jd1,aes(x=Date,y=Mod))+geom_line() p <- p + geom_line(data=jd2,aes(x=Date,y=Obs),color="red") p Now, I want to make a scatter plot between jd1$Mod and jd2$Obs. But I cannot create one since both of them have different number of rows. Since I have less number of rows for Mod I am planning to use the date of Mod and then find the corresponding values of Obs for those time periods. How can I find the corresponding values of Obs for the give date in jd1 ? Or is there any way to create a scatter plot and put the regression equation and correlation coefficient. Thank you so much. Best Regards, Janesh On Mon, Jan 21, 2013 at 1:19 AM, Jd Devkota <janesh.devkota@gmail.com>wrote:> Hello All, > > I have a data file in a text format and there are two data sets. The data > set are continuous. > For each data set there is a header which has the number of data rows and > the name of data series. > For example first data set has "6240 Terry Cove-Model". Then the data for > that series follows upto 6240 rows. Then another data would start and it > will have the header such as "5200 Terry-Observed" > > The sample data would look like: > > 6240 Terry Cove-Model > 300 .300110459327698 > 300.041656494141 .289277672767639 > 300.083343505859 .276237487792969 > 300.125 .258902788162231 > 300.166656494141 .236579895019531 > 300.208343505859 .221315026283264 > 300.25 .214318037033081 > 300.291656494141 .190926909446716 > 300.333343505859 .158144593238831 > 300.375 .113302707672119 > 300.416656494141 .103684902191162 > 300.458343505859 9.72903966903687E-02 > 300.5 8.76948833465576E-02 > 300.541656494141 8.42459201812744E-02 > 300.583343505859 .078397274017334 > 300.625 8.44632387161255E-02 > 300.666656494141 9.32939052581787E-02 > 300.708343505859 .113663911819458 > 300.75 .123064398765564 > 300.791656494141 .157548069953918 > 300.833343505859 .148393034934998 > 300.875 .135645747184753 > 300.916656494141 .137590646743774 > 300.958343505859 .133154153823853 > 301 .131152510643005 > 301.041656494141 .114152908325195 > 301.083343505859 8.04083347320557E-02 > 301.125 5.53587675094604E-02 > 301.166656494141 3.17397117614746E-02 > 301.208343505859 4.07266616821289E-03 > 301.25 -2.15455293655396E-02 > 301.291656494141 -4.07489538192749E-02 > 301.333343505859 -5.85414171218872E-02 > 301.375 -7.53517150878906E-02 > 301.416656494141 -8.49723815917969E-02 > 301.458343505859 -7.91778564453125E-02 > 301.5 -7.02846050262451E-02 > 301.541656494141 -7.24701881408691E-02 > 301.583343505859 -7.76907205581665E-02 > 301.625 -6.82642459869385E-02 > 62401 Terry Cove-Data > 300 .216407993 > 300.0042 .204216005 > 300.0083 .210311999 > 300.0125 .195071996 > 300.0167 .192023999 > 300.0208 .179831992 > 300.025 .188976001 > 300.0292 .185928004 > 300.0333 .195071996 > 300.0375 .219456009 > 300.0417 .210311999 > 300.0458 .204216005 > 300.05 .195071996 > 300.0542 .188976001 > 300.0583 .195071996 > 300.0625 .195071996 > 300.0667 .185928004 > 300.0708 .173735998 > 300.075 .170688001 > 300.0792 .167640004 > 300.0833 .167640004 > 300.0875 .167640004 > 300.0917 .167640004 > 300.0958 .161543991 > 300.1 .1524 > 300.1042 .158495994 > 300.1083 .149352003 > 300.1125 .158495994 > 300.1167 .1524 > 300.1208 .1524 > 300.125 .149352003 > 300.1292 .143256 > 300.1333 .146303997 > 300.1375 .149352003 > 300.1417 .146303997 > 300.1458 .137159996 > 300.15 .131064002 > 300.1542 .124967999 > 300.1583 .128015996 > 300.1625 .124967999 > 300.1667 .131064002 > 300.1708 .124967999 > 300.175 .124967999 > 300.1792 .134111999 > 300.1833 .118871996 > 300.1875 .128015996 > 300.1917 .131064002 > 300.1958 .128015996 > 300.2 .131064002 > 300.2042 .128015996 > 300.2083 .121920002 > 300.2125 .115823999 > 300.2167 .112776001 > 300.2208 .103632001 > 300.225 .097535998 > 300.2292 .103632001 > 300.2333 .094488001 > 300.2375 .082296003 > 300.2417 .0762 > 300.2458 .079247997 > 300.25 .067056 > 300.2542 .064007998 > 300.2583 .045720002 > 300.2625 .033528 > 300.2667 .036575999 > 300.2708 .036575999 > 300.275 .036575999 > 300.2792 .027432001 > 300.2833 .027432001 > 300.2875 .021336 > 300.2917 .012192 > 300.2958 .009144 > 300.3 .009144 > 300.3042 .003048 > 300.3083 0 > 300.3125 -.003048 > 300.3167 -.006096 > 300.3208 0 > 300.325 .006096 > 300.3292 -.003048 > 300.3333 .006096 > > The full data set can be downloaded from > https://www.dropbox.com/s/chhw3vz6ru1godk/Practicedata.Dat > > I want to make a comparison graph between modeled and observed. Once I am > able to read two data sets as two sets of data or combined in one I would > be able to create the time series graph. > > Another thing I need to do is create another sub data set where both the > series have common data. One data might have more intervals than another. > After I find two data sets of same interval then I want to plot a > correlation graph. > > I hope I made it clear what I want to do. > > Thank you so much. > > Best Regards, > Janesh >[[alternative HTML version deleted]]
jim holtman
2013-Jan-21 13:31 UTC
[R] How to read a file with two data sets in text format
Here is one way to read the data. Modified your sample for the line counts of actual data: x <- readLines(textConnection("40 Terry Cove-Model 300 .300110459327698 300.041656494141 .289277672767639 300.083343505859 .276237487792969 300.125 .258902788162231 300.166656494141 .236579895019531 300.208343505859 .221315026283264 300.25 .214318037033081 300.291656494141 .190926909446716 300.333343505859 .158144593238831 300.375 .113302707672119 300.416656494141 .103684902191162 300.458343505859 9.72903966903687E-02 300.5 8.76948833465576E-02 300.541656494141 8.42459201812744E-02 300.583343505859 .078397274017334 300.625 8.44632387161255E-02 300.666656494141 9.32939052581787E-02 300.708343505859 .113663911819458 300.75 .123064398765564 300.791656494141 .157548069953918 300.833343505859 .148393034934998 300.875 .135645747184753 300.916656494141 .137590646743774 300.958343505859 .133154153823853 301 .131152510643005 301.041656494141 .114152908325195 301.083343505859 8.04083347320557E-02 301.125 5.53587675094604E-02 301.166656494141 3.17397117614746E-02 301.208343505859 4.07266616821289E-03 301.25 -2.15455293655396E-02 301.291656494141 -4.07489538192749E-02 301.333343505859 -5.85414171218872E-02 301.375 -7.53517150878906E-02 301.416656494141 -8.49723815917969E-02 301.458343505859 -7.91778564453125E-02 301.5 -7.02846050262451E-02 301.541656494141 -7.24701881408691E-02 301.583343505859 -7.76907205581665E-02 301.625 -6.82642459869385E-02 81 Terry Cove-Data 300 .216407993 300.0042 .204216005 300.0083 .210311999 300.0125 .195071996 300.0167 .192023999 300.0208 .179831992 300.025 .188976001 300.0292 .185928004 300.0333 .195071996 300.0375 .219456009 300.0417 .210311999 300.0458 .204216005 300.05 .195071996 300.0542 .188976001 300.0583 .195071996 300.0625 .195071996 300.0667 .185928004 300.0708 .173735998 300.075 .170688001 300.0792 .167640004 300.0833 .167640004 300.0875 .167640004 300.0917 .167640004 300.0958 .161543991 300.1 .1524 300.1042 .158495994 300.1083 .149352003 300.1125 .158495994 300.1167 .1524 300.1208 .1524 300.125 .149352003 300.1292 .143256 300.1333 .146303997 300.1375 .149352003 300.1417 .146303997 300.1458 .137159996 300.15 .131064002 300.1542 .124967999 300.1583 .128015996 300.1625 .124967999 300.1667 .131064002 300.1708 .124967999 300.175 .124967999 300.1792 .134111999 300.1833 .118871996 300.1875 .128015996 300.1917 .131064002 300.1958 .128015996 300.2 .131064002 300.2042 .128015996 300.2083 .121920002 300.2125 .115823999 300.2167 .112776001 300.2208 .103632001 300.225 .097535998 300.2292 .103632001 300.2333 .094488001 300.2375 .082296003 300.2417 .0762 300.2458 .079247997 300.25 .067056 300.2542 .064007998 300.2583 .045720002 300.2625 .033528 300.2667 .036575999 300.2708 .036575999 300.275 .036575999 300.2792 .027432001 300.2833 .027432001 300.2875 .021336 300.2917 .012192 300.2958 .009144 300.3 .009144 300.3042 .003048 300.3083 0 300.3125 -.003048 300.3167 -.006096 300.3208 0 300.325 .006096 300.3292 -.003048 300.3333 .006096")) indx <- grep("^[0-9]+ [[:alpha:]]", x) # determine where breaks are # read data into a list result <- lapply(indx, function(.start){ # extract the line count n <- as.integer(sub("^\\s*([0-9]+).*", "\\1", x[.start])) read.table(text = x[seq(.start + 1L, length = n)]) }) str(result)> str(result)List of 2 $ :'data.frame': 40 obs. of 2 variables: ..$ V1: num [1:40] 300 300 300 300 300 ... ..$ V2: num [1:40] 0.3 0.289 0.276 0.259 0.237 ... $ :'data.frame': 81 obs. of 2 variables: ..$ V1: num [1:81] 300 300 300 300 300 ... ..$ V2: num [1:81] 0.216 0.204 0.21 0.195 0.192 ...> source('clipboard')List of 2 $ :'data.frame': 40 obs. of 2 variables: ..$ V1: num [1:40] 300 300 300 300 300 ... ..$ V2: num [1:40] 0.3 0.289 0.276 0.259 0.237 ... $ :'data.frame': 81 obs. of 2 variables: ..$ V1: num [1:81] 300 300 300 300 300 ... ..$ V2: num [1:81] 0.216 0.204 0.21 0.195 0.192 ... On Mon, Jan 21, 2013 at 2:19 AM, Jd Devkota <janesh.devkota at gmail.com> wrote:> Hello All, > > I have a data file in a text format and there are two data sets. The data > set are continuous. > For each data set there is a header which has the number of data rows and > the name of data series. > For example first data set has "6240 Terry Cove-Model". Then the data for > that series follows upto 6240 rows. Then another data would start and it > will have the header such as "5200 Terry-Observed" > > The sample data would look like: > > 6240 Terry Cove-Model > 300 .300110459327698 > 300.041656494141 .289277672767639 > 300.083343505859 .276237487792969 > 300.125 .258902788162231 > 300.166656494141 .236579895019531 > 300.208343505859 .221315026283264 > 300.25 .214318037033081 > 300.291656494141 .190926909446716 > 300.333343505859 .158144593238831 > 300.375 .113302707672119 > 300.416656494141 .103684902191162 > 300.458343505859 9.72903966903687E-02 > 300.5 8.76948833465576E-02 > 300.541656494141 8.42459201812744E-02 > 300.583343505859 .078397274017334 > 300.625 8.44632387161255E-02 > 300.666656494141 9.32939052581787E-02 > 300.708343505859 .113663911819458 > 300.75 .123064398765564 > 300.791656494141 .157548069953918 > 300.833343505859 .148393034934998 > 300.875 .135645747184753 > 300.916656494141 .137590646743774 > 300.958343505859 .133154153823853 > 301 .131152510643005 > 301.041656494141 .114152908325195 > 301.083343505859 8.04083347320557E-02 > 301.125 5.53587675094604E-02 > 301.166656494141 3.17397117614746E-02 > 301.208343505859 4.07266616821289E-03 > 301.25 -2.15455293655396E-02 > 301.291656494141 -4.07489538192749E-02 > 301.333343505859 -5.85414171218872E-02 > 301.375 -7.53517150878906E-02 > 301.416656494141 -8.49723815917969E-02 > 301.458343505859 -7.91778564453125E-02 > 301.5 -7.02846050262451E-02 > 301.541656494141 -7.24701881408691E-02 > 301.583343505859 -7.76907205581665E-02 > 301.625 -6.82642459869385E-02 > 62401 Terry Cove-Data > 300 .216407993 > 300.0042 .204216005 > 300.0083 .210311999 > 300.0125 .195071996 > 300.0167 .192023999 > 300.0208 .179831992 > 300.025 .188976001 > 300.0292 .185928004 > 300.0333 .195071996 > 300.0375 .219456009 > 300.0417 .210311999 > 300.0458 .204216005 > 300.05 .195071996 > 300.0542 .188976001 > 300.0583 .195071996 > 300.0625 .195071996 > 300.0667 .185928004 > 300.0708 .173735998 > 300.075 .170688001 > 300.0792 .167640004 > 300.0833 .167640004 > 300.0875 .167640004 > 300.0917 .167640004 > 300.0958 .161543991 > 300.1 .1524 > 300.1042 .158495994 > 300.1083 .149352003 > 300.1125 .158495994 > 300.1167 .1524 > 300.1208 .1524 > 300.125 .149352003 > 300.1292 .143256 > 300.1333 .146303997 > 300.1375 .149352003 > 300.1417 .146303997 > 300.1458 .137159996 > 300.15 .131064002 > 300.1542 .124967999 > 300.1583 .128015996 > 300.1625 .124967999 > 300.1667 .131064002 > 300.1708 .124967999 > 300.175 .124967999 > 300.1792 .134111999 > 300.1833 .118871996 > 300.1875 .128015996 > 300.1917 .131064002 > 300.1958 .128015996 > 300.2 .131064002 > 300.2042 .128015996 > 300.2083 .121920002 > 300.2125 .115823999 > 300.2167 .112776001 > 300.2208 .103632001 > 300.225 .097535998 > 300.2292 .103632001 > 300.2333 .094488001 > 300.2375 .082296003 > 300.2417 .0762 > 300.2458 .079247997 > 300.25 .067056 > 300.2542 .064007998 > 300.2583 .045720002 > 300.2625 .033528 > 300.2667 .036575999 > 300.2708 .036575999 > 300.275 .036575999 > 300.2792 .027432001 > 300.2833 .027432001 > 300.2875 .021336 > 300.2917 .012192 > 300.2958 .009144 > 300.3 .009144 > 300.3042 .003048 > 300.3083 0 > 300.3125 -.003048 > 300.3167 -.006096 > 300.3208 0 > 300.325 .006096 > 300.3292 -.003048 > 300.3333 .006096 > > The full data set can be downloaded from > https://www.dropbox.com/s/chhw3vz6ru1godk/Practicedata.Dat > > I want to make a comparison graph between modeled and observed. Once I am > able to read two data sets as two sets of data or combined in one I would > be able to create the time series graph. > > Another thing I need to do is create another sub data set where both the > series have common data. One data might have more intervals than another. > After I find two data sets of same interval then I want to plot a > correlation graph. > > I hope I made it clear what I want to do. > > Thank you so much. > > Best Regards, > Janesh > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it.