Good morning R-Help, I have a dataframe with 7 columns and 10000+ rows. I want to subset/extract those data frame with specific date (not in order). Here the head of my data frame: head(mjo30) year month date rmm1 rmm2 phase amp 1 1986 1 1 -0.326480 -1.55895 2 1.59277 2 1986 1 2 -0.417700 -1.82689 2 1.87403 3 1986 1 3 0.032915 -2.40150 3 2.40172 4 1986 1 4 0.492743 -2.49216 3 2.54041 5 1986 1 5 0.585106 -2.76866 3 2.82981 6 1986 1 6 0.665013 -3.13883 3 3.20851 and here my specific date:> date [1] "1986-04-25" "1987-06-10" "1988-09-03" "1989-10-05" "1990-10-26" "1991-05-07" "1992-11-19" "1993-01-23" "1994-12-04"[10] "1995-05-11" "1996-10-04" "1997-04-29" "1998-04-08" "1999-01-16" "2000-08-01" "2001-10-02" "2002-05-08" "2003-04-01" [19] "2004-05-07" "2005-09-02" "2006-12-30" "2007-09-03" "2008-10-24" "2009-11-14" "2010-07-05" "2011-04-30" "2012-05-21" [28] "2013-04-07" "2014-05-07" "2015-07-26" And also I was confused when I dput my date, it show like this:> dput(date)structure(c(5958, 6369, 6820, 7217, 7603, 7796, 8358, 8423, 9103,9261, 9773, 9980, 10324, 10607, 11170, 11597, 11815, 12143, 12545, 13028, 13512, 13759, 14176, 14562, 14795, 15094, 15481, 15802, 16197, 16642), class = "Date") what is that mean? I mean why it is not recall the dates but some values (5958,6369,7217,..)? Any comment and recommendation is appreciate. Thank you. Best, Ani [[alternative HTML version deleted]]
The dput function is for re-creating an R object in another R workspace, so it uses fundamental base types to define objects. A Date is really the number of days since a specific date (typically 1970-01-01) that get converted to look like dates whenever you display or print them, so what you are seiing are those numbers. If we enter the R code returned by dput into our R session we will be able to see the dates. Your mjo30 table seems to call the day of the month the "date"... which is confusing. I would combine those three columns into one like mjo30$Dt <- as.Date( ISOdate( mjo30$year, mjo30$month, mjo30$date ) ) You could then use indexing mjo30[ date[1] == mjo30$Dt, ] or mjo30[ mjo30$Dt %in% date, ] but the subset function would not work in this case because you have two different objects (a column in mjo30 and a vector in your global environment) both referred to as 'date'. On January 13, 2020 8:53:38 PM PST, ani jaya <gaaauul at gmail.com> wrote:>Good morning R-Help, > >I have a dataframe with 7 columns and 10000+ rows. I want to >subset/extract >those data frame with specific date (not in order). Here the head of my >data frame: > >head(mjo30) year month date rmm1 rmm2 phase amp >1 1986 1 1 -0.326480 -1.55895 2 1.59277 >2 1986 1 2 -0.417700 -1.82689 2 1.87403 >3 1986 1 3 0.032915 -2.40150 3 2.40172 >4 1986 1 4 0.492743 -2.49216 3 2.54041 >5 1986 1 5 0.585106 -2.76866 3 2.82981 >6 1986 1 6 0.665013 -3.13883 3 3.20851 > >and here my specific date: >> date [1] "1986-04-25" "1987-06-10" "1988-09-03" "1989-10-05" >"1990-10-26" "1991-05-07" "1992-11-19" "1993-01-23" "1994-12-04" >[10] "1995-05-11" "1996-10-04" "1997-04-29" "1998-04-08" "1999-01-16" >"2000-08-01" "2001-10-02" "2002-05-08" "2003-04-01" >[19] "2004-05-07" "2005-09-02" "2006-12-30" "2007-09-03" "2008-10-24" >"2009-11-14" "2010-07-05" "2011-04-30" "2012-05-21" >[28] "2013-04-07" "2014-05-07" "2015-07-26" > >And also I was confused when I dput my date, it show like this: >> dput(date)structure(c(5958, 6369, 6820, 7217, 7603, 7796, 8358, 8423, >9103, >9261, 9773, 9980, 10324, 10607, 11170, 11597, 11815, 12143, 12545, >13028, 13512, 13759, 14176, 14562, 14795, 15094, 15481, 15802, >16197, 16642), class = "Date") > >what is that mean? I mean why it is not recall the dates but some >values (5958,6369,7217,..)? > >Any comment and recommendation is appreciate. Thank you. > >Best, > >Ani > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- Sent from my phone. Please excuse my brevity.
Inline. Bert Gunter On Mon, Jan 13, 2020 at 8:54 PM ani jaya <gaaauul at gmail.com> wrote:> Good morning R-Help, > > I have a dataframe with 7 columns and 10000+ rows. I want to subset/extract > those data frame with specific date (not in order). Here the head of my > data frame: > > head(mjo30)> year month date rmm1 rmm2 phase amp > 1 1986 1 1 -0.326480 -1.55895 2 1.59277 > 2 1986 1 2 -0.417700 -1.82689 2 1.87403 > 3 1986 1 3 0.032915 -2.40150 3 2.40172 > 4 1986 1 4 0.492743 -2.49216 3 2.54041 > 5 1986 1 5 0.585106 -2.76866 3 2.82981 > 6 1986 1 6 0.665013 -3.13883 3 3.20851 >These are columns of numeric values. That you label them as year, month, date is irrelevant,.> > and here my specific date: > > date> [1] "1986-04-25" "1987-06-10" "1988-09-03" "1989-10-05" "1990-10-26" > "1991-05-07" "1992-11-19" "1993-01-23" "1994-12-04" > [10] "1995-05-11" "1996-10-04" "1997-04-29" "1998-04-08" "1999-01-16" > "2000-08-01" "2001-10-02" "2002-05-08" "2003-04-01" > [19] "2004-05-07" "2005-09-02" "2006-12-30" "2007-09-03" "2008-10-24" > "2009-11-14" "2010-07-05" "2011-04-30" "2012-05-21" > [28] "2013-04-07" "2014-05-07" "2015-07-26" > > This is how the print method for Date objects prints the dates. See ?DatesAnd also I was confused when I dput my date, it show like this:> > dput(date)> structure(c(5958, 6369, 6820, 7217, 7603, 7796, 8358, 8423, 9103, > 9261, 9773, 9980, 10324, 10607, 11170, 11597, 11815, 12143, 12545, > 13028, 13512, 13759, 14176, 14562, 14795, 15094, 15481, 15802, > 16197, 16642), class = "Date") >These are how objects of class date are represented internally, as integers. See ?Dates. Use ?str to see the structure of an object, not dput() I think you need to go through a tutorial or two on dates in R. And probably also on S3 methods in R.> what is that mean? I mean why it is not recall the dates but some > values (5958,6369,7217,..)? > > Any comment and recommendation is appreciate. Thank you. > > Extended tutorials on these topics are inappropriate here. There are manyplaces they can be found on the web. But here's an example for one simple way to do it:> d <- as.Date("2004-10-5") ## create object of class "Date"## This is what you want to subset with> d ## how they are printed[1] "2004-10-05"> str(d)Date[1:1], format: "2004-10-05"> class(d)[1] "Date"> dput(d) ## the internal representation of Date objectsstructure(12696, class = "Date")> > > ## Now create a data frame that you want to subset with d > df <- data.frame (year = c(2004,2005),+ month = c(10,2), + date = c(5,15))> dfyear month date 1 2004 10 5 2 2005 2 15> ## convert to a formatted character column of dates > alldates <- with(df,paste(year,month,date, sep ="-")) > alldates ## vector of formatted character strings.[1] "2004-10-5" "2005-2-15"> class(alldates)[1] "character"> ## convert it to "Date" class > alldates <- as.Date(alldates) > class(alldates)[1] "Date"> ## Now use this to subset the data frame > df[alldates %in% d, ]year month date 1 2004 10 5 ## And please post in **plain text** not HTML in future. Cheers, Bert Best,> > Ani > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Dear Jeff and Bert, Thank you very much for your correction and explanation. And yes, I need to study about date format more. Sorry for HTML mail, don't realize. I was able to subset the data that I want. mjo30<-read.table("rmm.txt", header=FALSE, skip=4234, nrows=10957) mjo30$V8<-NULL names(mjo30)<-c("year","month","day", "rmm1","rmm2","phase","amp") mjo3<-as.Date(with(mjo30,paste(year,month, day, sep="-")),"%Y-%m-%d") mjo<-mjo30[which(mjo3%in%date),] head(mjo) year month day rmm1 rmm2 phase amp 115 1986 4 25 -0.319090 -0.363030 2 0.483332 526 1987 6 10 1.662870 0.291632 5 1.688250 977 1988 9 3 -0.604950 -0.299850 1 0.675181 1374 1989 10 5 0.972298 -0.461030 4 1.076060 1760 1990 10 26 -1.183110 -1.589810 2 1.981730 1953 1991 5 7 -0.317180 0.953061 7 1.004450 Best, Ani On Tue, Jan 14, 2020 at 3:20 PM Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:> > The dput function is for re-creating an R object in another R workspace, so it uses fundamental base types to define objects. A Date is really the number of days since a specific date (typically 1970-01-01) that get converted to look like dates whenever you display or print them, so what you are seiing are those numbers. If we enter the R code returned by dput into our R session we will be able to see the dates. > > Your mjo30 table seems to call the day of the month the "date"... which is confusing. I would combine those three columns into one like > > mjo30$Dt <- as.Date( ISOdate( mjo30$year, mjo30$month, mjo30$date ) ) > > You could then use indexing > > mjo30[ date[1] == mjo30$Dt, ] > > or > > mjo30[ mjo30$Dt %in% date, ] > > but the subset function would not work in this case because you have two different objects (a column in mjo30 and a vector in your global environment) both referred to as 'date'. > > On January 13, 2020 8:53:38 PM PST, ani jaya <gaaauul at gmail.com> wrote: > >Good morning R-Help, > > > >I have a dataframe with 7 columns and 10000+ rows. I want to > >subset/extract > >those data frame with specific date (not in order). Here the head of my > >data frame: > > > >head(mjo30) year month date rmm1 rmm2 phase amp > >1 1986 1 1 -0.326480 -1.55895 2 1.59277 > >2 1986 1 2 -0.417700 -1.82689 2 1.87403 > >3 1986 1 3 0.032915 -2.40150 3 2.40172 > >4 1986 1 4 0.492743 -2.49216 3 2.54041 > >5 1986 1 5 0.585106 -2.76866 3 2.82981 > >6 1986 1 6 0.665013 -3.13883 3 3.20851 > > > >and here my specific date: > >> date [1] "1986-04-25" "1987-06-10" "1988-09-03" "1989-10-05" > >"1990-10-26" "1991-05-07" "1992-11-19" "1993-01-23" "1994-12-04" > >[10] "1995-05-11" "1996-10-04" "1997-04-29" "1998-04-08" "1999-01-16" > >"2000-08-01" "2001-10-02" "2002-05-08" "2003-04-01" > >[19] "2004-05-07" "2005-09-02" "2006-12-30" "2007-09-03" "2008-10-24" > >"2009-11-14" "2010-07-05" "2011-04-30" "2012-05-21" > >[28] "2013-04-07" "2014-05-07" "2015-07-26" > > > >And also I was confused when I dput my date, it show like this: > >> dput(date)structure(c(5958, 6369, 6820, 7217, 7603, 7796, 8358, 8423, > >9103, > >9261, 9773, 9980, 10324, 10607, 11170, 11597, 11815, 12143, 12545, > >13028, 13512, 13759, 14176, 14562, 14795, 15094, 15481, 15802, > >16197, 16642), class = "Date") > > > >what is that mean? I mean why it is not recall the dates but some > >values (5958,6369,7217,..)? > > > >Any comment and recommendation is appreciate. Thank you. > > > >Best, > > > >Ani > > > > [[alternative HTML version deleted]] > > > >______________________________________________ > >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >https://stat.ethz.ch/mailman/listinfo/r-help > >PLEASE do read the posting guide > >http://www.R-project.org/posting-guide.html > >and provide commented, minimal, self-contained, reproducible code. > > -- > Sent from my phone. Please excuse my brevity.