Prof Brian D Ripley
2000-Jul-20 07:33 UTC
[Rd] RFC: System and time support functions in R
I've been looking over system utility functions that we might want to add to R. A few come out of specific needs, others from looking at other systems and what people are using system() for. I've taken account of Paul Gilbert's comments posted here a while ago (and I think covered all except the use of mailers). We currently have date *.socket file.create file.exists file.remove file.append dir.create basename dirname list.file/dir unlink -- it is none too clear what this should do for dirs. file.show getenv S-PLUS 5.x has access files.in.dir -- we have list.files. is.dir mkdir -- we have dir.create rmdir -- recursive or not (unlink only removes empty directories) I have added today for R-devel (Unix and Wundows) file.access() -- an access() work-alike. file.info() -- subsumes is.dir(), and give the information from stat(2) calls. file.copy -- via file.create and file.append. Sys.info() -- give the information from uname(2) (including machine name) and getlogin(2) (the login name). Things which I think we still may need: putenv() -- or is `setenv' a better name? (putenv is the POSIX name). sleep() -- called Sys.sleep()?, and with sub-second accuracy. Tricky to do with event loops running, but looks possible. Package xgobi under Unix has system("sleep 3"), which is not a good idea in an event-driven system. I have this running on Windows for xgobi there. unlink() -- I suggest we add a recursive argument, defaulting to FALSE? (It is currently TRUE on most platforms.) The other main area that needs something more is date/times. For the moment file.info returns times as days/fractional days since 1 Jan 1970, which chron() can interpret. But that is not *quite* correct, as not all days are the same length due to the (rare) use of leap-seconds. And chron does not know about timezones. My suggestion here is to implement a time class called POSIXtime which is just POSIX's time_t. (Number of seconds since 1 Jan 1970.) And another time class POSIXtm which is an R list giving a struct tm (secs, mins, hours, day of month, month, year, day of week, day of year). (I think it also needs to record the timezone used.) Then we can have R functions as vectorized wrappers for the POSIX functions (not necesarily with these names) time (say Sys.date): date() as a POSIXtime variable. localtime / gmtime: convert POSIXtime to POSIXtm (local TZ/UTC) mktime: convert POSIXtm to POSIXtime strftime: convert POSIXtm to character string, flexibly. difftime: difference between times in secs. (The wrappers for the last two could handle POSIXtime and POSIXtm objects.) (Perhaps if these do not exist on a platform (unlikely) we can have less accurate alternatives in our code. They exist on Windows.) Possibly we might want to allow tzset: set a time zone, for the above functions or perhaps better just have tz as an argument to the conversion functions. Is this is a sensible design strategy? I am reluctant to add another set of date functions after packages date and chron, but cannot see how to easily leverage those to do what I need, and in any case POSIX has thought this through. -- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>>>>> Prof Brian D Ripley writes:> I've been looking over system utility functions that we might want to > add to R. A few come out of specific needs, others from looking at > other systems and what people are using system() for. I've taken > account of Paul Gilbert's comments posted here a while ago (and I > think covered all except the use of mailers).> ...> The other main area that needs something more is date/times. For the > moment file.info returns times as days/fractional days since 1 Jan > 1970, which chron() can interpret. But that is not *quite* correct, > as not all days are the same length due to the (rare) use of > leap-seconds. And chron does not know about timezones.> My suggestion here is to implement a time class called POSIXtime which is > just POSIX's time_t. (Number of seconds since 1 Jan 1970.) And another time > class POSIXtm which is an R list giving a struct tm (secs, mins, hours, day > of month, month, year, day of week, day of year). (I think it also needs > to record the timezone used.) Then we can have R functions as vectorized > wrappers for the POSIX functions (not necesarily with these names)> time (say Sys.date): date() as a POSIXtime variable. > localtime / gmtime: convert POSIXtime to POSIXtm (local TZ/UTC) > mktime: convert POSIXtm to POSIXtime > strftime: convert POSIXtm to character string, flexibly. > difftime: difference between times in secs. > (The wrappers for the last two could handle > POSIXtime and POSIXtm objects.)> (Perhaps if these do not exist on a platform (unlikely) we can have > less accurate alternatives in our code. They exist on Windows.)> Possibly we might want to allow> tzset: set a time zone, for the above functions> or perhaps better just have tz as an argument to the conversion functions.> Is this is a sensible design strategy? I am reluctant to add another > set of date functions after packages date and chron, but cannot see > how to easily leverage those to do what I need, and in any case POSIX > has thought this through.David James and I are currently discussing re-implementing chron, and among the issues is interfacing ANSI C time & date functions as you suggested above. We should also provide strptime() which is not ANSI but simple to implement without locale support, and maybe possible to take from glibc otherwise (assuming it is not in the system's libs). (Btw, why POSIX? K&R gave me the feeling that the above is all ANSI.) I will cc David on this. The R list (corresponding to struct tm) could as well be the basic representation of a chron object ... -k -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> From: Kurt Hornik <Kurt.Hornik@ci.tuwien.ac.at> > Date: Thu, 20 Jul 2000 13:17:35 +0200 (CEST) > > >>>>> Prof Brian D Ripley writes: > > > I've been looking over system utility functions that we might want to > > add to R. A few come out of specific needs, others from looking at > > other systems and what people are using system() for. I've taken > > account of Paul Gilbert's comments posted here a while ago (and I > > think covered all except the use of mailers). > > > ... > > > The other main area that needs something more is date/times. For the > > moment file.info returns times as days/fractional days since 1 Jan > > 1970, which chron() can interpret. But that is not *quite* correct, > > as not all days are the same length due to the (rare) use of > > leap-seconds. And chron does not know about timezones. > > > My suggestion here is to implement a time class called POSIXtime which is > > just POSIX's time_t. (Number of seconds since 1 Jan 1970.) And another time > > class POSIXtm which is an R list giving a struct tm (secs, mins, hours, day > > of month, month, year, day of week, day of year). (I think it also needs > > to record the timezone used.) Then we can have R functions as vectorized > > wrappers for the POSIX functions (not necesarily with these names) > > > time (say Sys.date): date() as a POSIXtime variable. > > localtime / gmtime: convert POSIXtime to POSIXtm (local TZ/UTC) > > mktime: convert POSIXtm to POSIXtime > > strftime: convert POSIXtm to character string, flexibly. > > difftime: difference between times in secs. > > (The wrappers for the last two could handle > > POSIXtime and POSIXtm objects.) > > > (Perhaps if these do not exist on a platform (unlikely) we can have > > less accurate alternatives in our code. They exist on Windows.) > > > Possibly we might want to allow > > > tzset: set a time zone, for the above functions > > > or perhaps better just have tz as an argument to the conversion functions. > > > Is this is a sensible design strategy? I am reluctant to add another > > set of date functions after packages date and chron, but cannot see > > how to easily leverage those to do what I need, and in any case POSIX > > has thought this through. > > David James and I are currently discussing re-implementing chron, and > among the issues is interfacing ANSI C time & date functions as you > suggested above. We should also provide strptime() which is not ANSI > but simple to implement without locale support, and maybe possible to > take from glibc otherwise (assuming it is not in the system's libs). > > (Btw, why POSIX? K&R gave me the feeling that the above is all ANSI.)ISO C (aka ANSI C) has time_t, but no guarantee that it is seconds past 1/1/1970. POSIX in general tightens up the definitions considerably.> I will cc David on this. The R list (corresponding to struct tm) could > as well be the basic representation of a chron object ...I would be very happy for someone else to do this as part of chron (say), *but* I do think at least the basic stuff should be in base R 1.2.x. (Not even in package chron bundled with R.) That way we can use this for calendar and file dates. If we go for the R list I think we do need to record the timezone in the list (POSIX does not). Brian -- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> From: Martin Maechler <maechler@stat.math.ethz.ch> > Date: Thu, 20 Jul 2000 15:24:33 +0200 (CEST) > To: Prof Brian Ripley <ripley@stats.ox.ac.uk>> BDR> ISO C (aka ANSI C) has time_t, but no guarantee that it is secondspast> BDR> 1/1/1970. POSIX in general tightens up the definitions considerably. > > Most you probably know this, .. but > since there's the "unix millenium bug" aka "C millenium bug", > POSIX or it's successor will have to extend/change the definition of time_t > to use 64bit integers (or something else) instead of (32-bit) long int as > it is now, since time will overflow some time in 2038 (?). > Our structure should make sure to accomodate more than only about 2^31 > seconds since 1970-Jan-01.Nothing in POSIX that I know of says time_t has to be an signed 32-bit int, and some systems use long (64-bit) or double or long double already, my book says. There is already a problem with 32-bit ints in only going back to the early years of this century. This is an argument against making use of the system conversion functions for a chron replacement, and I think that what I called POSIXtm (an list) is a more general format.> chron() even now uses double which should be okay, even if we'd go down to > parts of seconds -- something we might consider.I intended to use double in R to store time_t. (Yes, I know file.info does not currently, but what is there is a place holder.)> How are financial "tick data" recorded nowadays (and in five years from now)To one second, those that I have used. -- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Thu, 20 Jul 2000 08:33:01 +0100 (BST), you wrote in message <Pine.GSO.4.05.10007200826030.9485-100000@auk.stats>:>I've been looking over system utility functions that we might want to >add to R. A few come out of specific needs, others from looking at >other systems and what people are using system() for. I've taken >account of Paul Gilbert's comments posted here a while ago (and I >think covered all except the use of mailers). > >We currently have > >date >*.socket >file.create >file.exists >file.remove >file.appendIs there any interest in adding binary file access to the base? I think it would be really useful, and have put together a prototype (still for Windows only) that's on my web site at <http://www.stats.uwo.ca/faculty/murdoch/software/Rstreams.zip> Duncan Murdoch -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._