thr3ads.net - R help - [R] using "eval(parse(text)) " , gsub(pattern, replacement, x) , to process "code" within a loop/custom function [Dec 2007]

If this information is useful, please help other people find it:
Share via:

Thomas Pujol

2007-Dec-06 17:10 UTC

[R] using "eval(parse(text)) " , gsub(pattern, replacement, x) , to process "code" within a loop/custom function

R-help users,
  Thanks in advance for any assistance ... I truly appreciate your expertise.  I
searched help and could not figure this out, and think you can probably offer
some helpful tips. I apologize if I missed something, which I'm sure I
probably did.
   
  I have data for many "samples". (e.g. 1950, 1951, 1952, etc.)

  For each "sample", I have many data-frames. (e.g. temp.1952,
births.1952, gdp.1952, etc.)

  (Because the data is rather "large" (and for other reasons), I have
chosen to store the data as individual files, as opposed to a list of data
frames.)
   
  I wish to write a function that enables me to "run" any of many
custom "functions/processes" on each sample of data.

  I currently accomplish this by using a custom function that uses:
"eval(parse(t=text.i2)) ", and "gsub(pat, rep, x)" (this
changes the "sample number" for each line of text I submit to
"eval(parse(t=text.i2))" ).

  Is there a better/preferred/more flexible way to do this?

  One issue/obstacle that I have encountered: Some of the custom functions I use
need to take as input the value of "d" in the loop below.
(Please see the sample function "fn.mn.d" below.)
  
#creates sample data
temp.1951 <- c(11,13,15)
births.1951 <- c(123, 156, 178)
temp.1952 <- c(21,23,25)
births.1952 <- c(223, 256, 278)
#######################
#function that looks for a a pattern "pat.i" within "x", and
replaces it with "rep"
recurse <- function(x, pat.i,rep.i) {
f <- function(x,pat,rep) if (mode(x) == "character") gsub(pat, rep,
x)  else x
   if (length(x) == 0) return(x)
   if (is.list(x)) for(i in seq_along(x)) x[[i]] <- recurse(x[[i]],
pat.i,rep.i)
   else x <- f(x,pat.i,rep.i)
   x
#f <- function(x) if (mode(x) == "character") gsub("a",
"green", x)  else x
}# end recurse end
#######################
  #######################
#function that processes code submitted as "text.i" for each date in
"dates.i"
fn.dateloop <- function(text.i, dates.i ) {
for(d in 1: length(dates.i) ) {
tempdate <- dates.i[d]
text.i2 <- recurse(text.i, pat.i='#', rep.i=tempdate)
temp0=eval(parse(t=text.i2)) 
tempname <- paste(names(temp0)[1], tempdate, sep='.')
save(list='temp0', file = tempname)
} # next d
} # end fn.dateloop
#######################
  #####################
#a sample custom function that I want to run on each sample of data
fn.mn <- function(x, y) {
res = x - y
names(res) = 'mn'
res
} 
#####################
#####################
#example of function that takes d as input...
#I have not been able to get this to work with the custom function
"fn.dateloop" above
#I request assistance in learning how to accomplish this
fn.mn.d <- function(x, y, d) {x[d] - y[d]} 
#####################
  #####################
setwd('c:/') #specifies location where sample data will be saved
getwd() #checks location
fn.mn(x=temp.1951, y=births.1951)
fn.mn(x=temp.1952, y=births.1952)
#
fn.dateloop(text.i = "fn.mn(x=get('temp.#'),
y=get('births.#') )" , dates.i=c('1951','1952') )
get(load('mn.1951'))
get(load('mn.1952'))
   
   
   

       
---------------------------------

	[[alternative HTML version deleted]]

Emmanuel Charpentier

2007-Dec-06 23:00 UTC

head link

[R] using "eval(parse(text)) " , gsub(pattern, replacement, x) , to process "code" within a loop/custom function

Thomas Pujol a ?crit :> R-help users,
>   Thanks in advance for any assistance ... I truly appreciate your
expertise.  I searched help and could not figure this out, and think you can
probably offer some helpful tips. I apologize if I missed something, which
I'm sure I probably did.
>    
>   I have data for many "samples". (e.g. 1950, 1951, 1952, etc.)
> 
>   For each "sample", I have many data-frames. (e.g. temp.1952,
births.1952, gdp.1952, etc.)
> 
>   (Because the data is rather "large" (and for other reasons), I
have chosen to store the data as individual files, as opposed to a list of data
frames.)
>    
>   I wish to write a function that enables me to "run" any of many
custom "functions/processes" on each sample of data.
> 
>   I currently accomplish this by using a custom function that uses:
> "eval(parse(t=text.i2)) ", and "gsub(pat, rep, x)"
(this changes the "sample number" for each line of text I submit to
"eval(parse(t=text.i2))" ).
> 
>   Is there a better/preferred/more flexible way to do this?
Beware : what follows is the advice of someone used to use RDBMS and SQL
to work with data ; as anyone should know, everything is a nail to a man
with a hammer. Caveat emptor...

Unless I misunderstand you, you are trying to treat piecewise a large
dataset made of a large number of reasonably-sized independent chunks.

What you're trying to do seems to me a bit reinventing SAS macro
language. What's the point ?

IMNSHO, "large" datasets that are used only piecewise are much better
handled in a real database (RDBMS), queried at runtime via, for example,
Brian Ripley's RODBC.

In your example, I'd create a table births with all your data + the
relevant year. Out of the top of my mind :

# Do that ONCE in the lifetime of your data : a RDBMS is probably more
# apt than R dataframes for this kind of management

library(RODBC)
channel<-odbcConnect(WhateverYouHaveToUseForYourFavoriteDBMS)

sqlSave(channel, tablename="Births",
        rbind(cbind(data.frame(Year=rep(1952,nrow(births.1952))),
                    births.1952),
              cbind(data.frame(Year=rep(1953,nrow(births.1953))),
                    births.1953),
# ... ^W^Y ad nauseam ...
))

rm(births.1951, births.1952, ...) # get back breathing space

Beware : certain data types may be tricky to save ! I got bitten by
Dates recently... See RODBC documentation, your DBMS documentation and
the "R Data Import/Export guide"...

At analysis time, you may use the result of the relevant query exactly
as one of your dataframes. instead of :
foo(... data=birth.1952, ...)
type :
foo(... data=sqlQuery(channel,"select * from \"Births\" where
\"Year\"=1952;", ...) # Syntax illustrating talking to a
"picky" DBMS...

Furthermore, the variable "Year" bears your "d" information.
Problem
(dis)solved.

You may loop (or even sapply()...) at will on d :
for(year in 1952:1978) {
  query<-sprintf("select * from \"Births\" where
\"Year\"=%d;",year)
  foo(... data=sqlQuery(channel,query), ...)
  ...
}

If you already use a DBMS with some connection to R (via RODBC or
otherwise), use that. If not, sqlite is a very lightweight library that
enables you to use a (very considerable) subset of SQL92 to manipulate
your data.

I understand that some people of this list have undertaken the creation
of a sqlite-based package dedicated to this kind of large data management.

HTH,

					Emmanuel Charpentier

Thomas Pujol

2007-Dec-07 15:12 UTC

head link

[R] using "eval(parse(text)) " , gsub(pattern, replacement, x) , to process "code" within a loop/custom function

Emmanuel,
  Thanks for your reply.  Please allow me to clarify.  I am already extensively
using a RDBMS and to store the data, and have used SQL and ODBC to extract the
data into a set of R-files.  (I have experimented with this a bit, and for my
specific application, storing the data in R seems to improve speed and 
convenience.  For example, I can extract the data only once, store it as an
R-file, and then use the data an infinite number of times, whiteout ever again
needing to "hit" the RDBMS.)
   
  What I am trying to do:  I need to perform certain
operations/processes/custom-functions on each "sample".  I can easily
write the code to do this, using a "FOR-loop".  But I will then need
to have a separate loop for each process I want to run, and will re-write much
of the code within the "FOR-loop".
   
  I have many different "processes" I might want to perform on each
sample on any given day.  So instead of always re-writing the same loop, I want
to write a function that takes as its input the "process", and then
goes and runs it on each sample.
   
  Thanks
   
   
  From: Emmanuel Charpentier <charpent_at_bacbuc.dyndns.org> 
Date: Fri, 07 Dec 2007 00:00:21 +0100
    Thomas Pujol a écrit : > R-help users, 
> Thanks in advance for any assistance ... I truly appreciate your expertise.
I searched help and could not figure this out, and think you can probably offer
some helpful tips. I apologize if I missed something, which I'm sure I
probably did.
> 
> I have data for many "samples". (e.g. 1950, 1951, 1952, etc.) 
> 
> For each "sample", I have many data-frames. (e.g. temp.1952,
births.1952, gdp.1952, etc.)
> 
> (Because the data is rather "large" (and for other reasons), I
have chosen to store the data as individual files, as opposed to a list of data
frames.)
> 
> I wish to write a function that enables me to "run" any of many
custom "functions/processes" on each sample of data.
> 
> I currently accomplish this by using a custom function that uses: 
> "eval(parse(t=text.i2)) ", and "gsub(pat, rep, x)"
(this changes the "sample number" for each line of text I submit to
"eval(parse(t=text.i2))" ).
> 
> Is there a better/preferred/more flexible way to do this?   Beware : what follows is the advice of someone used to use RDBMS and SQL to
work with data ; as anyone should know, everything is a nail to a man with a
hammer. Caveat emptor...   Unless I misunderstand you, you are trying to treat
piecewise a large dataset made of a large number of reasonably-sized independent
chunks.   What you're trying to do seems to me a bit reinventing SAS macro
language. What's the point ?   IMNSHO, "large" datasets that are
used only piecewise are much better handled in a real database (RDBMS), queried
at runtime via, for example, Brian Ripley's RODBC.   In your example,
I'd create a table births with all your data + the relevant year. Out of the
top of my mind :   # Do that ONCE in the lifetime of your data : a RDBMS is
probably more # apt than R dataframes for this kind of management  
library(RODBC)
channel<-odbcConnect(WhateverYouHaveToUseForYourFavoriteDBMS)  
sqlSave(channel, tablename="Births",
        rbind(cbind(data.frame(Year=rep(1952,nrow(births.1952))),               
births.1952),                cbind(data.frame(Year=rep(1953,nrow(births.1953))),
births.1953),
  
# ... ^W^Y ad nauseam ... 
)) 
  rm(births.1951, births.1952, ...) # get back breathing space   Beware :
certain data types may be tricky to save ! I got bitten by Dates recently... See
RODBC documentation, your DBMS documentation and the "R Data Import/Export
guide"...   At analysis time, you may use the result of the relevant query
exactly as one of your dataframes. instead of :
foo(... data=birth.1952, ...) 
type : 
foo(... data=sqlQuery(channel,"select * from \"Births\" where
\"Year\"=1952;", ...) # Syntax illustrating talking to a
"picky" DBMS...   Furthermore, the variable "Year" bears
your "d" information. Problem (dis)solved.   You may loop (or even
sapply()...) at will on d : for(year in 1952:1978) {
  query<-sprintf("select * from \"Births\" where
\"Year\"=%d;",year)   foo(... data=sqlQuery(channel,query), ...) 
...
}   If you already use a DBMS with some connection to R (via RODBC or
otherwise), use that. If not, sqlite is a very lightweight library that enables
you to use a (very considerable) subset of SQL92 to manipulate your data.   I
understand that some people of this list have undertaken the creation of a
sqlite-based package dedicated to this kind of large data management.   HTH,    
Emmanuel Charpentier
   

















    
---------------------------------


       
---------------------------------

	[[alternative HTML version deleted]]

Gabor Grothendieck

2007-Dec-07 18:41 UTC

head link

[R] using "eval(parse(text)) " , gsub(pattern, replacement, x) , to process "code" within a loop/custom function

Use the same names (births, temp, ...) in each Rdata file and then load
each file into its own environment or proto object:

	library(proto); x1951 <- proto() # or x1951 <- new.env()
	load("1951.rda", envir = x1951)

Then pass the environment or proto object to each of your functions:

	f <- function(x) x$difference <- x$births - x$temp
	f(x1951)

The above completely avoids renaming variables and instead treats each
year as an object. If you use proto objects the home page
is: http://r-proto.googlecode.com

On Dec 6, 2007 12:10 PM, Thomas Pujol <thomas.pujol at yahoo.com>
wrote:> R-help users,
>  Thanks in advance for any assistance ... I truly appreciate your
expertise.  I searched help and could not figure this out, and think you can
probably offer some helpful tips. I apologize if I missed something, which
I'm sure I probably did.
>
>  I have data for many "samples". (e.g. 1950, 1951, 1952, etc.)
>
>  For each "sample", I have many data-frames. (e.g. temp.1952,
births.1952, gdp.1952, etc.)
>
>  (Because the data is rather "large" (and for other reasons), I
have chosen to store the data as individual files, as opposed to a list of data
frames.)
>
>  I wish to write a function that enables me to "run" any of many
custom "functions/processes" on each sample of data.
>
>  I currently accomplish this by using a custom function that uses:
> "eval(parse(t=text.i2)) ", and "gsub(pat, rep, x)"
(this changes the "sample number" for each line of text I submit to
"eval(parse(t=text.i2))" ).
>
>  Is there a better/preferred/more flexible way to do this?
>
>  One issue/obstacle that I have encountered: Some of the custom functions I
use need to take as input the value of "d" in the loop below.
> (Please see the sample function "fn.mn.d" below.)
>
> #creates sample data
> temp.1951 <- c(11,13,15)
> births.1951 <- c(123, 156, 178)
> temp.1952 <- c(21,23,25)
> births.1952 <- c(223, 256, 278)
> #######################
> #function that looks for a a pattern "pat.i" within
"x", and replaces it with "rep"
> recurse <- function(x, pat.i,rep.i) {
> f <- function(x,pat,rep) if (mode(x) == "character") gsub(pat,
rep, x)  else x
>   if (length(x) == 0) return(x)
>   if (is.list(x)) for(i in seq_along(x)) x[[i]] <- recurse(x[[i]],
pat.i,rep.i)
>   else x <- f(x,pat.i,rep.i)
>   x
> #f <- function(x) if (mode(x) == "character")
gsub("a", "green", x)  else x
> }# end recurse end
> #######################
>  #######################
> #function that processes code submitted as "text.i" for each date
in "dates.i"
> fn.dateloop <- function(text.i, dates.i ) {
> for(d in 1: length(dates.i) ) {
> tempdate <- dates.i[d]
> text.i2 <- recurse(text.i, pat.i='#', rep.i=tempdate)
> temp0=eval(parse(t=text.i2))
> tempname <- paste(names(temp0)[1], tempdate, sep='.')
> save(list='temp0', file = tempname)
> } # next d
> } # end fn.dateloop
> #######################
>  #####################
> #a sample custom function that I want to run on each sample of data
> fn.mn <- function(x, y) {
> res = x - y
> names(res) = 'mn'
> res
> }
> #####################
> #####################
> #example of function that takes d as input...
> #I have not been able to get this to work with the custom function
"fn.dateloop" above
> #I request assistance in learning how to accomplish this
> fn.mn.d <- function(x, y, d) {x[d] - y[d]}
> #####################
>  #####################
> setwd('c:/') #specifies location where sample data will be saved
> getwd() #checks location
> fn.mn(x=temp.1951, y=births.1951)
> fn.mn(x=temp.1952, y=births.1952)
> #
> fn.dateloop(text.i = "fn.mn(x=get('temp.#'),
y=get('births.#') )" , dates.i=c('1951','1952') )
> get(load('mn.1951'))
> get(load('mn.1952'))
>
>
>
>
>
> ---------------------------------
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Reasonably Related Threads

Search for more possibly parallel threads

R help - Dec 2007 - using "eval(parse(text)) " , gsub(pattern, replacement, x) , to process "code" within a loop/custom function

[R] using "eval(parse(text)) " , gsub(pattern, replacement, x) , to process "code" within a loop/custom function

[R] using "eval(parse(text)) " , gsub(pattern, replacement, x) , to process "code" within a loop/custom function

[R] using "eval(parse(text)) " , gsub(pattern, replacement, x) , to process "code" within a loop/custom function

[R] using "eval(parse(text)) " , gsub(pattern, replacement, x) , to process "code" within a loop/custom function

Reasonably Related Threads