Currently in R, constructing a string containing values of variables is done using 'paste' and can be an error-prone and traumatic experience. For example, when constructing a db query we have to write, paste("SELECT " value " FROM table where date ='",cdate,"'") we are getting null result from it, because without (forgotten...) sep="" we get SELECT value FROM table where date=' 2005-05-05 ' instead of SELECT value FROM table where date='2005-05-05' Adding sep="" as a habit results in other errors, like column names joined with keywords - because of forgotten spaces. Not to mention mixing up or unbalancing quote marks etc. The approach used by paste is similar to that of many other languages (like early Java, VB etc) and is inherently error-prone because of poor visualization. There is a way to improve it. In the Java world gstrings were introduced specifically for this purpose. A gstring is a string with variable names embedded and replaced by values (converted to strings, lazy eval) before use. An example in R-syntax would be:>alpha <- 8; beta="xyz" >gstr <- "the result is ${alpha} with the comment${beta}">cat(gstr)the result is 8 with the comment xyz This syntactic sugar reduces significantly the number of mistakes made with normal string concatenations. Gstrings are used in ant and groovy - (for details see http://groovy.codehaus.org/Strings, jump to GStrings). They are particularly useful for creating readable and error-free SQL statements, but obviously the simplify 'normal' string+value handling in all situations. [ps: gstrings are not nestable] I was wondering how difficult it would be to add such syntactic sugar to R and would that create some language problems? May be it is possible that it could be done as some gpaste function, parsing the argument for ${var}, extracting variables from the environment, evaluating them and producing the final string? I admit my bias - using ant for years and groovy for months and having to do a lot of SQL queries does not put me in the mainstream of R users - so it may be that this idea is not usable to a wider group of users.
Hi, In Bioconductor, we have something called copySubstitute, which does what you want, I believe, x="select @var1@ from @tab1@" copySubstitute(textConnection(x), symbolValues= list(var1="Race", tab1="ReallyBigTable"), dest=stdout()) yields select Race from ReallyBigTable you can read in from any connection and write out to any connection, change the delimiter, etc. We use it to autogenerate manual pages and other documentation for packages that have lots of similar structure, as well as for things like what you want to do. Best wishes, Robert On May 7, 2005, at 1:36 AM, charles loboz wrote:> Currently in R, constructing a string containing > values of variables is done using 'paste' and can be > an error-prone and traumatic experience. For example, > when constructing a db query we have to write, > paste("SELECT " value " FROM table where > date ='",cdate,"'") > we are getting null result from it, because without > (forgotten...) sep="" we get > SELECT value FROM table where date=' > 2005-05-05 ' > instead of > SELECT value FROM table where date='2005-05-05' > Adding sep="" as a habit results in other errors, like > column names joined with keywords - because of > forgotten spaces. Not to mention mixing up or > unbalancing quote marks etc. The approach used by > paste is similar to that of many other languages (like > early Java, VB etc) and is inherently error-prone > because of poor visualization. There is a way to > improve it. > > In the Java world gstrings were introduced > specifically for this purpose. A gstring is a string > with variable names embedded and replaced by values > (converted to strings, lazy eval) before use. An > example in R-syntax would be: > >> alpha <- 8; beta="xyz" >> gstr <- "the result is ${alpha} with the comment > ${beta}" >> cat(gstr) > the result is 8 with the comment xyz > > This syntactic sugar reduces significantly the number > of mistakes made with normal string concatenations. > Gstrings are used in ant and groovy - (for details see > http://groovy.codehaus.org/Strings, jump to GStrings). > They are particularly useful for creating readable and > error-free SQL statements, but obviously the simplify > 'normal' string+value handling in all situations. [ps: > gstrings are not nestable] > > I was wondering how difficult it would be to add such > syntactic sugar to R and would that create some > language problems? May be it is possible that it could > be done as some gpaste function, parsing the argument > for ${var}, extracting variables from the environment, > evaluating them and producing the final string? > > I admit my bias - using ant for years and groovy for > months and having to do a lot of SQL queries does not > put me in the mainstream of R users - so it may be > that this idea is not usable to a wider group of > users. > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >+----------------------------------------------------------------------- ----------------+ | Robert Gentleman phone: (206) 667-7700 | | Head, Program in Computational Biology fax: (206) 667-1319 | | Division of Public Health Sciences office: M2-B865 | | Fred Hutchinson Cancer Research Center | | email: rgentlem at fhcrc.org | +----------------------------------------------------------------------- ----------------+
The other thing to use is 'sprintf', which would be fantastic in R if it imputed types based on the format string. As it is now, for your query you would do:> sprintf("SELECT %s FROM table WHERE date = '%s'", "column", "2005-10-12")[1] "SELECT column FROM table WHERE date = '2005-10-12'" Which, in my opinion is nicer than the corresponding paste, and about as nice as gstring. The issue that I always have with sprintf is when I use numbers, specifically integers. As the function is just a wrapper for the C function and because numbers are implicitly doubles the following doesnt work:> sprintf("SELECT %s FROM table WHERE age = %d", "column", 1)Error in sprintf("SELECT %s FROM table WHERE age = %d", "column", 1) : use format %f, %e or %g for numeric objects It does work however if you do> sprintf("SELECT %s FROM table WHERE age = %d", "column", as.integer(1))[1] "SELECT column FROM table WHERE age = 1" This however, is not so nice - are there reasons why this has to be like this? This might be naive but I would think it would be pretty simple in R to do this automatically. Thanks for any insight. jim charles loboz wrote:>Currently in R, constructing a string containing >values of variables is done using 'paste' and can be >an error-prone and traumatic experience. For example, >when constructing a db query we have to write, > paste("SELECT " value " FROM table where >date ='",cdate,"'") >we are getting null result from it, because without >(forgotten...) sep="" we get > SELECT value FROM table where date=' >2005-05-05 ' >instead of > SELECT value FROM table where date='2005-05-05' >Adding sep="" as a habit results in other errors, like >column names joined with keywords - because of >forgotten spaces. Not to mention mixing up or >unbalancing quote marks etc. The approach used by >paste is similar to that of many other languages (like >early Java, VB etc) and is inherently error-prone >because of poor visualization. There is a way to >improve it. > >In the Java world gstrings were introduced >specifically for this purpose. A gstring is a string >with variable names embedded and replaced by values >(converted to strings, lazy eval) before use. An >example in R-syntax would be: > > > >>alpha <- 8; beta="xyz" >>gstr <- "the result is ${alpha} with the comment >> >> >${beta}" > > >>cat(gstr) >> >> > the result is 8 with the comment xyz > >This syntactic sugar reduces significantly the number >of mistakes made with normal string concatenations. >Gstrings are used in ant and groovy - (for details see >http://groovy.codehaus.org/Strings, jump to GStrings). >They are particularly useful for creating readable and >error-free SQL statements, but obviously the simplify >'normal' string+value handling in all situations. [ps: >gstrings are not nestable] > >I was wondering how difficult it would be to add such >syntactic sugar to R and would that create some >language problems? May be it is possible that it could >be done as some gpaste function, parsing the argument >for ${var}, extracting variables from the environment, >evaluating them and producing the final string? > >I admit my bias - using ant for years and groovy for >months and having to do a lot of SQL queries does not >put me in the mainstream of R users - so it may be >that this idea is not usable to a wider group of >users. > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > > >-- James Bullard bullard at berkeley.edu 760.267.0986
On 5/7/05, charles loboz <charles_loboz at yahoo.com> wrote:> Currently in R, constructing a string containing > values of variables is done using 'paste' and can be > an error-prone and traumatic experience. For example, > when constructing a db query we have to write, > paste("SELECT " value " FROM table where > date ='",cdate,"'") > we are getting null result from it, because without > (forgotten...) sep="" we get > SELECT value FROM table where date=' > 2005-05-05 ' > instead of > SELECT value FROM table where date='2005-05-05' > Adding sep="" as a habit results in other errors, like > column names joined with keywords - because of > forgotten spaces. Not to mention mixing up or > unbalancing quote marks etc. The approach used by > paste is similar to that of many other languages (like > early Java, VB etc) and is inherently error-prone > because of poor visualization. There is a way to > improve it. > > In the Java world gstrings were introduced > specifically for this purpose. A gstring is a string > with variable names embedded and replaced by values > (converted to strings, lazy eval) before use. An > example in R-syntax would be: > > >alpha <- 8; beta="xyz" > >gstr <- "the result is ${alpha} with the comment > ${beta}" > >cat(gstr) > the result is 8 with the comment xyz > > This syntactic sugar reduces significantly the number > of mistakes made with normal string concatenations. > Gstrings are used in ant and groovy - (for details see > http://groovy.codehaus.org/Strings, jump to GStrings). > They are particularly useful for creating readable and > error-free SQL statements, but obviously the simplify > 'normal' string+value handling in all situations. [ps: > gstrings are not nestable] > > I was wondering how difficult it would be to add such > syntactic sugar to R and would that create some > language problems? May be it is possible that it could > be done as some gpaste function, parsing the argument > for ${var}, extracting variables from the environment, > evaluating them and producing the final string? > > I admit my bias - using ant for years and groovy for > months and having to do a lot of SQL queries does not > put me in the mainstream of R users - so it may be > that this idea is not usable to a wider group of > users.Here is one attempt. It eliminates the necessity to quote the elements altogether but in exchange requires that the argument be a valid R expression. It is based on the R bquote function. gpaste <- function(expr, where = parent.frame()) { dequote <- function(e) as.name(noquote(as.character(e))) unquote <- function(e) { if (length(e) <= 1) dequote(e) else if (e[[1]] == as.name(".")) dequote(eval(e[[2]], where)) else as.call(lapply(e, unquote)) } rval <- paste(unquote(substitute(expr)), collapse = " ") rval <- gsub("+ ", "", rval, fix = TRUE) gsub("`", "", rval) } # test var <- "myvar" gpaste( select + .(var) + from + table + where + date +" =" + .(sQuote(Sys.Date())) ) When you run it you get this:> gpaste( select + .(var) + from + table + where ++ date +" =" + .(sQuote(Sys.Date())) ) [1] "select myvar from table where date = '2005-05-07'"
charles loboz <charles_loboz at yahoo.com> wrote:> A gstring is a string with variable names embedded and replaced by > values(converted to strings, lazy eval) before use.I use the following function, which will take variables either from named arguments or from the environment. It also concatenates all unnamed arguments (with sep="") as a convenience for long strings. g.p <- function(..., esc="\\$", sep="", collapse=" ", parent=1) { a <- lapply(list(...), as.character) n <- names(a); if (is.null(n)) n <- rep("", length(a)) s <- do.call("paste", c(a[n==""], sep=sep, collapse=collapse)) for (i in which(n != "")) s <- gsub(paste(esc,n[i],sep=""), a[[i]], s) while ((r <- regexpr(paste(esc,"\\w*",sep=""), s)) > 0) { v <- substring(s, r+1, r+attr(r,"match.length")-1) s <- if (v=="") paste(substring(s,1,r-1), substring(s,r+2), sep="") else gsub(paste(esc,v,sep=""), as.character(eval.parent(parse(text=v), parent)), s) } s } Here's a simple example: R> alpha <- 8 R> g.p("the result is $alpha with the comment $beta", beta="xyz") [1] "the result is 8 with the comment xyz" -- David Brahm (brahm at alum.mit.edu)
Reasonably Related Threads
- Contract Syntactic Sugar
- In-string variable/symbol substitution: What formats/syntax is out there?
- Compile error on Solaris Sparc
- Ruuid had non-zero exit status (PR#8965)
- [PATCH libldm 00/12] New API: an ability to retrieve created device-mapper devices back after they have been created.