Hi Ivan
Can you please clarify what input files should be used with your
proposed function? I tried a few files in r-svn/src/include and one of
them gave me an error.
> getdecl("~/R/r-svn/src/include/R.h")
[1] "R_FlushConsole" "R_ProcessEvents"
"R_WaitEvent"> getdecl("~/R/r-svn/src/include/Rdefines.h")
Error in regmatches(lines, gregexec(rx, lines, perl = TRUE))[[1]][3, ] :
incorrect number of dimensions
On Mon, Jul 15, 2024 at 10:32?AM Ivan Krylov via R-devel
<r-devel at r-project.org> wrote:>
> Hi all,
>
> I've noticed some peculiarities in the tools:::funAPI output that
> complicate its programmatic use a bit.
>
> - Is it for remapped symbol names (with Rf_ or the Fortran
> underscore), or for unmapped names (without Rf_ or the underscore)?
>
> I see that the functions marked in WRE are almost all (except
> Rf_installChar and Rf_installTrChar) unmapped. This makes a lot of
> sense because some of those interfaces (e.g. CONS(), CHAR(),
> NOT_SHARED()) are C preprocessor macros, not functions. I also see that
> installTrChar is not explicitly marked.
>
> Are we allowed to call tools:::unmap(tools:::funAPI()$name) and
> consider the return value to be the list of all unmapped APIs, despite,
> e.g., installTrChar not being explicitly marked?
>
> - Should R_PV be an @apifun if it's currently caught by checks in
> sotools.R?
>
> - Should R_FindSymbol be commented /* Not API */ if it's marked as
> @apifun in WRE and not caught by sotools.R? It is currently used by 8
> CRAN packages.
>
> - The names 'select', 'delztg' from R_ext/Lapack.h are
function
> pointer arguments, not functions or type declarations. They are
> being found because funcRegexp is written to match incomplete
> function declarations (e.g. when they end up being split over
> multiple lines, like in R_ext/Lapack.h), and function pointer
> argument declarations look sufficiently similar.
>
> A relatively compact (but still brittle) way to match function
> declarations in C header files is shown at the end of this message. I
> have confirmed that compared to tools:::getFunsHdr, the only extraneous
> symbols that it finds in preprocessed headers are "R_SetWin32",
> "user_unif_rand", "user_unif_init",
"user_unif_nseed",
> "user_unif_seedloc" "user_norm_rand", which are
special-cased in
> tools:::getFunsHdr, and the only symbols it doesn't find are
"select"
> and "delztg" in R_ext/Lapack.h, which we should not be finding.
>
> # "Bird's eye" view, gives unmapped names on non-preprocessed
headers
> getdecl <- function(file, lines = readLines(file)) {
> # have to combine to perform multi-line matches
> lines <- paste(c(lines, ''), collapse = '\n')
> # first eat the C comments, dotall but non-greedy match
> lines <- gsub('(?s)/\\*.*?\\*/', '', lines, perl
= TRUE)
> # C++-style comments too, multiline not dotall
> lines <- gsub('(?m)//.*$', '', lines, perl =
TRUE)
> # drop all preprocessor directives
> lines <- gsub('(?m)^\\s*#.*$', '', lines, perl =
TRUE)
>
> rx <- r"{(?xs)
> (?!typedef)(?<!\w) # please no typedefs
> # return type with attributes
> (
> # words followed by whitespace or stars
> (?: \w+ (?:\s+ | \*)+)+
> )
> # function name, assumes no extra whitespace
> (
> \w+\(\w+\) # macro call
> | \(\w+\) # in parentheses
> | \w+ # a plain name
> )
> # arguments: non-greedy match inside parentheses
> \s* \( (.*?) \) \s* # using dotall here
> # will include R_PRINTF_FORMAT(1,2 but we don't care
> # finally terminated by semicolon
> ;
> }"
>
> regmatches(lines, gregexec(rx, lines, perl = TRUE))[[1]][3,]
> }
>
> # Preprocess then extract remapped function names like getFunsHdr
> getdecl2 <- function(file)
> file |>
> readLines() |>
> grep('^\\s*#\\s*error', x = _, value = TRUE, invert = TRUE)
|>
> tools:::ccE() |>
> getdecl(lines = _)
>
> --
> Best regards,
> Ivan
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel