Hello,
I'm trying to fetch a data frame through the C API,
and have no problem doing this when all columns
are numbers, but when there is a column of
strings I have a problem. On the C-side the
function looks like:
SEXP myfunc(SEXP df),
and it is called with a dataframe from
the R side with:
.Call("myfunc", somedataframe)
On the C side (actually C++ side) I use code
like this:
SEXP colnames = getAttrib(df, R_NamesSymbol)
cname  = string(CHAR(STRING_ELT(colnames,i))
SEXP coldata = VECTOR_ELT(df,i) (data for i-th column)
if(isReal(colData))
    x = REAL(colData)[j];
else if(isInteger(colData))
    i = INTEGER(colData)[j];
else if(isString(colData))
    s = CHAR(STRING_ELT(colData,j))
The problem is that the last test (isString) never passes,
even when I pass in a frame for which one or more cols
contain character strings. When the column contains
strings the isVector(colData) test passes, but no matter
how I try to fetch the string data I get a seg fault. That
is, forcing CHAR(STRING_ELT(colData,j)) will
fault, and so will VECTOR_ELT(colData,0), even
though colData passes the isVector test.
Any ideas?
Thanks,
ds
While I do not know how to handle this on the C level, I know that  
you do not have characters in data frames, everything is factors  
instead. Internally they are coded as a number of integer levels,  
with the levels having labels (which is the character you see). So eg  
(in R):
 > test <- data.frame(tmp = letters[1:10])
 > test
    tmp
1    a
2    b
3    c
4    d
5    e
6    f
7    g
8    h
9    i
10   j
 > is.character(test$temp)
[1] FALSE
 > as.numeric(test$tmp) # The internal code of the factor
[1]  1  2  3  4  5  6  7  8  9 10
 > levels(test$tmp) # gives you the translation from internal code to  
actual label
[1] "a" "b" "c" "d" "e"
"f" "g" "h" "i" "j"
You probably need to convert the factor to a character, which I do  
not know how to do in C on top of my head, but which is probably not  
that difficult. At least now you should have some idea on where to look.
/Kasper
On Jun 21, 2006, at 10:07 PM, Dominick Samperi wrote:
> Hello,
>
> I'm trying to fetch a data frame through the C API,
> and have no problem doing this when all columns
> are numbers, but when there is a column of
> strings I have a problem. On the C-side the
> function looks like:
> SEXP myfunc(SEXP df),
> and it is called with a dataframe from
> the R side with:
>
> .Call("myfunc", somedataframe)
>
> On the C side (actually C++ side) I use code
> like this:
> SEXP colnames = getAttrib(df, R_NamesSymbol)
> cname  = string(CHAR(STRING_ELT(colnames,i))
> SEXP coldata = VECTOR_ELT(df,i) (data for i-th column)
> if(isReal(colData))
>     x = REAL(colData)[j];
> else if(isInteger(colData))
>     i = INTEGER(colData)[j];
> else if(isString(colData))
>     s = CHAR(STRING_ELT(colData,j))
>
> The problem is that the last test (isString) never passes,
> even when I pass in a frame for which one or more cols
> contain character strings. When the column contains
> strings the isVector(colData) test passes, but no matter
> how I try to fetch the string data I get a seg fault. That
> is, forcing CHAR(STRING_ELT(colData,j)) will
> fault, and so will VECTOR_ELT(colData,0), even
> though colData passes the isVector test.
>
> Any ideas?
> Thanks,
> ds
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
I think you want
        else if (TYPEOF(colData) == STRSXP)
... instead.
I don't know if this will convert from factors to string's,
but somewhere it probably involves something like this:
     PROTECT(colData = coerceVector(colData, STRSXP));
Dominick Samperi wrote:> Hello,
> 
> I'm trying to fetch a data frame through the C API,
> and have no problem doing this when all columns
> are numbers, but when there is a column of
> strings I have a problem. On the C-side the
> function looks like:
> SEXP myfunc(SEXP df),
> and it is called with a dataframe from
> the R side with:
> 
> .Call("myfunc", somedataframe)
> 
> On the C side (actually C++ side) I use code
> like this:
> SEXP colnames = getAttrib(df, R_NamesSymbol)
> cname  = string(CHAR(STRING_ELT(colnames,i))
> SEXP coldata = VECTOR_ELT(df,i) (data for i-th column)
> if(isReal(colData))
>     x = REAL(colData)[j];
> else if(isInteger(colData))
>     i = INTEGER(colData)[j];
> else if(isString(colData))
>     s = CHAR(STRING_ELT(colData,j))
> 
> The problem is that the last test (isString) never passes,
> even when I pass in a frame for which one or more cols
> contain character strings. When the column contains
> strings the isVector(colData) test passes, but no matter
> how I try to fetch the string data I get a seg fault. That
> is, forcing CHAR(STRING_ELT(colData,j)) will
> fault, and so will VECTOR_ELT(colData,0), even
> though colData passes the isVector test.
> 
> Any ideas?
> Thanks,
> ds
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel