Hello, I'm trying to fetch a data frame through the C API, and have no problem doing this when all columns are numbers, but when there is a column of strings I have a problem. On the C-side the function looks like: SEXP myfunc(SEXP df), and it is called with a dataframe from the R side with: .Call("myfunc", somedataframe) On the C side (actually C++ side) I use code like this: SEXP colnames = getAttrib(df, R_NamesSymbol) cname = string(CHAR(STRING_ELT(colnames,i)) SEXP coldata = VECTOR_ELT(df,i) (data for i-th column) if(isReal(colData)) x = REAL(colData)[j]; else if(isInteger(colData)) i = INTEGER(colData)[j]; else if(isString(colData)) s = CHAR(STRING_ELT(colData,j)) The problem is that the last test (isString) never passes, even when I pass in a frame for which one or more cols contain character strings. When the column contains strings the isVector(colData) test passes, but no matter how I try to fetch the string data I get a seg fault. That is, forcing CHAR(STRING_ELT(colData,j)) will fault, and so will VECTOR_ELT(colData,0), even though colData passes the isVector test. Any ideas? Thanks, ds
While I do not know how to handle this on the C level, I know that you do not have characters in data frames, everything is factors instead. Internally they are coded as a number of integer levels, with the levels having labels (which is the character you see). So eg (in R): > test <- data.frame(tmp = letters[1:10]) > test tmp 1 a 2 b 3 c 4 d 5 e 6 f 7 g 8 h 9 i 10 j > is.character(test$temp) [1] FALSE > as.numeric(test$tmp) # The internal code of the factor [1] 1 2 3 4 5 6 7 8 9 10 > levels(test$tmp) # gives you the translation from internal code to actual label [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" You probably need to convert the factor to a character, which I do not know how to do in C on top of my head, but which is probably not that difficult. At least now you should have some idea on where to look. /Kasper On Jun 21, 2006, at 10:07 PM, Dominick Samperi wrote:> Hello, > > I'm trying to fetch a data frame through the C API, > and have no problem doing this when all columns > are numbers, but when there is a column of > strings I have a problem. On the C-side the > function looks like: > SEXP myfunc(SEXP df), > and it is called with a dataframe from > the R side with: > > .Call("myfunc", somedataframe) > > On the C side (actually C++ side) I use code > like this: > SEXP colnames = getAttrib(df, R_NamesSymbol) > cname = string(CHAR(STRING_ELT(colnames,i)) > SEXP coldata = VECTOR_ELT(df,i) (data for i-th column) > if(isReal(colData)) > x = REAL(colData)[j]; > else if(isInteger(colData)) > i = INTEGER(colData)[j]; > else if(isString(colData)) > s = CHAR(STRING_ELT(colData,j)) > > The problem is that the last test (isString) never passes, > even when I pass in a frame for which one or more cols > contain character strings. When the column contains > strings the isVector(colData) test passes, but no matter > how I try to fetch the string data I get a seg fault. That > is, forcing CHAR(STRING_ELT(colData,j)) will > fault, and so will VECTOR_ELT(colData,0), even > though colData passes the isVector test. > > Any ideas? > Thanks, > ds > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
I think you want else if (TYPEOF(colData) == STRSXP) ... instead. I don't know if this will convert from factors to string's, but somewhere it probably involves something like this: PROTECT(colData = coerceVector(colData, STRSXP)); Dominick Samperi wrote:> Hello, > > I'm trying to fetch a data frame through the C API, > and have no problem doing this when all columns > are numbers, but when there is a column of > strings I have a problem. On the C-side the > function looks like: > SEXP myfunc(SEXP df), > and it is called with a dataframe from > the R side with: > > .Call("myfunc", somedataframe) > > On the C side (actually C++ side) I use code > like this: > SEXP colnames = getAttrib(df, R_NamesSymbol) > cname = string(CHAR(STRING_ELT(colnames,i)) > SEXP coldata = VECTOR_ELT(df,i) (data for i-th column) > if(isReal(colData)) > x = REAL(colData)[j]; > else if(isInteger(colData)) > i = INTEGER(colData)[j]; > else if(isString(colData)) > s = CHAR(STRING_ELT(colData,j)) > > The problem is that the last test (isString) never passes, > even when I pass in a frame for which one or more cols > contain character strings. When the column contains > strings the isVector(colData) test passes, but no matter > how I try to fetch the string data I get a seg fault. That > is, forcing CHAR(STRING_ELT(colData,j)) will > fault, and so will VECTOR_ELT(colData,0), even > though colData passes the isVector test. > > Any ideas? > Thanks, > ds > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel