Juan Manuel Barreneche
2015-Oct-22 00:45 UTC
[Rd] alternative read.arff function for the package foreign
?Hello everyone, I guess this is really directed to the R Core Team, but I understand that this is the best channel to submit this (please correct me if I'm wrong!). I would like to submit a function to consideration, as an upgrade for the current read.arff in package foreign. Code in github: https://raw.githubusercontent.com/jumanbar/misc/master/R/read.arff.R This function is a modified version of the one found in the foreign package. This changes aim to correct a problem I found with the standard read.arff: levels in factors do not match what's explicitly written in the original arff file. For example, if a nominal attribute in some arff datafile has this line in the header: @attribute X {'A', 'B', 'C'} But the data only have instances of 'A' and 'B', but not 'C', then what R imports is: dat <- read.arff("data.arff") levels(dat$X) [1] "a" "b" Not only the levels are in lowercase, but also there is one level which has disappeared. This is troublesome, specially if I wish to export my data frame to an arff file using write.arff. With this version of read.arff, when dealing with the aforementioned case, I get: levels(dat$X) [1] "A" "B" "C" And also I can set a couple of parameters which can help me tune up my work flow to better fit my needs (for example, reading only a limited number of lines, since I just want to make a couple of fast tests and therefore, I don't need the whole dataset). Thanks for your time, Juan Manuel -- MSc. Juan M. Barreneche Sarasola [[alternative HTML version deleted]]