The current population survey march supplements contain records on households, families and individuals, each with distinct record types all in the same file. I'm trying to efficiently read these files, the following function reads the data file "indata", the records are described in lists contained in "dd_by_type" and flag_pos gives the character position in the data which indicates which record type to be used, the function i've written "works" (see below, please) but it's awful at its job, it has to perform two read operations and one write operation per line, and I don't know how to make it more efficient. There exists a function for handling similar problems (read.fwf.multi) but it requires each collection of records be of a defined row length, (one household entry = n for all households) but, that doesnt work here so the function needs to read.fwf (or scan) a line and parse the line according to the flag in character position given by flag_pos.... thoughts? loadhierarchy <- function(indata,dd_by_type,flag_pos) { # read indata line by line and add each row based on its record type # we'll grab the line, compare the character in col position flag_pos to type_flag (column 3 in dd_by_type) and # rbind to the data with the right formatting i <- 1 width <- max(sum(unlist(dd_by_type[1:dim(dd_by_type)[1],3]))) #for (i in 1:dim(dd_by_type)[1]) { #assign(paste("con_",i,sep=""),file(paste(file.path(indata),"_",i,".csv",sep=""), open="w")) #} while (length(line <- (scan(indata,skip=(i-1),nlines=1,what=character(),fill=TRUE, sep=","))) > 0){ typeflag <- as.integer(substr(line,flag_pos,flag_pos)) inline <- read.fwf(indata,skip=(i-1),n=1,widths=as.vector(unlist(dd_by_type[typeflag,3])) ) inline <- matrix(unlist(inline),nrow=1) write.table(inline,paste(file.path(indata),"_",typeflag,".csv",sep=""), row.names=FALSE, col.names=FALSE, append=TRUE , sep=",") print(i) i <- (i+1) } } -- View this message in context: http://r.789695.n4.nabble.com/multiple-record-types-from-a-single-file-efficiently-tp2965249p2965249.html Sent from the R help mailing list archive at Nabble.com.