Tania Oh
2008-Apr-22 13:05 UTC
[R] how to read in multiple files with unequal number of columns
Dear all, I want to read in 1000 files which contain varying number of columns. For example: file[1] contains 8 columns (mixture of characters and numbers) file[2] contains 16 columns etc I'm reading everything into one big data frame and when I try rbind, R returns an error of "Error in rbind(deparse.level, ...) : numbers of columns of arguments do not match" Below is my code: all <- NULL all <- as.data.frame(all) ##read in the contents of the files for (f in 1:length(fnames)){ tmp <- try(read.table(fnames[f], header=F, fill=T, sep="\t"), TRUE) if (class(tmp) == "try-error") { next ## skip this file if it's empty/non-existent }else{ ## combine all the file contents into one big data frame all <- rbind(all, tmp) } } Here is some example of what the data in the files: L3 <- LETTERS[1:3] (d <- data.frame(cbind(x=1, y=1:10), fac=sample(L3, 10, replace=TRUE))) > str(d) 'data.frame': 10 obs. of 3 variables: $ x : num 1 1 1 1 1 1 1 1 1 1 $ y : num 1 2 3 4 5 6 7 8 9 10 $ fac: Factor w/ 3 levels "A","B","C": 1 3 1 2 2 2 2 1 1 2 my.fake.data <- data.frame(cbind(x=1, y=2)) > str(my.fake.data) 'data.frame': 1 obs. of 2 variables: $ x: num 1 $ y: num 2 all <- rbind(d, my.fake.data) Error in rbind(deparse.level, ...) : numbers of columns of arguments do not match I've searched the R-site but couldn't find any relevant solution.I might have used the wrong keywords to search, so if this question has been answered already, I'd be very grateful if someone could point me to the post. Else any help/suggestions would be greatly appreciated. Many thanks in advance, tania D.Phil student Department of Physiology, Anatomy and Genetics University of Oxford
Ingmar Visser
2008-Apr-22 13:12 UTC
[R] how to read in multiple files with unequal number of columns
you may be looking for ?merge hth, Ingmar On 22 Apr 2008, at 15:05, Tania Oh wrote:> Dear all, > > I want to read in 1000 files which contain varying number of columns. > For example: > > file[1] contains 8 columns (mixture of characters and numbers) > file[2] contains 16 columns etc > > I'm reading everything into one big data frame and when I try rbind, R > returns an error of > "Error in rbind(deparse.level, ...) : > numbers of columns of arguments do not match" > > > Below is my code: > > all <- NULL > all <- as.data.frame(all) > > ##read in the contents of the files > for (f in 1:length(fnames)){ > > tmp <- try(read.table(fnames[f], header=F, fill=T, sep="\t"), > TRUE) > > if (class(tmp) == "try-error") { > next ## skip this file if it's empty/non-existent > }else{ > ## combine all the file contents into one big data frame > all <- rbind(all, tmp) > } > } > > > Here is some example of what the data in the files: > > L3 <- LETTERS[1:3] > (d <- data.frame(cbind(x=1, y=1:10), fac=sample(L3, 10, > replace=TRUE))) > >> str(d) > 'data.frame': 10 obs. of 3 variables: > $ x : num 1 1 1 1 1 1 1 1 1 1 > $ y : num 1 2 3 4 5 6 7 8 9 10 > $ fac: Factor w/ 3 levels "A","B","C": 1 3 1 2 2 2 2 1 1 2 > > my.fake.data <- data.frame(cbind(x=1, y=2)) >> str(my.fake.data) > 'data.frame': 1 obs. of 2 variables: > $ x: num 1 > $ y: num 2 > > > all <- rbind(d, my.fake.data) > > Error in rbind(deparse.level, ...) : > numbers of columns of arguments do not match > > > I've searched the R-site but couldn't find any relevant solution.I > might have used the wrong keywords to search, so if this question has > been answered already, I'd be very grateful if someone could point me > to the post. Else any help/suggestions would be greatly appreciated. > > Many thanks in advance, > tania > > D.Phil student > Department of Physiology, Anatomy and Genetics > University of Oxford > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.Ingmar Visser Department of Psychology, University of Amsterdam Roetersstraat 15 1018 WB Amsterdam The Netherlands t: +31-20-5256723 [[alternative HTML version deleted]]
John Kane
2008-Apr-22 18:29 UTC
[R] how to read in multiple files with unequal number of columns
You might want to have a look at the merge_all function in the reshape package. --- Tania Oh <tania.oh at bnc.ox.ac.uk> wrote:> Dear all, > > I want to read in 1000 files which contain varying > number of columns. > For example: > > file[1] contains 8 columns (mixture of characters > and numbers) > file[2] contains 16 columns etc > > I'm reading everything into one big data frame and > when I try rbind, R > returns an error of > "Error in rbind(deparse.level, ...) : > numbers of columns of arguments do not match" > > > Below is my code: > > all <- NULL > all <- as.data.frame(all) > > ##read in the contents of the files > for (f in 1:length(fnames)){ > > tmp <- try(read.table(fnames[f], header=F, > fill=T, sep="\t"), > TRUE) > > if (class(tmp) == "try-error") { > next ## skip this file if it's > empty/non-existent > }else{ > ## combine all the file contents into one > big data frame > all <- rbind(all, tmp) > } > } > > > Here is some example of what the data in the files: > > L3 <- LETTERS[1:3] > (d <- data.frame(cbind(x=1, y=1:10), fac=sample(L3, > 10, replace=TRUE))) > > > str(d) > 'data.frame': 10 obs. of 3 variables: > $ x : num 1 1 1 1 1 1 1 1 1 1 > $ y : num 1 2 3 4 5 6 7 8 9 10 > $ fac: Factor w/ 3 levels "A","B","C": 1 3 1 2 2 2 > 2 1 1 2 > > my.fake.data <- data.frame(cbind(x=1, y=2)) > > str(my.fake.data) > 'data.frame': 1 obs. of 2 variables: > $ x: num 1 > $ y: num 2 > > > all <- rbind(d, my.fake.data) > > Error in rbind(deparse.level, ...) : > numbers of columns of arguments do not match > > > I've searched the R-site but couldn't find any > relevant solution.I > might have used the wrong keywords to search, so if > this question has > been answered already, I'd be very grateful if > someone could point me > to the post. Else any help/suggestions would be > greatly appreciated. > > Many thanks in advance, > tania > > D.Phil student > Department of Physiology, Anatomy and Genetics > University of Oxford > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >
jim holtman
2008-Apr-23 12:47 UTC
[R] how to read in multiple files with unequal number of columns
Is this what you want? I am assuming that you will read the dataframes into a list and then process them like below:> # put dataframe in a list -- would have read them in via a list > x <- list(d, my.fake.data) > # determine maximum number of columns and then pad out the short one > # also use the column names of the largest one > > col.max <- max(sapply(x, ncol)) > colNames <- lapply(x, function(.data){+ if (ncol(.data) == col.max) colnames(.data) + })[[1]]> new.data <- lapply(x, function(.data){+ if (ncol(.data) < col.max){ + .data[(ncol(.data) + 1):col.max] <- NA + colnames(.data) <- colNames + } + .data + })> all <- do.call(rbind, new.data) > allx y fac 1 1 1 B 2 1 2 B 3 1 3 B 4 1 4 B 5 1 5 A 6 1 6 A 7 1 7 C 8 1 8 C 9 1 9 A 10 1 10 C 11 1 2 <NA>>On Tue, Apr 22, 2008 at 9:05 AM, Tania Oh <tania.oh at bnc.ox.ac.uk> wrote:> Dear all, > > I want to read in 1000 files which contain varying number of columns. > For example: > > file[1] contains 8 columns (mixture of characters and numbers) > file[2] contains 16 columns etc > > I'm reading everything into one big data frame and when I try rbind, R > returns an error of > "Error in rbind(deparse.level, ...) : > numbers of columns of arguments do not match" > > > Below is my code: > > all <- NULL > all <- as.data.frame(all) > > ##read in the contents of the files > for (f in 1:length(fnames)){ > > tmp <- try(read.table(fnames[f], header=F, fill=T, sep="\t"), > TRUE) > > if (class(tmp) == "try-error") { > next ## skip this file if it's empty/non-existent > }else{ > ## combine all the file contents into one big data frame > all <- rbind(all, tmp) > } > } > > > Here is some example of what the data in the files: > > L3 <- LETTERS[1:3] > (d <- data.frame(cbind(x=1, y=1:10), fac=sample(L3, 10, replace=TRUE))) > > > str(d) > 'data.frame': 10 obs. of 3 variables: > $ x : num 1 1 1 1 1 1 1 1 1 1 > $ y : num 1 2 3 4 5 6 7 8 9 10 > $ fac: Factor w/ 3 levels "A","B","C": 1 3 1 2 2 2 2 1 1 2 > > my.fake.data <- data.frame(cbind(x=1, y=2)) > > str(my.fake.data) > 'data.frame': 1 obs. of 2 variables: > $ x: num 1 > $ y: num 2 > > > all <- rbind(d, my.fake.data) > > Error in rbind(deparse.level, ...) : > numbers of columns of arguments do not match > > > I've searched the R-site but couldn't find any relevant solution.I > might have used the wrong keywords to search, so if this question has > been answered already, I'd be very grateful if someone could point me > to the post. Else any help/suggestions would be greatly appreciated. > > Many thanks in advance, > tania > > D.Phil student > Department of Physiology, Anatomy and Genetics > University of Oxford > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?
Reasonably Related Threads
- how to check if a variable is preferentially present in a sample
- suggestions for plotting 5000 data points
- how to loop through 2 lists with different indexes
- Is this an artifact of using "which"?
- conditional statement to replace values in dataframe with NA