Hola! I have data in the form of a symmetric distance matrix, in the file I have recorded only the upper triangular part, with diagonal. The matrix is 21x21, and the file have row and col names, and some other information. I am trying to read with the following code (I tried many variations on it, but all give the same error). The items in the data file is delimited by white space. (Part of) script to read: myfile <- file("Paises.dat", open="r") # opens a connection which stays open until closed by close(myfile) name <- readLines(con=myfile, n=1) varnames <- scan( myfile, what=character(0), nlines=1 ) stopifnot( length(varnames) == 21 ) Paises <- matrix(0, 21, 21) colnames(Paises) <- varnames rownames(Paises) <- varnames for (i in 1:21) { temp <- scan(myfile, what=list("a", rep(0,22-i) ), nlines=1, sep="") Paises[i, i:21] <- temp[[2]] } I get the following result:> source("Paises.R", echo=TRUE)> myfile <- file("Paises.dat", open = "r")> name <- readLines(con = myfile, n = 1)> varnames <- scan(myfile, what = character(0), nlines = 1)Read 21 items> stopifnot(length(varnames) == 21)> Paises <- matrix(0, 21, 21)> colnames(Paises) <- varnames> rownames(Paises) <- varnames> for (i in 1:21) {temp <- scan(myfile, what = list("a", rep(0, 22 - i)), nlines = 1, sep = "") Paises[i, i:21] <- temp[[2]] } Read 11 records Error in "[<-"(`*tmp*`, i, i:21, value = temp[[2]]) : number of items to replace is not a multiple of replacement length> i[1] 1> temp[[1]] [1] "Bolivia" "1" "2" "3" "2" "4" "4" [8] "6" "6" "8" "8" [[2]] [1] 0 2 3 2 3 5 5 6 6 7 8>While I am asking only for one character string, multiple items are read as strings! What is happening? Kjetil Halvorsen
Kjetil - Frankly, your file would be much, much easier to read if it didn't have a row name at the beginning of each line. Any chance you can edit it to remove those ? Then, I think you could read in the numeric data with just one call to scan: mat <- matrix(0, 21, 21) mat[row(mat) >= col(mat)] <- scan("filename", skip=1) Paises <- t(mat) (Note that the upper-tri matrix gets transposed, in effect, when I read it in row-wise. So I transpose it back to upper-tri form in assigning it to 'Paises".) Last, you will need to read in the column names and assign these to both dimnames of Paises, but it's clear that you already know how to do that. Seems to me that you're going through a great deal of unnecessry gyrations by trying to use a "connection" rather than just pass the literal filename to scan(). I've never understood why people do that. - tom blackwell - u michigan medical school - ann arbor - On Mon, 10 Nov 2003 kjetil at entelnet.bo wrote:> Hola! > > I have data in the form of a symmetric distance matrix, in the file I > have recorded only the upper triangular part, with diagonal. The > matrix is 21x21, and the file have row and col names, and some other > information. I am trying to read with the following code (I tried > many variations on it, but all give the same error). The items in the > data file is delimited by white space. > > (Part of) script to read: > > myfile <- file("Paises.dat", open="r") > # opens a connection which stays open until closed by > close(myfile) > name <- readLines(con=myfile, n=1) > varnames <- scan( myfile, what=character(0), nlines=1 ) > > stopifnot( length(varnames) == 21 ) > Paises <- matrix(0, 21, 21) > colnames(Paises) <- varnames > rownames(Paises) <- varnames > for (i in 1:21) { > temp <- scan(myfile, what=list("a", rep(0,22-i) ), nlines=1, > sep="") > Paises[i, i:21] <- temp[[2]] > } > > I get the following result: > > > source("Paises.R", echo=TRUE) > > > myfile <- file("Paises.dat", open = "r") > > > name <- readLines(con = myfile, n = 1) > > > varnames <- scan(myfile, what = character(0), nlines = 1) > Read 21 items > > > stopifnot(length(varnames) == 21) > > > Paises <- matrix(0, 21, 21) > > > colnames(Paises) <- varnames > > > rownames(Paises) <- varnames > > > for (i in 1:21) { > temp <- scan(myfile, what = list("a", rep(0, 22 - i)), nlines > 1, > sep = "") > Paises[i, i:21] <- temp[[2]] > } > Read 11 records > Error in "[<-"(`*tmp*`, i, i:21, value = temp[[2]]) : > number of items to replace is not a multiple of replacement > length > > i > [1] 1 > > temp > [[1]] > [1] "Bolivia" "1" "2" "3" "2" "4" "4" > > [8] "6" "6" "8" "8" > > [[2]] > [1] 0 2 3 2 3 5 5 6 6 7 8 > > > > > While I am asking only for one character string, multiple items are > read as strings! What is happening? > > Kjetil Halvorsen > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help >
You supplied "what" as a list of length 2, which is not what you intended. I presume you read every other item on the first line. Try list("a", rep(list(0), 22-i)) I would have read the whole of the matrix as a character vector, removed the items corresponding to the labels, converted to numeric. On Mon, 10 Nov 2003 kjetil at entelnet.bo wrote:> Hola! > > I have data in the form of a symmetric distance matrix, in the file I > have recorded only the upper triangular part, with diagonal. The > matrix is 21x21, and the file have row and col names, and some other > information. I am trying to read with the following code (I tried > many variations on it, but all give the same error). The items in the > data file is delimited by white space. > > (Part of) script to read: > > myfile <- file("Paises.dat", open="r") > # opens a connection which stays open until closed by > close(myfile) > name <- readLines(con=myfile, n=1) > varnames <- scan( myfile, what=character(0), nlines=1 ) > > stopifnot( length(varnames) == 21 ) > Paises <- matrix(0, 21, 21) > colnames(Paises) <- varnames > rownames(Paises) <- varnames > for (i in 1:21) { > temp <- scan(myfile, what=list("a", rep(0,22-i) ), nlines=1, > sep="") > Paises[i, i:21] <- temp[[2]] > } > > I get the following result: > > > source("Paises.R", echo=TRUE) > > > myfile <- file("Paises.dat", open = "r") > > > name <- readLines(con = myfile, n = 1) > > > varnames <- scan(myfile, what = character(0), nlines = 1) > Read 21 items > > > stopifnot(length(varnames) == 21) > > > Paises <- matrix(0, 21, 21) > > > colnames(Paises) <- varnames > > > rownames(Paises) <- varnames > > > for (i in 1:21) { > temp <- scan(myfile, what = list("a", rep(0, 22 - i)), nlines = > 1, > sep = "") > Paises[i, i:21] <- temp[[2]] > } > Read 11 records > Error in "[<-"(`*tmp*`, i, i:21, value = temp[[2]]) : > number of items to replace is not a multiple of replacement > length > > i > [1] 1 > > temp > [[1]] > [1] "Bolivia" "1" "2" "3" "2" "4" "4" > > [8] "6" "6" "8" "8" > > [[2]] > [1] 0 2 3 2 3 5 5 6 6 7 8 > > > > > While I am asking only for one character string, multiple items are > read as strings! What is happening? > > Kjetil Halvorsen > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On Tue, 11 Nov 2003, Prof Brian Ripley wrote:> You supplied "what" as a list of length 2, which is not what you intended. > I presume you read every other item on the first line. > > Try list("a", rep(list(0), 22-i))Sorry, that got mangled when my cable modem went down. I tested c(list("a"), rep(list(0), 22-i)) -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595