thr3ads.net - R help - [R] reading in a tricky computer program output [Feb 2006]

If this information is useful, please help other people find it:
Share via:

Taka Matzmoto

2006-Feb-05 07:27 UTC

[R] reading in a tricky computer program output

Hi R user

I need to read in some values from a computer program output.

I can't change the output format because the developer of the program 
doesn't allow to change the format of output.

There are two formats.

First one looks like this

if I have 10 variables,

------------------------------------------------------------------------------------------------------
       [ 1]              [2]           [3]              [4]            [5]
[ 1]  0.000
[ 2]  0.001         0.000
[ 3] -0.002         0.019         0.000
[ 4]  0.012        -0.004        -0.020         0.000
[ 5] -0.015         0.003         0.011         0.008         0.000
[ 6]  0.005        -0.008        -0.005         0.002         0.005
[ 7]  0.008        -0.007         0.013         0.003         0.007
[ 8] -0.014        -0.011        -0.010        -0.025         0.002
[ 9]  0.006         0.003        -0.010         0.002        -0.020
[10] 0.006         0.010        -0.006         0.005         0.008
[ 6]  0.000
[ 7] -0.037         0.000
[ 8]  0.010         0.027         0.000
[ 9]  0.032        -0.004         0.008         0.000
[10] -0.008        -0.011         0.015        -0.020         0.000

------------------------------------------------------------------------------------------------
NOTE: I put [number] to show that this output is similar to a lower diagonal 
matrix including diagonal. In an ouput there is no [number]


The second format looks like this
--------------------------------------------------------------------------------------
       [1]              [2]             [3]           [4]              [5]
[ 2] -0.002
[ 3]  0.003        -0.053
[ 4] -0.026         0.010         0.045
[ 5]  0.023        -0.008        -0.025        -0.016
[ 6] -0.012         0.023         0.013        -0.005        -0.011
[ 7] -0.031         0.031        -0.054        -0.013        -0.027
[ 8]  0.040         0.042         0.031         0.075        -0.007
[ 9] -0.012        -0.009         0.023        -0.005         0.037
[10] -0.013        -0.027         0.014        -0.013        -0.020
[ 7]  0.127
[ 8] -0.035        -0.166
[ 9] -0.083         0.015        -0.027
[10]  0.021         0.047        -0.052         0.048
---------------------------------------------------------------------------------------------------------
NOTE: I put [number] to show that this output is similar to a lower diagonal 
matrix without diagonal. In an ouput there is no [number]

The problem of this format is the fixed column length ( 5 columns)

To make matter worse, the number of variables keep changing (10, 20, 30, 40, 
50, 60,70,80,90, and 100) so I need to take into the number of variables 
when I write a R function to read in these numbers.

If the number of variables is 80, the output is very long.

I only came up with this tedious one.

First I read in the output using scan() and then make it a numeric vector

I created 10 character vectors. Creating a 100 variable character vector is 
the most boring things

I have ever done.

one of the character vectors that matchs with the first 10 variable output 
is

first.10<-c(
            "i.001.001",
            "i.002.001","i.002.002",
            "i.003.001","i.003.002","i.003.003",
           
"i.004.001","i.004.002","i.004.003","i.004.004",
           
"i.005.001","i.005.002","i.005.003","i.005.004","i.005.005",
           
"i.006.001","i.006.002","i.006.003","i.006.004","i.006.005",
           
"i.007.001","i.007.002","i.007.003","i.007.004","i.007.005",
           
"i.008.001","i.008.002","i.008.003","i.008.004","i.008.005",
           
"i.009.001","i.009.002","i.009.003","i.009.004","i.009.005",
           
"i.010.001","i.010.002","i.010.003","i.010.004","i.010.005",
            "i.006.006",
            "i.007.006","i.007.007",
            "i.008.006","i.008.007","i.008.008",
           
"i.009.006","i.009.007","i.009.008","i.009.009",
           
"i.010.006","i.010.007","i.010.008","i.010.009","i.010.010"
           )

one of the character vectors that matchs with the second 10 variable output 
is

second.10<-c(
            "i.002.001",
            "i.003.001","i.003.002",
            "i.004.001","i.004.002","i.004.003",
           
"i.005.001","i.005.002","i.005.003","i.005.004",
           
"i.006.001","i.006.002","i.006.003","i.006.004","i.006.005",
           
"i.007.001","i.007.002","i.007.003","i.007.004","i.007.005",
           
"i.008.001","i.008.002","i.008.003","i.008.004","i.008.005",
           
"i.009.001","i.009.002","i.009.003","i.009.004","i.009.005",
           
"i.010.001","i.010.002","i.010.003","i.010.004","i.010.005",
            "i.007.006",
            "i.008.006","i.008.007",
            "i.009.006","i.009.007","i.009.008",
           
"i.010.006","i.010.007","i.010.008","i.010.009"
           )

and then assign the character vector to the numeric vector by

names<-first.10
first.10 = numeric.vector
combined.one <- cbind(names,first.10)
container <- diag(10)
for (i in 1:(10*10))
    {
        k   <- as.numeric(substr(combined.one[i,1],7,9))
        l   <- as.numeric(substr(combined.one [i,1],3,5))
        val <- as.numeric(combined.one [i,2])
        container [k,l] <- val
    }

container <- t(container )

Is there any other neat way to do this?

Any help would be appreciated

TM

Berwin A Turlach

2006-Feb-05 08:51 UTC

head link

[R] reading in a tricky computer program output

G'day Taka,
>>>>> "TM" == Taka Matzmoto <sell_mirage_ne at
hotmail.com> writes:
    TM> and then assign the character vector to the numeric vector by

    TM> names<-first.10
    TM> first.10 = numeric.vector
    TM> combined.one <- cbind(names,first.10)
    TM> container <- diag(10)
    TM> for (i in 1:(10*10))
I don't really understand this loop.  If I reverse-engineer this code
correctly thenthe matrix `combined.one' is not a 2*100 matrix, so you
should get an error while exectuting this loop.

    TM> Is there any other neat way to do this?
Neat way to create those character vectors?  Or a neat way to read in
the data from a file?

If the latter, I would use the following code:

      matzmoto <- function(file, diag=TRUE){
      
        dat <- scan(file)
        if(diag){
          nvar <- sqrt(2*length(dat)+0.25) - 0.5
          nn <- nvar
        }else{
          nvar <- sqrt(2*length(dat)+0.25) + 0.5
          nn <- nvar - 1
        }
        res <- matrix(0,nvar,nvar)
        ind <- upper.tri(res, diag=diag)
      
        rind <- 1:5
        while(nn > 0){
          if( nn < 5 ){
            rind <- rind[1:nn]
            tmp <- matrix(0,nn,nvar)
          }else{
            tmp <- matrix(0,5,nvar)
          }
      
          how.many <- sum(ind[rind,])
          tmp[ind[rind,]] <- dat[1:how.many]
          res[rind,] <- tmp
      
          dat <- dat[-(1:how.many)]
      
          rind <- rind + 5
          nn <- nn - 5
        }
        t(res)
      }
      
      res <- matzmoto("matzmoto1.dat", TRUE)
      print(res)
      
      res <- matzmoto("matzmoto2.dat", FALSE)
      print(res)


After storing the two examples that you posted into the files
matzmoto1.dat and matzmoto2.dat, respectively, and removing the part
that you said you have added, I get the following result on my machine
when sourcing the above code:

      > source("matzmoto.R")
      Read 55 items
              [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]  [,8]  [,9] [,10]
       [1,]  0.000  0.000  0.000  0.000  0.000  0.000  0.000 0.000  0.00     0
       [2,]  0.001  0.000  0.000  0.000  0.000  0.000  0.000 0.000  0.00     0
       [3,] -0.002  0.019  0.000  0.000  0.000  0.000  0.000 0.000  0.00     0
       [4,]  0.012 -0.004 -0.020  0.000  0.000  0.000  0.000 0.000  0.00     0
       [5,] -0.015  0.003  0.011  0.008  0.000  0.000  0.000 0.000  0.00     0
       [6,]  0.005 -0.008 -0.005  0.002  0.005  0.000  0.000 0.000  0.00     0
       [7,]  0.008 -0.007  0.013  0.003  0.007 -0.037  0.000 0.000  0.00     0
       [8,] -0.014 -0.011 -0.010 -0.025  0.002  0.010  0.027 0.000  0.00     0
       [9,]  0.006  0.003 -0.010  0.002 -0.020  0.032 -0.004 0.008  0.00     0
      [10,]  0.006  0.010 -0.006  0.005  0.008 -0.008 -0.011 0.015 -0.02     0
      Read 45 items
              [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]  [,9] [,10]
       [1,]  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000 0.000     0
       [2,] -0.002  0.000  0.000  0.000  0.000  0.000  0.000  0.000 0.000     0
       [3,]  0.003 -0.053  0.000  0.000  0.000  0.000  0.000  0.000 0.000     0
       [4,] -0.026  0.010  0.045  0.000  0.000  0.000  0.000  0.000 0.000     0
       [5,]  0.023 -0.008 -0.025 -0.016  0.000  0.000  0.000  0.000 0.000     0
       [6,] -0.012  0.023  0.013 -0.005 -0.011  0.000  0.000  0.000 0.000     0
       [7,] -0.031  0.031 -0.054 -0.013 -0.027  0.127  0.000  0.000 0.000     0
       [8,]  0.040  0.042  0.031  0.075 -0.007 -0.035 -0.166  0.000 0.000     0
       [9,] -0.012 -0.009  0.023 -0.005  0.037 -0.083  0.015 -0.027 0.000     0
      [10,] -0.013 -0.027  0.014 -0.013 -0.020  0.021  0.047 -0.052 0.048     0

The documentation of the function is quite simple.  Just pass the name
of the file in which the output is and whether it is a file that
includes the output of the diagonal or not.  If you don't trust the
calculation of how many variables are involved, then you might want to
change the function so that this is another input paramter.

HTH.

Cheers,

        Berwin

Gabor Grothendieck

2006-Feb-05 15:29 UTC

head link

[R] reading in a tricky computer program output

Its not clear to me what format you want to put the data in but this
will read it into a list, one list element per lower triangular matrix.
Modify to suit.

DF <- read.table("myfile.dat", fill = TRUE)
id <- cumsum(is.na(DF[,2]))
result <- by(DF, id, as.matrix)

# if the input is in the second format add this line after the above
result2 <- lapply(result, function(x) rbind(NA, x))


On 2/5/06, Taka Matzmoto <sell_mirage_ne at hotmail.com>
wrote:> Hi R user
>
> I need to read in some values from a computer program output.
>
> I can't change the output format because the developer of the program
> doesn't allow to change the format of output.
>
> There are two formats.
>
> First one looks like this
>
> if I have 10 variables,
>
>
------------------------------------------------------------------------------------------------------
>       [ 1]              [2]           [3]              [4]            [5]
> [ 1]  0.000
> [ 2]  0.001         0.000
> [ 3] -0.002         0.019         0.000
> [ 4]  0.012        -0.004        -0.020         0.000
> [ 5] -0.015         0.003         0.011         0.008         0.000
> [ 6]  0.005        -0.008        -0.005         0.002         0.005
> [ 7]  0.008        -0.007         0.013         0.003         0.007
> [ 8] -0.014        -0.011        -0.010        -0.025         0.002
> [ 9]  0.006         0.003        -0.010         0.002        -0.020
> [10] 0.006         0.010        -0.006         0.005         0.008
> [ 6]  0.000
> [ 7] -0.037         0.000
> [ 8]  0.010         0.027         0.000
> [ 9]  0.032        -0.004         0.008         0.000
> [10] -0.008        -0.011         0.015        -0.020         0.000
>
>
------------------------------------------------------------------------------------------------
> NOTE: I put [number] to show that this output is similar to a lower
diagonal
> matrix including diagonal. In an ouput there is no [number]
>
>
> The second format looks like this
>
--------------------------------------------------------------------------------------
>       [1]              [2]             [3]           [4]              [5]
> [ 2] -0.002
> [ 3]  0.003        -0.053
> [ 4] -0.026         0.010         0.045
> [ 5]  0.023        -0.008        -0.025        -0.016
> [ 6] -0.012         0.023         0.013        -0.005        -0.011
> [ 7] -0.031         0.031        -0.054        -0.013        -0.027
> [ 8]  0.040         0.042         0.031         0.075        -0.007
> [ 9] -0.012        -0.009         0.023        -0.005         0.037
> [10] -0.013        -0.027         0.014        -0.013        -0.020
> [ 7]  0.127
> [ 8] -0.035        -0.166
> [ 9] -0.083         0.015        -0.027
> [10]  0.021         0.047        -0.052         0.048
>
---------------------------------------------------------------------------------------------------------
> NOTE: I put [number] to show that this output is similar to a lower
diagonal
> matrix without diagonal. In an ouput there is no [number]
>
> The problem of this format is the fixed column length ( 5 columns)
>
> To make matter worse, the number of variables keep changing (10, 20, 30,
40,
> 50, 60,70,80,90, and 100) so I need to take into the number of variables
> when I write a R function to read in these numbers.
>
> If the number of variables is 80, the output is very long.
>
> I only came up with this tedious one.
>
> First I read in the output using scan() and then make it a numeric vector
>
> I created 10 character vectors. Creating a 100 variable character vector is
> the most boring things
>
> I have ever done.
>
> one of the character vectors that matchs with the first 10 variable output
> is
>
> first.10<-c(
>            "i.001.001",
>            "i.002.001","i.002.002",
>           
"i.003.001","i.003.002","i.003.003",
>           
"i.004.001","i.004.002","i.004.003","i.004.004",
>           
"i.005.001","i.005.002","i.005.003","i.005.004","i.005.005",
>           
"i.006.001","i.006.002","i.006.003","i.006.004","i.006.005",
>           
"i.007.001","i.007.002","i.007.003","i.007.004","i.007.005",
>           
"i.008.001","i.008.002","i.008.003","i.008.004","i.008.005",
>           
"i.009.001","i.009.002","i.009.003","i.009.004","i.009.005",
>           
"i.010.001","i.010.002","i.010.003","i.010.004","i.010.005",
>            "i.006.006",
>            "i.007.006","i.007.007",
>           
"i.008.006","i.008.007","i.008.008",
>           
"i.009.006","i.009.007","i.009.008","i.009.009",
>           
"i.010.006","i.010.007","i.010.008","i.010.009","i.010.010"
>           )
>
> one of the character vectors that matchs with the second 10 variable output
> is
>
> second.10<-c(
>            "i.002.001",
>            "i.003.001","i.003.002",
>           
"i.004.001","i.004.002","i.004.003",
>           
"i.005.001","i.005.002","i.005.003","i.005.004",
>           
"i.006.001","i.006.002","i.006.003","i.006.004","i.006.005",
>           
"i.007.001","i.007.002","i.007.003","i.007.004","i.007.005",
>           
"i.008.001","i.008.002","i.008.003","i.008.004","i.008.005",
>           
"i.009.001","i.009.002","i.009.003","i.009.004","i.009.005",
>           
"i.010.001","i.010.002","i.010.003","i.010.004","i.010.005",
>            "i.007.006",
>            "i.008.006","i.008.007",
>           
"i.009.006","i.009.007","i.009.008",
>           
"i.010.006","i.010.007","i.010.008","i.010.009"
>           )
>
> and then assign the character vector to the numeric vector by
>
> names<-first.10
> first.10 = numeric.vector
> combined.one <- cbind(names,first.10)
> container <- diag(10)
> for (i in 1:(10*10))
>    {
>        k   <- as.numeric(substr(combined.one[i,1],7,9))
>        l   <- as.numeric(substr(combined.one [i,1],3,5))
>        val <- as.numeric(combined.one [i,2])
>        container [k,l] <- val
>    }
>
> container <- t(container )
>
> Is there any other neat way to do this?
>
> Any help would be appreciated
>
> TM
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>

jim holtman

2006-Feb-06 03:13 UTC

head link

[R] reading in a tricky computer program output

Is this what you want?  You can use 'scan' to read in and 'fill'
out data in
a row.
> x <- scan('/temp/document1.txt', what=list(0, 0, 0, 0, 0),
fill=T,multi.line=F)
Read 15 records> x <- do.call('rbind', x)  # create a matrix
> x     [,1]  [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]  [,10]
[,11]  [,12] [,13]  [,14]
[1,]    0 0.001 -0.002  0.012 -0.015  0.005  0.008 -0.014  0.006  0.006
0 -0.037 0.010  0.032
[2,]   NA 0.000  0.019 -0.004  0.003 -0.008 -0.007 -0.011  0.003  0.010
NA  0.000 0.027 -0.004
[3,]   NA    NA  0.000 -0.020  0.011 -0.005  0.013 -0.010 -0.010 -0.006
NA     NA 0.000  0.008
[4,]   NA    NA     NA  0.000  0.008  0.002  0.003 -0.025  0.002  0.005
NA     NA    NA  0.000
[5,]   NA    NA     NA     NA  0.000  0.005  0.007  0.002 -0.020  0.008
NA     NA    NA     NA
      [,15]
[1,] -0.008
[2,] -0.011
[3,]  0.015
[4,] -0.020
[5,]  0.000> t(x)  # transpose to get in the order you want        [,1]   [,2]   [,3]   [,4]   [,5]
 [1,]  0.000     NA     NA     NA     NA
 [2,]  0.001  0.000     NA     NA     NA
 [3,] -0.002  0.019  0.000     NA     NA
 [4,]  0.012 -0.004 -0.020  0.000     NA
 [5,] -0.015  0.003  0.011  0.008  0.000
 [6,]  0.005 -0.008 -0.005  0.002  0.005
 [7,]  0.008 -0.007  0.013  0.003  0.007
 [8,] -0.014 -0.011 -0.010 -0.025  0.002
 [9,]  0.006  0.003 -0.010  0.002 -0.020
[10,]  0.006  0.010 -0.006  0.005  0.008
[11,]  0.000     NA     NA     NA     NA
[12,] -0.037  0.000     NA     NA     NA
[13,]  0.010  0.027  0.000     NA     NA
[14,]  0.032 -0.004  0.008  0.000     NA
[15,] -0.008 -0.011  0.015 -0.020  0.000>


On 2/5/06, Taka Matzmoto <sell_mirage_ne@hotmail.com>
wrote:>
> Hi R user
>
> I need to read in some values from a computer program output.
>
> I can't change the output format because the developer of the program
> doesn't allow to change the format of output.
>
> There are two formats.
>
> First one looks like this
>
> if I have 10 variables,
>
>
>
------------------------------------------------------------------------------------------------------
>       [ 1]              [2]           [3]              [4]            [5]
> [ 1]  0.000
> [ 2]  0.001         0.000
> [ 3] -0.002         0.019         0.000
> [ 4]  0.012        -0.004        -0.020         0.000
> [ 5] -0.015         0.003         0.011         0.008         0.000
> [ 6]  0.005        -0.008        -0.005         0.002         0.005
> [ 7]  0.008        -0.007         0.013         0.003         0.007
> [ 8] -0.014        -0.011        -0.010        -0.025         0.002
> [ 9]  0.006         0.003        -0.010         0.002        -0.020
> [10] 0.006         0.010        -0.006         0.005         0.008
> [ 6]  0.000
> [ 7] -0.037         0.000
> [ 8]  0.010         0.027         0.000
> [ 9]  0.032        -0.004         0.008         0.000
> [10] -0.008        -0.011         0.015        -0.020         0.000
>
>
>
------------------------------------------------------------------------------------------------
> NOTE: I put [number] to show that this output is similar to a lower
> diagonal
> matrix including diagonal. In an ouput there is no [number]
>
>
> The second format looks like this
>
>
--------------------------------------------------------------------------------------
>       [1]              [2]             [3]           [4]              [5]
> [ 2] -0.002
> [ 3]  0.003        -0.053
> [ 4] -0.026         0.010         0.045
> [ 5]  0.023        -0.008        -0.025        -0.016
> [ 6] -0.012         0.023         0.013        -0.005        -0.011
> [ 7] -0.031         0.031        -0.054        -0.013        -0.027
> [ 8]  0.040         0.042         0.031         0.075        -0.007
> [ 9] -0.012        -0.009         0.023        -0.005         0.037
> [10] -0.013        -0.027         0.014        -0.013        -0.020
> [ 7]  0.127
> [ 8] -0.035        -0.166
> [ 9] -0.083         0.015        -0.027
> [10]  0.021         0.047        -0.052         0.048
>
>
---------------------------------------------------------------------------------------------------------
> NOTE: I put [number] to show that this output is similar to a lower
> diagonal
> matrix without diagonal. In an ouput there is no [number]
>
> The problem of this format is the fixed column length ( 5 columns)
>
> To make matter worse, the number of variables keep changing (10, 20, 30,
> 40,
> 50, 60,70,80,90, and 100) so I need to take into the number of variables
> when I write a R function to read in these numbers.
>
> If the number of variables is 80, the output is very long.
>
> I only came up with this tedious one.
>
> First I read in the output using scan() and then make it a numeric vector
>
> I created 10 character vectors. Creating a 100 variable character vector
> is
> the most boring things
>
> I have ever done.
>
> one of the character vectors that matchs with the first 10 variable output
> is
>
> first.10<-c(
>            "i.001.001",
>            "i.002.001","i.002.002",
>           
"i.003.001","i.003.002","i.003.003",
>           
"i.004.001","i.004.002","i.004.003","i.004.004",
>           
"i.005.001","i.005.002","i.005.003","i.005.004","i.005.005",
>           
"i.006.001","i.006.002","i.006.003","i.006.004","i.006.005",
>           
"i.007.001","i.007.002","i.007.003","i.007.004","i.007.005",
>           
"i.008.001","i.008.002","i.008.003","i.008.004","i.008.005",
>           
"i.009.001","i.009.002","i.009.003","i.009.004","i.009.005",
>           
"i.010.001","i.010.002","i.010.003","i.010.004","i.010.005",
>            "i.006.006",
>            "i.007.006","i.007.007",
>           
"i.008.006","i.008.007","i.008.008",
>           
"i.009.006","i.009.007","i.009.008","i.009.009",
>           
"i.010.006","i.010.007","i.010.008","i.010.009","i.010.010"
>           )
>
> one of the character vectors that matchs with the second 10 variable
> output
> is
>
> second.10<-c(
>            "i.002.001",
>            "i.003.001","i.003.002",
>           
"i.004.001","i.004.002","i.004.003",
>           
"i.005.001","i.005.002","i.005.003","i.005.004",
>           
"i.006.001","i.006.002","i.006.003","i.006.004","i.006.005",
>           
"i.007.001","i.007.002","i.007.003","i.007.004","i.007.005",
>           
"i.008.001","i.008.002","i.008.003","i.008.004","i.008.005",
>           
"i.009.001","i.009.002","i.009.003","i.009.004","i.009.005",
>           
"i.010.001","i.010.002","i.010.003","i.010.004","i.010.005",
>            "i.007.006",
>            "i.008.006","i.008.007",
>           
"i.009.006","i.009.007","i.009.008",
>           
"i.010.006","i.010.007","i.010.008","i.010.009"
>           )
>
> and then assign the character vector to the numeric vector by
>
> names<-first.10
> first.10 = numeric.vector
> combined.one <- cbind(names,first.10)
> container <- diag(10)
> for (i in 1:(10*10))
>    {
>        k   <- as.numeric(substr(combined.one[i,1],7,9))
>        l   <- as.numeric(substr(combined.one [i,1],3,5))
>        val <- as.numeric(combined.one [i,2])
>        container [k,l] <- val
>    }
>
> container <- t(container )
>
> Is there any other neat way to do this?
>
> Any help would be appreciated
>
> TM
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>


--
Jim Holtman
Cincinnati, OH
+1 513 247 0281

What the problem you are trying to solve?

	[[alternative HTML version deleted]]

R help - Feb 2006 - reading in a tricky computer program output

[R] reading in a tricky computer program output

[R] reading in a tricky computer program output

[R] reading in a tricky computer program output

[R] reading in a tricky computer program output

Possibly Parallel Threads