thr3ads.net - R help - [R] how to read this kind of csv in R？ [Oct 2019]

If this information is useful, please help other people find it:
Share via:

vod vos

2019-Oct-06 11:29 UTC

[R] how to read this kind of csv in R？

I got hundreds of csv files. The real formats in each csv file are as follows:

aa(cm)
1, 2 , 3,

bb(mm)
1, 2, 3,
4, 5, 6,
7, 8, 9,

cc(mm)
3, 4, 5,
7, 5, 9,
6, 5, 8,

How can I use read.table or read.csv to convert the csv files 
to a tidy data frame format as follow:

aa, bb, cc
1, 1, 3
1, 2, 4
1, 3, 5
2, 4, 7
2, 5, 5
2, 6, 9
3, 7, 6
3, 8, 5
3, 9, 8

many thanks.

Duncan Murdoch

2019-Oct-06 12:08 UTC

head link

[R] how to read this kind of csv in R？

On 06/10/2019 7:29 a.m., vod vos via R-help wrote:> I got hundreds of csv files. The real formats in each csv file are as
follows:
> 
> aa(cm)
> 1, 2 , 3,
> 
> bb(mm)
> 1, 2, 3,
> 4, 5, 6,
> 7, 8, 9,
> 
> cc(mm)
> 3, 4, 5,
> 7, 5, 9,
> 6, 5, 8,
> 
> How can I use read.table or read.csv to convert the csv files
> to a tidy data frame format as follow:
> 
> aa, bb, cc
> 1, 1, 3
> 1, 2, 4
> 1, 3, 5
> 2, 4, 7
> 2, 5, 5
> 2, 6, 9
> 3, 7, 6
> 3, 8, 5
> 3, 9, 8
> 
> many thanks.
You'll need more than those two functions to do the transformation you 
want.  To work out what you need, write out the process in detail in 
English (or another natural language), not in code.  For example:

1.  Read aa from file 1.
2.  Read bb from file 2.
3.  Read cc from file 3.
4.  Expand all vectors to the same length.
5.  Combine them into a single dataframe.

Then work out each step separately.  I think you'll want to use 
something like scan("filename", skip = 1, sep = ",") in
steps 1, 2, and
3, but this will add NA values at the end of each line because of the 
final comma, so you could do this:

aa <- scan("file1", skip = 1, sep = ",")
aa <- aa[!is.na(aa)]

and similarly for the others.

I don't know the rules for expanding that you'll need in your real data,
but for your example step 4 could be

   aa <- rep(aa, each = 3)

Then step 5 could be

   result <- data.frame(aa, bb, cc)

Duncan Murdoch

vod vos

2019-Oct-06 12:23 UTC

head link

[R] how to read this kind of csv in R？

The problem is aa, bb and cc all in a single csv file 
contains no blank line.
The single csv file like list output.

aa(cm)
 1, 2 , 3,
 bb(mm)
  1, 2, 3,
 4, 5, 6,
 7, 8, 9,
 cc(mm)
 3, 4, 5,
 7, 5, 9,
 6, 5, 8,



 ---- ? ???, 06 ?? 2019 05:08:41 -0700 Duncan Murdoch <murdoch.duncan at
gmail.com> ?? ----
 > On 06/10/2019 7:29 a.m., vod vos via R-help wrote:
 > > I got hundreds of csv files. The real formats in each csv file are as
follows:
 > > 
 > > aa(cm)
 > > 1, 2 , 3,
 > > 
 > > bb(mm)
 > > 1, 2, 3,
 > > 4, 5, 6,
 > > 7, 8, 9,
 > > 
 > > cc(mm)
 > > 3, 4, 5,
 > > 7, 5, 9,
 > > 6, 5, 8,
 > > 
 > > How can I use read.table or read.csv to convert the csv files
 > > to a tidy data frame format as follow:
 > > 
 > > aa, bb, cc
 > > 1, 1, 3
 > > 1, 2, 4
 > > 1, 3, 5
 > > 2, 4, 7
 > > 2, 5, 5
 > > 2, 6, 9
 > > 3, 7, 6
 > > 3, 8, 5
 > > 3, 9, 8
 > > 
 > > many thanks.
 > 
 > You'll need more than those two functions to do the transformation you
 > want.  To work out what you need, write out the process in detail in 
 > English (or another natural language), not in code.  For example:
 > 
 > 1.  Read aa from file 1.
 > 2.  Read bb from file 2.
 > 3.  Read cc from file 3.
 > 4.  Expand all vectors to the same length.
 > 5.  Combine them into a single dataframe.
 > 
 > Then work out each step separately.  I think you'll want to use 
 > something like scan("filename", skip = 1, sep = ",")
in steps 1, 2, and
 > 3, but this will add NA values at the end of each line because of the 
 > final comma, so you could do this:
 > 
 > aa <- scan("file1", skip = 1, sep = ",")
 > aa <- aa[!is.na(aa)]
 > 
 > and similarly for the others.
 > 
 > I don't know the rules for expanding that you'll need in your real
data,
 > but for your example step 4 could be
 > 
 >    aa <- rep(aa, each = 3)
 > 
 > Then step 5 could be
 > 
 >    result <- data.frame(aa, bb, cc)
 > 
 > Duncan Murdoch
 >

Rui Barradas

2019-Oct-06 14:58 UTC

head link

[R] how to read this kind of csv in R？

Hello,

It is not clear if all files have

* a first block with just one data line
* all other blocks with as many rows as the numbers in that first data line.

If yes, maybe something like this?

lns <- readLines("strange.csv")
lns <- lns[sapply(lns, nchar) > 0]
lns <- sub(",$", "", lns)
i_title <- grep("[[:alpha:]]", lns)

tmp <- lapply(seq_along(i_title), function(i){
   tmp <- if(i < length(i_title)){
     lns[(i_title[i] + 1):(i_title[i + 1] - 1)]
   }else{
     lns[(i_title[i] + 1):length(lns)]
   }
   list(n = length(tmp), text = unlist(strsplit(tmp, ",")))
})

n <- max(sapply(tmp, '[[', 'n'))
tmp <- lapply(tmp, function(x) as.numeric(x$text))
tmp[[1]] <- rep(tmp[[1]], each = n)
res <- do.call(cbind.data.frame, tmp)
names(res) <- lns[i_title]
res


If you have hundreds of files, you should make a function out of the 
code above.

Hope this helps,

Rui Barradas

?s 12:29 de 06/10/19, vod vos via R-help escreveu:> I got hundreds of csv files. The real formats in each csv file are as
follows:
> 
> aa(cm)
> 1, 2 , 3,
> 
> bb(mm)
> 1, 2, 3,
> 4, 5, 6,
> 7, 8, 9,
> 
> cc(mm)
> 3, 4, 5,
> 7, 5, 9,
> 6, 5, 8,
> 
> How can I use read.table or read.csv to convert the csv files
> to a tidy data frame format as follow:
> 
> aa, bb, cc
> 1, 1, 3
> 1, 2, 4
> 1, 3, 5
> 2, 4, 7
> 2, 5, 5
> 2, 6, 9
> 3, 7, 6
> 3, 8, 5
> 3, 9, 8
> 
> many thanks.
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

vodvos m@iii@g oii zoho@com

2019-Oct-07 08:23 UTC

head link

[R] how to read this kind of csv in R？

The csv file is exported from Windows (dos format), so the line break is
different from Unix.


 ---- ? ???, 07 ?? 2019 01:18:54 -0700  <vodvos at zoho.com> ?? ----
 > I am mad about importing this strange csv format type.
 > 
 > The real csv has been attached now. The raw data points are huge.
 > 
 > Many thanks.
 > 
 > 
 > 
 > 
 >  ---- ? ???, 06 ?? 2019 07:58:37 -0700 Rui Barradas <ruipbarradas at
sapo.pt> ?? ----
 >  > Hello,
 >  > 
 >  > It is not clear if all files have
 >  > 
 >  > * a first block with just one data line
 >  > * all other blocks with as many rows as the numbers in that first
data line.
 >  > 
 >  > If yes, maybe something like this?
 >  > 
 >  > lns <- readLines("strange.csv")
 >  > lns <- lns[sapply(lns, nchar) > 0]
 >  > lns <- sub(",$", "", lns)
 >  > i_title <- grep("[[:alpha:]]", lns)
 >  > 
 >  > tmp <- lapply(seq_along(i_title), function(i){
 >  >    tmp <- if(i < length(i_title)){
 >  >      lns[(i_title[i] + 1):(i_title[i + 1] - 1)]
 >  >    }else{
 >  >      lns[(i_title[i] + 1):length(lns)]
 >  >    }
 >  >    list(n = length(tmp), text = unlist(strsplit(tmp,
",")))
 >  > })
 >  > 
 >  > n <- max(sapply(tmp, '[[', 'n'))
 >  > tmp <- lapply(tmp, function(x) as.numeric(x$text))
 >  > tmp[[1]] <- rep(tmp[[1]], each = n)
 >  > res <- do.call(cbind.data.frame, tmp)
 >  > names(res) <- lns[i_title]
 >  > res
 >  > 
 >  > 
 >  > If you have hundreds of files, you should make a function out of the
 >  > code above.
 >  > 
 >  > Hope this helps,
 >  > 
 >  > Rui Barradas
 >  > 
 >  > ?s 12:29 de 06/10/19, vod vos via R-help escreveu:
 >  > > I got hundreds of csv files. The real formats in each csv file
are as follows:
 >  > > 
 >  > > aa(cm)
 >  > > 1, 2 , 3,
 >  > > 
 >  > > bb(mm)
 >  > > 1, 2, 3,
 >  > > 4, 5, 6,
 >  > > 7, 8, 9,
 >  > > 
 >  > > cc(mm)
 >  > > 3, 4, 5,
 >  > > 7, 5, 9,
 >  > > 6, 5, 8,
 >  > > 
 >  > > How can I use read.table or read.csv to convert the csv files
 >  > > to a tidy data frame format as follow:
 >  > > 
 >  > > aa, bb, cc
 >  > > 1, 1, 3
 >  > > 1, 2, 4
 >  > > 1, 3, 5
 >  > > 2, 4, 7
 >  > > 2, 5, 5
 >  > > 2, 6, 9
 >  > > 3, 7, 6
 >  > > 3, 8, 5
 >  > > 3, 9, 8
 >  > > 
 >  > > many thanks.
 >  > > 
 >  > > ______________________________________________
 >  > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
 >  > > https://stat.ethz.ch/mailman/listinfo/r-help
 >  > > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 >  > > and provide commented, minimal, self-contained, reproducible
code.
 >  > > 
 >  >

Rui Barradas

2019-Oct-07 16:55 UTC

head link

[R] how to read this kind of csv in R？

Hello,

OK, I had some spare time. Try



readCSVFile <- function(filename){
   lns <- readLines(filename)
   lns <- lns[sapply(lns, nchar) > 0]
   lns <- gsub(" ", "", lns)
   lns <- sub(";$", "", lns)
   i_title <- grep("[[:alpha:]]", lns)

   blocks <- lapply(seq_along(i_title)[-1], function(i){
     if(i == length(i_title)){
       j <- i_title[i] + 1
       k <- length(lns)
     }else{
       j <- i_title[i] + 1
       k <- i_title[i + 1] - 1
     }
     lns[j:k]
   })

   n <- length(unlist(strsplit(blocks[[1]][1], ";")))
   first <- unlist(strsplit(lns[i_title[1] + 1], ";"))
   first <- as.numeric(first)
   first <- rep(first, each = n)

   blocks <- lapply(blocks, function(x){
     unlist(strsplit(x, ";"))
   })
   res <- do.call(cbind.data.frame, blocks)
   res <- cbind.data.frame(first, res)

   names(res) <- sub("\\[.*\\]$", "", lns[i_title])
   res
}

df1 <- readCSVFile("strange.csv")


If this function doesn't do it, please try to make an effort on your 
own, R-Help is not a code writing service, it's a mail list for *doubts* 
on R code.

Hope this helps,

Rui Barradas

?s 09:18 de 07/10/19, vodvos at zoho.com escreveu:> I am mad about importing this strange csv format type.
> 
> The real csv has been attached now. The raw data points are huge.
> 
> Many thanks.
> 
> 
> 
> 
>   ---- ? ???, 06 ?? 2019 07:58:37 -0700 Rui Barradas <ruipbarradas at
sapo.pt> ?? ----
>   > Hello,
>   >
>   > It is not clear if all files have
>   >
>   > * a first block with just one data line
>   > * all other blocks with as many rows as the numbers in that first
data line.
>   >
>   > If yes, maybe something like this?
>   >
>   > lns <- readLines("strange.csv")
>   > lns <- lns[sapply(lns, nchar) > 0]
>   > lns <- sub(",$", "", lns)
>   > i_title <- grep("[[:alpha:]]", lns)
>   >
>   > tmp <- lapply(seq_along(i_title), function(i){
>   >    tmp <- if(i < length(i_title)){
>   >      lns[(i_title[i] + 1):(i_title[i + 1] - 1)]
>   >    }else{
>   >      lns[(i_title[i] + 1):length(lns)]
>   >    }
>   >    list(n = length(tmp), text = unlist(strsplit(tmp,
",")))
>   > })
>   >
>   > n <- max(sapply(tmp, '[[', 'n'))
>   > tmp <- lapply(tmp, function(x) as.numeric(x$text))
>   > tmp[[1]] <- rep(tmp[[1]], each = n)
>   > res <- do.call(cbind.data.frame, tmp)
>   > names(res) <- lns[i_title]
>   > res
>   >
>   >
>   > If you have hundreds of files, you should make a function out of the
>   > code above.
>   >
>   > Hope this helps,
>   >
>   > Rui Barradas
>   >
>   > ?s 12:29 de 06/10/19, vod vos via R-help escreveu:
>   > > I got hundreds of csv files. The real formats in each csv file
are as follows:
>   > >
>   > > aa(cm)
>   > > 1, 2 , 3,
>   > >
>   > > bb(mm)
>   > > 1, 2, 3,
>   > > 4, 5, 6,
>   > > 7, 8, 9,
>   > >
>   > > cc(mm)
>   > > 3, 4, 5,
>   > > 7, 5, 9,
>   > > 6, 5, 8,
>   > >
>   > > How can I use read.table or read.csv to convert the csv files
>   > > to a tidy data frame format as follow:
>   > >
>   > > aa, bb, cc
>   > > 1, 1, 3
>   > > 1, 2, 4
>   > > 1, 3, 5
>   > > 2, 4, 7
>   > > 2, 5, 5
>   > > 2, 6, 9
>   > > 3, 7, 6
>   > > 3, 8, 5
>   > > 3, 9, 8
>   > >
>   > > many thanks.
>   > >
>   > > ______________________________________________
>   > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>   > > https://stat.ethz.ch/mailman/listinfo/r-help
>   > > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>   > > and provide commented, minimal, self-contained, reproducible
code.
>   > >
>   >
>

R help - Oct 2019 - how to read this kind of csv in R？

[R] how to read this kind of csv in R？

[R] how to read this kind of csv in R？

[R] how to read this kind of csv in R？

[R] how to read this kind of csv in R？

[R] how to read this kind of csv in R？

[R] how to read this kind of csv in R？