thr3ads.net - R help - [R] R - Reading a horizontally arranged csv file, replacing values and saving again [Oct 2018]

If this information is useful, please help other people find it:
Share via:

Manoranjan Muthusamy

2018-Oct-03 10:05 UTC

[R] R - Reading a horizontally arranged csv file, replacing values and saving again

Thanks for the short but informative answer, Bill. But still, each row has
four columns..right? Although the *NA* is replaced by a blank cell,
because of the extra comma it still is a four column row. Is there any way
to avoid/remove the extra comma when NA is replaced which will make it a
three column row?



On Mon, Oct 1, 2018 at 4:56 PM William Dunlap <wdunlap at tibco.com>
wrote:
> Read the help file for write.table
>    ?write.table
> and look at the descriptions of its arguments.
>   > df <-
data.frame(Text=c("Abe","Ben",NA,"David"),
Age=c(19, NA, 12, 10))
>   > write.table(file=stdout(), t(df), sep=",")
>   "V1","V2","V3","V4"
>   "Text","Abe","Ben",NA,"David"
>   "Age","19",NA,"12","10"
>   > write.table(file=stdout(), t(df), quote=FALSE, na="",
sep=",")
>   V1,V2,V3,V4
>   Text,Abe,Ben,,David
>   Age,19,,12,10
>
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Mon, Oct 1, 2018 at 2:24 AM, Manoranjan Muthusamy <
> ranjanmano167 at gmail.com> wrote:
>
>> I have a horizontally arranged CSV file
>> <https://www.dropbox.com/s/0ukyuifvpq1olqk/samplefile.csv?dl=0> (
>> samplefile.csv) with headers are in the first column. Also, each row
has a
>> different number of columns. I want to read this CSV file, replace one
of
>> the cell value and save again as a CSV file with the same format as the
>> original file with exactly same number of columns and rows. It sounds
like
>> a simple task, but I am struggling to find a way. I tried to do this
with
>> the help of this
>> <
>>
https://stackoverflow.com/questions/17288197/reading-a-csv-file-organized-horizontally
>> >
>>  and this
>> <
>>
https://stackoverflow.com/questions/20308621/how-do-i-write-a-csv-file-in-r-where-my-input-is-written-to-the-file-as-row
>> >
>> posts,
>> but still, I can't get the output the way I want. Can somebody help
me
>> here?
>>
>> My attempt using the answer in this post
>> <
>>
https://stackoverflow.com/questions/17288197/reading-a-csv-file-organized-horizontally
>> >
>> to
>> read the CSV file (samplefile.csv) gives me the following output where
>> headers are kinda messed up and empty cells are replaced with NA values
>> which is not what I want
>>
>> aaa <- read.tcsv("samplefile.csv")
>> aaa
>>
>>      Header.1 Header.2..units. Header.3..units. Header.3..units..11
>> Some text           0.0625                0             2648962
>> <NA>           0.0625             1200             6647473       
<NA>
>>           0.0625             1380                  14        <NA>
>>      0.2500             1500                  15        <NA>
>> 0.6250             1620                 NA6        <NA>
>> 1.3125             1740                 NA7        <NA>
>> 2.4375             1860                 NA8        <NA>
>> 3.5625             1980                 NA9        <NA>
>> 4.6250             2100                 NA10       <NA>
>> 5.0000             2220                 NA11       <NA>
>> 5.0000             2340                 NA12       <NA>
>> 4.6250             2460                 NA13       <NA>
>> 3.5625             2580                 NA14       <NA>
>> 2.4375             2700                 NA15       <NA>
>> 1.3125             2820                 NA16       <NA>
>> 0.6250             2940                 NA17       <NA>
>> 0.2500             3060                 NA18       <NA>
>> 0.0625             3180                 NA19       <NA>
>> 0.0000             3300                 NA20       <NA>
>> 0.0000            18000                 NA
>>
>> Also, I am not sure how to go back to original format when I save the
file
>> again after a modification (for example after replacing a cell value)
>>
>> I tried saving the file again by using t (transpose) as given below
>>
>> write.csv(t(aaa), file ="samplefile_e.csv", row.names=T)
>>
>> but still, there are following issues in the saved file
>>
>>    1. messed up headers
>>    2. empty cells are replaced with NA
>>    3. when I open the file in a text editor all the values are shown as
>>    characters
>>
>> Thanks,
>> Mano
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
	[[alternative HTML version deleted]]

William Dunlap

2018-Oct-03 16:51 UTC

head link

[R] R - Reading a horizontally arranged csv file, replacing values and saving again

Using read.table to read data with a variable number of entries per row is
probably a mistake - data.frames (which read.table returns) are not meant
for this sort of data.  You want to store the data in R as a list with
names.  E.g.,

readYourFormat <- function(connection) {
    # connection is a file name or connection object referring
    # to text of the form
    # Header A,Some text
    # Header B,<number>,<number>,<number>
    # Header C,<number>
    # Header D,<number>,<number>,<number>,<number>
    lines <- readLines(connection)
    lines <- lines[ nchar(lines)>0 ] # omit empty lines
    fields <- strsplit(lines, ",") # split lines by commas into
fields
    headers <- vapply(fields, function(x)x[1], FUN.VALUE="") #
first entry
is header
    names(fields) <- headers
    contents <- lapply(fields, function(x)x[-1]) # remaining entries are
contents
    contents[-1] <- lapply(contents[-1], as.numeric) # convert all but
first line's entries to numeric
    contents
}

After reading the data and fiddling with it (but keeping it a named list),
you can write it out in a similar format with

writeYourFormat <- function (data, connection = stdout())
{
    stopifnot(is.list(data), !is.null(names(data)))
    cat(file = connection, sep = "\n", paste(sep = ",",
names(data),
        vapply(data, paste, collapse = ",", FUN.VALUE =
"")))
}


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Oct 3, 2018 at 3:05 AM, Manoranjan Muthusamy <
ranjanmano167 at gmail.com> wrote:
> Thanks for the short but informative answer, Bill. But still, each row has
> four columns..right? Although the *NA* is replaced by a blank cell,
> because of the extra comma it still is a four column row. Is there any way
> to avoid/remove the extra comma when NA is replaced which will make it a
> three column row?
>
>
>
> On Mon, Oct 1, 2018 at 4:56 PM William Dunlap <wdunlap at tibco.com>
wrote:
>
>> Read the help file for write.table
>>    ?write.table
>> and look at the descriptions of its arguments.
>>   > df <-
data.frame(Text=c("Abe","Ben",NA,"David"),
Age=c(19, NA, 12,
>> 10))
>>   > write.table(file=stdout(), t(df), sep=",")
>>   "V1","V2","V3","V4"
>>   "Text","Abe","Ben",NA,"David"
>>   "Age","19",NA,"12","10"
>>   > write.table(file=stdout(), t(df), quote=FALSE, na="",
sep=",")
>>   V1,V2,V3,V4
>>   Text,Abe,Ben,,David
>>   Age,19,,12,10
>>
>>
>>
>>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>>
>> On Mon, Oct 1, 2018 at 2:24 AM, Manoranjan Muthusamy <
>> ranjanmano167 at gmail.com> wrote:
>>
>>> I have a horizontally arranged CSV file
>>>
<https://www.dropbox.com/s/0ukyuifvpq1olqk/samplefile.csv?dl=0> (
>>> samplefile.csv) with headers are in the first column. Also, each
row has
>>> a
>>> different number of columns. I want to read this CSV file, replace
one of
>>> the cell value and save again as a CSV file with the same format as
the
>>> original file with exactly same number of columns and rows. It
sounds
>>> like
>>> a simple task, but I am struggling to find a way. I tried to do
this with
>>> the help of this
>>> <https://stackoverflow.com/questions/17288197/reading-a-
>>> csv-file-organized-horizontally>
>>>  and this
>>> <https://stackoverflow.com/questions/20308621/how-do-i-
>>>
write-a-csv-file-in-r-where-my-input-is-written-to-the-file-as-row>
>>> posts,
>>> but still, I can't get the output the way I want. Can somebody
help me
>>> here?
>>>
>>> My attempt using the answer in this post
>>> <https://stackoverflow.com/questions/17288197/reading-a-
>>> csv-file-organized-horizontally>
>>> to
>>> read the CSV file (samplefile.csv) gives me the following output
where
>>> headers are kinda messed up and empty cells are replaced with NA
values
>>> which is not what I want
>>>
>>> aaa <- read.tcsv("samplefile.csv")
>>> aaa
>>>
>>>      Header.1 Header.2..units. Header.3..units. Header.3..units..11
>>> Some text           0.0625                0             2648962
>>> <NA>           0.0625             1200             6647473   
<NA>
>>>           0.0625             1380                  14       
<NA>
>>>      0.2500             1500                  15        <NA>
>>> 0.6250             1620                 NA6        <NA>
>>> 1.3125             1740                 NA7        <NA>
>>> 2.4375             1860                 NA8        <NA>
>>> 3.5625             1980                 NA9        <NA>
>>> 4.6250             2100                 NA10       <NA>
>>> 5.0000             2220                 NA11       <NA>
>>> 5.0000             2340                 NA12       <NA>
>>> 4.6250             2460                 NA13       <NA>
>>> 3.5625             2580                 NA14       <NA>
>>> 2.4375             2700                 NA15       <NA>
>>> 1.3125             2820                 NA16       <NA>
>>> 0.6250             2940                 NA17       <NA>
>>> 0.2500             3060                 NA18       <NA>
>>> 0.0625             3180                 NA19       <NA>
>>> 0.0000             3300                 NA20       <NA>
>>> 0.0000            18000                 NA
>>>
>>> Also, I am not sure how to go back to original format when I save
the
>>> file
>>> again after a modification (for example after replacing a cell
value)
>>>
>>> I tried saving the file again by using t (transpose) as given below
>>>
>>> write.csv(t(aaa), file ="samplefile_e.csv", row.names=T)
>>>
>>> but still, there are following issues in the saved file
>>>
>>>    1. messed up headers
>>>    2. empty cells are replaced with NA
>>>    3. when I open the file in a text editor all the values are
shown as
>>>    characters
>>>
>>> Thanks,
>>> Mano
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/
>>> posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
	[[alternative HTML version deleted]]

Manoranjan Muthusamy

2018-Oct-04 16:03 UTC

head link

[R] R - Reading a horizontally arranged csv file, replacing values and saving again

That's exactly what I was looking for. Never used readLines before which is
why I was struggling.  Many thanks. I asked the same question in Stack
Overflow as well. But no answers yet. I will post the link to your answer.
Thanks again.

On Wed, Oct 3, 2018 at 5:51 PM William Dunlap <wdunlap at tibco.com>
wrote:
> Using read.table to read data with a variable number of entries per row is
> probably a mistake - data.frames (which read.table returns) are not meant
> for this sort of data.  You want to store the data in R as a list with
> names.  E.g.,
>
> readYourFormat <- function(connection) {
>     # connection is a file name or connection object referring
>     # to text of the form
>     # Header A,Some text
>     # Header B,<number>,<number>,<number>
>     # Header C,<number>
>     # Header D,<number>,<number>,<number>,<number>
>     lines <- readLines(connection)
>     lines <- lines[ nchar(lines)>0 ] # omit empty lines
>     fields <- strsplit(lines, ",") # split lines by commas
into fields
>     headers <- vapply(fields, function(x)x[1], FUN.VALUE="") #
first entry
> is header
>     names(fields) <- headers
>     contents <- lapply(fields, function(x)x[-1]) # remaining entries are
> contents
>     contents[-1] <- lapply(contents[-1], as.numeric) # convert all but
> first line's entries to numeric
>     contents
> }
>
> After reading the data and fiddling with it (but keeping it a named list),
> you can write it out in a similar format with
>
> writeYourFormat <- function (data, connection = stdout())
> {
>     stopifnot(is.list(data), !is.null(names(data)))
>     cat(file = connection, sep = "\n", paste(sep = ",",
names(data),
>         vapply(data, paste, collapse = ",", FUN.VALUE =
"")))
> }
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Wed, Oct 3, 2018 at 3:05 AM, Manoranjan Muthusamy <
> ranjanmano167 at gmail.com> wrote:
>
>> Thanks for the short but informative answer, Bill. But still, each row
>> has four columns..right? Although the *NA* is replaced by a blank cell,
>> because of the extra comma it still is a four column row. Is there any
way
>> to avoid/remove the extra comma when NA is replaced which will make it
a
>> three column row?
>>
>>
>>
>> On Mon, Oct 1, 2018 at 4:56 PM William Dunlap <wdunlap at
tibco.com> wrote:
>>
>>> Read the help file for write.table
>>>    ?write.table
>>> and look at the descriptions of its arguments.
>>>   > df <-
data.frame(Text=c("Abe","Ben",NA,"David"),
Age=c(19, NA, 12,
>>> 10))
>>>   > write.table(file=stdout(), t(df), sep=",")
>>>   "V1","V2","V3","V4"
>>>  
"Text","Abe","Ben",NA,"David"
>>>   "Age","19",NA,"12","10"
>>>   > write.table(file=stdout(), t(df), quote=FALSE,
na="", sep=",")
>>>   V1,V2,V3,V4
>>>   Text,Abe,Ben,,David
>>>   Age,19,,12,10
>>>
>>>
>>>
>>>
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com
>>>
>>> On Mon, Oct 1, 2018 at 2:24 AM, Manoranjan Muthusamy <
>>> ranjanmano167 at gmail.com> wrote:
>>>
>>>> I have a horizontally arranged CSV file
>>>>
<https://www.dropbox.com/s/0ukyuifvpq1olqk/samplefile.csv?dl=0> (
>>>> samplefile.csv) with headers are in the first column. Also,
each row
>>>> has a
>>>> different number of columns. I want to read this CSV file,
replace one
>>>> of
>>>> the cell value and save again as a CSV file with the same
format as the
>>>> original file with exactly same number of columns and rows. It
sounds
>>>> like
>>>> a simple task, but I am struggling to find a way. I tried to do
this
>>>> with
>>>> the help of this
>>>> <
>>>>
https://stackoverflow.com/questions/17288197/reading-a-csv-file-organized-horizontally
>>>> >
>>>>  and this
>>>> <
>>>>
https://stackoverflow.com/questions/20308621/how-do-i-write-a-csv-file-in-r-where-my-input-is-written-to-the-file-as-row
>>>> >
>>>> posts,
>>>> but still, I can't get the output the way I want. Can
somebody help me
>>>> here?
>>>>
>>>> My attempt using the answer in this post
>>>> <
>>>>
https://stackoverflow.com/questions/17288197/reading-a-csv-file-organized-horizontally
>>>> >
>>>> to
>>>> read the CSV file (samplefile.csv) gives me the following
output where
>>>> headers are kinda messed up and empty cells are replaced with
NA values
>>>> which is not what I want
>>>>
>>>> aaa <- read.tcsv("samplefile.csv")
>>>> aaa
>>>>
>>>>      Header.1 Header.2..units. Header.3..units.
Header.3..units..11
>>>> Some text           0.0625                0             2648962
>>>> <NA>           0.0625             1200            
6647473        <NA>
>>>>           0.0625             1380                  14       
<NA>
>>>>      0.2500             1500                  15       
<NA>
>>>> 0.6250             1620                 NA6        <NA>
>>>> 1.3125             1740                 NA7        <NA>
>>>> 2.4375             1860                 NA8        <NA>
>>>> 3.5625             1980                 NA9        <NA>
>>>> 4.6250             2100                 NA10       <NA>
>>>> 5.0000             2220                 NA11       <NA>
>>>> 5.0000             2340                 NA12       <NA>
>>>> 4.6250             2460                 NA13       <NA>
>>>> 3.5625             2580                 NA14       <NA>
>>>> 2.4375             2700                 NA15       <NA>
>>>> 1.3125             2820                 NA16       <NA>
>>>> 0.6250             2940                 NA17       <NA>
>>>> 0.2500             3060                 NA18       <NA>
>>>> 0.0625             3180                 NA19       <NA>
>>>> 0.0000             3300                 NA20       <NA>
>>>> 0.0000            18000                 NA
>>>>
>>>> Also, I am not sure how to go back to original format when I
save the
>>>> file
>>>> again after a modification (for example after replacing a cell
value)
>>>>
>>>> I tried saving the file again by using t (transpose) as given
below
>>>>
>>>> write.csv(t(aaa), file ="samplefile_e.csv",
row.names=T)
>>>>
>>>> but still, there are following issues in the saved file
>>>>
>>>>    1. messed up headers
>>>>    2. empty cells are replaced with NA
>>>>    3. when I open the file in a text editor all the values are
shown as
>>>>    characters
>>>>
>>>> Thanks,
>>>> Mano
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible
code.
>>>>
>>>
>>>
>
	[[alternative HTML version deleted]]

R help - Oct 2018 - R - Reading a horizontally arranged csv file, replacing values and saving again

[R] R - Reading a horizontally arranged csv file, replacing values and saving again

[R] R - Reading a horizontally arranged csv file, replacing values and saving again

[R] R - Reading a horizontally arranged csv file, replacing values and saving again