thr3ads.net - R help - [R] Help with read.csv.sql() [Jul 2020]

If this information is useful, please help other people find it:
Share via:

2020-Jul-18 01:27 UTC

[R] Help with read.csv.sql()

I have created a dataframe with columns that are characters, integers and
numeric and with column names assigned by me. I am using read.csv.sql() to read
portions of a number of large csv files into this dataframe, each csv file
having a header row with columb names.

The problem I am having is that the csv files have header rows with column names
that are slightly different from the column names I have assigned in the
dataframe and it seems that when I read the csv data into the dataframe, the
column names from the csv file replace the column names I chose when creating
the dataframe.

I have been unable to figure out if it is possible to assign column names of my
choosing in the read.csv.sql() function? I have tried various variations but
none seem to work. I tried colClasses = c(....) but that did not work, I tried
field.types = c(...) but could not get that to work either.

It seems that the above should be feasible but I am missing something? Does
anyone know?

A secondary issue is that the csv files have a column with a date in mm/dd/yyyy
format that I would like to make into a Date type column in my dataframe. Again,
I have been unable to find a way - if at all possible - to force a conversion
into a Date format when importing into the dataframe. The best I have so far is
to import is a character column and then use as.Date() to later force the
conversion of the dataframe column.

Is it possible to do this when importing using read.csv.sql()?

Bert Gunter

2020-Jul-18 01:49 UTC

head link

[R] Help with read.csv.sql()

Is there some reason that you can't make the changes to the data frame
(column names, as.date(), ...) *after* you have read all your data in?

Do all your csv files use the same names and date formats?


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Jul 17, 2020 at 6:28 PM H <agents at meddatainc.com> wrote:
> I have created a dataframe with columns that are characters, integers and
> numeric and with column names assigned by me. I am using read.csv.sql() to
> read portions of a number of large csv files into this dataframe, each csv
> file having a header row with columb names.
>
> The problem I am having is that the csv files have header rows with column
> names that are slightly different from the column names I have assigned in
> the dataframe and it seems that when I read the csv data into the
> dataframe, the column names from the csv file replace the column names I
> chose when creating the dataframe.
>
> I have been unable to figure out if it is possible to assign column names
> of my choosing in the read.csv.sql() function? I have tried various
> variations but none seem to work. I tried colClasses = c(....) but that did
> not work, I tried field.types = c(...) but could not get that to work
> either.
>
> It seems that the above should be feasible but I am missing something?
> Does anyone know?
>
> A secondary issue is that the csv files have a column with a date in
> mm/dd/yyyy format that I would like to make into a Date type column in my
> dataframe. Again, I have been unable to find a way - if at all possible -
> to force a conversion into a Date format when importing into the dataframe.
> The best I have so far is to import is a character column and then use
> as.Date() to later force the conversion of the dataframe column.
>
> Is it possible to do this when importing using read.csv.sql()?
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

2020-Jul-18 15:43 UTC

head link

[R] Help with read.csv.sql()

On 07/17/2020 09:49 PM, Bert Gunter wrote:> Is there some reason that you can't make the changes to the data frame
(column names, as.date(), ...) *after* you have read all your data in?
>
> Do all your csv files use the same names and date formats?
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip
)
>
>
> On Fri, Jul 17, 2020 at 6:28 PM H <agents at meddatainc.com
<mailto:agents at meddatainc.com>> wrote:
>
>     I have created a dataframe with columns that are characters, integers
and numeric and with column names assigned by me. I am using read.csv.sql() to
read portions of a number of large csv files into this dataframe, each csv file
having a header row with columb names.
>
>     The problem I am having is that the csv files have header rows with
column names that are slightly different from the column names I have assigned
in the dataframe and it seems that when I read the csv data into the dataframe,
the column names from the csv file replace the column names I chose when
creating the dataframe.
>
>     I have been unable to figure out if it is possible to assign column
names of my choosing in the read.csv.sql() function? I have tried various
variations but none seem to work. I tried colClasses = c(....) but that did not
work, I tried field.types = c(...) but could not get that to work either.
>
>     It seems that the above should be feasible but I am missing something?
Does anyone know?
>
>     A secondary issue is that the csv files have a column with a date in
mm/dd/yyyy format that I would like to make into a Date type column in my
dataframe. Again, I have been unable to find a way - if at all possible - to
force a conversion into a Date format when importing into the dataframe. The
best I have so far is to import is a character column and then use as.Date() to
later force the conversion of the dataframe column.
>
>     Is it possible to do this when importing using read.csv.sql()?
>
>     ______________________________________________
>     R-help at r-project.org <mailto:R-help at r-project.org> mailing
list -- To UNSUBSCRIBE and more, see
>     https://stat.ethz.ch/mailman/listinfo/r-help
>     PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>     and provide commented, minimal, self-contained, reproducible code.
>Yes, the files use the same column names and date format (at least as far as I
know now.) I agree I could do it as you suggest above but from a purist
perspective I would rather do it when importing the data using read.csv.sql(),
particularly if column names and/or date format might change, or be different
between different files. I am indeed selecting rows from a large number of csv
files so this is entirely plausible.

Has anyone been able to name columns in the read.csv.sql() call and/or force
date format conversion in the call itself? The first refers to naming columns
differently from what a header in the csv file may have.


	[[alternative HTML version deleted]]

Gabor Grothendieck

2020-Jul-30 22:09 UTC

head link

[R] Help with read.csv.sql()

Probably simplest to assign the names afterwards as others have
suggested but it could be done like this:

  library(sqldf)
  write.csv(BOD, "BOD.csv", quote = FALSE, row.names = FALSE)  # test
data

  read.csv.sql("BOD.csv", "select Time as Time2, demand as
demand2 from file")

giving the column names Time2 and demand2 rather than the original column names.

    Time2 demand2
  1     1     8.3
  2     2    10.3
  3     3    19.0
  4     4    16.0
  5     5    15.6
  6     7    19.8

On Fri, Jul 17, 2020 at 9:28 PM H <agents at meddatainc.com>
wrote:>
> I have created a dataframe with columns that are characters, integers and
numeric and with column names assigned by me. I am using read.csv.sql() to read
portions of a number of large csv files into this dataframe, each csv file
having a header row with columb names.
>
> The problem I am having is that the csv files have header rows with column
names that are slightly different from the column names I have assigned in the
dataframe and it seems that when I read the csv data into the dataframe, the
column names from the csv file replace the column names I chose when creating
the dataframe.
>
> I have been unable to figure out if it is possible to assign column names
of my choosing in the read.csv.sql() function? I have tried various variations
but none seem to work. I tried colClasses = c(....) but that did not work, I
tried field.types = c(...) but could not get that to work either.
>
> It seems that the above should be feasible but I am missing something? Does
anyone know?
>
> A secondary issue is that the csv files have a column with a date in
mm/dd/yyyy format that I would like to make into a Date type column in my
dataframe. Again, I have been unable to find a way - if at all possible - to
force a conversion into a Date format when importing into the dataframe. The
best I have so far is to import is a character column and then use as.Date() to
later force the conversion of the dataframe column.
>
> Is it possible to do this when importing using read.csv.sql()?
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

2020-Jul-30 23:56 UTC

head link

[R] Help with read.csv.sql()

On 07/30/2020 06:09 PM, Gabor Grothendieck wrote:> Probably simplest to assign the names afterwards as others have
> suggested but it could be done like this:
>
>   library(sqldf)
>   write.csv(BOD, "BOD.csv", quote = FALSE, row.names = FALSE)  #
test data
>
>   read.csv.sql("BOD.csv", "select Time as Time2, demand as
demand2 from file")
>
> giving the column names Time2 and demand2 rather than the original column
names.
>
>     Time2 demand2
>   1     1     8.3
>   2     2    10.3
>   3     3    19.0
>   4     4    16.0
>   5     5    15.6
>   6     7    19.8
>
> On Fri, Jul 17, 2020 at 9:28 PM H <agents at meddatainc.com> wrote:
>> I have created a dataframe with columns that are characters, integers
and numeric and with column names assigned by me. I am using read.csv.sql() to
read portions of a number of large csv files into this dataframe, each csv file
having a header row with columb names.
>>
>> The problem I am having is that the csv files have header rows with
column names that are slightly different from the column names I have assigned
in the dataframe and it seems that when I read the csv data into the dataframe,
the column names from the csv file replace the column names I chose when
creating the dataframe.
>>
>> I have been unable to figure out if it is possible to assign column
names of my choosing in the read.csv.sql() function? I have tried various
variations but none seem to work. I tried colClasses = c(....) but that did not
work, I tried field.types = c(...) but could not get that to work either.
>>
>> It seems that the above should be feasible but I am missing something?
Does anyone know?
>>
>> A secondary issue is that the csv files have a column with a date in
mm/dd/yyyy format that I would like to make into a Date type column in my
dataframe. Again, I have been unable to find a way - if at all possible - to
force a conversion into a Date format when importing into the dataframe. The
best I have so far is to import is a character column and then use as.Date() to
later force the conversion of the dataframe column.
>>
>> Is it possible to do this when importing using read.csv.sql()?
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>Apologies, I had tuned out from this discussion since I solved the problem by
renaming the columns after reading the file. Your suggestion to do it in the SQL
statement itself, however, seems to be neatest one though!

Thank you.

R help - Jul 2020 - Help with read.csv.sql()

[R] Help with read.csv.sql()

[R] Help with read.csv.sql()

[R] Help with read.csv.sql()

[R] Help with read.csv.sql()

[R] Help with read.csv.sql()