thr3ads.net - R help - [R] Reading very large text files into R [Sep 2022]

If this information is useful, please help other people find it:
Share via:

Bert Gunter

2022-Sep-29 15:16 UTC

[R] Reading very large text files into R

I had no trouble reading your text snippet with
read.csv(text "... your text... ")

There were 15 columns. The last column was all empty except for the row
containing the "B".

So there seems to be some confusion here.

-- Bert






On Thu, Sep 29, 2022 at 6:54 AM Nick Wray <nickmwray at gmail.com> wrote:
> Hello   I may be offending the R purists with this question but it is
> linked to R, as will become clear.  I have very large data sets from the UK
> Met Office in notepad form.  Unfortunately,  I can?t read them directly
> into R because, for some reason, although most lines in the text doc
> consist of 15 elements, every so often there is a sixteenth one and R
> doesn?t like this and gives me an error message because it has assumed that
> every line has 15 elements and doesn?t like finding one with more.  I have
> tried playing around with the text document, inserting an extra element
> into the top line etc, but to no avail.
>
> Also unfortunately you need access permission from the Met Office to get
> the files in question so this link probably won?t work:
>
> https://catalogue.ceda.ac.uk/uuid/bbd6916225e7475514e17fdbf11141c1
>
> So what I have done is simply to copy and paste the text docs into excel
> csv and then read them in, which is time-consuming but works.  However the
> later datasets are over the excel limit of 1048576 lines.  I can paste in
> the first 1048576 lines but then trying to isolate the remainder of the
> text doc to paste it into a second csv doc is proving v difficult ? the
> only way I have found is to scroll down by hand and that?s taking ages.  I
> cannot find another way of editing the notepad text doc to get rid of the
> part which I have already copied and pasted.
>
> Can anyone help with a)ideally being able to simply read the text tables
> into R  or b)suggest a way of editing out the bits of the text file I have
> already pasted in without laborious scrolling?
>
> Thanks Nick Wray
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Nick Wray

2022-Sep-29 15:26 UTC

head link

[R] Reading very large text files into R

Hi Bert   Right Thing is, I didn't know that there even was an instruction
like read.csv(text "... your text... ")  so at any rate I can paste
the original text files in
by hand if there's no shorter cut
Thanks v much Nick

On Thu, 29 Sept 2022 at 16:16, Bert Gunter <bgunter.4567 at gmail.com>
wrote:
> I had no trouble reading your text snippet with
> read.csv(text > "... your text... ")
>
> There were 15 columns. The last column was all empty except for the row
> containing the "B".
>
> So there seems to be some confusion here.
>
> -- Bert
>
>
>
>
>
>
> On Thu, Sep 29, 2022 at 6:54 AM Nick Wray <nickmwray at gmail.com>
wrote:
>
>> Hello   I may be offending the R purists with this question but it is
>> linked to R, as will become clear.  I have very large data sets from
the
>> UK
>> Met Office in notepad form.  Unfortunately,  I can?t read them directly
>> into R because, for some reason, although most lines in the text doc
>> consist of 15 elements, every so often there is a sixteenth one and R
>> doesn?t like this and gives me an error message because it has assumed
>> that
>> every line has 15 elements and doesn?t like finding one with more.  I
have
>> tried playing around with the text document, inserting an extra element
>> into the top line etc, but to no avail.
>>
>> Also unfortunately you need access permission from the Met Office to
get
>> the files in question so this link probably won?t work:
>>
>> https://catalogue.ceda.ac.uk/uuid/bbd6916225e7475514e17fdbf11141c1
>>
>> So what I have done is simply to copy and paste the text docs into
excel
>> csv and then read them in, which is time-consuming but works.  However
the
>> later datasets are over the excel limit of 1048576 lines.  I can paste
in
>> the first 1048576 lines but then trying to isolate the remainder of the
>> text doc to paste it into a second csv doc is proving v difficult ? the
>> only way I have found is to scroll down by hand and that?s taking ages.
I
>> cannot find another way of editing the notepad text doc to get rid of
the
>> part which I have already copied and pasted.
>>
>> Can anyone help with a)ideally being able to simply read the text
tables
>> into R  or b)suggest a way of editing out the bits of the text file I
have
>> already pasted in without laborious scrolling?
>>
>> Thanks Nick Wray
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
	[[alternative HTML version deleted]]

Jeff Newmiller

2022-Sep-29 15:31 UTC

head link

[R] Reading very large text files into R

"Confusion" is the size of the file. Try specifying the colClasses
argument to nail down the number and type of the columns.

On September 29, 2022 8:16:34 AM PDT, Bert Gunter <bgunter.4567 at
gmail.com> wrote:>I had no trouble reading your text snippet with
>read.csv(text >"... your text... ")
>
>There were 15 columns. The last column was all empty except for the row
>containing the "B".
>
>So there seems to be some confusion here.
>
>-- Bert
>
>
>
>
>
>
>On Thu, Sep 29, 2022 at 6:54 AM Nick Wray <nickmwray at gmail.com>
wrote:
>
>> Hello   I may be offending the R purists with this question but it is
>> linked to R, as will become clear.  I have very large data sets from
the UK
>> Met Office in notepad form.  Unfortunately,  I can?t read them directly
>> into R because, for some reason, although most lines in the text doc
>> consist of 15 elements, every so often there is a sixteenth one and R
>> doesn?t like this and gives me an error message because it has assumed
that
>> every line has 15 elements and doesn?t like finding one with more.  I
have
>> tried playing around with the text document, inserting an extra element
>> into the top line etc, but to no avail.
>>
>> Also unfortunately you need access permission from the Met Office to
get
>> the files in question so this link probably won?t work:
>>
>> https://catalogue.ceda.ac.uk/uuid/bbd6916225e7475514e17fdbf11141c1
>>
>> So what I have done is simply to copy and paste the text docs into
excel
>> csv and then read them in, which is time-consuming but works.  However
the
>> later datasets are over the excel limit of 1048576 lines.  I can paste
in
>> the first 1048576 lines but then trying to isolate the remainder of the
>> text doc to paste it into a second csv doc is proving v difficult ? the
>> only way I have found is to scroll down by hand and that?s taking ages.
I
>> cannot find another way of editing the notepad text doc to get rid of
the
>> part which I have already copied and pasted.
>>
>> Can anyone help with a)ideally being able to simply read the text
tables
>> into R  or b)suggest a way of editing out the bits of the text file I
have
>> already pasted in without laborious scrolling?
>>
>> Thanks Nick Wray
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
-- 
Sent from my phone. Please excuse my brevity.

R help - Sep 2022 - Reading very large text files into R

[R] Reading very large text files into R

[R] Reading very large text files into R

[R] Reading very large text files into R