thr3ads.net - R help - [R] Reading Text files from UK Met Office into R again... [Oct 2022]

If this information is useful, please help other people find it:
Share via:

Nick Wray

2022-Oct-09 11:01 UTC

[R] Reading Text files from UK Met Office into R again...

Hello I've had some invaluable help from folk about downloading files from
the UK Met Office - unfortunately I now have another one which I can't
solve and I wonder whether anyone's got any ideas.

I'm trying to download hourly weather records from the Met Office

 https://data.ceda.ac.uk/badc/ukmo-midas/data/WH/yearly_files



Up to 2010 everything's fine and dandy - the data is in nice neat columns
and I can download it and filter out what I don't want.  But after 2010 the
format changes (The Met Office in fact say on their guidelines that it
changes)  - it's still a text doc but instead of columns it seems to be one
long vector.  Here is a short sample:



2015-01-01 00:00, 03002, WMO, SYNOP, 1, 12, 1011, 4, 7, 200, 18, 82, , , 8,
, , , , 100, 450, 1005.4, 5, , 102, 4, , 129, , , , , , , , 8.7, 7.5, 8.1,
1003.6, , , , , , , 1, 1, 1, , , 1, , , , , 1, 1, 1, 1, 1, 1, , 1, , 1, 1,
, 1, , , , , , , , 1, , , , , 2014-12-31 23:53, 0, , , , , , , , , , , , K,
, , , , 91.7, A, , , ,
2015-01-01 00:00, 03005, WMO, SYNOP, 1, 9, 1011, 4, 1, 210, 26, 62, 8, 6,
8, 8, , , 8, 30, 700, 1006, 1, 8, 54, 7, 6, 105, , , , , , , , 8.6, 7.3, 8,
996.1, , 01, , , , , 1, 1, 1, 1, 1, 1, 1, , , 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, , , , , , , , 1, , , , , 2014-12-31 23:55, 0, , , , , , , , , ,
, , K, , , , , 91.7, A, , , 0, 1
2015-01-01 00:00, 03006, WMO, SYNOP, 1, 10, 1011, 4, 6, 210, 23, , , , , ,
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , 1, 1, , , , , , ,
, , , , , , , , , , , , , , , , , , , , , , , , , , , 2014-12-31 23:53, 0,
, , , , , , , , , , , , , , , , , A, , , ,
2015-01-01 00:00, 03010, WMO, SYNOP, 1, 17, 1011, 4, 6, 230, 21, , , , , ,
, , , , , 1006.1, , , , , , , , , , , , , , 9.4, 6.2, 7.9, , , , , , , , 1,
1, , , , , , , , , , , 1, 1, 1, 1, , , , , , , , , , , , , , , , , , , ,





If I could download it I should still be able to use it as I could identify
each separate line as it will be headed by a date.   The files are v large
c 1GB and when I start to download in R it works for a while and then
after maybe 30 seconds I get this error message:



Error in read.table("midas_wxhrly_201501-201512.txt", fill = T) :
  duplicate 'row.names' are not allowed


The instruction in the error message works perfectly well up to 2010, and I
can?t see where a ?duplicate rowname? would come in this data anyway


Is there a way of either downloading the file without getting the error
message or of being able to identify at what point in the file the error
message is being generated so that I could, by hand possibly, take out
whatever the problem is?



I?ve tried putting the downloaded text doc into other formats but nothing
seems to work

If anyone has any ideas I?d be v grateful   Thanks Nick Wray

	[[alternative HTML version deleted]]

Ivan Krylov

2022-Oct-09 15:50 UTC

head link

[R] Reading Text files from UK Met Office into R again...

On Sun, 9 Oct 2022 12:01:27 +0100
Nick Wray <nickmwray at gmail.com> wrote:
> Error in read.table("midas_wxhrly_201501-201512.txt", fill = T) :
>   duplicate 'row.names' are not allowed
Since you don't pass the `header` argument, I think that the automatic
header detection is here at play. This is what ?read.table has to say
about row names:
>> If there is a header and the first row contains one fewer field than
>> the number of columns, the first column in the input is used for the
>> row names.  Otherwise if ?row.names? is missing, the rows are
>> numbered.
Perhaps the "one fewer field in the header than the number of columns"
condition is true for files after 2010? I'm too lazy to sign up for a
CEDA account and I'm not sure I'd be given access to hourly datasets
anyway.

If this is the reason for the failure (first column used as rownames()
and turns out to be non-unique), there's an easy way to fix that:
>> Using ?row.names = NULL? forces row numbering.
I don't see a header in your example. If there's actually no header
containing column names, passing `header = FALSE` will both prevent the
error and avoid eating the first line of the file.

-- 
Best regards,
Ivan

Dr Eberhard W Lisse

2022-Oct-09 21:30 UTC

head link

[R] Reading Text files from UK Met Office into R again...

Does it say what the new format is?

On 2022-10-09 13:01 , Nick Wray wrote:
[...]> Up to 2010 everything's fine and dandy - the data is in nice neat
columns
> and I can download it and filter out what I don't want.  But after 2010
the
> format changes (The Met Office in fact say on their guidelines that it
> changes)  - it's still a text doc but instead of columns it seems to be
one
> long vector.  Here is a short sample:[...]

R help - Oct 2022 - Reading Text files from UK Met Office into R again...

[R] Reading Text files from UK Met Office into R again...

[R] Reading Text files from UK Met Office into R again...

[R] Reading Text files from UK Met Office into R again...