arun
2013-May-16 15:58 UTC
[R] Help with how to process multiple column variable in a read.table
Hi,
Try this:
unemp.wy <-
read.table("ftp://ftp.bls.gov/pub/time.series/la/la.data.59.Wyoming",
header=TRUE, sep="\t",stringsAsFactors=FALSE,na.strings="")
dim(unemp.wy)
#[1] 46692???? 5
?head(unemp.wy)
#????????? series_id year period value footnote_codes
#1 LASST56000003???? 1976??? M01?? 4.2?????????? <NA>
#2 LASST56000003???? 1976??? M02?? 4.1?????????? <NA>
#3 LASST56000003???? 1976??? M03?? 4.0?????????? <NA>
#4 LASST56000003???? 1976??? M04?? 3.9?????????? <NA>
#5 LASST56000003???? 1976??? M05?? 3.9?????????? <NA>
#6 LASST56000003???? 1976??? M06?? 3.9?????????? <NA>
?str(unemp.wy)
#'data.frame':??? 46692 obs. of? 5 variables:
# $ series_id???? : chr? "LASST56000003??? " "LASST56000003???
" "LASST56000003??? " "LASST56000003??? " ...
# $ year????????? : int? 1976 1976 1976 1976 1976 1976 1976 1976 1976 1976 ...
# $ period??????? : chr? "M01" "M02" "M03"
"M04" ...
# $ value???????? : num? 4.2 4.1 4 3.9 3.9 3.9 4 4.1 4.1 4 ...
# $ footnote_codes: chr? NA NA NA NA ...
?tail(unemp.wy)
#????????????? series_id year period? value footnote_codes
#46687 LAUST56000006???? 2012??? M11 305820????????????? D
#46688 LAUST56000006???? 2012??? M12 304293????????????? D
#46689 LAUST56000006???? 2012??? M13 306064????????????? D
#46690 LAUST56000006???? 2013??? M01 305150?????????? <NA>
#46691 LAUST56000006???? 2013??? M02 304918?????????? <NA>
#46692 LAUST56000006???? 2013??? M03 305556????????????? P
A.K.
>I am new to R. ?I am trying to read a table from BLS FTP site: the
column structure has 5 columns but on the 5th column data is not always
present, >so it is throwing of error: here is my code:
>
?>unemp.wy <-
read.table("ftp://ftp.bls.gov/pub/time.series/la/la.data.59.Wyoming",
header=FALSE, sep="", skip=2 ) >
>Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
?:
?> line 384 did not have 4 elements >
>Here is the structure of the text. About 384 rows the footnote
column gets added as well. This seems to throw of the read.table. Is it
possible to just >read the line a a text string and then parse it or is
there a better way to approach this problem. >series_id year period value footnote_codes
>LASST56000003 ? ? 1976 M01 ? ? ? ? 4.2
>LASST56000003 ? ? 1976 M02 ? ? ? ? 4.1
>LASST56000003 ? ? 1976 M03 ? ? ? ? 4.0
LASST56000003 ? ? 1976 M04 ? ? ? ? 3.9 >LASST56000003 ? ? 1976 M05 ? ? ? ? 3.9
>
>Thanks I am using R after having used SAS for years, so I am
unsure of the best way to overcome a Program vector approach to data
cleansing. >
>Thanks