arun
2013-May-16 15:58 UTC
[R] Help with how to process multiple column variable in a read.table
Hi, Try this: unemp.wy <- read.table("ftp://ftp.bls.gov/pub/time.series/la/la.data.59.Wyoming", header=TRUE, sep="\t",stringsAsFactors=FALSE,na.strings="") dim(unemp.wy) #[1] 46692???? 5 ?head(unemp.wy) #????????? series_id year period value footnote_codes #1 LASST56000003???? 1976??? M01?? 4.2?????????? <NA> #2 LASST56000003???? 1976??? M02?? 4.1?????????? <NA> #3 LASST56000003???? 1976??? M03?? 4.0?????????? <NA> #4 LASST56000003???? 1976??? M04?? 3.9?????????? <NA> #5 LASST56000003???? 1976??? M05?? 3.9?????????? <NA> #6 LASST56000003???? 1976??? M06?? 3.9?????????? <NA> ?str(unemp.wy) #'data.frame':??? 46692 obs. of? 5 variables: # $ series_id???? : chr? "LASST56000003??? " "LASST56000003??? " "LASST56000003??? " "LASST56000003??? " ... # $ year????????? : int? 1976 1976 1976 1976 1976 1976 1976 1976 1976 1976 ... # $ period??????? : chr? "M01" "M02" "M03" "M04" ... # $ value???????? : num? 4.2 4.1 4 3.9 3.9 3.9 4 4.1 4.1 4 ... # $ footnote_codes: chr? NA NA NA NA ... ?tail(unemp.wy) #????????????? series_id year period? value footnote_codes #46687 LAUST56000006???? 2012??? M11 305820????????????? D #46688 LAUST56000006???? 2012??? M12 304293????????????? D #46689 LAUST56000006???? 2012??? M13 306064????????????? D #46690 LAUST56000006???? 2013??? M01 305150?????????? <NA> #46691 LAUST56000006???? 2013??? M02 304918?????????? <NA> #46692 LAUST56000006???? 2013??? M03 305556????????????? P A.K.>I am new to R. ?I am trying to read a table from BLS FTP site: thecolumn structure has 5 columns but on the 5th column data is not always present, >so it is throwing of error: here is my code:>?>unemp.wy <- read.table("ftp://ftp.bls.gov/pub/time.series/la/la.data.59.Wyoming", header=FALSE, sep="", skip=2 )> >Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, ?:?> line 384 did not have 4 elements> >Here is the structure of the text. About 384 rows the footnotecolumn gets added as well. This seems to throw of the read.table. Is it possible to just >read the line a a text string and then parse it or is there a better way to approach this problem.>series_id year period value footnote_codes >LASST56000003 ? ? 1976 M01 ? ? ? ? 4.2 >LASST56000003 ? ? 1976 M02 ? ? ? ? 4.1 >LASST56000003 ? ? 1976 M03 ? ? ? ? 4.0LASST56000003 ? ? 1976 M04 ? ? ? ? 3.9>LASST56000003 ? ? 1976 M05 ? ? ? ? 3.9 > >Thanks I am using R after having used SAS for years, so I amunsure of the best way to overcome a Program vector approach to data cleansing.> >Thanks