R-help, I have the following file I want to import to R (some lines removed) Calibrated CTD data for station:00280001 Calibrated:23/8 2001, Salinity Unsmoothed, Fluorescence Uncalibrated Maximum observed depth: 36 m QUAL has one digit for each of pressure, temp., sal. and fluor. QUAL=1:Uncal., QUAL=2:OK, QUAL=6:Interp., QUAL=9:No data DEPTH CTDPRS CTDTMP CTDSAL RAWFLU NUMB. QUAL M DBAR IPTS-68 PSS-78 OBS. ******* ******* ******* ******* 1 1.0 2999 2 2.0 5.9793 35.1629 .107 17 2221 3 3.0 5.9797 35.1631 .101 17 2221 4 4.0 5.9809 35.1631 .118 12 2221 5 5.1 5.9811 35.1629 .115 42 2221 6 6.1 5.9810 35.1631 .116 18 2221 7 7.1 5.9797 35.1631 .116 15 2221 8 8.1 5.9798 35.1630 .102 13 2221 9 9.1 5.9792 35.1629 .113 11 2221 ............... ................ ......... If I use : read.table(file, skip = 10) it works fine but sometimes the missing data are not only in line number 1 ( 1 1.0 2999) but in lines 1,2,3,,, and therefore R fails to import the data file How can I fix it? I have tried with the arguments strip.white = TRUE , fill = TRUE , blank.lines.skip = TRUE but still not get what I want Thanks in advance> version_ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 4.0 year 2006 month 10 day 03 svn rev 39566 language R version.string R version 2.4.0 (2006-10-03)
Hi If the file is tab delimited you could try something like this: a<-read.delim(file, skip = 9, header=F, na.strings=NA) Are you sure you want to skip 10 lines? (Is there a blank line somewhere?) J --- -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Luis Ridao Cruz Sent: 03 November 2006 14:03 To: r-help at stat.math.ethz.ch Subject: [R] read file problem R-help, I have the following file I want to import to R (some lines removed) Calibrated CTD data for station:00280001 Calibrated:23/8 2001, Salinity Unsmoothed, Fluorescence Uncalibrated Maximum observed depth: 36 m QUAL has one digit for each of pressure, temp., sal. and fluor. QUAL=1:Uncal., QUAL=2:OK, QUAL=6:Interp., QUAL=9:No data DEPTH CTDPRS CTDTMP CTDSAL RAWFLU NUMB. QUAL M DBAR IPTS-68 PSS-78 OBS. ******* ******* ******* ******* 1 1.0 2999 2 2.0 5.9793 35.1629 .107 17 2221 3 3.0 5.9797 35.1631 .101 17 2221 4 4.0 5.9809 35.1631 .118 12 2221 5 5.1 5.9811 35.1629 .115 42 2221 6 6.1 5.9810 35.1631 .116 18 2221 7 7.1 5.9797 35.1631 .116 15 2221 8 8.1 5.9798 35.1630 .102 13 2221 9 9.1 5.9792 35.1629 .113 11 2221 ............... ................ ......... If I use : read.table(file, skip = 10) it works fine but sometimes the missing data are not only in line number 1 ( 1 1.0 2999) but in lines 1,2,3,,, and therefore R fails to import the data file How can I fix it? I have tried with the arguments strip.white = TRUE , fill = TRUE , blank.lines.skip = TRUE but still not get what I want Thanks in advance> version_ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 4.0 year 2006 month 10 day 03 svn rev 39566 language R version.string R version 2.4.0 (2006-10-03) ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
You should pay attention to the argument na.string. na.strings: a character vector of strings which are to be interpreted as 'NA' values. Blank fields are also considered to be missing values in logical, integer, numeric and complex fields. On 11/3/06, Luis Ridao Cruz <Luisr at frs.fo> wrote:> R-help, > > I have the following file I want to import to R (some lines > removed) > > > Calibrated CTD data for station:00280001 > Calibrated:23/8 2001, Salinity Unsmoothed, Fluorescence Uncalibrated > Maximum observed depth: 36 m > QUAL has one digit for each of pressure, temp., sal. and fluor. > QUAL=1:Uncal., QUAL=2:OK, QUAL=6:Interp., QUAL=9:No data > > DEPTH CTDPRS CTDTMP CTDSAL RAWFLU NUMB. QUAL > M DBAR IPTS-68 PSS-78 OBS. > ******* ******* ******* ******* > 1 1.0 2999 > 2 2.0 5.9793 35.1629 .107 17 2221 > 3 3.0 5.9797 35.1631 .101 17 2221 > 4 4.0 5.9809 35.1631 .118 12 2221 > 5 5.1 5.9811 35.1629 .115 42 2221 > 6 6.1 5.9810 35.1631 .116 18 2221 > 7 7.1 5.9797 35.1631 .116 15 2221 > 8 8.1 5.9798 35.1630 .102 13 2221 > 9 9.1 5.9792 35.1629 .113 11 2221 > ............... > ................ > ......... > > > If I use : > > read.table(file, skip = 10) > > it works fine but sometimes the missing data are not only > in line number 1 ( 1 1.0 2999) > but in lines 1,2,3,,, and therefore R fails to import the data file > > How can I fix it? > I have tried with the arguments > strip.white = TRUE > , fill = TRUE > , blank.lines.skip = TRUE > > but still not get what I want > > > Thanks in advance > > > version > _ > platform i386-pc-mingw32 > arch i386 > os mingw32 > system i386, mingw32 > status > major 2 > minor 4.0 > year 2006 > month 10 > day 03 > svn rev 39566 > language R > version.string R version 2.4.0 (2006-10-03) > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Ronggui Huang Department of Sociology Fudan University, Shanghai, China ?????? ????????????????
On Fri, Nov 03, 2006 at 02:02:35PM +0000, Luis Ridao Cruz wrote:> DEPTH CTDPRS CTDTMP CTDSAL RAWFLU NUMB. QUAL > M DBAR IPTS-68 PSS-78 OBS. > ******* ******* ******* ******* > 1 1.0 2999 > 2 2.0 5.9793 35.1629 .107 17 2221 > 3 3.0 5.9797 35.1631 .101 17 2221 > 4 4.0 5.9809 35.1631 .118 12 2221 > 5 5.1 5.9811 35.1629 .115 42 2221 > 6 6.1 5.9810 35.1631 .116 18 2221 > 7 7.1 5.9797 35.1631 .116 15 2221 > 8 8.1 5.9798 35.1630 .102 13 2221 > 9 9.1 5.9792 35.1629 .113 11 2221> read.table(file, skip = 10)To me it looks like your data is in a fixed-width format. If that is the case you should use read.fwf() instead of read.table(). cu Philipp -- Dr. Philipp Pagel Tel. +49-8161-71 2131 Dept. of Genome Oriented Bioinformatics Fax. +49-8161-71 2186 Technical University of Munich Science Center Weihenstephan 85350 Freising, Germany http://mips.gsf.de/staff/pagel
Luis Ridao Cruz wrote:> R-help, > > I have the following file I want to import to R (some lines > removed) > > > Calibrated CTD data for station:00280001 > Calibrated:23/8 2001, Salinity Unsmoothed, Fluorescence Uncalibrated > Maximum observed depth: 36 m > QUAL has one digit for each of pressure, temp., sal. and fluor. > QUAL=1:Uncal., QUAL=2:OK, QUAL=6:Interp., QUAL=9:No data > > DEPTH CTDPRS CTDTMP CTDSAL RAWFLU NUMB. QUAL > M DBAR IPTS-68 PSS-78 OBS. > ******* ******* ******* ******* > 1 1.0 2999 > 2 2.0 5.9793 35.1629 .107 17 2221 > 3 3.0 5.9797 35.1631 .101 17 2221 > 4 4.0 5.9809 35.1631 .118 12 2221 > 5 5.1 5.9811 35.1629 .115 42 2221 > 6 6.1 5.9810 35.1631 .116 18 2221 > 7 7.1 5.9797 35.1631 .116 15 2221 > 8 8.1 5.9798 35.1630 .102 13 2221 > 9 9.1 5.9792 35.1629 .113 11 2221 > ............... > ................ > ......... > > > If I use : > > read.table(file, skip = 10) > > it works fine but sometimes the missing data are not only > in line number 1 ( 1 1.0 2999) > but in lines 1,2,3,,, and therefore R fails to import the data file > > How can I fix it? > I have tried with the arguments > strip.white = TRUE > , fill = TRUE > , blank.lines.skip = TRUE > > but still not get what I wantThis looks like a job for read.fwf... ?read.fwf -- --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k