Hi all I am trying to read some text files with the following format: 1377262633.948000 $GPRMC,125708.00,A,5047.66107,N,00603.65528,E,0.203,247.36,230813,,,A*60 1377262633.958000 $GPVTG,247.36,T,,M,0.203,N,0.377,K,A*3B 1377262633.968000 $GPGGA,125708.00,5047.66107,N,00603.65528,E,1,09,0.85,169.3,M,46.5,M,,*52 1377262633.978000 $GPGSA,A,3,29,21,31,25,16,05,06,13,27,,,,1.78,0.85,1.57*0C 1377262633.998000 $GPGSV,3,1,12,03,01,266,,05,16,043,39,06,21,263,43,13,07,330,43*70 1377262634.008000 $GPGSV,3,2,12,16,37,302,45,18,03,149,,21,59,166,33,23,04,304,16*75 1377262634.028000 $GPGSV,3,3,12,25,18,129,21,27,11,260,39,29,45,071,47,31,35,211,47*7C but this returns me the following: read.csv("sensor_0.log",sep=",") Error in read.table(file = file, header = header, sep = sep, quote = quote, : more columns than column names I guess the problem is that the columns are not consistent on a per row basis. What I am trying to do though is to read only the lines that contain the $GPGLL or the $GPGLA entries (in the example they corresponds to 3rd and 4th line). How can I do this in R? Regards A [[alternative HTML version deleted]]
Hello, See the "fill" option of the "read.csv" function. But be careful, it might lead to erroneous results, as explained in the help page... And there is neither $GPGLL nor $GPGLA in your example. Regards, Pascal On Tue, Mar 11, 2014 at 9:00 PM, Alaios <alaios at yahoo.com> wrote:> Hi all > I am trying to read some text files with the following format: > > 1377262633.948000 $GPRMC,125708.00,A,5047.66107,N,00603.65528,E,0.203,247.36,230813,,,A*60 > 1377262633.958000 $GPVTG,247.36,T,,M,0.203,N,0.377,K,A*3B > 1377262633.968000 $GPGGA,125708.00,5047.66107,N,00603.65528,E,1,09,0.85,169.3,M,46.5,M,,*52 > 1377262633.978000 $GPGSA,A,3,29,21,31,25,16,05,06,13,27,,,,1.78,0.85,1.57*0C > 1377262633.998000 $GPGSV,3,1,12,03,01,266,,05,16,043,39,06,21,263,43,13,07,330,43*70 > 1377262634.008000 $GPGSV,3,2,12,16,37,302,45,18,03,149,,21,59,166,33,23,04,304,16*75 > 1377262634.028000 $GPGSV,3,3,12,25,18,129,21,27,11,260,39,29,45,071,47,31,35,211,47*7C > > but this returns me the following: > > read.csv("sensor_0.log",sep=",") > Error in read.table(file = file, header = header, sep = sep, quote = quote, : > more columns than column names > > I guess the problem is that the columns are not consistent on a per row basis. > What I am trying to do though is to read only the lines that contain the $GPGLL or the $GPGLA entries (in the example they corresponds to 3rd and 4th line). > How can I do this in R? > > Regards > A > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Pascal Oettli Project Scientist JAMSTEC Yokohama, Japan
Hi, There is no GPGLL or GPGLA entries in the data.? You can use ?grep to read those lines. ##Modifying the input data lines1 <- readLines(textConnection("1377262633.948000??? $GPRMC,125708.00,A,5047.66107,N,00603.65528,E,0.203,247.36,230813,,,A*60 1377262633.958000??? $GPVTG,247.36,T,,M,0.203,N,0.377,K,A*3B 1377262633.968000??? $GPGLA,125708.00,5047.66107,N,00603.65528,E,1,09,0.85,169.3,M,46.5,M,,*52 1377262633.978000??? $GPGSA,A,3,29,21,31,25,16,05,06,13,27,,,,1.78,0.85,1.57*0C 1377262633.998000??? $GPGSV,3,1,12,03,01,266,,05,16,043,39,06,21,263,43,13,07,330,43*70 1377262634.008000??? $GPGLL,3,2,12,16,37,302,45,18,03,149,,21,59,166,33,23,04,304,16*75 1377262634.028000??? $GPGSV,3,3,12,25,18,129,21,27,11,260,39,29,45,071,47,31,35,211,47*7C")) dat <- read.table(text=lines1[grepl("GPGLA|GPGLL",lines1)],header=FALSE,stringsAsFactors=FALSE,sep=",",fill=TRUE) A.K. On Tuesday, March 11, 2014 8:02 AM, Alaios <alaios at yahoo.com> wrote: Hi all I am trying to read some text files with the following format: 1377262633.948000??? $GPRMC,125708.00,A,5047.66107,N,00603.65528,E,0.203,247.36,230813,,,A*60 1377262633.958000??? $GPVTG,247.36,T,,M,0.203,N,0.377,K,A*3B 1377262633.968000??? $GPGGA,125708.00,5047.66107,N,00603.65528,E,1,09,0.85,169.3,M,46.5,M,,*52 1377262633.978000??? $GPGSA,A,3,29,21,31,25,16,05,06,13,27,,,,1.78,0.85,1.57*0C 1377262633.998000??? $GPGSV,3,1,12,03,01,266,,05,16,043,39,06,21,263,43,13,07,330,43*70 1377262634.008000??? $GPGSV,3,2,12,16,37,302,45,18,03,149,,21,59,166,33,23,04,304,16*75 1377262634.028000??? $GPGSV,3,3,12,25,18,129,21,27,11,260,39,29,45,071,47,31,35,211,47*7C but this returns me the following: read.csv("sensor_0.log",sep=",") Error in read.table(file = file, header = header, sep = sep, quote = quote,? : ? more columns than column names I guess the problem is that the columns are not consistent on a per row basis. What? I am trying to do though is to read only the lines that contain the $GPGLL or the $GPGLA entries (in the example they corresponds to 3rd and 4th line). How can I do this in R? Regards A ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Since you don't have a header on the first line, just use 'read.table'.> x <- read.table(text = "1377262633.948000$GPRMC,125708.00,A,5047.66107,N,00603.65528,E,0.203,247.36,230813,,,A*60 + 1377262633.958000 $GPVTG,247.36,T,,M,0.203,N,0.377,K,A*3B + 1377262633.968000 $GPGGA,125708.00,5047.66107,N,00603.65528,E,1,09,0.85,169.3,M,46.5,M,,*52 + 1377262633.978000 $GPGSA,A,3,29,21,31,25,16,05,06,13,27,,,,1.78,0.85,1.57*0C + 1377262633.998000 $GPGSV,3,1,12,03,01,266,,05,16,043,39,06,21,263,43,13,07,330,43*70 + 1377262634.008000 $GPGSV,3,2,12,16,37,302,45,18,03,149,,21,59,166,33,23,04,304,16*75 + 1377262634.028000 $GPGSV,3,3,12,25,18,129,21,27,11,260,39,29,45,071,47,31,35,211,47*7C" + , sep = ',' + , as.is = TRUE + , fill = TRUE + )> str(x)'data.frame': 7 obs. of 20 variables: $ V1 : chr "1377262633.948000 $GPRMC" "1377262633.958000 $GPVTG" "1377262633.968000 $GPGGA" "1377262633.978000 $GPGSA" ... $ V2 : chr "125708.00" "247.36" "125708.00" "A" ... $ V3 : chr "A" "T" "5047.66107" "3" ... $ V4 : chr "5047.66107" "" "N" "29" ... $ V5 : chr "N" "M" "00603.65528" "21" ... $ V6 : chr "00603.65528" "0.203" "E" "31" ... $ V7 : chr "E" "N" "1" "25" ... $ V8 : num 0.203 0.377 9 16 NA 45 21 $ V9 : chr "247.36" "K" "0.85" "05" ... $ V10: chr "230813" "A*3B" "169.3" "06" ... $ V11: chr "" "" "M" "13" ... $ V12: num NA NA 46.5 27 39 NA 39 $ V13: chr "A*60" "" "M" "" ... $ V14: int NA NA NA NA 21 59 45 $ V15: chr "" "" "*52" "" ... $ V16: num NA NA NA 1.78 43 33 47 $ V17: num NA NA NA 0.85 13 23 31 $ V18: chr "" "" "" "1.57*0C" ... $ V19: int NA NA NA NA 330 304 211 $ V20: chr "" "" "" "" ...> # keep only rows with "GPGS" > y <- subset(x, grepl("GPGS", V1)) > str(y)'data.frame': 4 obs. of 20 variables: $ V1 : chr "1377262633.978000 $GPGSA" "1377262633.998000 $GPGSV" "1377262634.008000 $GPGSV" "1377262634.028000 $GPGSV" $ V2 : chr "A" "3" "3" "3" $ V3 : chr "3" "1" "2" "3" $ V4 : chr "29" "12" "12" "12" $ V5 : chr "21" "03" "16" "25" $ V6 : chr "31" "01" "37" "18" $ V7 : chr "25" "266" "302" "129" $ V8 : num 16 NA 45 21 $ V9 : chr "05" "05" "18" "27" $ V10: chr "06" "16" "03" "11" $ V11: chr "13" "043" "149" "260" $ V12: num 27 39 NA 39 $ V13: chr "" "06" "21" "29" $ V14: int NA 21 59 45 $ V15: chr "" "263" "166" "071" $ V16: num 1.78 43 33 47 $ V17: num 0.85 13 23 31 $ V18: chr "1.57*0C" "07" "04" "35" $ V19: int NA 330 304 211 $ V20: chr "" "43*70" "16*75" "47*7C"> > >Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Tue, Mar 11, 2014 at 8:00 AM, Alaios <alaios@yahoo.com> wrote:> Hi all > I am trying to read some text files with the following format: > > 1377262633.948000 > $GPRMC,125708.00,A,5047.66107,N,00603.65528,E,0.203,247.36,230813,,,A*60 > 1377262633.958000 $GPVTG,247.36,T,,M,0.203,N,0.377,K,A*3B > 1377262633.968000 > $GPGGA,125708.00,5047.66107,N,00603.65528,E,1,09,0.85,169.3,M,46.5,M,,*52 > 1377262633.978000 > $GPGSA,A,3,29,21,31,25,16,05,06,13,27,,,,1.78,0.85,1.57*0C > 1377262633.998000 > $GPGSV,3,1,12,03,01,266,,05,16,043,39,06,21,263,43,13,07,330,43*70 > 1377262634.008000 > $GPGSV,3,2,12,16,37,302,45,18,03,149,,21,59,166,33,23,04,304,16*75 > 1377262634.028000 > $GPGSV,3,3,12,25,18,129,21,27,11,260,39,29,45,071,47,31,35,211,47*7C > > but this returns me the following: > > read.csv("sensor_0.log",sep=",") > Error in read.table(file = file, header = header, sep = sep, quote > quote, : > more columns than column names > > I guess the problem is that the columns are not consistent on a per row > basis. > What I am trying to do though is to read only the lines that contain the > $GPGLL or the $GPGLA entries (in the example they corresponds to 3rd and > 4th line). > How can I do this in R? > > Regards > A > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >[[alternative HTML version deleted]]