Dear kind R helpers, I have a vector of runway names in rwy ("31R", "31L",... the number is user selectable) arrgnd is a data frame with data for all flights and all runways, with a Runway column. I am trying to subset arrgnd into a dat frame for each selected runway, and then combine them back together using the following code: for (j in 1:nr) { # nr = number of user-selected runways ar4rw = arrgnd[arrgnd$Runway==rwy[j],] if (j == 1) { arrw = ar4rw } else { arrw = merge(arrw, ar4rw) } } but, the merge step gives me a data frame with all NAs. In addition, ar4rw always gets a row with NAs at the start, which I do not understand. There are no rows with all NAs in the arrgnd data frame. > ar4rw[1:2,] # first time through for 31R DateTime Date month hour minute quarter weekday IATA ICAO Flight NA <NA> <NA> NA NA NA NA NA <NA> <NA> <NA> 529 1/1/09 21:46 2009-01-01 1 21 46 87 5 TA TAI TAI570 AircraftType Tail Arrived STA Runway FromTo Delay NA <NA> <NA> <NA> <NA> <NA> <NA> NA 529 A320 N496TA 21:46:58 22:30 31R MSLP /KJFK 0 Operator dq gw NA <NA> <NA> NA 529 TACA INTERNATIONAL AIRLINES 2009-01-01 87 1 > ar4rw[1:2,] # second time through for 31L DateTime Date month hour minute quarter weekday IATA ICAO Flight NA <NA> <NA> NA NA NA NA NA <NA> <NA> <NA> 552 1/1/09 23:03 2009-01-01 1 23 3 92 5 AA AAL AAL22 AircraftType Tail Arrived STA Runway FromTo Delay Operator NA <NA> <NA> <NA> <NA> <NA> <NA> NA <NA> 552 B762 N329AA 23:03:35 23:10 31L LAX /JFK 0 AMERICAN AIRLINES dq gw NA <NA> NA But after the merge, I get all NAs. What am I doing wrong? Thanks, Jim Rome 552 2009-01-01 92 1
On Feb 1, 2010, at 5:16 PM, James Rome wrote:> Dear kind R helpers, > > I have a vector of runway names in rwy ("31R", "31L",... the > number is user selectable) > arrgnd is a data frame with data for all flights and all runways, > with a Runway column. > I am trying to subset arrgnd into a dat frame for each selected > runway, and then combine them back together using the following code: > > for (j in 1:nr) { # nr = number of user-selected runwaysSafer would be: for (j in seq_along(rwy) {> ar4rw = arrgnd[arrgnd$Runway==rwy[j],]Clearer would be : ar4rw <- subset(arrgnd, Runway=j) # and I think the NA line's will also disappear.> if (j == 1) { > arrw = ar4rw > } > else { > arrw = merge(arrw, ar4rw) > } > }You really should give us something like: dput(rwy) dput( head(arrgnd, 10) )> > but, the merge step gives me a data frame with all NAs. In addition, > ar4rw always gets a row with NAs at the start, which I do not > understand. There are no rows with all NAs in the arrgnd data frame. > > ar4rw[1:2,] # first time through for 31R > DateTime Date month hour minute quarter weekday IATA > ICAO Flight > NA <NA> <NA> NA NA NA NA NA <NA> <NA> <NA> > 529 1/1/09 21:46 2009-01-01 1 21 46 87 5 TA > TAI TAI570 > AircraftType Tail Arrived STA Runway FromTo Delay > NA <NA> <NA> <NA> <NA> <NA> <NA> NA > 529 A320 N496TA 21:46:58 22:30 31R MSLP /KJFK 0 > Operator dq gw > NA <NA> <NA> NA > 529 TACA INTERNATIONAL AIRLINES 2009-01-01 87 1 > > > ar4rw[1:2,] # second time through for 31L > DateTime Date month hour minute quarter weekday IATA > ICAO Flight > NA <NA> <NA> NA NA NA NA NA <NA> <NA> <NA> > 552 1/1/09 23:03 2009-01-01 1 23 3 92 5 AA > AAL AAL22 > AircraftType Tail Arrived STA Runway FromTo > Delay Operator > NA <NA> <NA> <NA> <NA> <NA> <NA> NA <NA> > 552 B762 N329AA 23:03:35 23:10 31L LAX /JFK 0 > AMERICAN AIRLINES > dq gw > NA <NA> NA > > But after the merge, I get all NAs. What am I doing wrong?The data layout gets mangled and I cannot tell what rows are being matched to what. Use dput to convey an unambiguous, and easily replicated example.> > Thanks, > Jim Rome > > 552 2009-01-01 92 1 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD Heritage Laboratories West Hartford, CT
David, Now the code is: for (j in seq_along(rwy)) { # subset the data and merge them ar4rw = ar4rw <- subset(arrgnd, arrgnd$Runway==rwy[j]) if(j == 1) { arrw = ar4rw } else { arrw = merge(arrw, ar4rw) } } I attach the data. I needed 500 rows to get both runways in rwy. The suggestions did not help much, but did get rid of the row of NAs in ar4rw. Why? When I run through the loop for 2 runways, I get # j = 1, Runway = "31L" Browse[1]> arrw[1:3,] DateTime Date month hour minute quarter weekday IATA ICAO Flight 552 1/1/09 23:03 2009-01-01 1 23 3 92 5 AA AAL AAL22 563 1/1/09 23:17 2009-01-01 1 23 17 93 5 DL DAL DAL242 565 1/1/09 23:24 2009-01-01 1 23 24 93 5 DL DAL DAL624 AircraftType Tail Arrived STA Runway FromTo Delay 552 B762 N329AA 23:03:35 23:10 * 31L* LAX /JFK 0 563 B763 N1611B 23:17:37 23:46 31L KATL /KJFK 0 565 B752 N654DL 23:24:04 23:48 31L LAS /JFK 0 Operator dq gw 552 AMERICAN AIRLINES 2009-01-01 92 1 563 DELTA AIR LINES 2009-01-01 93 1 565 DELTA AIR LINES 2009-01-01 93 1 # j = 2 Runway="31R" Browse[1]> ar4rw[1:3,] DateTime Date month hour minute quarter weekday IATA ICAO Flight 529 1/1/09 21:46 2009-01-01 1 21 46 87 5 TA TAI TAI570 530 1/1/09 21:48 2009-01-01 1 21 48 87 5 AA AAL AAL2018 531 1/1/09 21:50 2009-01-01 1 21 50 87 5 BA BAW BAW183 AircraftType Tail Arrived STA Runway FromTo Delay 529 A320 N496TA 21:46:58 22:30 * 31R* MSLP /KJFK 0 530 B752 N621AM 21:48:43 21:50 31R TLPL /JFK 0 531 B744 G-CIVI 21:50:26 22:50 31R EGLL /KJFK 0 Operator dq gw 529 TACA INTERNATIONAL AIRLINES 2009-01-01 87 1 530 AMERICAN AIRLINES 2009-01-01 87 1 531 BRITISH AIRWAYS 2009-01-01 87 1 # But the merge gives all NAs! ]> arrw[1:3,] DateTime Date month hour minute quarter weekday IATA ICAO Flight NA <NA> <NA> NA NA NA NA NA <NA> <NA> <NA> NA.1 <NA> <NA> NA NA NA NA NA <NA> <NA> <NA> NA.2 <NA> <NA> NA NA NA NA NA <NA> <NA> <NA> AircraftType Tail Arrived STA Runway FromTo Delay Operator dq gw NA <NA> <NA> <NA> <NA> <NA> <NA> NA <NA> <NA> NA NA.1 <NA> <NA> <NA> <NA> <NA> <NA> NA <NA> <NA> NA NA.2 <NA> <NA> <NA> <NA> <NA> <NA> NA <NA> <NA> NA Thanks, Jim Rome On Feb 1, 2010, at 5:30 PM, David Winsemius wrote:> > On Feb 1, 2010, at 5:16 PM, James Rome wrote: > >> Dear kind R helpers, >> >> I have a vector of runway names in rwy ("31R", "31L",... the number >> is user selectable) >> arrgnd is a data frame with data for all flights and all runways, >> with a Runway column. >> I am trying to subset arrgnd into a dat frame for each selected >> runway, and then combine them back together using the following code: >> >> for (j in 1:nr) { # nr = number of user-selected runways > > Safer would be: > > for (j in seq_along(rwy) { > >> ar4rw = arrgnd[arrgnd$Runway==rwy[j],] > > Clearer would be : > > ar4rw <- subset(arrgnd, Runway= j) # and I think the NA line's will > also disappear.^ == ^> > >> if (j == 1) { >> arrw = ar4rw >> } >> else { >> arrw = merge(arrw, ar4rw) >> } >> } > > You really should give us something like: > > dput(rwy) > dput( head(arrgnd, 10) ) >> >> but, the merge step gives me a data frame with all NAs. In addition, >> ar4rw always gets a row with NAs at the start, which I do not >> understand. There are no rows with all NAs in the arrgnd data frame. >> > ar4rw[1:2,] # first time through for 31R >> DateTime Date month hour minute quarter weekday IATA ICAO Flight >> NA <NA> <NA> NA NA NA NA NA <NA> <NA> <NA> >> 529 1/1/09 21:46 2009-01-01 1 21 46 87 5 TA TAI TAI570 >> AircraftType Tail Arrived STA Runway FromTo Delay >> NA <NA> <NA> <NA> <NA> <NA> <NA> NA >> 529 A320 N496TA 21:46:58 22:30 31R MSLP /KJFK 0 >> Operator dq gw >> NA <NA> <NA> NA >> 529 TACA INTERNATIONAL AIRLINES 2009-01-01 87 1 >> >> > ar4rw[1:2,] # second time through for 31L >> DateTime Date month hour minute quarter weekday IATA ICAO Flight >> NA <NA> <NA> NA NA NA NA NA <NA> <NA> <NA> >> 552 1/1/09 23:03 2009-01-01 1 23 3 92 5 AA AAL AAL22 >> AircraftType Tail Arrived STA Runway FromTo Delay Operator >> NA <NA> <NA> <NA> <NA> <NA> <NA> NA <NA> >> 552 B762 N329AA 23:03:35 23:10 31L LAX /JFK 0 AMERICAN AIRLINES >> dq gw >> NA <NA> NA >> >> But after the merge, I get all NAs. What am I doing wrong? > > The data layout gets mangled and I cannot tell what rows are being > matched to what. Use dput to convey an unambiguous, and easily > replicated example. >> >> Thanks, >> Jim Rome >> >> 552 2009-01-01 92 1 >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > Heritage Laboratories > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD Heritage Laboratories West Hartford, CT
On 2/1/2010 5:51 PM, David Winsemius wrote: I figured this out finally. I really believe that the R help write-ups are sorely lacking. As soon as I looked at http://www.statmethods.net/management/merging.html, it was obvious: Adding Columns To merge two dataframes (datasets) horizontally, use the *merge* function. In most cases, you join two dataframes by one or more common key variables (i.e., an inner join). |# merge two dataframes by ID total <- merge(dataframeA,dataframeB,by="ID")| |# merge two dataframes by ID and Country total <- merge(dataframeA,dataframeB,by=c("ID","Country")) | Adding Rows To join two dataframes (datasets) vertically, use the* rbind* function. The two dataframes *must* have the same variables, but they do not have to be in the same order. |total <- rbind(dataframeA, dataframeB) | I needed to add rows, and had to use rbind. If the help for merge said "To merge two dataframes (datasets) horizontally" I would have known right away that it was the wrong function to use. Thanks for the help, Jim Rome On Feb 1, 2010, at 5:30 PM, David Winsemius wrote:> > On Feb 1, 2010, at 5:16 PM, James Rome wrote: > >> Dear kind R helpers, >> >> I have a vector of runway names in rwy ("31R", "31L",... the number >> is user selectable) >> arrgnd is a data frame with data for all flights and all runways, >> with a Runway column. >> I am trying to subset arrgnd into a dat frame for each selected >> runway, and then combine them back together using the following code: >> >> for (j in 1:nr) { # nr = number of user-selected runways > > Safer would be: > > for (j in seq_along(rwy) { > >> ar4rw = arrgnd[arrgnd$Runway==rwy[j],] > > Clearer would be : > > ar4rw <- subset(arrgnd, Runway= j) # and I think the NA line's > will also disappear.^ == ^> > >> if (j == 1) { >> arrw = ar4rw >> } >> else { >> arrw = merge(arrw, ar4rw) >> } >> } > > You really should give us something like: > > dput(rwy) > dput( head(arrgnd, 10) ) >> >> but, the merge step gives me a data frame with all NAs. In addition, >> ar4rw always gets a row with NAs at the start, which I do not >> understand. There are no rows with all NAs in the arrgnd data frame. >> > ar4rw[1:2,] # first time through for 31R >> DateTime Date month hour minute quarter weekday IATA ICAO >> Flight >> NA <NA> <NA> NA NA NA NA NA <NA> <NA> <NA> >> 529 1/1/09 21:46 2009-01-01 1 21 46 87 5 TA >> TAI TAI570 >> AircraftType Tail Arrived STA Runway FromTo Delay >> NA <NA> <NA> <NA> <NA> <NA> <NA> NA >> 529 A320 N496TA 21:46:58 22:30 31R MSLP /KJFK 0 >> Operator dq gw >> NA <NA> <NA> NA >> 529 TACA INTERNATIONAL AIRLINES 2009-01-01 87 1 >> >> > ar4rw[1:2,] # second time through for 31L >> DateTime Date month hour minute quarter weekday IATA ICAO >> Flight >> NA <NA> <NA> NA NA NA NA NA <NA> <NA> <NA> >> 552 1/1/09 23:03 2009-01-01 1 23 3 92 5 AA >> AAL AAL22 >> AircraftType Tail Arrived STA Runway FromTo Delay >> Operator >> NA <NA> <NA> <NA> <NA> <NA> <NA> NA <NA> >> 552 B762 N329AA 23:03:35 23:10 31L LAX /JFK 0 >> AMERICAN AIRLINES >> dq gw >> NA <NA> NA >> >> But after the merge, I get all NAs. What am I doing wrong? > > The data layout gets mangled and I cannot tell what rows are being > matched to what. Use dput to convey an unambiguous, and easily > replicated example. >> >> Thanks, >> Jim Rome >> >> 552 2009-01-01 92 1 >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > Heritage Laboratories > West Hartford, CT > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD Heritage Laboratories West Hartford, CT [[alternative HTML version deleted]]