Miroslav Lachman
2010-Oct-02 19:58 UTC
is there a bug in AWK on 6.x and 7.x (fixed in 8.x)?
I think there is a bug in AWK in base of FreeBSD 6.x and 7.x (tested on 6.4 i386 and 7.3 i386) I have this simple test case, where I want 2 columns from GeoIP CSV file: awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv It should produce output like this: # awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 "1.0.0.0"-"1.7.255.255" "1.9.0.0"-"1.9.255.255" "1.10.10.0"-"1.10.10.255" "1.11.0.0"-"1.11.255.255" "1.12.0.0"-"1.15.255.255" (above is taken from FreeBSD 8.1 i386) On FreeBSD 6.4 and 7.3 it results in broken first line: awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 "1.0.0.0","1.7.255.255","16777216","17301503","AU","Australia"- "1.9.0.0"-"1.9.255.255" "1.10.10.0"-"1.10.10.255" "1.11.0.0"-"1.11.255.255" "1.12.0.0"-"1.15.255.255" There are no errors in CSV file, it doesn't metter if I delete the affected first line from the file. It is reproducible with handmade file: # cat test.csv "1.9.0.0","1.9.255.255","17367040","17432575","MY","Malaysia" "1.10.10.0","1.10.10.255","17435136","17435391","AU","Australia" "1.11.0.0","1.11.255.255","17498112","17563647","KR","Korea, Republic of" "1.12.0.0","1.15.255.255","17563648","17825791","CN","China" "1.16.0.0","1.19.255.255","17825792","18087935","KR","Korea, Republic of" "1.21.0.0","1.21.255.255","18153472","18219007","JP","Japan" # awk 'FS="," { print $1"-"$2 }' test.csv "1.9.0.0","1.9.255.255","17367040","17432575","MY","Malaysia"- "1.10.10.0"-"1.10.10.255" "1.11.0.0"-"1.11.255.255" "1.12.0.0"-"1.15.255.255" "1.16.0.0"-"1.19.255.255" "1.21.0.0"-"1.21.255.255" As it works in 8.1, can it be fixed in 7-STABLE? (I don't know if it was purposely fixed or if it is coincidence of newer version of AWK in 8.x) Should I file PR for it? Miroslav Lachman
Miroslav Lachman
2010-Oct-02 21:21 UTC
is there a bug in AWK on 6.x and 7.x (fixed in 8.x)?
Damian Weber wrote:> > > On Sat, 2 Oct 2010, Miroslav Lachman wrote: > >> Date: Sat, 02 Oct 2010 21:58:27 +0200 >> From: Miroslav Lachman<000.fbsd@quip.cz> >> To: freebsd-stable<freebsd-stable@freebsd.org> >> Subject: is there a bug in AWK on 6.x and 7.x (fixed in 8.x)? >> >> I think there is a bug in AWK in base of FreeBSD 6.x and 7.x (tested on 6.4 >> i386 and 7.3 i386) >> >> I have this simple test case, where I want 2 columns from GeoIP CSV file: >> >> awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv >> >> It should produce output like this: >> >> # awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 >> "1.0.0.0"-"1.7.255.255" >> "1.9.0.0"-"1.9.255.255" >> "1.10.10.0"-"1.10.10.255" >> "1.11.0.0"-"1.11.255.255" >> "1.12.0.0"-"1.15.255.255" >> >> (above is taken from FreeBSD 8.1 i386) >> >> On FreeBSD 6.4 and 7.3 it results in broken first line: >> >> awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 >> "1.0.0.0","1.7.255.255","16777216","17301503","AU","Australia"- >> "1.9.0.0"-"1.9.255.255" >> "1.10.10.0"-"1.10.10.255" >> "1.11.0.0"-"1.11.255.255" >> "1.12.0.0"-"1.15.255.255" >> > > Are you sure the command above contains a valid variable assignment?I am not AWK expert, so maybe you are right. I just found this difference between 7.x and 8.x. But if if works for other lines, why it doesn't work fot the first line too? Anyway, thank you for working examples, I will use them! Another working example from 6.4 is: awk -F "," '{ print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 "1.0.0.0"-"1.7.255.255" "1.9.0.0"-"1.9.255.255" "1.10.10.0"-"1.10.10.255" "1.11.0.0"-"1.11.255.255" "1.12.0.0"-"1.15.255.255"> The following works on both 7.3-STABLE and 8.1-STABLE > > $ awk -v FS="," '{ print $1"-"$2; }' GeoIPCountryWhois.csv | head -n 5 > "1.0.0.0"-"1.7.255.255" > "1.9.0.0"-"1.9.255.255" > "1.10.10.0"-"1.10.10.255" > "1.11.0.0"-"1.11.255.255" > "1.12.0.0"-"1.15.255.255" > > > The following works as well > > $ awk '{ print $1"-"$2; }' FS="," GeoIPCountryWhois.csv | head -n 5 > "1.0.0.0"-"1.7.255.255" > "1.9.0.0"-"1.9.255.255" > "1.10.10.0"-"1.10.10.255" > "1.11.0.0"-"1.11.255.255" > "1.12.0.0"-"1.15.255.255" > > Or, using a BEGIN section for assignment... > > $ awk 'BEGIN {FS=","} { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 > "1.0.0.0"-"1.7.255.255" > "1.9.0.0"-"1.9.255.255" > "1.10.10.0"-"1.10.10.255" > "1.11.0.0"-"1.11.255.255" > "1.12.0.0"-"1.15.255.255" > > As a side note, gawk shows the following output on 7-STABLE and 8-STABLE > $ gawk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 > "1.0.0.0","1.7.255.255","16777216","17301503","AU","Australia"- > "1.9.0.0"-"1.9.255.255" > "1.10.10.0"-"1.10.10.255" > "1.11.0.0"-"1.11.255.255" > "1.12.0.0"-"1.15.255.255" > > ... which means the new behaviour of awk on 8-STABLE seems to break > compatibility with gawk at that point. > > -- Damian
On Sat, 2 Oct 2010, Miroslav Lachman wrote:> Date: Sat, 02 Oct 2010 21:58:27 +0200 > From: Miroslav Lachman <000.fbsd@quip.cz> > To: freebsd-stable <freebsd-stable@freebsd.org> > Subject: is there a bug in AWK on 6.x and 7.x (fixed in 8.x)? > > I think there is a bug in AWK in base of FreeBSD 6.x and 7.x (tested on 6.4 > i386 and 7.3 i386) > > I have this simple test case, where I want 2 columns from GeoIP CSV file: > > awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv > > It should produce output like this: > > # awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 > "1.0.0.0"-"1.7.255.255" > "1.9.0.0"-"1.9.255.255" > "1.10.10.0"-"1.10.10.255" > "1.11.0.0"-"1.11.255.255" > "1.12.0.0"-"1.15.255.255" > > (above is taken from FreeBSD 8.1 i386) > > On FreeBSD 6.4 and 7.3 it results in broken first line: > > awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 > "1.0.0.0","1.7.255.255","16777216","17301503","AU","Australia"- > "1.9.0.0"-"1.9.255.255" > "1.10.10.0"-"1.10.10.255" > "1.11.0.0"-"1.11.255.255" > "1.12.0.0"-"1.15.255.255" >Are you sure the command above contains a valid variable assignment? The following works on both 7.3-STABLE and 8.1-STABLE $ awk -v FS="," '{ print $1"-"$2; }' GeoIPCountryWhois.csv | head -n 5 "1.0.0.0"-"1.7.255.255" "1.9.0.0"-"1.9.255.255" "1.10.10.0"-"1.10.10.255" "1.11.0.0"-"1.11.255.255" "1.12.0.0"-"1.15.255.255" The following works as well $ awk '{ print $1"-"$2; }' FS="," GeoIPCountryWhois.csv | head -n 5 "1.0.0.0"-"1.7.255.255" "1.9.0.0"-"1.9.255.255" "1.10.10.0"-"1.10.10.255" "1.11.0.0"-"1.11.255.255" "1.12.0.0"-"1.15.255.255" Or, using a BEGIN section for assignment... $ awk 'BEGIN {FS=","} { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 "1.0.0.0"-"1.7.255.255" "1.9.0.0"-"1.9.255.255" "1.10.10.0"-"1.10.10.255" "1.11.0.0"-"1.11.255.255" "1.12.0.0"-"1.15.255.255" As a side note, gawk shows the following output on 7-STABLE and 8-STABLE $ gawk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 "1.0.0.0","1.7.255.255","16777216","17301503","AU","Australia"- "1.9.0.0"-"1.9.255.255" "1.10.10.0"-"1.10.10.255" "1.11.0.0"-"1.11.255.255" "1.12.0.0"-"1.15.255.255" ... which means the new behaviour of awk on 8-STABLE seems to break compatibility with gawk at that point. -- Damian
Dominic Fandrey
2010-Oct-03 06:18 UTC
is there a bug in AWK on 6.x and 7.x (fixed in 8.x)?
On 02/10/2010 21:58, Miroslav Lachman wrote:> I think there is a bug in AWK in base of FreeBSD 6.x and 7.x (tested on > 6.4 i386 and 7.3 i386) > > I have this simple test case, where I want 2 columns from GeoIP CSV file: > > awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csvYou know that with this syntax FS="," is treated as a condition to run the following block in angle brackets? I.e. you assign FS for every line of input. What seems to have changed is when and how a variable assignment returns true or false in a boolean condition. Any way, I doubt it was your intention to conditionally execute the code, so this is a bug in your programming. The correct approach to do what you want to is the BEGIN block if you want to do it in the code. For your use of head, consider the following:> awk 'BEGIN {FS=","} NR <= 5 { print $1"-"$2 }' GeoIPCountryWhois.csvThat way the code is executed under the condition that the line number is less or equal 5. -- A: Because it fouls the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail?
Is this becuase the behavior of "FS=" was changed to match the behavior of awk -F On 2010-10-02 09:58:27PM +0200, Miroslav Lachman wrote:> I think there is a bug in AWK in base of FreeBSD 6.x and 7.x (tested on > 6.4 i386 and 7.3 i386) > > I have this simple test case, where I want 2 columns from GeoIP CSV file: > > awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv > > It should produce output like this: > > # awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 > "1.0.0.0"-"1.7.255.255" > "1.9.0.0"-"1.9.255.255" > "1.10.10.0"-"1.10.10.255" > "1.11.0.0"-"1.11.255.255" > "1.12.0.0"-"1.15.255.255" > > (above is taken from FreeBSD 8.1 i386) > > On FreeBSD 6.4 and 7.3 it results in broken first line: > > awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 > "1.0.0.0","1.7.255.255","16777216","17301503","AU","Australia"- > "1.9.0.0"-"1.9.255.255" > "1.10.10.0"-"1.10.10.255" > "1.11.0.0"-"1.11.255.255" > "1.12.0.0"-"1.15.255.255" > > There are no errors in CSV file, it doesn't metter if I delete the > affected first line from the file. > > It is reproducible with handmade file: > > # cat test.csv > "1.9.0.0","1.9.255.255","17367040","17432575","MY","Malaysia" > "1.10.10.0","1.10.10.255","17435136","17435391","AU","Australia" > "1.11.0.0","1.11.255.255","17498112","17563647","KR","Korea, Republic of" > "1.12.0.0","1.15.255.255","17563648","17825791","CN","China" > "1.16.0.0","1.19.255.255","17825792","18087935","KR","Korea, Republic of" > "1.21.0.0","1.21.255.255","18153472","18219007","JP","Japan" > > > # awk 'FS="," { print $1"-"$2 }' test.csv > "1.9.0.0","1.9.255.255","17367040","17432575","MY","Malaysia"- > "1.10.10.0"-"1.10.10.255" > "1.11.0.0"-"1.11.255.255" > "1.12.0.0"-"1.15.255.255" > "1.16.0.0"-"1.19.255.255" > "1.21.0.0"-"1.21.255.255" > > > As it works in 8.1, can it be fixed in 7-STABLE? > (I don't know if it was purposely fixed or if it is coincidence of newer > version of AWK in 8.x) > > Should I file PR for it? > > Miroslav Lachman > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"-- ==========================================================Peter C. Lai | Bard College at Simon's Rock Systems Administrator | 84 Alford Rd. Information Technology Svcs. | Gt. Barrington, MA 01230 USA peter AT simons-rock.edu | (413) 528-7428 ===========================================================