Warren Young wrote:> On Oct 25, 2017, at 10:02 AM, Mark Haney <mark.haney at neonova.net> wrote: >> >> I have a file with two columns 'email' and 'total' like this: >> >> me at example.com 20 >> me at example.com 40 >> you at domain.com 100 >> you at domain.com 30 >> >> I need to get the total number of messages for each email address. > > This screams out for associative arrays. (Also called hashes, > dictionaries, maps, etc.) > > That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is > definitely out, as that ships Bash 3, which lacks this feature.<snip> Associative arrays? Awk! Awk! (No, I am not a seagull...) sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}' mark "associative arrays, how do I love thee? Let me tot the arrays..."
On 10/25/2017 01:24 PM, m.roth at 5-cent.us wrote:>> >> This screams out for associative arrays. (Also called hashes, >> dictionaries, maps, etc.) >> >> That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is >> definitely out, as that ships Bash 3, which lacks this feature. > <snip> > Associative arrays? > > Awk! Awk! (No, I am not a seagull...) > > sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" > array[i];}' > > mark "associative arrays, how do I love thee? Let me tot the arrays..." >Okay, I'm impressed with this one.? I use awk for simple stuff when sed starts getting weird, but this is absolutely elegant. No offense to the other examples, they are all awesome, but I had no idea awk could do this with such little effort.? Well, I know what I'm studying up on this weekend. -- Mark Haney Network Engineer at NeoNova 919-460-3330 option 1 mark.haney at neonova.net www.neonova.net
Mark Haney wrote:> On 10/25/2017 01:24 PM, m.roth at 5-cent.us wrote: >>> >>> This screams out for associative arrays. (Also called hashes, >>> dictionaries, maps, etc.) >>> >>> That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 >>> is definitely out, as that ships Bash 3, which lacks this feature. >> <snip> >> Associative arrays? >> >> Awk! Awk! (No, I am not a seagull...) >> >> sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i >> "\t" array[i];}' >> >> mark "associative arrays, how do I love thee? Let me tot the >> arrays..." >> > Okay, I'm impressed with this one.? I use awk for simple stuff when sed > starts getting weird, but this is absolutely elegant. No offense to the > other examples, they are all awesome, but I had no idea awk could do > this with such little effort.? Well, I know what I'm studying up on this > weekend. >The perl script was about the same. It's just, well, I learned awk when I first got into *nix, in '91. Had a project where We were going to be the center and Tell All Agencies The Format of the data they would give us, and we'd load a d/b.... I wrote the d/b loader in C..and then they all said, "sorry, no budget for that, here's the format we've got it in, ya want it or not?" Before that project finished, I had 30 awk scripts, ranging in length from 100-200 lines (yes, really), to reformat, and validate the data before feeding it to the loader I'd written. The other thing - there may be more succinct ways to write it (my manager, these days, uses regular expressions to the point I have to look what it's doing up), while more than half my career was as a programmer, and I write code such that if I get hit by a car, or take another job, or get called at 16:30 on a Friday, or 02:00, I want to fix the problem without spending hours trying to remember how clever I'd been last year... so I make it easily readable and comprehensible. awk is just fun. mark
hrm.. seems like you were missing a } sort file | awk '{array[$1] += $2;} END { for (i in array) {print i "\t" array[i];}}' regards, Jason On 10/25/2017 01:24 PM, m.roth at 5-cent.us wrote:> Warren Young wrote: >> On Oct 25, 2017, at 10:02 AM, Mark Haney <mark.haney at neonova.net> wrote: >>> I have a file with two columns 'email' and 'total' like this: >>> >>> me at example.com 20 >>> me at example.com 40 >>> you at domain.com 100 >>> you at domain.com 30 >>> >>> I need to get the total number of messages for each email address. >> This screams out for associative arrays. (Also called hashes, >> dictionaries, maps, etc.) >> >> That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is >> definitely out, as that ships Bash 3, which lacks this feature. > <snip> > Associative arrays? > > Awk! Awk! (No, I am not a seagull...) > > sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" > array[i];}' > > mark "associative arrays, how do I love thee? Let me tot the arrays..." > > > _______________________________________________ > CentOS mailing list > CentOS at centos.org > https://lists.centos.org/mailman/listinfo/centos
Jason Welsh wrote:> hrm.. seems like you were missing a } > > sort file | awk '{array[$1] += $2;} END { for (i in array) {print i "\t" > array[i];}}' >Oops. Well, it's not vi, it's webmail, so I couldn't check... <g> Thanks. mark> > regards, > > Jason > > > > On 10/25/2017 01:24 PM, m.roth at 5-cent.us wrote: >> Warren Young wrote: >>> On Oct 25, 2017, at 10:02 AM, Mark Haney <mark.haney at neonova.net> >>> wrote: >>>> I have a file with two columns 'email' and 'total' like this: >>>> >>>> me at example.com 20 >>>> me at example.com 40 >>>> you at domain.com 100 >>>> you at domain.com 30 >>>> >>>> I need to get the total number of messages for each email address. >>> This screams out for associative arrays. (Also called hashes, >>> dictionaries, maps, etc.) >>> >>> That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 >>> is >>> definitely out, as that ships Bash 3, which lacks this feature. >> <snip> >> Associative arrays? >> >> Awk! Awk! (No, I am not a seagull...) >> >> sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i >> "\t" >> array[i];}' >> >> mark "associative arrays, how do I love thee? Let me tot the >> arrays..." >> >> >> _______________________________________________ >> CentOS mailing list >> CentOS at centos.org >> https://lists.centos.org/mailman/listinfo/centos > > _______________________________________________ > CentOS mailing list > CentOS at centos.org > https://lists.centos.org/mailman/listinfo/centos >
In article <b5215baacd93a6e85efc59947f9b8ed9.squirrel at host290.hostmonster.com>, <m.roth at 5-cent.us> wrote:> Warren Young wrote: > > On Oct 25, 2017, at 10:02 AM, Mark Haney <mark.haney at neonova.net> wrote: > >> > >> I have a file with two columns 'email' and 'total' like this: > >> > >> me at example.com 20 > >> me at example.com 40 > >> you at domain.com 100 > >> you at domain.com 30 > >> > >> I need to get the total number of messages for each email address. > > > > This screams out for associative arrays. (Also called hashes, > > dictionaries, maps, etc.) > > > > That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is > > definitely out, as that ships Bash 3, which lacks this feature. > <snip> > Associative arrays? > > Awk! Awk! (No, I am not a seagull...) > > sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" > array[i];}'Why the sort? It doesn't matter in what order the lines are read. Wouldn't this give you the same? awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}}' <file Cheers Tony -- Tony Mountifield Work: tony at softins.co.uk - http://www.softins.co.uk Play: tony at mountifield.org - http://tony.mountifield.org
Tony Mountifield wrote:> In article > <b5215baacd93a6e85efc59947f9b8ed9.squirrel at host290.hostmonster.com>, > <m.roth at 5-cent.us> wrote: >> Warren Young wrote: >> > On Oct 25, 2017, at 10:02 AM, Mark Haney <mark.haney at neonova.net> >> wrote: >> >> >> >> I have a file with two columns 'email' and 'total' like this: >> >> >> >> me at example.com 20 >> >> me at example.com 40 >> >> you at domain.com 100 >> >> you at domain.com 30 >> >> >> >> I need to get the total number of messages for each email address. >> > >> > This screams out for associative arrays. (Also called hashes, >> > dictionaries, maps, etc.) >> > >> > That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 >> is >> > definitely out, as that ships Bash 3, which lacks this feature. >> <snip> >> Associative arrays? >> >> Awk! Awk! (No, I am not a seagull...) >> >> sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i >> "\t" >> array[i];}' > > Why the sort? It doesn't matter in what order the lines are read. > Wouldn't this give you the same? > > awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" > array[i];}}' <file >You're right, not really necessary in this case. I was working with a couple of awk scripts here at work, and it was needed in the middle.... mark