Hello, I have a log file with the following input: X , ID , Date, Time, Y 01,01368,2010-12-02,09:07:00,Pass 01,01368,2010-12-02,10:54:00,Pass 01,01368,2010-12-02,13:07:04,Pass 01,01368,2010-12-02,18:54:01,Pass 01,01368,2010-12-03,09:02:00,Pass 01,01368,2010-12-03,13:53:00,Pass 01,01368,2010-12-03,16:07:00,Pass My goal is to get the number of times ID has a TIME that's after 09:00:00 each DATE. That would give me two output. one is the number of days ID has been late, and secondly, the day and time this ID has been late . I've started as such: sort -t ','? -k 3,3 -k 4,4? file.log? # this will sort the file according to the DATE field as well as the Time fileld. I'm stuck for the last 30 min to find a way to get the first line of each day (logically it'll be the earliest as i've sorted by date/time previously) once i know how to do this, i'll be able to compare time and proceed.. Can any one help ? i looked into sort - u and uniq -f3 though i didnt get far with it..
lhecking at users.sourceforge.net
2010-Dec-21 17:33 UTC
[CentOS] Text Proccessing script - advice?
> sort -t ','? -k 3,3 -k 4,4? file.log? # this will sort the file according to the DATE field as well as the Time fileld. > I'm stuck for the last 30 min to find a way to get the first line of each day (logically it'll be the earliest as i've sorted by date/time previously) once i know how to do this, i'll be able to compare time and proceed..If you're not afraid of perl, the Date-Manip module allows comparing time and date, among other things. --------------------------------------------------------------- This message and any attachments may contain Cypress (or its subsidiaries) confidential information. If it has been received in error, please advise the sender and immediately delete this message. ---------------------------------------------------------------
Roland RoLaNd wrote:> > I have a log file with the following input: > X , ID , Date, Time, Y > 01,01368,2010-12-02,09:07:00,Pass > 01,01368,2010-12-02,10:54:00,Pass > 01,01368,2010-12-02,13:07:04,Pass > 01,01368,2010-12-02,18:54:01,Pass > 01,01368,2010-12-03,09:02:00,Pass > 01,01368,2010-12-03,13:53:00,Pass > 01,01368,2010-12-03,16:07:00,Pass > > My goal is to get the number of times ID has a TIME that's after 09:00:00 > each DATE. > That would give me two output. one is the number of days ID has been late, > and secondly, the day and time this ID has been late . >awk 'BEGIN { FS=",";} \ { if ( $4 > "09:00:00" ) { array[ $2 ][1]++; array[ $2 ][ array[$2][1] + 1] = $3 "::" $4; } } END { for j in array { for k in array[j] { print j, array[j][k]; } } } It's been a while since I needed to do this, but I *think* the nested "for <var> in array" will work. <snip> mark
On 12/21/2010 11:30 AM, Roland RoLaNd wrote:> > Hello, > > I have a log file with the following input: > X , ID , Date, Time, Y > 01,01368,2010-12-02,09:07:00,Pass > 01,01368,2010-12-02,10:54:00,Pass > 01,01368,2010-12-02,13:07:04,Pass > 01,01368,2010-12-02,18:54:01,Pass > 01,01368,2010-12-03,09:02:00,Pass > 01,01368,2010-12-03,13:53:00,Pass > 01,01368,2010-12-03,16:07:00,Pass > > My goal is to get the number of times ID has a TIME that's after 09:00:00 each DATE. > That would give me two output. one is the number of days ID has been late, and secondly, the day and time this ID has been late . > > I've started as such: > > sort -t ',' -k 3,3 -k 4,4 file.log # this will sort the file according to the DATE field as well as the Time fileld. > I'm stuck for the last 30 min to find a way to get the first line of each day (logically it'll be the earliest as i've sorted by date/time previously) once i know how to do this, i'll be able to compare time and proceed.. > > Can any one help ? > i looked into sort - u and uniq -f3 though i didnt get far with it..Most logs are written in append mode so ascending date/time comes naturally. This perl should list each instance and the count: my %id_count; my %id_date; #date already seen; while (<>) { my ($x,$id,$date,$time) = split /,/; next if ($x == 'X'); #skip header next if ($time le "09:00:00"); next if ($id_date{$id} eq $date); $id_date{$id} = $date; print "$id - $date - $time\n"; $id_count{$id}++; } print "----\n"; while (( my $id,$count) = each(%id_count)) { print "$id late $count days\n"; } -- Les Mikesell lesmikesell at gmail.com
Reasonably Related Threads
- hey please help me my 3rd email of how to change From fileld username in sip packet
- [PATCH 0/2] ocfs2: two bug fixes about xattr and inline-data
- Dsync deleting mailboxes due to duplicate UIDs
- Get an extra_field in login process
- New package cloudRmpi: Cloud-based parallel proccessing for R