Hey Listee's I am trying to write a shell script to sort and compare my blacklist for squidGuard with the nightly updates that come down in a tar ball. It should be rather simple but I'm not to grate at this. The script is to run nightly, it will download the latest blacklist tarball, un tar it and then add any new entries to the existing black list. The blacklists work by having a folder for each filtered category so the folder "db" contains the subfolders "adult", "gambling", "drugs" etc and each sub folder has two files, "domains" and "urls" (pretty self explanitory). This is how far I have gotten (I haven't tested this script yet as I haven't had a chance I have only gotten as far as writting it, this is what I have so far: #!/bin/bash #This will be running from home directory wget http://www.blacklistsite.com/blacklist.tar tar -cxf blacklist.tar cd BL find ./ -type d -maxdepth 1 | while read FOLDER; do SQUIDDB="usr/local/squidGuard/db/$FOLDER" sort_db($SQUIDDB) comm -3 $SQUIDDB/domains $FOLDER/domains > $SQUIDDB/domains.missing comm -3 $SQUIDDB/urls $FOLDER/urls > $SQUIDDB/urls.missing cat $SQUIDDB/domains.missing >> $SQUIDDB/domains cat $SQUIDDB/urls.missing >> $SQUIDDB/urls rm $SQUIDDB/domains.missing rm $SQUIDDB/urls.missing sort_db($SQUIDDB) done sort_db(){ sort -f $1/domains > $1/domains.sorted sort -f $1/urls > $1/urls.sorted rm $1/domains rm $1/urls mv $1/doamins.sorted $1/domains mv $1/urls.sorted $1/urls } Is it obvious I'm new to this? Hehe, I would also love to hear how people would do this in a more efficient manner because obvisouly this is pretty sloppy and as I said I haven't tested it yet so it might not even run?! Thanks, James ;) -----BEGIN GEEK CODE BLOCK----- Version: 3.1 GIT/MU/U dpu s: a--> C++>$ U+> L++> B-> P+> E?> W+++>$ N K W++ O M++>$ V- PS+++ PE++ Y+ PGP t 5 X+ R- tv+ b+> DI D+++ G+ e(+++++) h--(++) r++ z++ ------END GEEK CODE BLOCK------
on 5-13-2009 4:21 AM James Bensley spake the following:> Hey Listee's > > I am trying to write a shell script to sort and compare my blacklist > for squidGuard with the nightly updates that come down in a tar ball. > It should be rather simple but I'm not to grate at this. The script is > to run nightly, it will download the latest blacklist tarball, un tar > it and then add any new entries to the existing black list. The > blacklists work by having a folder for each filtered category so the > folder "db" contains the subfolders "adult", "gambling", "drugs" etc > and each sub folder has two files, "domains" and "urls" (pretty self > explanitory). This is how far I have gotten (I haven't tested this > script yet as I haven't had a chance I have only gotten as far as > writting it, this is what I have so far: > > > #!/bin/bash > #This will be running from home directory > > wget http://www.blacklistsite.com/blacklist.tar > tar -cxf blacklist.tar > cd BL > > find ./ -type d -maxdepth 1 | while read FOLDER; do > SQUIDDB="usr/local/squidGuard/db/$FOLDER" > sort_db($SQUIDDB) > comm -3 $SQUIDDB/domains $FOLDER/domains > $SQUIDDB/domains.missing > comm -3 $SQUIDDB/urls $FOLDER/urls > $SQUIDDB/urls.missing > cat $SQUIDDB/domains.missing >> $SQUIDDB/domains > cat $SQUIDDB/urls.missing >> $SQUIDDB/urls > rm $SQUIDDB/domains.missing > rm $SQUIDDB/urls.missing > sort_db($SQUIDDB) > done > > sort_db(){ > sort -f $1/domains > $1/domains.sorted > sort -f $1/urls > $1/urls.sorted > rm $1/domains > rm $1/urls > mv $1/doamins.sorted $1/domains > mv $1/urls.sorted $1/urls > } > > Is it obvious I'm new to this? Hehe, I would also love to hear how > people would do this in a more efficient manner because obvisouly this > is pretty sloppy and as I said I haven't tested it yet so it might not > even run?! > > Thanks, James ;) > > -----BEGIN GEEK CODE BLOCK----- > Version: 3.1 > GIT/MU/U dpu s: a--> C++>$ U+> L++> B-> P+> E?> W+++>$ N K W++ O M++>$ V- > PS+++ PE++ Y+ PGP t 5 X+ R- tv+ b+> DI D+++ G+ e(+++++) h--(++) r++ z++ > ------END GEEK CODE BLOCK------Are you looking to have a custom blacklist, or do you just want to know what changed? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 258 bytes Desc: OpenPGP digital signature URL: <http://lists.centos.org/pipermail/centos/attachments/20090513/d6e61bf9/attachment-0004.sig>
> to run nightly, it will download the latest blacklist tarball, un tar > it and then add any new entries to the existing black list. Theif you're already going to the effort of downloading the entire blacklist every night, why not dump the old database, and just insert the newly downloaded one?> tar -cxf blacklist.tarthis will suck your computer into a vortex of doom. I recommend either creating a tarball, or extracting one, but not both at the same time. :) In all honesty, you might be better targeting this query to squidGuard users, as this may be something they do regularly. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available URL: <http://lists.centos.org/pipermail/centos/attachments/20090514/a27b93b0/attachment-0004.sig>
> if you're already going to the effort of downloading the entire > blacklist every night, why not dump the old database, and just insert > the newly downloaded one?Because we also add our own entries to the current blacklist so we are just adding any new entries from the nightly updates of our blacklist provides>> tar -cxf blacklist.tar > > this will suck your computer into a vortex of doom. I recommend either > creating a tarball, or extracting one, but not both at the same time. :)Its ok the blacklist is text so its a 10mb tarball of text. Takes about 30 seconds to download and it will take about 2 minutes for the script to run ;)> In all honesty, you might be better targeting this query to squidGuard > users, as this may be something they do regularly.Should be simple text manipulation :( none the less a good idea I will post my question there. Thanks! -----BEGIN GEEK CODE BLOCK----- Version: 3.1 GIT/MU/U dpu s: a--> C++>$ U+> L++> B-> P+> E?> W+++>$ N K W++ O M++>$ V- PS+++ PE++ Y+ PGP t 5 X+ R- tv+ b+> DI D+++ G+ e(+++++) h--(++) r++ z++ ------END GEEK CODE BLOCK------