I am going to try an experiment with e-mail aggregation where I expect to receive over 1 million e-mails a day from public lists. Can anyone shed some light on hard disk space (to retain this e-mail for long periods) and system specs to be able to handle the load? I am looking to buy a low end box, but that can hold lots of RAM and accomodate a fair number of HD's to store the e-mail while I try my experiments. Can anyone provide some realistic specs while maintaining a small budget? -Jason -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.centos.org/pipermail/centos/attachments/20110803/10a56ca1/attachment-0003.html>
On Wed, 2011-08-03 at 10:53 -0700, Todd wrote:> I am going to try an experiment with e-mail aggregation where I > expect to receive over 1 million e-mails a day from public lists.You're surely not going to read all of them ;-) -- With best regards, Paul. England, EU.
Google is your friend, but things to think about 1) distribution of receipt. Don't make the mistake of 1,000,000/(24*60) to spec your network and i/o capacity. Depending on your taste, a distributed file system using iscsi or one of the cluster filesystems may be a good idea... 2) size of emails which may affect 3) inode configuration on the disk 4) DNS lookup times 5) SPAM processing load, but maybe you want the spam too Really need to know what you mean by 'small budget' as well. $100's or $1,000's? On Wed, 3 Aug 2011, Todd wrote:> I am ?going to try an experiment with e-mail aggregation where I expect to > receive over 1 million e-mails a day from public lists. > Can anyone shed some light on hard disk space (to retain this e-mail for > long periods) and system specs to be able to handle the load? > > I am looking to buy a low end box, but that can hold lots of RAM and > accomodate a fair number of HD's to store the e-mail while I try my > experiments. > > Can anyone provide some realistic specs while maintaining a small budget? > > -Jason > >---------------------------------------------------------------------- Jim Wildman, CISSP, RHCE jim at rossberry.com http://www.rossberry.net "Society in every state is a blessing, but Government, even in its best state, is a necessary evil; in its worst state, an intolerable one." Thomas Paine
On 08/03/11 10:53 AM, Todd wrote:> I am looking to buy a low end box, but that can hold lots of RAM and > accomodate a fair number of HD's to store the e-mail while I try my > experiments.the HP DL180G6 is a nice box for those requirements. 2U server that can be configured with up to 2x6 core Xeon 5600 series processors, and up to 96GB ram without using really expensive memory (has 12 memory slots, 6 per CPU socket, so 6x8gb gets you 48GB, 12x8gb gets you 96gb), and has either 12 x 3.5" SAS/SATA or 25 x 2.5" SAS/SATA hotswap drives. a million emails/day is an average of 12/second every single second of the day. I'd wager your file system had better be able to handle 3-4 times that so that bursts are handled gracefully. I'd definately recommend using raid10 with a fair number of disks for this as that's lots of small file creates. what are you doing with this email when you recieve it, beyond just saving it? -- john r pierce N 37, W 122 santa cruz ca mid-left coast