-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I am setting up a proof-of-concept backup server at my office. The end idea is for a dozen or so of our ~200 workstations to dump images (like PowerQuest DeployCenter, not JPEG) to a 2Tb RAID5 at reasonable speeds. The testbed (whose specs are listed below) is, I admit, grossly lacking in RAM. I still think it should handle at least two or three systems at once without choking. But on to that.... Most of the machines we've used as test clients can dump to the system at very high speeds without a problem. CPU utilization hits ~10%, load average between 0.1 and 0.3 (generally). RAM never seems to move much, it's always about 3mb free with little or no swap used. When two systems hit it, CPU doesn't change much, but load average starts climbing. It /may/ stabilize near 0.8. Three most always pushes it over 1.0. Once that happens, it keeps climbing until the clients timeout and abort. By then, load average has hit 6-14 (depending on how many machines were transferring at once). Some systems can (individually) cause this overload. All the clients are using 100Mbps connections (some switched, some on hubs). No real pattern has emerged. Image files are broken up every 130mb right now, and no verification is done until the whole image is written. Transfers *from* the testbed use almost no CPU time, and the load average never gets much over 0.2, even with multiple clients. My only theory so far is that Samba is filling up the write cache/buffers faster than they can be emptied to the HD. On that premise, I've tried to speed up the filesystem and slow down Samba a bit (for instance, by taking "SO_RCVBUF=8192 SO_SNDBUF=8192" out of the "socket options"). That does not seem to have helped noticeably. I've managed to snag a tcpdump of one particular client box running and then aborting, if that would help. It's a 9mb tgz that I can make available on request. In contrast: The Microsofties running the department are pushing for an NT5 "server", so we're comparing this testbed to a little 500Mhz recycled-workstation (also with 128mb of RAM). On the Linux testbed, all five clients aborted after the connections stalled. The same five clients kept going when talking to the NT box (albeit at a slightly slower speed). While management doesn't much care about the outcome of this test (they'd want the NT box if it took two days to write ten bytes), I want the best system to win. I just can't believe that a Linux/Samba system is unable to out-perform a runty Windows box. One nagging question is what would the "real" server's performance be? We have spec'd dual Athlon MP 2200+ CPUs, a 3ware 7506-12 controller with 12 200gb Western Digital drives, and 4gb of RAM. (Whole thing is $6,000!!) Thing is, I don't think the RAID would be much faster (writing) than the existing IDE drive. I'd hate to blow six grand and find out it doesn't perform any better. Has anyone dealt with a similar problem before? Did I overlook some obvious "DontChokeAtHighSpeeds" option? System specs: Linux 2.4.22 (custom) Slackware 9.1 Samba 3.0.1 2.2Ghz Intel Celeron 60gb Maxtor 6Y060L0 on UltraATA/133 128mb RAM, 256mb swap # Will try to add RAM next week On-board Intel Pro/1000 (Gigabit) NIC All partitions are on LVM except swap Path Type Size Free / ext3 6g 5.1g /files ext2 30g 9g <Others left off, not relevent> /etc/samba/smb.conf: (selected highlights) security = domain socket options = TCP_NODELAY max xmit = 8192 [Backups] comment = Backup test storage path = /files/Backups valid users = @OurDomain+Dept_ComputerRes OurDomain+BackMeUp public = yes writable = yes printable = no create mask = 0775 guest ok = yes Daniel Johnson Progman2000@usa.net http://progman2000.net -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (MingW32) - GPGshell v2.95 iD8DBQFAAEbm6vGcUBY+ge8RAvlpAKD1iPNGWu0L5IKRNWvLktIZkR6xygCdFted A9eecXGRkQpyK3mxtVPnzI0=D4at -----END PGP SIGNATURE-----
> I am setting up a proof-of-concept backup server at my > office. The end idea is for a dozen or so of our ~200 > workstations to dump images (like PowerQuest > DeployCenter, not JPEG) to a 2Tb RAID5 at reasonable > speeds.Your backup program is a bit less general than a tar which I use, but perhaps you can make some analogy with my comments below and apply it to your case. Basically I think your problem is that continuous writing to an smb-share is rather fragile. If your backup problem allows you to output data to stdout, then you might attach it to an rsh or rexec filter with buffering software on the Linux side. Read my comment.> One nagging question is what would the "real" server's > performance be? We have spec'd dual Athlon MP 2200+ > CPUs, a 3ware 7506-12 controller with 12 200gb > Western Digital drives, and 4gb of RAM. (Whole thing > is $6,000!!) Thing is, I don't think the RAID would > be much faster (writing) than the existing IDE drive. > I'd hate to blow six grand and find out it doesn't > perform any better.I can speculate what a "real" server would do, but I've been doing something like that for a long time with a similar workstation, SuSE 8.2, P4/3G, 2GB RAM, 480 GB 4-way IDE stripe and never bothered to look at load numbers because it works so smoothly. 25 admin shares are being backed up simultaneously every workday but without affecting interactivity of remote sessions. The built-in Gbit NIC is using up all 100 Mbps that the switch passes on to it plus about 20 MB/s from a samba PDC via a Gbit link, so there is an aggregate max speed of about 32 MB/s. Never any aborts. The trick is probably in the little buffering filter (xt) between the backup tool and the disk. This is more efficient both because the reading part accepts incoming data without delay and because the writing part only writes data to disk once a high mark is reached so when it starts writing it flushes data in one big chunk, which reduces fragmentation. The downside is that I'm using 32 MB RAM per backup session, so you need more memory. The buffer size is settable to a multiple of 64 KB between 10 and (SHMMAX/64KB - 3). 512 works fine for me but less would probably work decently too. I use tar as backup tool. All shares are smbmount'd under /mnt so backing the data up is basically for share in $(</etc/bkp-shares) do cd /mnt/$share ( tar cbf 64 - . | xt -n512 > /tars/$share ) & done Well, there's a little more for logging (2>/logs/$share) and incrementation (find . -mtime -o -ctime | tar -T -...) but I didn't want to clutter the simple example. The filter xt has optional arguments -i infile, -o outfile, -s KBchunk, -n numchunks, -t sleeptime. Defaults are stdin, stdout, 64 KB, 10, 1. I also use it to transfer backups to tape. It can read from the stripe at about 130 MB/s and the tape can accept about 80 MB/s, if no other I/O takes place, but combining the two reduces the speed to about 35 MB/s so that on average only about 50 MB/s are obtained. A "real" server not limited to 32-bit/33MHz PCI could probably do a little better.> System specs: > Linux 2.4.22 (custom) > Slackware 9.1 > Samba 3.0.1 > 2.2Ghz Intel Celeron > 60gb Maxtor 6Y060L0 on UltraATA/133 > 128mb RAM, 256mb swap > # Will try to add RAM next week > On-board Intel Pro/1000 (Gigabit) NIC