Before you read this, I want to state (for reasons listed below) that I don't expect an answer (advice is welcomed, but please read this email carefully before answering). I'm sharing this with the community with the hope that better software results from our sad experience... BACKGROUND I've been using NT for 4 years, Netware and Linux for 3 years, and Samba for almost 2. I work in the IT department of a medium-sized unit of a global advertising company. We have a Netware and NT environment with a bit of Linux. We installed a 280GB IDE Samba archive server (rare usage) and a 15GB SCSI Mac/Samba file server (medium usage). We also use Samba for more menial tasks like smbmounts and file transfers. We thought we were comfortable with Samba. We knew we were comfortable with other types of file servers. OUR SETUP Going from my tired memory: Athlon MP 1.8GHz (mem=nopentium) 2GB ECC SDRAM Tyan S2460(I think?) Antec 450W PS Lots of cooling 5 IBM DeskStar 120GB drives with 8MB caches in RAID 5 3ware 7580(I think?) 8-port hardware RAID 3ware hot-swappable drive cages Intel e1000 Gigabit NIC, full duplex, 1000MBit, autonegotiation off 3com Gigabit switch, autonegotiation off RedHat 7.3 Kernel 2.4.19 with ACL support ext3 with ACL support Samba 2.2.5 with ACL support installed from a recompiled SRPM from the samba.org FTP site. Winbind NO nfs daemon (I hear it's buggy w/ ACLs) We have a variety of clients, from DOS and OS/2 to Windows (9x-2000) and Linux. The server acts as a print spooling area (the actual queues are on an NT server) and scratch area for database programmers to manipulate their flat database files. As far as I know, these files are not commonly accessed by more than one user at a time. THE PROBLEM For the past year, our heaviest-used Netware server has been under more and more stress.. filling up, running out of licenses, slowing down, etc. Preliminary tests using Samba on a fast Linux box showed anywhere from 70% to 1000% speed improvements, depending on the task. The decision was made to switch it to Linux; the whole company is migrating away from Netware and we (as a unit, not speaking for the company) don't want to be completely trapped into Windows if we can help it. The new hardware arrived and more preliminary tests indicated all looked good. We were set to switch last Saturday night. We turned off logins to the Netware box, backed it up, restored it to the new Linux box, set permissions, then made sure the various computers in the building could log in. Yesterday, our first day, was rough. For most of the day we fought random slow browsing with no explanation. Clients would appear to lock up for several seconds. We found some misconfigurations in smb.conf but the problems reappeared. No errors were seen in any machines' logs on debug level 2. I trimmed the smb.conf to a minimal number of options and that seemed to help with the slowness. Today, however, the problem reappeared a few times with no errors in the logs that we could see. The printers were missing some of the records sent to them to print, something that had never happened with Netware. Every time the missing records were different. Occasionally, it would work right. Oplocks (kernel, level I and II) were left to defaults (turned on). THE OUTCOME Sadly, tonight we are installing a Windows NT server. Installing a brand new server is actually cheaper for us than the 8 or so hours of downtime to back up the server, install NT on it, and restore the data to it. We don't want to revert to Netware because so many clients have been reconfigured to log on only to the domain (DOS, OS/2, etc.) and that would require many more hours reversing those changes. Also, some files have been added since leaving Netware. We also decided to proceed to use NT because is more proven in this capacity. CONCLUSION To be fair, the problems could be related to some misconfiguration. I have pasted the smb.conf below. I fear it might just be an oplock problem, but it is not clear what would result if more than one user happened to try to write to a file with them disabled. Every advice we found said to leave them on to prevent corruption and to improve performance. We ran out of time to test it, and feared what failure would bring. Running this: grep -r -B5 -A5 oplock /var/log/samba/ | grep -B5 -A5 error produced only 5 of these errors oplock_break: receive_smb error (Connection reset by peer) from the same DOS machine from 2 days worth of all machines' logs running at debuglevel 1 (some at level 2). I don't know if that is a good indicator of an oplock problem. I can do other greps on request. Unfortunately, we can't test out your suggestions in production, and our off-production testing apparently can't stress it well enough. So please just take this email as input - I'm not looking for answers here, though advice is appreciated. The problem could also have been environment or hardware. We should know soon, as we are going to reinstall the original Samba server with NT, and the problems should reappear if hardware or environment. If we do find that to be true, I will certainly reveal our findings to this mailing list. And perhaps the problem was with ACLs. We couldn't turn them off in production to test that theory. It is likely that we will try Samba in this capacity again in the future with a more mature version. Thanks for listening, /dev/idal [global] server string workgroup = <our domain> password server = <our PDC> security = domain encrypt passwords = yes smb passwd file /etc/samba/smbpasswd veto files = /lost+found/ winbind uid = 10000-20000 winbind gid = 10000-20000 winbind separator = + create mask = 660 force create mode = 660 directory mask = 0770 force directory mode = 0770 log file /var/log/samba/%m.log debuglevel = 2 [print] path = /share/print writeable = yes __________________________________________________ Do you Yahoo!? Y! Web Hosting - Let the expert host your web site http://webhosting.yahoo.com/
err are you asking for help, or just wasting our time? sounds like you have a big job ahead of you tonight setting up that NT machine better get that out of the way before telling us your life story like that.... ----- Original Message ----- From: "Chris de Vidal" <cdevidal@yahoo.com> To: <samba@lists.samba.org> Cc: <samba-technical@lists.samba.org> Sent: Wednesday, October 23, 2002 4:13 PM Subject: [Samba] How Samba let us down> Before you read this, I want to state (for reasons > listed below) that I don't expect an answer (advice is > welcomed, but please read this email carefully before > answering). I'm sharing this with the community with > the hope that better software results from our sad > experience... > > BACKGROUND > > I've been using NT for 4 years, Netware and Linux for > 3 years, and Samba for almost 2. I work in the IT > department of a medium-sized unit of a global > advertising company. We have a Netware and NT > environment with a bit of Linux. > > We installed a 280GB IDE Samba archive server (rare > usage) and a 15GB SCSI Mac/Samba file server (medium > usage). We also use Samba for more menial tasks like > smbmounts and file transfers. We thought we were > comfortable with Samba. We knew we were comfortable > with other types of file servers. > > OUR SETUP > > Going from my tired memory: > Athlon MP 1.8GHz (mem=nopentium) > 2GB ECC SDRAM > Tyan S2460(I think?) > Antec 450W PS > Lots of cooling > 5 IBM DeskStar 120GB drives with 8MB caches in RAID 5 > 3ware 7580(I think?) 8-port hardware RAID > 3ware hot-swappable drive cages > Intel e1000 Gigabit NIC, full duplex, 1000MBit, > autonegotiation off > 3com Gigabit switch, autonegotiation off > RedHat 7.3 > Kernel 2.4.19 with ACL support > ext3 with ACL support > Samba 2.2.5 with ACL support installed from a > recompiled SRPM from the samba.org FTP site. > Winbind > NO nfs daemon (I hear it's buggy w/ ACLs) > > We have a variety of clients, from DOS and OS/2 to > Windows (9x-2000) and Linux. The server acts as a > print spooling area (the actual queues are on an NT > server) and scratch area for database programmers to > manipulate their flat database files. As far as I > know, these files are not commonly accessed by more > than one user at a time. > > THE PROBLEM > > For the past year, our heaviest-used Netware server > has been under more and more stress.. filling up, > running out of licenses, slowing down, etc. > Preliminary tests using Samba on a fast Linux box > showed anywhere from 70% to 1000% speed improvements, > depending on the task. The decision was made to > switch it to Linux; the whole company is migrating > away from Netware and we (as a unit, not speaking for > the company) don't want to be completely trapped into > Windows if we can help it. > > The new hardware arrived and more preliminary tests > indicated all looked good. We were set to switch last > Saturday night. We turned off logins to the Netware > box, backed it up, restored it to the new Linux box, > set permissions, then made sure the various computers > in the building could log in. > > Yesterday, our first day, was rough. For most of the > day we fought random slow browsing with no > explanation. Clients would appear to lock up for > several seconds. We found some misconfigurations in > smb.conf but the problems reappeared. No errors were > seen in any machines' logs on debug level 2. I > trimmed the smb.conf to a minimal number of options > and that seemed to help with the slowness. Today, > however, the problem reappeared a few times with no > errors in the logs that we could see. > > The printers were missing some of the records sent to > them to print, something that had never happened with > Netware. Every time the missing records were > different. Occasionally, it would work right. > Oplocks (kernel, level I and II) were left to defaults > (turned on). > > THE OUTCOME > > Sadly, tonight we are installing a Windows NT server. > Installing a brand new server is actually cheaper for > us than the 8 or so hours of downtime to back up the > server, install NT on it, and restore the data to it. > We don't want to revert to Netware because so many > clients have been reconfigured to log on only to the > domain (DOS, OS/2, etc.) and that would require many > more hours reversing those changes. Also, some files > have been added since leaving Netware. We also > decided to proceed to use NT because is more proven in > this capacity. > > CONCLUSION > > To be fair, the problems could be related to some > misconfiguration. I have pasted the smb.conf below. > > I fear it might just be an oplock problem, but it is > not clear what would result if more than one user > happened to try to write to a file with them disabled. > Every advice we found said to leave them on to > prevent corruption and to improve performance. We ran > out of time to test it, and feared what failure would > bring. Running this: > grep -r -B5 -A5 oplock /var/log/samba/ | grep -B5 -A5 > error > produced only 5 of these errors > oplock_break: receive_smb error (Connection reset by > peer) > from the same DOS machine from 2 days worth of all > machines' logs running at debuglevel 1 (some at level > 2). I don't know if that is a good indicator of an > oplock problem. I can do other greps on request. > > Unfortunately, we can't test out your suggestions in > production, and our off-production testing apparently > can't stress it well enough. So please just take this > email as input - I'm not looking for answers here, > though advice is appreciated. > > The problem could also have been environment or > hardware. We should know soon, as we are going to > reinstall the original Samba server with NT, and the > problems should reappear if hardware or environment. > If we do find that to be true, I will certainly reveal > our findings to this mailing list. > > And perhaps the problem was with ACLs. We couldn't > turn them off in production to test that theory. > > It is likely that we will try Samba in this capacity > again in the future with a more mature version. > > Thanks for listening, > /dev/idal > > > [global] > server string > workgroup = <our domain> > password server = <our PDC> > security = domain > encrypt passwords = yes > smb passwd file > /etc/samba/smbpasswd > veto files = /lost+found/ > winbind uid = 10000-20000 > winbind gid = 10000-20000 > winbind separator = + > create mask = 660 > force create mode = 660 > directory mask = 0770 > force directory mode = 0770 > log file > /var/log/samba/%m.log > debuglevel = 2 > > [print] > path = /share/print > writeable = yes > > __________________________________________________ > Do you Yahoo!? > Y! Web Hosting - Let the expert host your web site > http://webhosting.yahoo.com/ > -- > To unsubscribe from this list go to the following URL and read the > instructions: http://lists.samba.org/mailman/listinfo/samba >
--- tim smith <tims@cqpl.com.au> wrote:> err are you asking for help, or just wasting our > time?Read the first paragraph of my email, please. /dev/idal __________________________________________________ Do you Yahoo!? Y! Web Hosting - Let the expert host your web site http://webhosting.yahoo.com/
--- Chris de Vidal <cdevidal@yahoo.com> wrote:> --- tim smith <tims@cqpl.com.au> wrote: > > err are you asking for help, or just wasting our > > time? > > Read the first paragraph of my email, please.It said: "Before you read this, I want to state (for reasons listed below) that I don't expect an answer (advice is welcomed, but please read this email carefully before answering). I'm sharing this with the community with the hope that better software results from our sad experience..." I am pro-Samba and am trying to help by sharing a potential problem. Please read the email more carefully before responding next time. /dev/idal __________________________________________________ Do you Yahoo!? Y! Web Hosting - Let the expert host your web site http://webhosting.yahoo.com/
--- Bartlomiej Solarz-Niesluchowski <B.Solarz-Niesluchowski@wsisiz.edu.pl> wrote:> At 08:13 2002-10-23, you wrote: > > >The printers were missing some of the records sent > to > >them to print, something that had never happened > with > >Netware. Every time the missing records were > >different. Occasionally, it would work right. > >Oplocks (kernel, level I and II) were left to > defaults > >(turned on). > > > > This is known bug - on my setup (600 clients/6500 > users/2TB hdd space) all > print jobs are printed without problems but when you > look on print manager > it can talk that it is printer error on hangup > client machine. > > As I read you must wait to samba 2.2.6 (it will be > in some days - currently > it is samba 2.2.6 rc4) where this bug will be > corrected (bug = weird "work" > of print manager).The actual queues are on an NT server. This server merely acts as a large spool area. Are you using Samba as the spool area only or using Samba printing support? Our printouts are not fine (corrupt), and we are not using a Windows print manager but a DOS BARR machine. We look forward to using Samba again at a later version; this might indeed be a bug that gets fixed then. /dev/idal __________________________________________________ Do you Yahoo!? Y! Web Hosting - Let the expert host your web site http://webhosting.yahoo.com/
Thank you for responding. You win a gold star for actually reading my email and not jumping to conclusions (-: --- Tristan Ball <tb@vsl.com.au> wrote:> I think the 7580 might be a mistake. The card has > only 2meg of cache (read: f*ck all).The amount of RAM is not an apples-to-apples comparison. The RAM isn't SDRAM like on other hardware RAID cards but SRAM... no latency, and the controller uses a non-blocking switched fabric.* I'm too tired to remember what that means, but we saw that the 3ware cards did about as well as other RAID cards with much more RAM. I don't recall, however, looking over RAID 5 performance (regarding your next reply), which could have been our mistake. Still, the primary problem is corruption, not performance, but they could be related. * Something about that here: http://www.matrixlist.com/pipermail/pc_support/2002-July/001737.html> Raid 5 writes are _slow_, with 4 physical IO's > required for every 1 io > from the OS or client. Thats why they try to buffer > them up and write full > stripes at a time, or to keep parity blocks cached > in ram. That means if your > clients are sending lots of small-ish random writes > (I bet yours would be if > they are DB developers), that 5 disk array will > probably sustain no more than > about 100-200 writes a sec. Only a scratch more > than a single disk.You could be right here. The author in the link above indicated that it might be a problem with small RAID 5 random read/writes. Know how to see I/Os/sec on Linux, by chance? Bonnie++? I'm still learning about Linux through experience, reading, and asking questions (: The biggest problem is not performance but corruption, but they could be related. Anyway, if the problem is the card, we should see the same problem when we put NT on the server. I'll let you and the list know.> You also didn't mention what the CPU utilisation > looked like, particularly as a > user/system/io breakdown. :-)Load averages < 0.50 most of the time, free memory - caches + buffers is around 350MB out of 2GB. sysctl.conf has been tuned for a large file server setup using recommendations from "Securing And Optimizing Linux 2.0" (only in hardback from OpenNA.com). I can post a copy of the file if you'd like. I'm still learning about Linux.. how would a user/system/io breakdown be done? Some flag of ps?> Actually, while they can improve performance, they > are an inherently less > reliable option than no-oplocks. Even on pure MS > networks there are special > cases where they can cause trouble. (it does require > other things to go wrong to > trigger them tho).So.. it is safe to turn them off?> I generally find level 3 debugs are the lowest level > usefull for tracing, but > that enabling them for all processes will massively > affect performance - > particularly if your logs go to that raid-5 volume > :-)Seperate drive for the OS + logs. I'd heard level 3 was too slow so I didn't go that high. I'll take it up that high on a client basis using your next advice.> I generally selectively enable logs using smbcontrol > for particular clients, and > use a level of 3-5.We couldn't determine how to set the debug level individually. Thanks!> > veto files = /lost+found/ > > This will slow performance.Our problem wasn't performance but corruption, but they could be related. I'll take this option out as it doesn't matter if the user sees those directories. Thanks for catching that.> > debuglevel = 2 > > Again, this will affect performance.It was on 1 most of the time but on 2 when I copied it to the list.> Sorry you had such a rough time of it tho..Thank you very much! There is still a chance we will use Samba again for this, and I'll take your advice with me when we do. By chance, do you use ACLs? /dev/idal __________________________________________________ Do you Yahoo!? Y! Web Hosting - Let the expert host your web site http://webhosting.yahoo.com/
--- Bartlomiej Solarz-Niesluchowski> >The actual queues are on an NT server. This server > >merely acts as a large spool area. Are you using > >Samba as the spool area only or using Samba > printing > >support? > > I use only samba printing support (all printers are > net printers > HP4000N/4050N/4100N)Yours might be different than ours. Our Samba server has no connections to these printers at all; they are just being used as hard drive storage.> >Our printouts are not fine (corrupt), and we are > not > >using a Windows print manager but a DOS BARR > machine. > > This looks like cr-LF problem - be sure that is NO > conversions on unix side for your printouts.....No conversion. Nothing on Linux is opening it. It is being written from Windows to the spool area like a large hard drive and being read off of the spool area by another client. We're not using any of Samba's print operations. And sometimes it works fine. Thanks for responding! /dev/idal __________________________________________________ Do you Yahoo!? Y! Web Hosting - Let the expert host your web site http://webhosting.yahoo.com/
The new NT server has a bad HD, so we have a repreive temporarily and perhaps we can still work this problem out and still use Samba (: --- Mathew McKernan <mathewmckernan@optushome.com.au> wrote:> By the look of it, the reason why it is so slow is > the fact that you may not > be running a WINS Server. We had this problem with > NT boxes, yes Windows > Servers. We installed a Windows NT Server to be our > WINS server, it > increased the speed of the LAN dramatically. We now > run the WINS Server on a > Linux box running Samba.While this is a great way to increase speed, A. It's plenty fast on the NT, Netware, and other Samba servers. In fact, the slowness appears to be totally isolated to the new Samba server. B. The slow browsing is on the hard drive once connected to the server, not cruising network neighborhood where WINS would be most effective. C. Our primary problem is data corruption, not performance, though they could be related. The random slowness might actually be our RAID setup or perhaps even oplocks. Installing NT ought to show if we have a RAID problem. The corruption might be related to oplocks. I'm doing research. Is it safe to disable kernel, regular, and level2 oplocks if we're not doing any linux-side read/writes?> We have a home drive server which serves about 1800 > users with 400 logged on > at one time drawing about 30MBps out of it server. > This box is a Pentium 4, > 512MB RAM. 400GB RAID server running Linux and > Samba.What card and type of drives?> My suggestion: > Install a WINS Server (simple 400MHz box even) > running Linux, and if you > like run an internal DNS too which is syncronised to > the WINS database using > the "wins hook" option in smb.conf. Point all your > devices' WINS addresses > to this new WINS server. You will notice a dramatic > improvement in > performance.I did try WINS in testing; I made one of the Samba servers a WINS server and pointed my workstation to it. I didn't see other addresses caching in the Samba WINS database and often I would see "WINS server appears to be down" when using smbclient. However, no other machines were using the WINS server, and the WINS server was not local subnet browse master, so that might have stopped me. Have you seen better documents on implementing Samba WINS than what is on samba.org or in /usr/share/doc? /dev/idal __________________________________________________ Do you Yahoo!? Y! Web Hosting - Let the expert host your web site http://webhosting.yahoo.com/
Chris: First of all let me say that your implementation of Samba is orders of magnitude above mine, however I have seen some problems similar to yours. I second the advice about the WINS server. When I have not had this set up properly (or nmbd has died for some reason) I see all kinds of odd browsing problems including slow browsing, things coming in and out of the browse list with no explanation, etc. In addition I thought I'd add that when I added a VPN link to my network to tie in other off-site offices, this noticeably slowed down the browsing for local clients. This is because they are hitting that VPN link (because they have a drive mapped there) and thus it takes time to travel out and back over the link. Granted none of this addresses your file corruption issues, but I thought it might help for the future. Browsing is generally what generates all my questions from users :) Good Luck - and don't give up! James James W. Beauchamp, P.E. 2121 Newmarket Pkwy. Suite 140 Marietta, GA 30067 phone - 770-690-9552 ext. 227 fax - 770-690-9529 www.gesinc.com
> -----Original Message----- > From: Bradley W. Langhorst [mailto:brad@langhorst.com]> i use acls - people like them.. > i wouldn't think there'd be a particular performace hit with them > though...I use ACLs. They work fine for me, but then again I've only got about 30 clients.> another thing to consider - what is your filesystem on those > machines... > > i've had bad luck with reiserfs (a while ago though) > and nothing but success with xfs > no experience with ext3 > (you'd be nuts to use ext2 on such a large filesystem)Heh. Tell me about it...we have a 130 gigabyte ext2 filesystem. I'm going to switch it to ext3 soon. The extra overhead is worth not having 10 minute long fscks during reboots. ;)
I think everyone else has suggested that you upgrade to 2.2.6. I too would recommend this. The company where I work had a Win2K box whose print jobs would get dropped depending on how the printing was setup. Upgrading fixed the problem. I sent an e-mail about it about a week ago. If you've already got the drivers installed on the workstations you might want to go ahead and insert the following into the printer section. use client driver=Yes I believe that the mem=nopentium option is not necessary with the newer kernels. As far as browsing goes, you probably do want to get WINS setup and make sure that DNS is configured correctly. I noticed that some of the browsing slowness issues went away when I moved from 2.2.1a to 2.2.5. I don't know what browsing looks like after the upgrade 2.2.6. I only work via ssh, since I'm away at school. I can't actually be there to know what it feels like. Just my little bit of advice. James Hubbard Chris de Vidal wrote:> Athlon MP 1.8GHz (mem=nopentium) > 2GB ECC SDRAM > Tyan S2460(I think?) > Antec 450W PS > Lots of cooling > 5 IBM DeskStar 120GB drives with 8MB caches in RAID 5 > 3ware 7580(I think?) 8-port hardware RAID > 3ware hot-swappable drive cages > Intel e1000 Gigabit NIC, full duplex, 1000MBit, > autonegotiation off > 3com Gigabit switch, autonegotiation off > RedHat 7.3 > Kernel 2.4.19 with ACL support > ext3 with ACL support > Samba 2.2.5 with ACL support installed from a > recompiled SRPM from the samba.org FTP site. > Winbind > NO nfs daemon (I hear it's buggy w/ ACLs) > > We have a variety of clients, from DOS and OS/2 to > Windows (9x-2000) and Linux. The server acts as a > print spooling area (the actual queues are on an NT > server) and scratch area for database programmers to > manipulate their flat database files. As far as I > know, these files are not commonly accessed by more > than one user at a time. > > THE PROBLEM > > For the past year, our heaviest-used Netware server > has been under more and more stress.. filling up, > running out of licenses, slowing down, etc. > Preliminary tests using Samba on a fast Linux box > showed anywhere from 70% to 1000% speed improvements, > depending on the task. The decision was made to > switch it to Linux; the whole company is migrating > away from Netware and we (as a unit, not speaking for > the company) don't want to be completely trapped into > Windows if we can help it. > > The new hardware arrived and more preliminary tests > indicated all looked good. We were set to switch last > Saturday night. We turned off logins to the Netware > box, backed it up, restored it to the new Linux box, > set permissions, then made sure the various computers > in the building could log in. > > Yesterday, our first day, was rough. For most of the > day we fought random slow browsing with no > explanation. Clients would appear to lock up for > several seconds. We found some misconfigurations in > smb.conf but the problems reappeared. No errors were > seen in any machines' logs on debug level 2. I > trimmed the smb.conf to a minimal number of options > and that seemed to help with the slowness. Today, > however, the problem reappeared a few times with no > errors in the logs that we could see. > > The printers were missing some of the records sent to > them to print, something that had never happened with > Netware. Every time the missing records were > different. Occasionally, it would work right. > Oplocks (kernel, level I and II) were left to defaults > (turned on). > > THE OUTCOME > > Sadly, tonight we are installing a Windows NT server. > Installing a brand new server is actually cheaper for > us than the 8 or so hours of downtime to back up the > server, install NT on it, and restore the data to it. > We don't want to revert to Netware because so many > clients have been reconfigured to log on only to the > domain (DOS, OS/2, etc.) and that would require many > more hours reversing those changes. Also, some files > have been added since leaving Netware. We also > decided to proceed to use NT because is more proven in > this capacity. > > CONCLUSION > > To be fair, the problems could be related to some > misconfiguration. I have pasted the smb.conf below. > > I fear it might just be an oplock problem, but it is > not clear what would result if more than one user > happened to try to write to a file with them disabled. > Every advice we found said to leave them on to > prevent corruption and to improve performance. We ran > out of time to test it, and feared what failure would > bring. Running this: > grep -r -B5 -A5 oplock /var/log/samba/ | grep -B5 -A5 > error > produced only 5 of these errors > oplock_break: receive_smb error (Connection reset by > peer) > from the same DOS machine from 2 days worth of all > machines' logs running at debuglevel 1 (some at level > 2). I don't know if that is a good indicator of an > oplock problem. I can do other greps on request. > > Unfortunately, we can't test out your suggestions in > production, and our off-production testing apparently > can't stress it well enough. So please just take this > email as input - I'm not looking for answers here, > though advice is appreciated. > > The problem could also have been environment or > hardware. We should know soon, as we are going to > reinstall the original Samba server with NT, and the > problems should reappear if hardware or environment. > If we do find that to be true, I will certainly reveal > our findings to this mailing list. > > And perhaps the problem was with ACLs. We couldn't > turn them off in production to test that theory. > > It is likely that we will try Samba in this capacity > again in the future with a more mature version. > > Thanks for listening, > /dev/idal > > > [global] > server string > workgroup = <our domain> > password server = <our PDC> > security = domain > encrypt passwords = yes > smb passwd file > /etc/samba/smbpasswd > veto files = /lost+found/ > winbind uid = 10000-20000 > winbind gid = 10000-20000 > winbind separator = + > create mask = 660 > force create mode = 660 > directory mask = 0770 > force directory mode = 0770 > log file > /var/log/samba/%m.log > debuglevel = 2 > > [print] > path = /share/print > writeable = yes > > __________________________________________________ > Do you Yahoo!? > Y! Web Hosting - Let the expert host your web site > http://webhosting.yahoo.com/
Here at Tricord, we run Samba through some pretty intense tests, as well. Since we are a file system producer, we focus on corruption bugs. We haven't found any in Samba, other than a rather famous Microsoft Word bug that also occurs on Windows servers. I'm not trying to chime in here, but if there was the kind of bug someone would notice within the first few hours of use, we'd have hit it hundreds of times already, just in our testing this week. We've been testing like this for more than two years. -----Original Message----- From: John H Terpstra [mailto:jht@samba.org] Sent: Wednesday, October 23, 2002 1:04 PM To: Jay Ts Cc: jra@dp.samba.org; chris@devidal.tv; Mathew McKernan; samba@lists.samba.org; samba-technical@lists.samba.org Subject: Re: [Samba] Re: How Samba let us down Jay, For the record, I thouroughly test samba pre-releases before we ever ship. To the best of my knowledge, NOT ONE version of samba we have released ever CAUSED (or resulted in) file/data corruption. If I sound defensive - that's is exactly correct because file corruption is a DEATH issue! Please note: This does NOT include smbfs, which is not officially part of Samba. I can make NO assertions regarding the integrity of smbfs as I regard this as most undesirable technology. I do NOT test smbfs at all. Every reported case of file corruption I have looked at has been due to: 1. Bad or defective or low grade ethernet cards 2. Defective HUBs / Ether-Switches 3. Defective Hardware on the Server 4. Incorrect Protocol Stack configuration on the MS Windows client FHIW: My current testbed consists of: Tyan 2460 motherboard, 2 X MP1600+ CPUs 1 GB DDR2100 RAM 1 Gigabit Intel Enternet 2 x Intel EEpro100 1 x 3Ware 7540 IDE RAID - 3 WD 60 GB IDE HdD 2 x IBM 40GB IDE driver (native to system) Caldera OpenLinux 3.1.1 with 2.4.18 kernel with ACL patch applied. Test load on system with up to 60 sessions doing full load work. Peak IDE I/O bandwidth is 452 MBytes/sec. Peak network I/O is 117 MBytes/sec. Samba peak I/O depends on nature of operations. In other words, I beat the living daisies out of samba during test. Tests done with Samba with Win9X, WinME, Win2K (Pro + Adv Server), WinXPPro. I can vouch for the fact that not one file corruption problem has been detected during the 2.2.x series, nor on any prior series. Cheers, John T. On Wed, 23 Oct 2002, Jay Ts wrote:> Jeremy Allison (jra@dp.samba.org) wrote: > > Jay Ts wrote: > > > > > > > The corruption might be related to oplocks. I'm doing > > Just to keep myself out of more trouble today, I'd like > to point out that I didn't write the above. ;-) > > > File corruption is treated as a drop everything - priority > > 1 bug in Samba. If this were a generic problem known with > > 2.2.6 we'd be issuing a patch *immediately*. > > I'm really lost at this point (too many replies to too many > threads while having "one of those days"), but I think I/we > suggested he _upgrade_ to 2.2.6, if he isn't already running > a pretty recent release. > > I've seen problems in the early 2.2.x releases (when transferring > large files) that could be perceived as (or called) "file corruption", > but the problem went away sometime before 2.2.4. > > Jay Ts >-- John H Terpstra Email: jht@samba.org -------------- next part -------------- HTML attachment scrubbed and removed
We regularly do large file Copy-Paste tests with files between 30G and 60G. We have yet to see a problem. Tricord's market is Network Attached Storage, and our product is a file system. Samba is the main interface between our market and our file system. We spend a lot of time making sure the data that goes through Samba and into our file system comes back out in the same shape it went in. We have a whole department devoted to that purpose. Trust me, we'd notice if there was a problem. :) -----Original Message----- From: Jay Ts [mailto:jay@jayts.cx] Sent: Wednesday, October 23, 2002 2:25 PM To: Esh, Andrew Cc: 'John H Terpstra'; jra@dp.samba.org; chris@devidal.tv; Mathew McKernan; samba@lists.samba.org; samba-technical@lists.samba.org Subject: Re: [Samba] Re: How Samba let us down Esh, Andrew wrote:> Here at Tricord, we run Samba through some pretty intense tests, as well. > Since we are a file system producer, we focus on corruption bugs. Wehaven't> found any in Samba,Since I've been curious about this anyway, I might go ahead and check: Do you (And J. Terpstra, and others) test sending really huge files, such as 700 MB ISO CD-ROM images or bigger, across the net, and then run a cmp on them, going in each direction. That is: 1. Start with a reference huge file on Windows, known to match an existing file on Unix, then copy it over through Samba and do a cmp on them? About how many times is this done in your tests? More than a few hundred? 2. Also run the test the other way, and compare the copies on the Windows side?> I'm not trying to chime in here, but if there was > the kind of bug someone would notice within the first few hours of use,we'd> have hit it hundreds of times already, just in our testing this week.We've> been testing like this for more than two years.That's what I thought, too. My take on it was that it was related to Samba, but not necessarily caused by it. In any case, it was all more than a year ago, and isn't an issue to me at all. Jay Ts -------------- next part -------------- HTML attachment scrubbed and removed
I tend to focus on absurdities. They lead to interesting results. -----Original Message----- From: Philip Burrow [mailto:phil.burrow@blueyonder.co.uk] Sent: Wednesday, October 23, 2002 5:22 PM Cc: samba@lists.samba.org; samba-technical@lists.samba.org Subject: Re: [Samba] Re: How Samba let us down ----- Original Message ----- From: Most of you Cc: <samba@lists.samba.org>; <samba-technical@lists.samba.org> Sent: Wednesday, October 23, 2002 10:14 PM Subject: Re: [Samba] Re: How Samba let us down> etc etcWell this one certainly roused you all. Must it be the case that you all jump in to reply to this unhelpful garbage yet when someone posts a 'simple' query they often don't get any replies. Phil. -------------- next part -------------- HTML attachment scrubbed and removed
> Philip Burrow wrote: > > ----- Original Message ----- > From: Most of you > Cc: <samba@lists.samba.org>; <samba-technical@lists.samba.org> > Sent: Wednesday, October 23, 2002 10:14 PM > Subject: Re: [Samba] Re: How Samba let us down > > > etc etc > > Well this one certainly roused you all. > > Must it be the case that you all jump in to reply to this > unhelpful garbage yet when someone posts a 'simple' query > they often don't get any replies.The main reason 'simple' queries don't get replies is that the answers can be found in so many places. These resources should be utilised before posting queries. In the time that it takes to compose the 'simple' query, the answer could have been found, complete with circles and arrows and a paragraph on the back 'a each one.... Given a manual and a telephone, many, many people would pick up the phone and call tech support before cracking the manual open. It's the same with tech lists, I guess. RTFM RTFAQ Google RTListArchives RTOnLineDocs(SWAT) ThinkAboutItAWhile Repeat if necessary Then Post... Jim
Somebody please drag this dead horse off the server..... PV -----Original Message----- From: Van Sickler, Jim [mailto:vansickj-eodc@Kaman.com] Sent: Thursday, October 24, 2002 1:10 PM To: Samba-L (E-mail) Subject: RE: [Samba] Re: How Samba let us down> Philip Burrow wrote: > > ----- Original Message ----- > From: Most of you > Cc: <samba@lists.samba.org>; <samba-technical@lists.samba.org> > Sent: Wednesday, October 23, 2002 10:14 PM > Subject: Re: [Samba] Re: How Samba let us down > > > etc etc > > Well this one certainly roused you all. > > Must it be the case that you all jump in to reply to this unhelpful > garbage yet when someone posts a 'simple' query they often don't get > any replies.The main reason 'simple' queries don't get replies is that the answers can be found in so many places. These resources should be utilised before posting queries. In the time that it takes to compose the 'simple' query, the answer could have been found, complete with circles and arrows and a paragraph on the back 'a each one.... Given a manual and a telephone, many, many people would pick up the phone and call tech support before cracking the manual open. It's the same with tech lists, I guess. RTFM RTFAQ Google RTListArchives RTOnLineDocs(SWAT) ThinkAboutItAWhile Repeat if necessary Then Post... Jim -- To unsubscribe from this list go to the following URL and read the instructions: http://lists.samba.org/mailman/listinfo/samba
Reading through Jeremy's eagerly awaited discourse on oplocks/share modes/locking, I read this bit :> ... if you need simultaneous > file access from a Windows and UNIX client you *must* have an > application that is written to lock records correctly on both > sides. Few applications are written like this, and even fewer > are cross platform (UNIX and Windows) so in practice this isn't > much of a problem.but my brain kept stumbling over "isn't much of a problem" (;-) .... surely that should say "isn't much of a solution" ? I only mention it in the interests of honing the discourse as it heads towards the docs. Cheers Nick Boyce EDS Southwest Solution Centre, Bristol, UK