Hi, all:
   I want to check small files' property(such as date, path, and so on)
frequently. The files are stored in netwrok driver and their sizes 
vary from 2KB to 5KB.
   
   I found that Windows 2K outperform Linux/Samba very much after I 
campared the bench results. I am very confused about it and who can 
explain it?
The computers' configurations are as follows:
1. PC Client 
	It runs the follow VB program to compute the time when check files'
property
	Operation System:
	  Windows 2000 professional
	// ...
	Set objFSO = CreateObject("Scripting.FileSystemObject")
	thistime = thisnow
	If objFSO.FileExists(fn) Then
		totle = totle & "Check file time " & CStr(thisnow -
thistime) + " ms" + vbCrLf
		thistime = thisnow
		Set objFile = objFSO.GetFile(fn)
		totle = totle & "Get object time " & CStr(thisnow -
thistime) + " ms" + vbCrLf
		thistime = thisnow
		temp = DateValue(CStr(objFile.DateLastModified))
		totle = totle & "Get date time " & CStr(thisnow - thistime)
+ " ms" + vbCrLf
	End If
2. Linux PC Server:	
	It provide Linux/Samba shared directory for the client
	(1) Operation System 
	  kernel = 2.6.6
	  file system = xfs
	  nic = intel pro 100
	  Samba 2.2.8a (I have tried samba 3.0.4, and the result is likely)
	(2) smb.conf
	  [global]
        encrypt passwords = yes
        socket options = TCP_NODELAY SO_RCVBUF=4096 SO_SNDBUF=4096
        max xmit = 4096
        read raw = No
        wide links = No
      [pub]                          
        path = /pub  
        guest ok = no      
        write list = test
        create mode = 0664
        directory mode = 0775
3. Windows PC Server:	
	It provide Windows shared directory for the client
	Operation System:
	  Windows 2000 professional
4. Bench results
	Windows                 Linux/Samba  
	CHECK OBJECT DATE       CHECK OBJECT DATE
    ----- ------ ----       ----- ------ ---- 
	760 10  120             10  0   203     
	750 10  80              20  0   363     
	780 10  50              20  761 632     
	750 10  50              10  0   173     
	800 0   10              40  0   711     
	                
	90  0   140             240 212 871     
	60  40  90              240 212 821     
	90  0   50              210 30  162     
	60  50  10              20  220 150     
	10  30  20              30  30  160     
	                
	0   50  80              741 50  412     
	10  40  110             781 10  412     
	20  0   70              20  781 381     
	10  0   70              10  791 81      
	20  20  0               50  21  691     
   NOTE: the unit is ms.
Best Regards!
Jacky Kim
.
Sorry, I miss the follow informations: 1. There are 100,000 files in shared directory. 2. Mkfs.xfs data device in linux server, and mount it with noatime option Jacky Kim .
Here, I have seen sub 700MHz P3 systems with IDE disks blow away a dual 900MHz 2K system with SCSI drives in every manner. I would recommend trying reiserfs, it is superior at handling small files. This is what we are using. On Wed, 2004-07-07 at 04:31, Jacky Kim wrote:> Hi, all: > > I want to check small files' property(such as date, path, and so on) > frequently. The files are stored in netwrok driver and their sizes > vary from 2KB to 5KB. > > I found that Windows 2K outperform Linux/Samba very much after I > campared the bench results. I am very confused about it and who can > explain it? > > The computers' configurations are as follows: > > 1. PC Client > It runs the follow VB program to compute the time when check files' property > Operation System: > Windows 2000 professional > > // ... > Set objFSO = CreateObject("Scripting.FileSystemObject") > thistime = thisnow > If objFSO.FileExists(fn) Then > totle = totle & "Check file time " & CStr(thisnow - thistime) + " ms" + vbCrLf > thistime = thisnow > Set objFile = objFSO.GetFile(fn) > totle = totle & "Get object time " & CStr(thisnow - thistime) + " ms" + vbCrLf > thistime = thisnow > temp = DateValue(CStr(objFile.DateLastModified)) > totle = totle & "Get date time " & CStr(thisnow - thistime) + " ms" + vbCrLf > End If > > 2. Linux PC Server: > It provide Linux/Samba shared directory for the client > (1) Operation System > kernel = 2.6.6 > file system = xfs > nic = intel pro 100 > Samba 2.2.8a (I have tried samba 3.0.4, and the result is likely) > (2) smb.conf > [global] > encrypt passwords = yes > socket options = TCP_NODELAY SO_RCVBUF=4096 SO_SNDBUF=4096 > max xmit = 4096 > read raw = No > wide links = No > [pub] > path = /pub > guest ok = no > write list = test > create mode = 0664 > directory mode = 0775 > > 3. Windows PC Server: > It provide Windows shared directory for the client > Operation System: > Windows 2000 professional > > 4. Bench results > > Windows Linux/Samba > CHECK OBJECT DATE CHECK OBJECT DATE > ----- ------ ---- ----- ------ ---- > 760 10 120 10 0 203 > 750 10 80 20 0 363 > 780 10 50 20 761 632 > 750 10 50 10 0 173 > 800 0 10 40 0 711 > > 90 0 140 240 212 871 > 60 40 90 240 212 821 > 90 0 50 210 30 162 > 60 50 10 20 220 150 > 10 30 20 30 30 160 > > 0 50 80 741 50 412 > 10 40 110 781 10 412 > 20 0 70 20 781 381 > 10 0 70 10 791 81 > 20 20 0 50 21 691 > > NOTE: the unit is ms. > > Best Regards! > Jacky Kim > . >
Gerald (Jerry) Carter
2004-Jul-07  13:41 UTC
[Samba] Windows 2K outperform Linux/Samba very much?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Jacky Kim wrote: | Sorry, I miss the follow informations: | 1. There are 100,000 files in shared directory. This is currently a bad test case for Samba but we are working on a solution to address it. The problem is that smbd must perform case insensitive file name lookups on a case sensitive file system. cheers, jerry - ---------------------------------------------------------------------- Hewlett-Packard ------------------------- http://www.hp.com SAMBA Team ---------------------- http://www.samba.org GnuPG Key ---- http://www.plainjoe.org/gpg_public.asc "...a hundred billion castaways looking for a home." ----------- Sting -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFA6/1oIR7qMdg1EfYRAvFkAJ9o37Vxjsj8EylqS9NKZHXvhiW21gCfd6Gs 1ieJ1+GI42ccTmTF3S5MRKs=mLAs -----END PGP SIGNATURE-----
I just want to get small files' property, NOT content. I heve compared XFS and Reiserfs just now, and found that the results are very likely too.>Here, I have seen sub 700MHz P3 systems with IDE disks blow away a dual >900MHz 2K system with SCSI drives in every manner. > >I would recommend trying reiserfs, it is superior at handling small >files. This is what we are using. > >> >> I want to check small files' property(such as date, path, and so on) >> frequently. The files are stored in netwrok driver and their sizes >> vary from 2KB to 5KB. >>
>> Sorry, I miss the follow informations: >> 1. There are 100,000 files in shared directory. > >That's the problem. Because Samba has to present a case insensitive >view of a case-sensitive filesystem we have to scan the entire >directory to ensure a file doesn't exist. > >The answer to the "why is Samba slow" question in this case is >"don't do that" (put 100,000 files in a directory).I trid 20,000 files in a directory too, and found the same result: Windows's share is about 10 times faster than Linux/samba's one when get small file's property(NOT content). Jacky Kim .
Here's a thought - If ACLs/journalling aren't needed, how well does Samba work on FAT/VFAT partitions? Could the sheer simplicity of the filesystem help there? Thanks, Mark Lidstone IT and Network Support Administrator BMT SeaTech Ltd Grove House, Meridians Cross, 7 Ocean Way Ocean Village, Southampton. SO14 3TJ. UK Tel: +44 (0)23 8063 5122 Fax: +44 (0)23 8063 5144 E-Mail: mailto:mark.lidstone@bmtseatech.co.uk Website: www.bmtseatech.co.uk ========================================================================Confidentiality Notice and Disclaimer: The contents of this e-mail and any attachments are intended only for the use of the e-mail addressee(s) shown. If you are not that person, or one of those persons, you are not allowed to take any action based upon it or to copy it, forward, distribute or disclose the contents of it and you should please delete it from your system. BMT SeaTech Limited does not accept liability for any errors or omissions in the context of this e-mail or its attachments which arise as a result of Internet transmission, nor accept liability for statements which are those of the author and not clearly made on behalf of BMT SeaTech Limited. ======================================================================== -----Original Message----- From: Malcolm Baldridge [mailto:google@paypc.com] Sent: 08 July 2004 10:43 To: samba Subject: Re: Re: [Samba] Windows 2K outperform Linux/Samba very much? Jacky Kim <jcy_2008@163.com> wrote:> I trid 20,000 files in a directory too, and found the same result: > Windows's share is about 10 times faster than Linux/samba's one when > get small file's property(NOT content).Jacky, Not all Linux filesystems are created equally, especially for this kind of file access method. Ext2/Ext3 is probably the slowest filesytem for this kind of thing. I have seen some glimpses of directory hashing being retrofitted into ext2/ext3, but this requires a format-time option with very new tools, with new mount/kernels, etc. You'd be MUCH better off with reiserfs. I've had 500,000 files in a single directory without a significant decrease in performance. I've never managed to get Windows 2000 to manage this without really tanking in performance [I've given up the test harness long before it got that far]. I don't think you'll ever see samba outperforming Windows in this though, because of the case-insensitivity issue, though it should at least match the performance. Reiserfs may provide other benefits (superior access locality) which MIGHT boost performance a bit towards Linux/Samba, but I'd not hold my breath. =MB-- To unsubscribe from this list go to the following URL and read the instructions: http://lists.samba.org/mailman/listinfo/samba
On Thu, Jul 08, 2004 at 05:26:08PM +0800, Jacky Kim wrote:> >> Sorry, I miss the follow informations: > >> 1. There are 100,000 files in shared directory. > > > >That's the problem. Because Samba has to present a case insensitive > >view of a case-sensitive filesystem we have to scan the entire > >directory to ensure a file doesn't exist. > > > >The answer to the "why is Samba slow" question in this case is > >"don't do that" (put 100,000 files in a directory). > > I trid 20,000 files in a directory too, and found the same result: > Windows's share is about 10 times faster than Linux/samba's one > when get small file's property(NOT content).Yes, as soon as smbd has to scan the entire directory to check for the non-existance of the name things will slow down. With sub microsecond timestamps there are some tricks we can play with directory caching, but these are not immediate fixes. Jeremy.
Thank Mark. Yeah, it gains about 30% performance raise to set 'case sensitive=yes'. But Windows's share is still about 8 times faster than Linux/Samba's . After several tests, I find that XFS's performance is very near to Reisfer when checking files' property. So I think the main problem is caused by samba not File system. Is there any other config I can set? Best Regards! Jacky Kim .>You can use the configuration options to force filenames to a particular >case, and then bypass the slow case-insensitive matching by forcing case >sensitivity: > > mangle case = yes > case sensitive = yes > default case = lower > preserve case = no > short preserve case = no > > >However, on sharing a VFAT filesystem, perhaps you can preserve case, >and gain the speed increase by just setting 'case sensitive=yes'. >
Thank Mark.
I have tried another test program in Linux, it call 2 C lib functions
(open & fstat) to small files, and I compute the time they take via
gettimeofday system call.
	// C code ... 
	ret = gettimeofday(&before, &zone);
	fd = open(argv[1], O_RDONLY);
	ret = gettimeofday(&after, &zone);
	interval1 = (after.tv_usec - before.tv_usec);
	ret = fstat(fd, &buf);
	ret = gettimeofday(&before, &zone);
	interval2 = (before.tv_usec - after.tv_usec);
	// C code ...
Linux Test results(time unit is usec):
	case      open      fstat
	----      ----      -----
	1         800       240
	1         757       239
	2         18175     245
	2         17240     246
Case 1: smbmount Linux/samba's share that contains 20,000 small files
        set 'case sensitive = Yes' in smb.conf   
Case 2: smbmount Windows's share that contains 20,000 small file
Test analysis:
1. Linux/samba performs much better than Windows as for open system call.
2. The 2 Operation systems have same performance with fstat system call.
But when VB test program in Windows is used, Windows has much better 
perpormance then linux/samba
	// VB code ...
	Set objFSO = CreateObject("Scripting.FileSystemObject")
	thistime = thisnow
	If objFSO.FileExists(fn) Then
		totle = totle & "Check file time " & CStr(thisnow -
thistime) + " ms" + vbCrLf
		thistime = thisnow
		Set objFile = objFSO.GetFile(fn)
		totle = totle & "Get object time " & CStr(thisnow -
thistime) + " ms" + vbCrLf
		thistime = thisnow
		temp = DateValue(CStr(objFile.DateLastModified))
		totle = totle & "Get date time " & CStr(thisnow - thistime)
+ " ms" + vbCrLf
	End If
	// VB code ...
Windows Test results(time unit is msec):
	case   FileExists  GetFile  DateLastModified
	----   ----------  -------  ----------------
	1         20       0        100
	1         0        0        120
	1         0        0        61
	2         30       10       0
	2         30       0        0
	2         10       0        0
Case 1: map Linux/samba's share that contains 20,000 small files
        set 'case sensitive = Yes' in smb.conf  
Case 2: map Windows's share that contains 20,000 small file
Can we get the follow conclusions as for check small file's property:
1. Windows client can get much better performance from Windows's share than
others'
2. Linux client can get much better performance from Linux/Samba's share
than others'
Best Regards!
Jacky Kim
.
>Hi,
>
>The next step is probably to capture some network traces of your
>benchmarking test and compare and contrast the NT and Samba backends.
>
>Playing with the negotiated protocol level might win you some more
>performance by simplifying the transactions.  It may lose you a lot too
>though.  (See max protocol option)
>
>There are the max xmit/socket options/stat cache tweakables.  If you
>have logging wound up on the linux box, that can affect performance
>greatly too.
>
>How does performance of the filesystem on the linux machine vary with
>number of files in the directory ?  Ie, take samba out of the equation
>and see if you're running into underlying issues with your file layout.
>
>Splitting your directory into a hierarchical structure is certainly more
>scalable than a flat directory.
>
>Mark
>