All, A major performance regression was introduced to the CAM subsystem in FreeBSD 7.1. The following configurations are known to be affected: VMWare ESX VMWare Fusion (using bt or lsilogic controller options) HP CISS RAID Some MPT-SAS combinations with SATA drives attached (Includes Dell SAS5/ir, but not PERC5/PERC6). Pure SCSI and SAS subsystems likely are NOT affected. Any hardware that uses the 'ata' driver is also definitely NOT affected. To determine if your installation is affected, run the following command as root: camcontrol tags da0 Substitute 'da0' with another appropriate drive device number, if needed. Note that this ONLY AFFECTS 'da' DEVICES. If your disks are 'ad' devices, they are NOT affected. The result from running this command should be an output similar to the following: (pass0:mpt0:0:8:0): device openings: 255 If, instead, it reports a value of '1', you are likely affected. Note that it may be normal for USB memory devices to report a low number. Also, many legacy SCSI disks, and devices that are not disks, may also be expected to report a low number. The effect of this problem is that only one I/O command will be issued to the controller and disk at a time, instead of overlapping multiple commands in parallel. This causes significantly higher latency in servicing moderate and heavy I/O workloads, leading to very poor performance. Performance can be easily compared by downgrading to FreeBSD 7.0. I have committed a fix for this problem for FreeBSD 8-CURRENT as of SVN revision 188570. FreeBSD 7-STABLE will be updated with the fix in a few days once I've gotten confirmation that the fix works and doesn't cause any adverse side-effects. Anyone wanting to help in this validation effort should apply the attached patch to their kernel source tree and recompile. Please contact me directly by email to report if the problem is fixed for you. If the validation process goes smoothly, I will work with the release engineering team to turn this fix into an official errata update for FreeBSD 7.1. Thanks in advance for your help. Scott -------------- next part -------------- A non-text attachment was scrubbed... Name: cam_tags.diff Type: text/x-diff Size: 429 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20090213/ae382247/cam_tags.bin
On Fri, 2009-02-13 at 03:55 -0700, Scott Long wrote:> All, > > A major performance regression was introduced to the CAM subsystem in > FreeBSD 7.1. The following configurations are known to be affected: > > VMWare ESX > VMWare Fusion > (using bt or lsilogic controller options) > HP CISS RAID > Some MPT-SAS combinations with SATA drives attached > (Includes Dell SAS5/ir, but not PERC5/PERC6). > > Pure SCSI and SAS subsystems likely are NOT affected. Any hardware > that uses the 'ata' driver is also definitely NOT affected. To > determine if your installation is affected, run the following command as > root: > > camcontrol tags da0 > > Substitute 'da0' with another appropriate drive device number, if > needed. Note that this ONLY AFFECTS 'da' DEVICES. If your disks are > 'ad' devices, they are NOT affected. > > The result from running this command should be an output similar to the > following: > > (pass0:mpt0:0:8:0): device openings: 255 > > If, instead, it reports a value of '1', you are likely affected. Note > that it may be normal for USB memory devices to report a low number. > Also, many legacy SCSI disks, and devices that are not disks, may also > be expected to report a low number. > > The effect of this problem is that only one I/O command will be issued > to the controller and disk at a time, instead of overlapping multiple > commands in parallel. This causes significantly higher latency in > servicing moderate and heavy I/O workloads, leading to very poor > performance. Performance can be easily compared by downgrading to > FreeBSD 7.0. > > I have committed a fix for this problem for FreeBSD 8-CURRENT as of SVN > revision 188570. FreeBSD 7-STABLE will be updated with the fix in a few > days once I've gotten confirmation that the fix works and doesn't cause > any adverse side-effects. Anyone wanting to help in this validation > effort should apply the attached patch to their kernel source tree and > recompile. Please contact me directly by email to report if the problem > is fixed for you. > > If the validation process goes smoothly, I will work with the release > engineering team to turn this fix into an official errata update for > FreeBSD 7.1. > > Thanks in advance for your help. > > Scott >Hi Scott I have one da0 device, a USB attached hard disk: umass0: <LaCie LaCie Hard Drive USB, class 0/0, rev 2.00/0.00, addr 2> on uhub6 da0 at umass-sim0 bus 0 target 0 lun 0 da0: <SAMSUNG SP2514N VF10> Fixed Direct Access SCSI-2 device da0: 40.000MB/s transfers da0: 238475MB (488397168 512 byte sectors: 255H 63S/T 30401C) camcontrol shows:> $ sudo camcontrol tags da0(pass0:umass-sim0:0:0:0): device openings: 1 Is that to be expected? This is RELENG_7 from October '08: FreeBSD strangepork.mintel.co.uk 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #0: Wed Oct 22 02:25:56 BST 2008 root@sweetpork.pc.mintel.co.uk:/usr/FreeBSD/RELENG_7/obj/usr/FreeBSD/RELENG_7/src/sys/STRANGEPORK i386 Thanks Tom
Ivan Voras wrote:> Scott Long wrote: > >> I have committed a fix for this problem for FreeBSD 8-CURRENT as of SVN >> revision 188570. FreeBSD 7-STABLE will be updated with the fix in a few >> days once I've gotten confirmation that the fix works and doesn't cause >> any adverse side-effects. Anyone wanting to help in this validation >> effort should apply the attached patch to their kernel source tree and >> recompile. Please contact me directly by email to report if the problem >> is fixed for you. > > I notice that write performance on an ESXi 3.5 hosted system is doubled, > but read performance remains the same (in bonnie++). > On a CISS system there is no significant change. >bonnie is an unreliable tool for measuring performance. Scott
Svein Skogen (listmail account)
2009-Feb-14 09:31 UTC
HEADS UP: Major CAM performance regression
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Scott Long wrote:> All, > > A major performance regression was introduced to the CAM subsystem in > FreeBSD 7.1. The following configurations are known to be affected: > > VMWare ESX > VMWare Fusion > (using bt or lsilogic controller options) > HP CISS RAID > Some MPT-SAS combinations with SATA drives attached > (Includes Dell SAS5/ir, but not PERC5/PERC6). > > Pure SCSI and SAS subsystems likely are NOT affected. Any hardware > that uses the 'ata' driver is also definitely NOT affected. To > determine if your installation is affected, run the following command as > root: > > camcontrol tags da0 > > Substitute 'da0' with another appropriate drive device number, if > needed. Note that this ONLY AFFECTS 'da' DEVICES. If your disks are > 'ad' devices, they are NOT affected. > > The result from running this command should be an output similar to the > following: > > (pass0:mpt0:0:8:0): device openings: 255 > > If, instead, it reports a value of '1', you are likely affected. Note > that it may be normal for USB memory devices to report a low number. > Also, many legacy SCSI disks, and devices that are not disks, may also > be expected to report a low number. > > The effect of this problem is that only one I/O command will be issued > to the controller and disk at a time, instead of overlapping multiple > commands in parallel. This causes significantly higher latency in > servicing moderate and heavy I/O workloads, leading to very poor > performance. Performance can be easily compared by downgrading to > FreeBSD 7.0.Any estimate on when this will be MFC'ed down to RELENG_7 yet? //Svein - -- - --------+-------------------+------------------------------- /"\ |Svein Skogen | svein@d80.iso100.no \ / |Solberg ?stli 9 | PGP Key: 0xE5E76831 X |2020 Skedsmokorset | svein@jernhuset.no / \ |Norway | PGP Key: 0xCE96CE13 | | svein@stillbilde.net ascii | | PGP Key: 0x58CD33B6 ribbon |System Admin | svein-listmail@stillbilde.net Campaign|stillbilde.net | PGP Key: 0x22D494A4 +-------------------+------------------------------- |msn messenger: | Mobile Phone: +47 907 03 575 |svein@jernhuset.no | RIPE handle: SS16503-RIPE - --------+-------------------+------------------------------- Picture Gallery: https://gallery.stillbilde.net/v/svein/ - ------------------------------------------------------------ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkmW+/IACgkQODUnwSLUlKTVuACgpk70v7d6hyBmvIdaFhLsDA01 nqIAoJkljSXU+TRb7tl9xM8EEerFeMGz =0mNQ -----END PGP SIGNATURE-----
Hi Scott, I just tried this on 7.1-p2 with an Areca (arcmsr) controller with SATA drives attached to see if it fixed the performance problem I noticed back in December 2008. See: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=43971+0+archive/2008/freebsd-stable/20081207.freebsd-stable The performance is still terrible. Interestingly, running your camcontrol command returns "device openings" of 1 on 7.1, and 255 on 6.4, so it seems to be the same underlying problem. I am happy to try other patches. Thanks, Jan Mikkelsen Scott Long wrote:> All, > > A major performance regression was introduced to the CAM subsystem in > FreeBSD 7.1. The following configurations are known to be affected: > > VMWare ESX > VMWare Fusion > (using bt or lsilogic controller options) > HP CISS RAID > Some MPT-SAS combinations with SATA drives attached > (Includes Dell SAS5/ir, but not PERC5/PERC6). > > Pure SCSI and SAS subsystems likely are NOT affected. Any hardware > that uses the 'ata' driver is also definitely NOT affected. To > determine if your installation is affected, run the following command as > root: > > camcontrol tags da0 > > Substitute 'da0' with another appropriate drive device number, if > needed. Note that this ONLY AFFECTS 'da' DEVICES. If your disks are > 'ad' devices, they are NOT affected. > > The result from running this command should be an output similar to the > following: > > (pass0:mpt0:0:8:0): device openings: 255 > > If, instead, it reports a value of '1', you are likely affected. Note > that it may be normal for USB memory devices to report a low number. > Also, many legacy SCSI disks, and devices that are not disks, may also > be expected to report a low number. > > The effect of this problem is that only one I/O command will be issued > to the controller and disk at a time, instead of overlapping multiple > commands in parallel. This causes significantly higher latency in > servicing moderate and heavy I/O workloads, leading to very poor > performance. Performance can be easily compared by downgrading to > FreeBSD 7.0. > > I have committed a fix for this problem for FreeBSD 8-CURRENT as of SVN > revision 188570. FreeBSD 7-STABLE will be updated with the fix in a few > days once I've gotten confirmation that the fix works and doesn't cause > any adverse side-effects. Anyone wanting to help in this validation > effort should apply the attached patch to their kernel source tree and > recompile. Please contact me directly by email to report if the problem > is fixed for you. > > If the validation process goes smoothly, I will work with the release > engineering team to turn this fix into an official errata update for > FreeBSD 7.1. > > Thanks in advance for your help. > > Scott > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
At 05:55 AM 2/13/2009, Scott Long wrote:>If, instead, it reports a value of '1', you are likely affected. Note >that it may be normal for USB memory devices to report a low number. >Also, many legacy SCSI disks, and devices that are not disks, may >also be expected to report a low number.Hi Scott, I tested with the patch on my areca controller, and it still reports 1 post patch. (On RELENG_6, it shows 255 with the same controller) ---Mike
Scott Long ha scritto: Hello.> The following configurations are known to be affected: > > VMWare ESX > VMWare Fusion > (using bt or lsilogic controller options) > HP CISS RAID > Some MPT-SAS combinations with SATA drives attached > (Includes Dell SAS5/ir, but not PERC5/PERC6).Does it holds for any of these? Or do you require a combination of factors? I ask because I have two identical HP machines, one running 7.1p2/amd64, the other still at 7.1-PRERELEASE/amd64 and on both I get: # camcontrol tags da0 (pass1:ciss0:0:0:0): device openings: 254 So it looks like I'm not affected, although I have a ciss RAID. ??? bye & Thanks av.
Mike Tancsa wrote:> At 05:55 AM 2/13/2009, Scott Long wrote: > >> If, instead, it reports a value of '1', you are likely affected. Note >> that it may be normal for USB memory devices to report a low number. >> Also, many legacy SCSI disks, and devices that are not disks, may also >> be expected to report a low number. > > Hi Scott, > I tested with the patch on my areca controller, and it still > reports 1 post patch. (On RELENG_6, it shows 255 with the same controller)I can report a "metoo" on a 7.0-RELEASE-p3 machine with an Areca ARC-1212 card and SATA drives. "camcontrol tags da0" reports: (pass0:arcmsr0:0:0:0): device openings: 1 The machine is just a dog sometimes. Haven't tried the patch though. Barry