Slawa Olhovchenkov
2016-Mar-06 21:27 UTC
kernel: mps0: Out of chain frames, consider increasing hw.mps.max_chains.
On Sun, Mar 06, 2016 at 01:10:42PM -0800, Scott Long wrote:> Hi, > > The message is harmless, it's a reminder that you should tune the kernel for your workload. When the message is triggered, it means that a potential command was deferred, likely for only a few microseconds, and then everything moved on as normal. > > A command uses anywhere from 0 to a few dozen chain frames per I/o, depending on the size of the io. The chain frame memory is allocated at boot so that it's always available, not allocated on the fly. When I wrote this driver, I felt that it would be wasteful to reserve memory for a worst case scenario of all large io's by default, so I put in this deferral system with a console reminder to for tuning. > > Yes, you actually do have 900 io's outstanding. The controller buffers the io requests and allows the system to queue up much more than what sata disks might allow on their own. It's debatable if this is good or bad, but it's tunable as well. > > Anyways, the messages should not cause alarm. Either tune up the chain frame count, or tune down the max io count.I am don't know depends or not, but I see dramaticaly performance drop at time of this messages. How I can calculate buffers numbers? I am have very heavy I/O. This allocated one for all controllers, or allocated for every controller?> > On Mar 6, 2016, at 11:45 AM, Slawa Olhovchenkov <slw at zxy.spb.ru> wrote: > > > > I am use 10-STABLE r295539 and LSI SAS2008. > > > > mps0: <Avago Technologies (LSI) SAS2008> port 0x8000-0x80ff mem 0xdfc00000-0xdfc03fff,0xdfb80000-0xdfbbffff irq 32 at device 0.0 on pci2 > > mps0: Firmware: 15.00.00.00, Driver: 20.00.00.00-fbsd > > mps0: IOCCapabilities: 185c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,IR> > > mps1: <Avago Technologies (LSI) SAS2008> port 0x7000-0x70ff mem 0xdf600000-0xdf603fff,0xdf580000-0xdf5bffff irq 34 at device 0.0 on pci3 > > mps1: Firmware: 17.00.01.00, Driver: 20.00.00.00-fbsd > > mps1: IOCCapabilities: 185c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,IR> > > mps2: <Avago Technologies (LSI) SAS2008> port 0xf000-0xf0ff mem 0xfba00000-0xfba03fff,0xfb980000-0xfb9bffff irq 50 at device 0.0 on pci129 > > mps2: Firmware: 15.00.00.00, Driver: 20.00.00.00-fbsd > > mps2: IOCCapabilities: 185c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,IR> > > mps3: <Avago Technologies (LSI) SAS2008> port 0xe000-0xe0ff mem 0xfb400000-0xfb403fff,0xfb380000-0xfb3bffff irq 56 at device 0.0 on pci130 > > mps3: Firmware: 15.00.00.00, Driver: 20.00.00.00-fbsd > > mps3: IOCCapabilities: 185c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,IR> > > > > Some time ago I am see in log messages like this: > > > > Mar 6 22:28:27 edge02 kernel: mps3: Out of chain frames, consider increasing hw.mps.max_chains. > > Mar 6 22:28:28 edge02 kernel: mps1: Out of chain frames, consider increasing hw.mps.max_chains. > > Mar 6 22:28:28 edge02 kernel: mps0: Out of chain frames, consider increasing hw.mps.max_chains. > > Mar 6 22:29:39 edge02 kernel: mps0: Out of chain frames, consider increasing hw.mps.max_chains. > > Mar 6 22:30:07 edge02 kernel: mps3: Out of chain frames, consider increasing hw.mps.max_chains. > > Mar 6 22:30:09 edge02 kernel: mps1: Out of chain frames, consider increasing hw.mps.max_chains. > > > > This is peak hour. I am try to monitoring: > > > > root at edge02:/ # sysctl dev.mps | grep -e chain_free: -e io_cmds_active > > dev.mps.3.chain_free: 70 > > dev.mps.3.io_cmds_active: 901 > > dev.mps.2.chain_free: 504 > > dev.mps.2.io_cmds_active: 725 > > dev.mps.1.chain_free: 416 > > dev.mps.1.io_cmds_active: 896 > > dev.mps.0.chain_free: 39 > > dev.mps.0.io_cmds_active: 12 > > root at edge02:/ # sysctl dev.mps | grep -e chain_free: -e io_cmds_active > > dev.mps.3.chain_free: 412 > > dev.mps.3.io_cmds_active: 572 > > dev.mps.2.chain_free: 718 > > dev.mps.2.io_cmds_active: 687 > > dev.mps.1.chain_free: 211 > > dev.mps.1.io_cmds_active: 906 > > dev.mps.0.chain_free: 65 > > dev.mps.0.io_cmds_active: 144 > > root at edge02:/ # sysctl dev.mps | grep -e chain_free: -e io_cmds_active > > dev.mps.3.chain_free: 500 > > dev.mps.3.io_cmds_active: 629 > > dev.mps.2.chain_free: 623 > > dev.mps.2.io_cmds_active: 676 > > dev.mps.1.chain_free: 251 > > dev.mps.1.io_cmds_active: 907 > > dev.mps.0.chain_free: 139 > > dev.mps.0.io_cmds_active: 144 > > > > [...] > > > > root at edge02:/ # sysctl dev.mps | grep -e chain_free: -e io_cmds_active > > dev.mps.3.chain_free: 1874 > > dev.mps.3.io_cmds_active: 78 > > dev.mps.2.chain_free: 1888 > > dev.mps.2.io_cmds_active: 64 > > dev.mps.1.chain_free: 1922 > > dev.mps.1.io_cmds_active: 42 > > dev.mps.0.chain_free: 1936 > > dev.mps.0.io_cmds_active: 48 > > root at edge02:/ # sysctl dev.mps | grep -e chain_free: -e io_cmds_active > > dev.mps.3.chain_free: 1890 > > dev.mps.3.io_cmds_active: 78 > > dev.mps.2.chain_free: 1890 > > dev.mps.2.io_cmds_active: 82 > > dev.mps.1.chain_free: 1729 > > dev.mps.1.io_cmds_active: 150 > > dev.mps.0.chain_free: 1893 > > dev.mps.0.io_cmds_active: 57 > > > > What this mean? Why with allocated 2048 chains per controller I see 65 free and 144 allocated? > > How I got 976 active commands? I am use SATA HDD suported only 32 tags on NCQ, with 8 ports > > this maximum 256 outstanding commands per controller. > > > > How I can resolve this issue? > > _______________________________________________ > > freebsd-stable at freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > > To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
Scott Long
2016-Mar-07 02:20 UTC
kernel: mps0: Out of chain frames, consider increasing hw.mps.max_chains.
> On Mar 6, 2016, at 1:27 PM, Slawa Olhovchenkov <slw at zxy.spb.ru> wrote: > > On Sun, Mar 06, 2016 at 01:10:42PM -0800, Scott Long wrote: > >> Hi, >> >> The message is harmless, it's a reminder that you should tune the kernel for your workload. When the message is triggered, it means that a potential command was deferred, likely for only a few microseconds, and then everything moved on as normal. >> >> A command uses anywhere from 0 to a few dozen chain frames per I/o, depending on the size of the io. The chain frame memory is allocated at boot so that it's always available, not allocated on the fly. When I wrote this driver, I felt that it would be wasteful to reserve memory for a worst case scenario of all large io's by default, so I put in this deferral system with a console reminder to for tuning. >> >> Yes, you actually do have 900 io's outstanding. The controller buffers the io requests and allows the system to queue up much more than what sata disks might allow on their own. It's debatable if this is good or bad, but it's tunable as well. >> >> Anyways, the messages should not cause alarm. Either tune up the chain frame count, or tune down the max io count. > > I am don't know depends or not, but I see dramaticaly performance drop > at time of this messages. >Good to know. Part of the performance drop might be because of the slowness of printing to the console.> How I can calculate buffers numbers?If your system is new enough to have mpsutil, please run it ?mpsutil show iocfacts?. If not, then boot your system with bootverbose and send me the output.> I am have very heavy I/O.Out of curiosity, do you redefine MAXPHYS/DFLTPHYS in your kernel config?> This allocated one for all controllers, or allocated for every controller?It?s per-controller. I?ve thought about making the tuning be dynamic at runtime. I implemented similar dynamic tuning for other drivers, but it seemed overly complex for low benefit. Implementing it for this driver would be possible but require some significant code changes. Scott