Hi, On Wed, Feb 22, 2012 at 2:56 PM, Jack Vogel <jfvogel@gmail.com> wrote:> Using igb and/or ixgbe on a reasonably powered server requires 1K mbuf > clusters per MSIX vector, > that's how many are in a ring. Either driver will configure 8 queues on a > system with that many or more > cores, so 8K clusters per port... > > My test engineer has a system with 2 igb ports, and 2 10G ixgbe, this is > hardly heavy duty, and yet this > exceeds the default mbuf pool on the installed kernel (1024 + maxusers * > 64). > > Now, this can be immediately fixed by a sysadmin after that first boot, but > it does result in the second > driver that gets started to complain about inadequate buffers. > > I think the default calculation is dated and should be changed, but am not > sure the best way, so are > there suggestions/opinions about this, and might we get it fixed before 8.3 > is baked? >get rid of the limit once and for all, it is pointless. - Arnaud
Using igb and/or ixgbe on a reasonably powered server requires 1K mbuf clusters per MSIX vector, that's how many are in a ring. Either driver will configure 8 queues on a system with that many or more cores, so 8K clusters per port... My test engineer has a system with 2 igb ports, and 2 10G ixgbe, this is hardly heavy duty, and yet this exceeds the default mbuf pool on the installed kernel (1024 + maxusers * 64). Now, this can be immediately fixed by a sysadmin after that first boot, but it does result in the second driver that gets started to complain about inadequate buffers. I think the default calculation is dated and should be changed, but am not sure the best way, so are there suggestions/opinions about this, and might we get it fixed before 8.3 is baked? Cheers, Jack
On Wed, Feb 22, 2012 at 11:56:29AM -0800, Jack Vogel wrote:> Using igb and/or ixgbe on a reasonably powered server requires 1K mbuf > clusters per MSIX vector, > that's how many are in a ring. Either driver will configure 8 queues on a > system with that many or more > cores, so 8K clusters per port... > > My test engineer has a system with 2 igb ports, and 2 10G ixgbe, this is > hardly heavy duty, and yet this > exceeds the default mbuf pool on the installed kernel (1024 + maxusers * > 64). > > Now, this can be immediately fixed by a sysadmin after that first boot, but > it does result in the second > driver that gets started to complain about inadequate buffers. > > I think the default calculation is dated and should be changed, but am not > sure the best way, so are > there suggestions/opinions about this, and might we get it fixed before 8.3 > is baked?I have hit this problem recently, too. Maybe the issue mostly/only exists on 32-bit systems. Here is a possible approach: 1. nmbclusters consume the kernel virtual address space so there must be some upper limit, say VM_LIMIT = 256000 (translates to 512MB of address space) 2. also you don't want the clusters to take up too much of the available memory. This one would only trigger for minimal-memory systems, or virtual machines, but still... MEM_LIMIT = (physical_ram / 2) / 2048 3. one may try to set a suitably large, desirable number of buffers TARGET_CLUSTERS = 128000 4. and finally we could use the current default as the absolute minimum MIN_CLUSTERS = 1024 + maxusers*64 Then at boot the system could say nmbclusters = min(TARGET_CLUSTERS, VM_LIMIT, MEM_LIMIT) nmbclusters = max(nmbclusters, MIN_CLUSTERS) In turn, i believe interfaces should do their part and by default never try to allocate more than a fraction of the total number of buffers, if necessary reducing the number of active queues. what do people think ? cheers luigi
On Wed, Feb 22, 2012 at 09:09:46PM +0000, Ben Hutchings wrote:> On Wed, 2012-02-22 at 21:52 +0100, Luigi Rizzo wrote:...> > I have hit this problem recently, too. > > Maybe the issue mostly/only exists on 32-bit systems. > > No, we kept hitting mbuf pool limits on 64-bit systems when we started > working on FreeBSD support.ok never mind then, the mechanism would be the same, though the limits (especially VM_LIMIT) would be different.> > Here is a possible approach: > > > > 1. nmbclusters consume the kernel virtual address space so there > > must be some upper limit, say > > > > VM_LIMIT = 256000 (translates to 512MB of address space) > > > > 2. also you don't want the clusters to take up too much of the available > > memory. This one would only trigger for minimal-memory systems, > > or virtual machines, but still... > > > > MEM_LIMIT = (physical_ram / 2) / 2048 > > > > 3. one may try to set a suitably large, desirable number of buffers > > > > TARGET_CLUSTERS = 128000 > > > > 4. and finally we could use the current default as the absolute minimum > > > > MIN_CLUSTERS = 1024 + maxusers*64 > > > > Then at boot the system could say > > > > nmbclusters = min(TARGET_CLUSTERS, VM_LIMIT, MEM_LIMIT) > > > > nmbclusters = max(nmbclusters, MIN_CLUSTERS) > > > > > > In turn, i believe interfaces should do their part and by default > > never try to allocate more than a fraction of the total number > > of buffers, > > Well what fraction should that be? It surely depends on how many > interfaces are in the system and how many queues the other interfaces > have.> > if necessary reducing the number of active queues. > > So now I have too few queues on my interface even after I increase the > limit. > > There ought to be a standard way to configure numbers of queues and > default queue lengths.Jack raised the problem that there is a poorly chosen default for nmbclusters, causing one interface to consume all the buffers. If the user explicitly overrides the value then the number of cluster should be what the user asks (memory permitting). The next step is on devices: if there are no overrides, the default for a driver is to be lean. I would say that topping the request between 1/4 and 1/8 of the total buffers is surely better than the current situation. Of course if there is an explicit override, then use it whatever happens to the others. cheers luigi