thr3ads.net - freebsd stable - filesystem full error with inumber [Jul 2006]

If this information is useful, please help other people find it:
Share via:

Feargal Reilly

2006-Jul-21 13:00 UTC

filesystem full error with inumber

The following error is being logged in /var/log/messages on
FreeBSD 5.4:

Jul 21 09:58:44 arwen kernel: pid 615 (postgres), uid 1001
inumber 6166128 on /data0: filesystem full

However, this does not appear to be a case of being out of disk
space, or running out of inodes:

ttyp2$ df -hi
Filesystem       Size    Used   Avail Capacity iused   ifree
%iused  Mounted on
/dev/amrd0s1f     54G     44G    5.4G    89% 4104458 3257972
56%   /data0

Nor does it appear to be a file limit:

ttyp2$ sysctl kern.maxfiles kern.openfiles
kern.maxfiles: 20000
kern.openfiles: 3582

These reading were not taken at exactly the same time as the
error occured, but close to it.

Here's the head of dumpfs:

magic   19540119 (UFS2) time    Fri Jul 21 09:38:40 2006
superblock location     65536   id      [ 42446884 99703062 ]
ncg     693     size    29360128        blocks  28434238
bsize   8192    shift   13      mask    0xffffe000
fsize   2048    shift   11      mask    0xfffff800
frag    4       shift   2       fsbtodb 2
minfree 8%      optim   time    symlinklen 120
maxbsize 8192   maxbpg  1024    maxcontig 16    contigsumsize 16
nbfree  563891  ndir    495168  nifree  3245588 nffree  19898
bpg     10597   fpg     42388   ipg     10624
nindir  1024    inopb   32      maxfilesize     8804691443711
sbsize  2048    cgsize  8192    csaddr  1372    cssize  12288
sblkno  36      cblkno  40      iblkno  44      dblkno  1372
cgrotor 322     fmod    0       ronly   0       clean   0
avgfpdir 64     avgfilesize 16384
flags   soft-updates 
fsmnt   /data0
volname         swuid   0

Now the server's main function in life is running postgres.
I first noticed this error during a maintainence run
which sequentially dumps and vacuums each individual database.
The are currently 117 databases, most of which are no more than
20M in size, but there are a few outliers, the largest of which
is 792M in size. The chunk of this is stored in a single 500+M
file, so I can't see this consuming all my inodes, even if
soft-updates weren't cleaning up, perhaps I'm wrong. It has
since been happening outside of those runs as well.

I have searched through various forums and list archives, and
while I have found a few references to this error, I have not
been able to find a cause and subsequent solution posted.

Looking through the source, the error is being logged by
ffs_fserr in sys/ufs/ffs/ffs_alloc.c It is being called either
by ffs_alloc or by ffs_realloccg after either of the following
conditions:

ffs_alloc {
...
retry:
  if (size == fs->fs_bsize && fs->fs_cstotal.cs_nbfree == 0)
                goto nospace;
            freespace(fs, fs->fs_minfree) - numfrags(fs, size) <
0) goto nospace;
...
nospace:
        if (fs->fs_pendingblocks > 0 && reclaimed == 0) {
                reclaimed = 1;
                softdep_request_cleanup(fs, ITOV(ip));
                goto retry;
        }
        ffs_fserr(fs, ip->i_number, "filesystem full");
}

My uninformed and uneducated reading of this is that it does not
think there are enough blocks free, yet that does not tally with
what df is telling me.

Looking again at dumpfs, it appears to say that this is formatted
with a block size of 8K, and a fragment size of 2K, but
tuning(7) says:

     FreeBSD performs best when using 8K or 16K file system
block sizes.  The default file system block size is 16K, which
provides best performance for most applications, with the
exception of those that perform random access on large files
(such as database server software).  Such applica- tions tend to
perform better with a smaller block size, although modern disk
characteristics are such that the performance gain from using a
smaller block size may not be worth consideration.  Using a
block size larger than 16K can cause fragmentation of the buffer
cache and lead to lower performance.

     The defaults may be unsuitable for a file system that
requires a very large number of i-nodes or is intended to hold a
large number of very small files.  Such a file system should be
created with an 8K or 4K block size.  This also requires you to
specify a smaller fragment size.  We recommend always using a
fragment size that is 1/8 the block size (less testing has been
done on other fragment size factors).

Reading this makes me think that when this server was installed,
the block size was dropped from the 16K default to 8K for
performance reasons, but the fragment size was not modified
accordingly.

Would this be the root of my problem? If so, is my only option
to back everything up and newfs the disk, or is there something
else I can do that will minimise my downtime?

Any help and advice would be greatly appreciated.

-Feargal.

-- 
Feargal Reilly, Chief Techie, FBI.
PGP Key: 0x105D7168 (expires: 2006-11-30)
Web: http://www.fbi.ie/ | Tel: +353.14988588 | Fax: +353.14988489
Communications House, 11 Sallymount Avenue, Ranelagh, Dublin 6.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url :
http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20060721/0af443c9/signature.pgp

Feargal Reilly

2006-Jul-21 13:01 UTC

head link

filesystem full error with inumber

The following error is being logged in /var/log/messages on
FreeBSD 5.4:

Jul 21 09:58:44 arwen kernel: pid 615 (postgres), uid 1001
inumber 6166128 on /data0: filesystem full

However, this does not appear to be a case of being out of disk
space, or running out of inodes:

ttyp2$ df -hi
Filesystem       Size    Used   Avail Capacity iused   ifree
%iused  Mounted on
/dev/amrd0s1f     54G     44G    5.4G    89% 4104458 3257972
56%   /data0

Nor does it appear to be a file limit:

ttyp2$ sysctl kern.maxfiles kern.openfiles
kern.maxfiles: 20000
kern.openfiles: 3582

These reading were not taken at exactly the same time as the
error occured, but close to it.

Here's the head of dumpfs:

magic   19540119 (UFS2) time    Fri Jul 21 09:38:40 2006
superblock location     65536   id      [ 42446884 99703062 ]
ncg     693     size    29360128        blocks  28434238
bsize   8192    shift   13      mask    0xffffe000
fsize   2048    shift   11      mask    0xfffff800
frag    4       shift   2       fsbtodb 2
minfree 8%      optim   time    symlinklen 120
maxbsize 8192   maxbpg  1024    maxcontig 16    contigsumsize 16
nbfree  563891  ndir    495168  nifree  3245588 nffree  19898
bpg     10597   fpg     42388   ipg     10624
nindir  1024    inopb   32      maxfilesize     8804691443711
sbsize  2048    cgsize  8192    csaddr  1372    cssize  12288
sblkno  36      cblkno  40      iblkno  44      dblkno  1372
cgrotor 322     fmod    0       ronly   0       clean   0
avgfpdir 64     avgfilesize 16384
flags   soft-updates 
fsmnt   /data0
volname         swuid   0

Now the server's main function in life is running postgres.
I first noticed this error during a maintainence run
which sequentially dumps and vacuums each individual database.
The are currently 117 databases, most of which are no more than
20M in size, but there are a few outliers, the largest of which
is 792M in size. The chunk of this is stored in a single 500+M
file, so I can't see this consuming all my inodes, even if
soft-updates weren't cleaning up, perhaps I'm wrong. It has
since been happening outside of those runs as well.

I have searched through various forums and list archives, and
while I have found a few references to this error, I have not
been able to find a cause and subsequent solution posted.

Looking through the source, the error is being logged by
ffs_fserr in sys/ufs/ffs/ffs_alloc.c It is being called either
by ffs_alloc or by ffs_realloccg after either of the following
conditions:

ffs_alloc {
...
retry:
  if (size == fs->fs_bsize && fs->fs_cstotal.cs_nbfree == 0)
                goto nospace;
            freespace(fs, fs->fs_minfree) - numfrags(fs, size) <
0) goto nospace;
...
nospace:
        if (fs->fs_pendingblocks > 0 && reclaimed == 0) {
                reclaimed = 1;
                softdep_request_cleanup(fs, ITOV(ip));
                goto retry;
        }
        ffs_fserr(fs, ip->i_number, "filesystem full");
}

My uninformed and uneducated reading of this is that it does not
think there are enough blocks free, yet that does not tally with
what df is telling me.

Looking again at dumpfs, it appears to say that this is formatted
with a block size of 8K, and a fragment size of 2K, but
tuning(7) says:

     FreeBSD performs best when using 8K or 16K file system
block sizes.  The default file system block size is 16K, which
provides best performance for most applications, with the
exception of those that perform random access on large files
(such as database server software).  Such applica- tions tend to
perform better with a smaller block size, although modern disk
characteristics are such that the performance gain from using a
smaller block size may not be worth consideration.  Using a
block size larger than 16K can cause fragmentation of the buffer
cache and lead to lower performance.

     The defaults may be unsuitable for a file system that
requires a very large number of i-nodes or is intended to hold a
large number of very small files.  Such a file system should be
created with an 8K or 4K block size.  This also requires you to
specify a smaller fragment size.  We recommend always using a
fragment size that is 1/8 the block size (less testing has been
done on other fragment size factors).

Reading this makes me think that when this server was installed,
the block size was dropped from the 16K default to 8K for
performance reasons, but the fragment size was not modified
accordingly.

Would this be the root of my problem? If so, is my only option
to back everything up and newfs the disk, or is there something
else I can do that will minimise my downtime?

Any help and advice would be greatly appreciated.

-Feargal.

-- 
Feargal Reilly, Chief Techie, FBI.
PGP Key: 0x105D7168 (expires: 2006-11-30)
Web: http://www.fbi.ie/ | Tel: +353.14988588 | Fax: +353.14988489
Communications House, 11 Sallymount Avenue, Ranelagh, Dublin 6.


-- 
Feargal Reilly.
PGP Key: 0x847DE4C8 (expires: 2006-11-30)
Web: http://www.helgrim.com/ | ICQ: 109837009 | YIM: ectoraige
Visit http://ie.bsd.net/ - BSDs presence in Ireland
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url :
http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20060721/50f8f82d/signature.pgp

Oliver Fromme

2006-Jul-24 15:14 UTC

head link

filesystem full error with inumber

Nobody else has answered so far, so I try to give it a shot ...

Feargal Reilly <feargal@fbi.ie> wrote:
 > The following error is being logged in /var/log/messages on
 > FreeBSD 5.4:
 > 
 > Jul 21 09:58:44 arwen kernel: pid 615 (postgres), uid 1001
 > inumber 6166128 on /data0: filesystem full
 > 
 > However, this does not appear to be a case of being out of disk
 > space, or running out of inodes:

The "filesystem full" error can happen in three cases:
1.  The file system is running out of data space.
2.  The file system is running out of inodes.
3.  The file system is running out of non-fragmented blocks.

The third case can only happen on extremely fragmented
file systems which happens very rarely, but maybe it's
a possible cause of your problem.

 > kern.maxfiles: 20000
 > kern.openfiles: 3582

Those have nothing to do with "filesystem full".

 > Looking again at dumpfs, it appears to say that this is formatted
 > with a block size of 8K, and a fragment size of 2K, but
 > tuning(7) says:  [...]
 > Reading this makes me think that when this server was installed,
 > the block size was dropped from the 16K default to 8K for
 > performance reasons, but the fragment size was not modified
 > accordingly.
 > 
 > Would this be the root of my problem?

I think a bsize/fsize ratio of 4/1 _should_ work, but it's
not widely used, so there might be bugs hidden somewhere.

 > If so, is my only option
 > to back everything up and newfs the disk, or is there something
 > else I can do that will minimise my downtime?

If you need to change bsize and/or fsize, then you will
have to backup and newfs, I'm afraid.

Best regards
   Oliver

-- 
Oliver Fromme,  secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing
Dienstleistungen mit Schwerpunkt FreeBSD: http://www.secnetix.de/bsd
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.

"UNIX was not designed to stop you from doing stupid things,
because that would also stop you from doing clever things."
        -- Doug Gwyn

Sven Willenberger

2006-Jul-26 19:23 UTC

head link

filesystem full error with inumber

Peter Jeremy presumably uttered the following on 07/26/06
15:00:> On Wed, 2006-Jul-26 13:07:19 -0400, Sven Willenberger wrote:
>> One of my machines that I recently upgraded to 6.1 (6.1-RELEASE-p3) is
also
>> exhibiting df reporting wrong data usage numbers.
> 
> What did you upgrade from?
> Is this UFS1 or UFS2?
> Does a full fsck fix the problem?
> 
This was an upgrade from a 5.x system (UFS2); a full fsck did in fact fix the
problem (for now).

Thanks,

Sven

Oliver Fromme

2006-Jul-27 15:38 UTC

head link

filesystem full error with inumber

Sven Willenberger wrote:
 > This was an upgrade from a 5.x system (UFS2); a full fsck did in fact fix
the
 > problem (for now).

Because of past experience I recommend that you disable
background fsck (it has a switch in /etc/rc.conf).  There
are failure scenarios with background fsck that can lead
to symptoms similar to what you have experienced.

Best regards
   Oliver

-- 
Oliver Fromme,  secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing
Dienstleistungen mit Schwerpunkt FreeBSD: http://www.secnetix.de/bsd
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.

"C++ is the only current language making COBOL look good."
        -- Bertrand Meyer

freebsd stable - Jul 2006 - filesystem full error with inumber

filesystem full error with inumber

filesystem full error with inumber

filesystem full error with inumber

filesystem full error with inumber

filesystem full error with inumber