CentOS 6.4 x86_64 Kernel: 2.6.32-358.14.1.el6.x86_64 I have been noticing repeatedly that after a couple of weeks of uptime my NFS server starts to generate the following error: ------------[ cut here ]------------ WARNING: at lib/list_debug.c:26 __list_add+0x6d/0xa0() (Tainted: G W --- ------------ ) Hardware name: ProLiant DL380 G7 list_add corruption. next->prev should be prev (ffff88031930bde8), but was ffff8 803e9401888. (next=ffff8803e9401888). Modules linked in: bridge mptctl mptbase nfsd lockd nfs_acl auth_rpcgss exportfs dlm configfs sunrpc ipmi_devintf cpufreq_ondemand freq_table pcc_cpufreq bondin g 8021q garp stp llc ipv6 power_meter sg microcode serio_raw iTCO_wdt iTCO_vendo r_support hpilo hpwdt bnx2 i7core_edac edac_core shpchp ext4 mbcache jbd2 sd_mod crc_t10dif hpsa radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_mirror d m_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Pid: 3269, comm: nfsd Tainted: G W --------------- 2.6.32-358.14.1.el 6.x86_64 #1 Call Trace: [<ffffffff8106e307>] ? warn_slowpath_common+0x87/0xc0 [<ffffffff8106e3f6>] ? warn_slowpath_fmt+0x46/0x50 [<ffffffffa01e249f>] ? ext4_mark_iloc_dirty+0x35f/0x5a0 [ext4] [<ffffffff812890fd>] ? __list_add+0x6d/0xa0 [<ffffffffa01eb77f>] ? ext4_orphan_add+0x13f/0x1f0 [ext4] [<ffffffffa01eefbb>] ? ext4_rename+0x58b/0x750 [ext4] [<ffffffff8119009b>] ? vfs_rename+0x3ab/0x440 [<ffffffffa03e3aca>] ? nfsd_rename+0x47a/0x4d0 [nfsd] [<ffffffffa03eb4f1>] ? nfsd3_proc_rename+0xd1/0x1a0 [nfsd] [<ffffffffa03ecb85>] ? decode_fh+0x55/0x80 [nfsd] [<ffffffffa03ecddc>] ? decode_filename+0x1c/0x70 [nfsd] [<ffffffffa03dd43e>] ? nfsd_dispatch+0xfe/0x240 [nfsd] [<ffffffffa033b614>] ? svc_process_common+0x344/0x640 [sunrpc] [<ffffffff81063330>] ? default_wake_function+0x0/0x20 [<ffffffffa033bc50>] ? svc_process+0x110/0x160 [sunrpc] [<ffffffffa03ddb62>] ? nfsd+0xc2/0x160 [nfsd] [<ffffffffa03ddaa0>] ? nfsd+0x0/0x160 [nfsd] [<ffffffff81096956>] ? kthread+0x96/0xa0 [<ffffffff8100c0ca>] ? child_rip+0xa/0x20 [<ffffffff810968c0>] ? kthread+0x0/0xa0 [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 ---[ end trace 9af8b68bfc4bdd49 ]--- After a reboot all is okay. Searched, but could not find anything recent that would apply to the errors that I am seeing. Does anyone have a clue as to what could be causing this error? Thank you, Sajesh
> -----Original Message----- > From: Barbara Krasovec [mailto:barbarak at arnes.si] > Sent: Friday, September 13, 2013 8:48 AM > To: Sajesh Singh > Subject: Re: [CentOS] Errors on NFS server > > We had similar errors, but installed 3.8.10 kernel from elrepo on the > machine (also HP Proliant DL380 G7). NFS seems to work much better on > that kernel. We see no such errors. > > Cheers, > BarbaraBarbara, Thank you for the info. Did you have any issues running the HP drivers on the server after installing the 3.8.10 kernel? -Sajesh-