thr3ads.net - CentOS - [CentOS] DRBD 8.2 crashes CentOS 5.2 on rsync from remote host [Aug 2008]

If this information is useful, please help other people find it:
Share via:

Chris Miller

2008-Aug-13 21:54 UTC

[CentOS] DRBD 8.2 crashes CentOS 5.2 on rsync from remote host

I've got a pair of HA servers I'm trying to get into production.
Here are some specs :

Xeon X3210 Quad Core (aka Core 2 Quad) 2.13Ghz (four logical
processors, no Hyper Threading)
4GB memory
Hardware (3ware) Raid 1 mirror, 2 x Seagate 750GB SATA2
650GB DRBD partition run on top of an LVM2 partition.

CentOS 5.2 2.6.18-92.1.6.el5.centos.plus
DRBD 8.2 (drbd82-8.2.6-1.el5.centos)
Kernel Module kmod-drbd82-8.2.6-1.2.6.18_92.1.6.el5.centos.plus

I've been trying to rsync data from a remote server and it's crashed
a couple of times now. It does not happen immediately, but over
time. I connected a serial console and got the below panic message.
The last file copied was ~1GB in size, but previous files up to 4GB
had been copied. I do not have kernel core dumping enabled, but
that's a possibility if needed. Not sure if this is a bug or is
caused by something I've done. This isn't my first DRBD install
(although first on top of LVM) and I believe I've gotten everything
setup correctly. I did have a full sync rate (110M) enabled over
Gbe, if that's relevant. Thoughts?

Regards,
	Chris

[root at haws2 ~]# pvscan
   PV /dev/sda2   VG VolGroup00   lvm2 [698.28 GB / 0    free]
   Total: 1 [698.28 GB] / in use: 1 [698.28 GB] / in no VG: 0 [0   ]

[root at haws2 ~]# lvscan
   ACTIVE            '/dev/VolGroup00/LogVol00' [39.06 GB] inherit
   ACTIVE            '/dev/VolGroup00/LogVol02' [658.72 GB] inherit
   ACTIVE            '/dev/VolGroup00/LogVol01' [512.00 MB] inherit

[root at haws2 ~]# df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                       39676508   1938880  35689628   6% /
/dev/sda1               194442     23650    160753  13% /boot
tmpfs                  1684156         0   1684156   0% /dev/shm
/dev/drbd0           679824572 113321224 531970212  18% /home


[root at haws1 ~]# BUG: unable to handle kernel paging request at
virtual address c
  printing eip:
c04e9291
*pde = 00000000
Oops: 0000 [#1]
SMP
last sysfs file: /devices/pci0000:00/0000:00:00.0/irq
Modules linked in: softdog drbd(U) autofs4 hidp rfcomm l2cap
bluetooth sunrpc id
CPU:    0
EIP:    0060:[<c04e9291>]    Tainted: G      VLI
EFLAGS: 00010046   (2.6.18-92.1.6.el5.centos.plus #1)
EIP is at list_del+0x25/0x5c
eax: fe187128   ebx: f04a6ab8   ecx: f04a6a8c   edx: f04a6a8c
esi: fe187128   edi: f4e355a0   ebp: f426c800   esp: f385df3c
ds: 007b   es: 007b   ss: 0068
Process drbd0_asender (pid: 2900, ti=f385d000 task=f4932000
task.ti=f385d000)
Stack: 000000e6 f8d1953b 00000000 f04a6a8c 000000e6 00000001
ee187b14 00000046
        f49e7bc0 f04a6ab8 f04a6a8c f426c800 fe187128 f4e355a0
0000349f f8d24805
        00000800 f426c800 f426c800 00000008 f426c9f4 f8d14d47
f385dfbc f8d15fbc
Call Trace:
  [<f8d1953b>] _req_may_be_done+0x4ea/0x710 [drbd]
  [<f8d24805>] tl_release+0x35/0x172 [drbd]
  [<f8d14d47>] got_BarrierAck+0x10/0x6b [drbd]
  [<f8d15fbc>] drbd_asender+0x3b1/0x4e7 [drbd]
  [<f8d24a53>] drbd_thread_setup+0x0/0x14e [drbd]
  [<f8d24adb>] drbd_thread_setup+0x88/0x14e [drbd]
  [<f8d24a53>] drbd_thread_setup+0x0/0x14e [drbd]
  [<c0405c3b>] kernel_thread_helper+0x7/0x10
  ======================Code: 89 c3 eb eb 90 90 53 89 c3 8b 40 04 8b 00 39 d8 74
17 50 53 68
9b 9a 63 c
EIP: [<c04e9291>] list_del+0x25/0x5c SS:ESP 0068:f385df3c
  <0>Kernel panic - not syncing: Fatal exception
  BUG: warning at arch/i386/kernel/smp.c:550/smp_call_function()
(Tainted: G    )
  [<c0417ae0>] stop_this_cpu+0x0/0x33
  [<c04178cf>] smp_call_function+0x57/0xc3
  [<c0426682>] printk+0x18/0x8e
  [<c041794e>] smp_send_stop+0x13/0x1c
  [<c0425c53>] panic+0x4c/0x16d
  [<c04064dd>] die+0x25d/0x291
  [<c060c48b>] do_page_fault+0x3ea/0x4b8
  [<c060c0a1>] do_page_fault+0x0/0x4b8
  [<c0405a71>] error_code+0x39/0x40
  [<c04e9291>] list_del+0x25/0x5c
  [<f8d1953b>] _req_may_be_done+0x4ea/0x710 [drbd]
  [<f8d24805>] tl_release+0x35/0x172 [drbd]
  [<f8d14d47>] got_BarrierAck+0x10/0x6b [drbd]
  [<f8d15fbc>] drbd_asender+0x3b1/0x4e7 [drbd]
  [<f8d24a53>] drbd_thread_setup+0x0/0x14e [drbd]
  [<f8d24adb>] drbd_thread_setup+0x88/0x14e [drbd]
  [<f8d24a53>] drbd_thread_setup+0x0/0x14e [drbd]
  [<c0405c3b>] kernel_thread_helper+0x7/0x10
  =======================

nate

2008-Aug-13 22:47 UTC

head link

[CentOS] DRBD 8.2 crashes CentOS 5.2 on rsync from remote host

Chris Miller wrote:>
> I've got a pair of HA servers I'm trying to get into production.
> Here are some specs :
>
[..]> [root at haws1 ~]# BUG: unable to handle kernel paging request at
> virtual address c
This typically means bad RAM

nate

Toshaan Bharvani

2008-Aug-18 13:28 UTC

head link

[CentOS] Re: DRBD 8.2 crashes CentOS 5.2 on rsync from remote host

Scott Silva wrote:> on 8-14-2008 12:55 AM Chris Miller spake the following:
>> nate wrote:
>>> Chris Miller wrote:
>>>> I've got a pair of HA servers I'm trying to get into
production.
>>>> Here are some specs :
>>>>
>>>
>>> [..]
>>>> [root at haws1 ~]# BUG: unable to handle kernel paging request
at
>>>> virtual address c
>>>
>>> This typically means bad RAM
>>
>> While I won't rule this out, my local hardware vendor does a 48
hour
>> burn-in including a full gamut of tests (including memory) before 
>> handing over the servers. These servers are less than two weeks old...
>>
>> Seems like this is a common type of error in some situations. I tried 
>> to boot in kexec/kdump mode (CentOS 5 replacement for diskdumputils), 
>> but the e1000 driver isn't seeing the NICs after a reboot via the 
>> "capture kernel", so I can't replicate the (rsync
induced) problem
>> and perform kernel debugging. I'll explore this more tomorrow.
>>
>> Chris
> When the servers are shipped to you, do you open them and make sure 
> all modules are seated completely, and haven't been dislodged by the 
> shipping?
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> http://lists.centos.org/mailman/listinfo/centos
>   why not try a memtest, you can download a bootable cd/usb and do a check


_________________________________________________________________
News, entertainment and everything you care about at Live.com. Get it now!
http://www.live.com/getstarted.aspx
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.centos.org/pipermail/centos/attachments/20080818/c5bdc30e/attachment-0005.html>

nightduke

2008-Aug-18 13:34 UTC

head link

[CentOS] Re: DRBD 8.2 crashes CentOS 5.2 on rsync from remote host

What server are? IBM, HP, DELL?

2008/8/18 Toshaan Bharvani <toshaan at
hotmail.com>:> Scott Silva wrote:
>> on 8-14-2008 12:55 AM Chris Miller spake the following:
>>> nate wrote:
>>>> Chris Miller wrote:
>>>>> I've got a pair of HA servers I'm trying to get
into production.
>>>>> Here are some specs :
>>>>>
>>>>
>>>> [..]
>>>>> [root at haws1 ~]# BUG: unable to handle kernel paging
request at
>>>>> virtual address c
>>>>
>>>> This typically means bad RAM
>>>
>>> While I won't rule this out, my local hardware vendor does a 48
hour
>>> burn-in including a full gamut of tests (including memory) before
>>> handing over the servers. These servers are less than two weeks
old...
>>>
>>> Seems like this is a common type of error in some situations. I
tried
>>> to boot in kexec/kdump mode (CentOS 5 replacement for
diskdumputils),
>>> but the e1000 driver isn't seeing the NICs after a reboot via
the
>>> "capture kernel", so I can't replicate the (rsync
induced) problem
>>> and perform kernel debugging. I'll explore this more tomorrow.
>>>
>>> Chris
>> When the servers are shipped to you, do you open them and make sure
>> all modules are seated completely, and haven't been dislodged by
the
>> shipping?
>>
>>
>>
>>
------------------------------------------------------------------------
>>
>> _______________________________________________
>> CentOS mailing list
>> CentOS at centos.org
>> http://lists.centos.org/mailman/listinfo/centos
>>
> why not try a memtest, you can download a bootable cd/usb and do a check
>
>
> ________________________________
> Get news, entertainment and everything you care about at Live.com. Check it
> out!
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> http://lists.centos.org/mailman/listinfo/centos
>
>

CentOS - Aug 2008 - DRBD 8.2 crashes CentOS 5.2 on rsync from remote host

[CentOS] DRBD 8.2 crashes CentOS 5.2 on rsync from remote host

[CentOS] DRBD 8.2 crashes CentOS 5.2 on rsync from remote host

[CentOS] Re: DRBD 8.2 crashes CentOS 5.2 on rsync from remote host

[CentOS] Re: DRBD 8.2 crashes CentOS 5.2 on rsync from remote host