thr3ads.net - Ocfs2 users - [Ocfs2-users] ocfs2 1.2.8 issues [Jun 2008]

If this information is useful, please help other people find it:
Share via:

Saranya Sivakumar

2008-Jun-18 20:35 UTC

[Ocfs2-users] ocfs2 1.2.8 issues

Hi,

We recently upgraded ocfs2 to 1.2.8 from 1.2.3 on our 4 node RAC production
systems.

On one of the nodes, we notice the following in the logs

Jun 18 02:00:57 db0 kernel: (6327,7):dlm_send_remote_convert_request:398 ERROR:
status = -107
Jun 18 02:00:57 db0 kernel: (6327,7):dlm_wait_for_node_death:365
2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of
node 2
Jun 18 02:01:02 db0 kernel: (6327,7):dlm_send_remote_convert_request:398 ERROR:
status = -107
Jun 18 02:01:02 db0 kernel: (6327,7):dlm_wait_for_node_death:365
2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of
node 2
Jun 18 02:01:07 db0 kernel: (6327,7):dlm_send_remote_convert_request:398 ERROR:
status = -107
Jun 18 02:01:07 db0 kernel: (6327,7):dlm_wait_for_node_death:365
2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of
node 2
Jun 18 02:01:12 db0 kernel: (6327,7):dlm_send_remote_convert_request:398 ERROR:
status = -107
Jun 18 02:01:12 db0 kernel: (6327,7):dlm_wait_for_node_death:365
2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of
node 2
Jun 18 02:01:17 db0 kernel: (6327,7):dlm_send_remote_convert_request:398 ERROR:
status = -107
Jun 18 02:01:17 db0 kernel: (6327,7):dlm_wait_for_node_death:365
2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of
node 2
Jun 18 02:01:22 db0 kernel: (6327,7):dlm_send_remote_convert_request:398 ERROR:
status = -107
Jun 18 02:01:22 db0 kernel: (6327,7):dlm_wait_for_node_death:365
2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of
node 2
Jun 18 02:01:28 db0 kernel: (6327,7):dlm_send_remote_convert_request:398 ERROR:
status = -107
Jun 18 02:01:28 db0 kernel: (6327,7):dlm_wait_for_node_death:365
2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of
node 2
Jun 18 02:01:33 db0 kernel: (6327,7):dlm_send_remote_convert_request:398 ERROR:
status = -107
Jun 18 02:01:33 db0 kernel: (6327,7):dlm_wait_for_node_death:365
2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of
node 2
Jun 18 02:01:38 db0 kernel: (6327,7):dlm_send_remote_convert_request:398 ERROR:
status = -107
Jun 18 02:01:38 db0 kernel: (6327,7):dlm_wait_for_node_death:365
2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of
node 2
Jun 18 02:01:43 db0 kernel: (6327,7):dlm_send_remote_convert_request:398 ERROR:
status = -107
Jun 18 02:01:43 db0 kernel: (6327,7):dlm_wait_for_node_death:365
2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of
node 2
Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: status =
-107
Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: status = -107
Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: status =
-107
Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: status = -107
Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: status =
-107
Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: status = -107
Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: status =
-107
Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: status = -107
Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: status =
-107
Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: status = -107
Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: status =
-107
Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: status = -107
Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: status =
-107
Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: status = -107
Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: status =
-107
Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: status = -107

We are suspecting that a backup that was scheduled to happen right around 2 am
did not complete as a result of these errors.
The backup process is hung and we can still see it in the process list. 

We are not able to access the /orabackup folder (ocfs2 mounted) from any of the
nodes either.

Right now we see the following in the logs

Jun 18 14:56:27 db0 kernel: (6327,3):dlm_send_remote_convert_request:398 ERROR:
status = -107
Jun 18 14:56:27 db0 kernel: (6327,3):dlm_wait_for_node_death:365
2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of
node 2
Jun 18 14:56:32 db0 kernel: (6327,3):dlm_send_remote_convert_request:398 ERROR:
status = -107
Jun 18 14:56:32 db0 kernel: (6327,3):dlm_wait_for_node_death:365
2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of
node 2
Jun 18 14:56:37 db0 kernel: (6327,3):dlm_send_remote_convert_request:398 ERROR:
status = -107
Jun 18 14:56:37 db0 kernel: (6327,3):dlm_wait_for_node_death:365
2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of
node 2
Jun 18 14:56:42 db0 kernel: (6327,3):dlm_send_remote_convert_request:398 ERROR:
status = -107
Jun 18 14:56:42 db0 kernel: (6327,3):dlm_wait_for_node_death:365
2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of
node 2
Jun 18 14:56:48 db0 kernel: (6327,3):dlm_send_remote_convert_request:398 ERROR:
status = -107
Jun 18 14:56:48 db0 kernel: (6327,3):dlm_wait_for_node_death:365
2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of
node 2
Jun 18 14:56:53 db0 kernel: (6327,3):dlm_send_remote_convert_request:398 ERROR:
status = -107
Jun 18 14:56:53 db0 kernel: (6327,3):dlm_wait_for_node_death:365
2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of
node 2
Jun 18 14:56:58 db0 kernel: (6327,3):dlm_send_remote_convert_request:398 ERROR:
status = -107
Jun 18 14:56:58 db0 kernel: (6327,3):dlm_wait_for_node_death:365
2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of
node 2
Jun 18 14:57:03 db0 kernel: (6327,3):dlm_send_remote_convert_request:398 ERROR:
status = -107
Jun 18 14:57:03 db0 kernel: (6327,3):dlm_wait_for_node_death:365
2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of
node 2

We need to fix this issue before the backup runs again at 2 am. Please advice
what we should do to fix this.

Thanks,
Sincerely,
Saranya


      
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://oss.oracle.com/pipermail/ocfs2-users/attachments/20080618/4b643a14/attachment.html

Sunil Mushran

2008-Jun-18 20:54 UTC

head link

[Ocfs2-users] ocfs2 1.2.8 issues

http://oss.oracle.com/projects/ocfs2/news/article_18.html

This is oss bugzilla#919 that has been fixed in 1.2.9-1.

Saranya Sivakumar wrote:> Hi,
>
> We recently upgraded ocfs2 to 1.2.8 from 1.2.3 on our 4 node RAC 
> production systems.
>
> On one of the nodes, we notice the following in the logs
>
> Jun 18 02:00:57 db0 kernel: 
> (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 02:00:57 db0 kernel: (6327,7):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 02:01:02 db0 kernel: 
> (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 02:01:02 db0 kernel: (6327,7):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 02:01:07 db0 kernel: 
> (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 02:01:07 db0 kernel: (6327,7):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 02:01:12 db0 kernel: 
> (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 02:01:12 db0 kernel: (6327,7):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 02:01:17 db0 kernel: 
> (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 02:01:17 db0 kernel: (6327,7):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 02:01:22 db0 kernel: 
> (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 02:01:22 db0 kernel: (6327,7):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 02:01:28 db0 kernel: 
> (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 02:01:28 db0 kernel: (6327,7):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 02:01:33 db0 kernel: 
> (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 02:01:33 db0 kernel: (6327,7):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 02:01:38 db0 kernel: 
> (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 02:01:38 db0 kernel: (6327,7):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 02:01:43 db0 kernel: 
> (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 02:01:43 db0 kernel: (6327,7):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: 
> status = -107
>
> We are suspecting that a backup that was scheduled to happen right 
> around 2 am did not complete as a result of these errors.
> The backup process is hung and we can still see it in the process list.
>
> We are not able to access the /orabackup folder (ocfs2 mounted) from 
> any of the nodes either.
>
> Right now we see the following in the logs
>
> Jun 18 14:56:27 db0 kernel: 
> (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 14:56:27 db0 kernel: (6327,3):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 14:56:32 db0 kernel: 
> (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 14:56:32 db0 kernel: (6327,3):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 14:56:37 db0 kernel: 
> (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 14:56:37 db0 kernel: (6327,3):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 14:56:42 db0 kernel: 
> (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 14:56:42 db0 kernel: (6327,3):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 14:56:48 db0 kernel: 
> (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 14:56:48 db0 kernel: (6327,3):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 14:56:53 db0 kernel: 
> (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 14:56:53 db0 kernel: (6327,3):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 14:56:58 db0 kernel: 
> (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 14:56:58 db0 kernel: (6327,3):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 14:57:03 db0 kernel: 
> (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 14:57:03 db0 kernel: (6327,3):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
>
> We need to fix this issue before the backup runs again at 2 am. Please 
> advice what we should do to fix this.
>
> Thanks,
> Sincerely,
> Saranya
>
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users

Saranya Sivakumar

2008-Jun-18 21:20 UTC

head link

[Ocfs2-users] ocfs2 1.2.8 issues

Hi 
Thanks for the quick response.
Would reboot the nodes solve the issue temporarily before we schedule an
upgrade?

Thanks,
Sincerely,
Saranya Sivakumar




----- Original Message ----
From: Sunil Mushran <Sunil.Mushran at oracle.com>
To: Saranya Sivakumar <sarlavk at yahoo.com>
Cc: ocfs2-users at oss.oracle.com
Sent: Wednesday, June 18, 2008 3:54:31 PM
Subject: Re: [Ocfs2-users] ocfs2 1.2.8  issues

http://oss.oracle.com/projects/ocfs2/news/article_18.html

This is oss bugzilla#919 that has been fixed in 1.2.9-1.

Saranya Sivakumar wrote:> Hi,
>
> We recently upgraded ocfs2 to 1.2.8 from 1.2.3 on our 4 node RAC 
> production systems.
>
> On one of the nodes, we notice the following in the logs
>
> Jun 18 02:00:57 db0 kernel: 
> (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 02:00:57 db0 kernel: (6327,7):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 02:01:02 db0 kernel: 
> (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 02:01:02 db0 kernel: (6327,7):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 02:01:07 db0 kernel: 
> (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 02:01:07 db0 kernel: (6327,7):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 02:01:12 db0 kernel: 
> (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 02:01:12 db0 kernel: (6327,7):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 02:01:17 db0 kernel: 
> (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 02:01:17 db0 kernel: (6327,7):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 02:01:22 db0 kernel: 
> (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 02:01:22 db0 kernel: (6327,7):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 02:01:28 db0 kernel: 
> (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 02:01:28 db0 kernel: (6327,7):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 02:01:33 db0 kernel: 
> (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 02:01:33 db0 kernel: (6327,7):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 02:01:38 db0 kernel: 
> (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 02:01:38 db0 kernel: (6327,7):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 02:01:43 db0 kernel: 
> (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 02:01:43 db0 kernel: (6327,7):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: 
> status = -107
> Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: 
> status = -107
>
> We are suspecting that a backup that was scheduled to happen right 
> around 2 am did not complete as a result of these errors.
> The backup process is hung and we can still see it in the process list.
>
> We are not able to access the /orabackup folder (ocfs2 mounted) from 
> any of the nodes either.
>
> Right now we see the following in the logs
>
> Jun 18 14:56:27 db0 kernel: 
> (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 14:56:27 db0 kernel: (6327,3):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 14:56:32 db0 kernel: 
> (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 14:56:32 db0 kernel: (6327,3):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 14:56:37 db0 kernel: 
> (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 14:56:37 db0 kernel: (6327,3):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 14:56:42 db0 kernel: 
> (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 14:56:42 db0 kernel: (6327,3):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 14:56:48 db0 kernel: 
> (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 14:56:48 db0 kernel: (6327,3):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 14:56:53 db0 kernel: 
> (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 14:56:53 db0 kernel: (6327,3):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 14:56:58 db0 kernel: 
> (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 14:56:58 db0 kernel: (6327,3):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
> Jun 18 14:57:03 db0 kernel: 
> (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> Jun 18 14:57:03 db0 kernel: (6327,3):dlm_wait_for_node_death:365 
> 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of 
> death of node 2
>
> We need to fix this issue before the backup runs again at 2 am. Please 
> advice what we should do to fix this.
>
> Thanks,
> Sincerely,
> Saranya
>
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users

      
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://oss.oracle.com/pipermail/ocfs2-users/attachments/20080618/5b272482/attachment-0001.html

Sunil Mushran

2008-Jun-18 21:29 UTC

head link

[Ocfs2-users] ocfs2 1.2.8 issues

No. You have to upgrade the fs. But that should be quicker
than rebooting.

Saranya Sivakumar wrote:> Hi
> Thanks for the quick response.
> Would reboot the nodes solve the issue temporarily before we schedule 
> an upgrade?
>
> Thanks,
> Sincerely,
> Saranya Sivakumar
>
>
> ----- Original Message ----
> From: Sunil Mushran <Sunil.Mushran at oracle.com>
> To: Saranya Sivakumar <sarlavk at yahoo.com>
> Cc: ocfs2-users at oss.oracle.com
> Sent: Wednesday, June 18, 2008 3:54:31 PM
> Subject: Re: [Ocfs2-users] ocfs2 1.2.8 issues
>
> http://oss.oracle.com/projects/ocfs2/news/article_18.html
>
> This is oss bugzilla#919 that has been fixed in 1.2.9-1.
>
> Saranya Sivakumar wrote:
> > Hi,
> >
> > We recently upgraded ocfs2 to 1.2.8 from 1.2.3 on our 4 node RAC
> > production systems.
> >
> > On one of the nodes, we notice the following in the logs
> >
> > Jun 18 02:00:57 db0 kernel:
> > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> > Jun 18 02:00:57 db0 kernel: (6327,7):dlm_wait_for_node_death:365
> > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of
> > death of node 2
> > Jun 18 02:01:02 db0 kernel:
> > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> > Jun 18 02:01:02 db0 kernel: (6327,7):dlm_wait_for_node_death:365
> > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of
> > death of node 2
> > Jun 18 02:01:07 db0 kernel:
> > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> > Jun 18 02:01:07 db0 kernel: (6327,7):dlm_wait_for_node_death:365
> > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of
> > death of node 2
> > Jun 18 02:01:12 db0 kernel:
> > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> > Jun 18 02:01:12 db0 kernel: (6327,7):dlm_wait_for_node_death:365
> > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of
> > death of node 2
> > Jun 18 02:01:17 db0 kernel:
> > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> > Jun 18 02:01:17 db0 kernel: (6327,7):dlm_wait_for_node_death:365
> > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of
> > death of node 2
> > Jun 18 02:01:22 db0 kernel:
> > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> > Jun 18 02:01:22 db0 kernel: (6327,7):dlm_wait_for_node_death:365
> > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of
> > death of node 2
> > Jun 18 02:01:28 db0 kernel:
> > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> > Jun 18 02:01:28 db0 kernel: (6327,7):dlm_wait_for_node_death:365
> > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of
> > death of node 2
> > Jun 18 02:01:33 db0 kernel:
> > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> > Jun 18 02:01:33 db0 kernel: (6327,7):dlm_wait_for_node_death:365
> > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of
> > death of node 2
> > Jun 18 02:01:38 db0 kernel:
> > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> > Jun 18 02:01:38 db0 kernel: (6327,7):dlm_wait_for_node_death:365
> > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of
> > death of node 2
> > Jun 18 02:01:43 db0 kernel:
> > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107
> > Jun 18 02:01:43 db0 kernel: (6327,7):dlm_wait_for_node_death:365
> > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of
> > death of node 2
> > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR:
> > status = -107
> > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR:
> > status = -107
> > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR:
> > status = -107
> > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR:
> > status = -107
> > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR:
> > status = -107
> > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR:
> > status = -107
> > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR:
> > status = -107
> > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR:
> > status = -107
> > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR:
> > status = -107
> > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR:
> > status = -107
> > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR:
> > status = -107
> > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR:
> > status = -107
> > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR:
> > status = -107
> > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR:
> > status = -107
> > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR:
> > status = -107
> > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR:
> > status = -107
> >
> > We are suspecting that a backup that was scheduled to happen right
> > around 2 am did not complete as a result of these errors.
> > The backup process is hung and we can still see it in the process
list.
> >
> > We are not able to access the /orabackup folder (ocfs2 mounted) from
> > any of the nodes either.
> >
> > Right now we see the following in the logs
> >
> > Jun 18 14:56:27 db0 kernel:
> > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> > Jun 18 14:56:27 db0 kernel: (6327,3):dlm_wait_for_node_death:365
> > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of
> > death of node 2
> > Jun 18 14:56:32 db0 kernel:
> > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> > Jun 18 14:56:32 db0 kernel: (6327,3):dlm_wait_for_node_death:365
> > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of
> > death of node 2
> > Jun 18 14:56:37 db0 kernel:
> > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> > Jun 18 14:56:37 db0 kernel: (6327,3):dlm_wait_for_node_death:365
> > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of
> > death of node 2
> > Jun 18 14:56:42 db0 kernel:
> > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> > Jun 18 14:56:42 db0 kernel: (6327,3):dlm_wait_for_node_death:365
> > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of
> > death of node 2
> > Jun 18 14:56:48 db0 kernel:
> > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> > Jun 18 14:56:48 db0 kernel: (6327,3):dlm_wait_for_node_death:365
> > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of
> > death of node 2
> > Jun 18 14:56:53 db0 kernel:
> > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> > Jun 18 14:56:53 db0 kernel: (6327,3):dlm_wait_for_node_death:365
> > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of
> > death of node 2
> > Jun 18 14:56:58 db0 kernel:
> > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> > Jun 18 14:56:58 db0 kernel: (6327,3):dlm_wait_for_node_death:365
> > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of
> > death of node 2
> > Jun 18 14:57:03 db0 kernel:
> > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107
> > Jun 18 14:57:03 db0 kernel: (6327,3):dlm_wait_for_node_death:365
> > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of
> > death of node 2
> >
> > We need to fix this issue before the backup runs again at 2 am. Please
> > advice what we should do to fix this.
> >
> > Thanks,
> > Sincerely,
> > Saranya
> >
> >
> >
> >
> >
------------------------------------------------------------------------
> >
> > _______________________________________________
> > Ocfs2-users mailing list
> > Ocfs2-users at oss.oracle.com <mailto:Ocfs2-users at
oss.oracle.com>
> > http://oss.oracle.com/mailman/listinfo/ocfs2-users
>
>

Saranya Sivakumar

2009-Aug-04 22:32 UTC

head link

[Ocfs2-users] ocfs2 question

Hi, 
We have a shared backup storage that resides on EMC storage and mounted using
ocfs2 1.2.3 on a physical standby database in production.
We are in the process of adding another physical standby database and need to
mount the backup storage on the new physical standby as well,
 to be able to recover a backup that resides on it. 
But the new standby has ocfs2 1.2.9 installed. 
Eventually we will be removing the current physical standby from the
configuration, but for a period of time both physical standby may be
using the same backup storage.
Is it ok to mount the shared storage using different ocfs2 versions on different
machines?
Please advice.

Thanks,
Sincerely,
Saranya Sivakumar


      
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://oss.oracle.com/pipermail/ocfs2-users/attachments/20090804/57a1688a/attachment.html

Sunil Mushran

2009-Aug-04 22:35 UTC

head link

[Ocfs2-users] ocfs2 question

ocfs2 1.2.3 is 3 years old. Suggest you upgrade that to 1.2.9.

1.2.3 and 1.2.9 are not network compatible. The mount will fail.

Saranya Sivakumar wrote:> Hi,
> We have a shared backup storage that resides on EMC storage and 
> mounted using ocfs2 1.2.3 on a physical standby database in production.
> We are in the process of adding another physical standby database and 
> need to mount the backup storage on the new physical standby as well,
>  to be able to recover a backup that resides on it.
> But the new standby has ocfs2 1.2.9 installed.
> Eventually we will be removing the current physical standby from the 
> configuration, but for a period of time both physical standby may be 
> using the same backup storage.
> Is it ok to mount the shared storage using different ocfs2 versions on 
> different machines?
> Please advice.
>
> Thanks,
> Sincerely,
> Saranya Sivakumar
>
>
>

Ocfs2 users - Jun 2008 - ocfs2 1.2.8 issues

[Ocfs2-users] ocfs2 1.2.8 issues

[Ocfs2-users] ocfs2 1.2.8 issues

[Ocfs2-users] ocfs2 1.2.8 issues

[Ocfs2-users] ocfs2 1.2.8 issues

[Ocfs2-users] ocfs2 question

[Ocfs2-users] ocfs2 question