Hi, We recently upgraded ocfs2 to 1.2.8 from 1.2.3 on our 4 node RAC production systems. On one of the nodes, we notice the following in the logs Jun 18 02:00:57 db0 kernel: (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 Jun 18 02:00:57 db0 kernel: (6327,7):dlm_wait_for_node_death:365 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of node 2 Jun 18 02:01:02 db0 kernel: (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 Jun 18 02:01:02 db0 kernel: (6327,7):dlm_wait_for_node_death:365 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of node 2 Jun 18 02:01:07 db0 kernel: (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 Jun 18 02:01:07 db0 kernel: (6327,7):dlm_wait_for_node_death:365 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of node 2 Jun 18 02:01:12 db0 kernel: (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 Jun 18 02:01:12 db0 kernel: (6327,7):dlm_wait_for_node_death:365 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of node 2 Jun 18 02:01:17 db0 kernel: (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 Jun 18 02:01:17 db0 kernel: (6327,7):dlm_wait_for_node_death:365 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of node 2 Jun 18 02:01:22 db0 kernel: (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 Jun 18 02:01:22 db0 kernel: (6327,7):dlm_wait_for_node_death:365 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of node 2 Jun 18 02:01:28 db0 kernel: (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 Jun 18 02:01:28 db0 kernel: (6327,7):dlm_wait_for_node_death:365 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of node 2 Jun 18 02:01:33 db0 kernel: (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 Jun 18 02:01:33 db0 kernel: (6327,7):dlm_wait_for_node_death:365 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of node 2 Jun 18 02:01:38 db0 kernel: (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 Jun 18 02:01:38 db0 kernel: (6327,7):dlm_wait_for_node_death:365 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of node 2 Jun 18 02:01:43 db0 kernel: (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 Jun 18 02:01:43 db0 kernel: (6327,7):dlm_wait_for_node_death:365 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of node 2 Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: status = -107 Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: status = -107 Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: status = -107 Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: status = -107 Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: status = -107 Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: status = -107 Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: status = -107 Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: status = -107 Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: status = -107 Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: status = -107 Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: status = -107 Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: status = -107 Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: status = -107 Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: status = -107 Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: status = -107 Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: status = -107 We are suspecting that a backup that was scheduled to happen right around 2 am did not complete as a result of these errors. The backup process is hung and we can still see it in the process list. We are not able to access the /orabackup folder (ocfs2 mounted) from any of the nodes either. Right now we see the following in the logs Jun 18 14:56:27 db0 kernel: (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 Jun 18 14:56:27 db0 kernel: (6327,3):dlm_wait_for_node_death:365 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of node 2 Jun 18 14:56:32 db0 kernel: (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 Jun 18 14:56:32 db0 kernel: (6327,3):dlm_wait_for_node_death:365 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of node 2 Jun 18 14:56:37 db0 kernel: (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 Jun 18 14:56:37 db0 kernel: (6327,3):dlm_wait_for_node_death:365 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of node 2 Jun 18 14:56:42 db0 kernel: (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 Jun 18 14:56:42 db0 kernel: (6327,3):dlm_wait_for_node_death:365 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of node 2 Jun 18 14:56:48 db0 kernel: (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 Jun 18 14:56:48 db0 kernel: (6327,3):dlm_wait_for_node_death:365 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of node 2 Jun 18 14:56:53 db0 kernel: (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 Jun 18 14:56:53 db0 kernel: (6327,3):dlm_wait_for_node_death:365 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of node 2 Jun 18 14:56:58 db0 kernel: (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 Jun 18 14:56:58 db0 kernel: (6327,3):dlm_wait_for_node_death:365 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of node 2 Jun 18 14:57:03 db0 kernel: (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 Jun 18 14:57:03 db0 kernel: (6327,3):dlm_wait_for_node_death:365 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of death of node 2 We need to fix this issue before the backup runs again at 2 am. Please advice what we should do to fix this. Thanks, Sincerely, Saranya -------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20080618/4b643a14/attachment.html
http://oss.oracle.com/projects/ocfs2/news/article_18.html This is oss bugzilla#919 that has been fixed in 1.2.9-1. Saranya Sivakumar wrote:> Hi, > > We recently upgraded ocfs2 to 1.2.8 from 1.2.3 on our 4 node RAC > production systems. > > On one of the nodes, we notice the following in the logs > > Jun 18 02:00:57 db0 kernel: > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 02:00:57 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 02:01:02 db0 kernel: > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 02:01:02 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 02:01:07 db0 kernel: > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 02:01:07 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 02:01:12 db0 kernel: > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 02:01:12 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 02:01:17 db0 kernel: > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 02:01:17 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 02:01:22 db0 kernel: > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 02:01:22 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 02:01:28 db0 kernel: > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 02:01:28 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 02:01:33 db0 kernel: > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 02:01:33 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 02:01:38 db0 kernel: > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 02:01:38 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 02:01:43 db0 kernel: > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 02:01:43 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > status = -107 > > We are suspecting that a backup that was scheduled to happen right > around 2 am did not complete as a result of these errors. > The backup process is hung and we can still see it in the process list. > > We are not able to access the /orabackup folder (ocfs2 mounted) from > any of the nodes either. > > Right now we see the following in the logs > > Jun 18 14:56:27 db0 kernel: > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 14:56:27 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 14:56:32 db0 kernel: > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 14:56:32 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 14:56:37 db0 kernel: > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 14:56:37 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 14:56:42 db0 kernel: > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 14:56:42 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 14:56:48 db0 kernel: > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 14:56:48 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 14:56:53 db0 kernel: > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 14:56:53 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 14:56:58 db0 kernel: > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 14:56:58 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 14:57:03 db0 kernel: > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 14:57:03 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > > We need to fix this issue before the backup runs again at 2 am. Please > advice what we should do to fix this. > > Thanks, > Sincerely, > Saranya > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users
Hi Thanks for the quick response. Would reboot the nodes solve the issue temporarily before we schedule an upgrade? Thanks, Sincerely, Saranya Sivakumar ----- Original Message ---- From: Sunil Mushran <Sunil.Mushran at oracle.com> To: Saranya Sivakumar <sarlavk at yahoo.com> Cc: ocfs2-users at oss.oracle.com Sent: Wednesday, June 18, 2008 3:54:31 PM Subject: Re: [Ocfs2-users] ocfs2 1.2.8 issues http://oss.oracle.com/projects/ocfs2/news/article_18.html This is oss bugzilla#919 that has been fixed in 1.2.9-1. Saranya Sivakumar wrote:> Hi, > > We recently upgraded ocfs2 to 1.2.8 from 1.2.3 on our 4 node RAC > production systems. > > On one of the nodes, we notice the following in the logs > > Jun 18 02:00:57 db0 kernel: > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 02:00:57 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 02:01:02 db0 kernel: > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 02:01:02 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 02:01:07 db0 kernel: > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 02:01:07 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 02:01:12 db0 kernel: > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 02:01:12 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 02:01:17 db0 kernel: > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 02:01:17 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 02:01:22 db0 kernel: > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 02:01:22 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 02:01:28 db0 kernel: > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 02:01:28 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 02:01:33 db0 kernel: > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 02:01:33 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 02:01:38 db0 kernel: > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 02:01:38 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 02:01:43 db0 kernel: > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 02:01:43 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > status = -107 > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > status = -107 > > We are suspecting that a backup that was scheduled to happen right > around 2 am did not complete as a result of these errors. > The backup process is hung and we can still see it in the process list. > > We are not able to access the /orabackup folder (ocfs2 mounted) from > any of the nodes either. > > Right now we see the following in the logs > > Jun 18 14:56:27 db0 kernel: > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 14:56:27 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 14:56:32 db0 kernel: > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 14:56:32 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 14:56:37 db0 kernel: > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 14:56:37 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 14:56:42 db0 kernel: > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 14:56:42 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 14:56:48 db0 kernel: > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 14:56:48 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 14:56:53 db0 kernel: > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 14:56:53 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 14:56:58 db0 kernel: > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 14:56:58 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > Jun 18 14:57:03 db0 kernel: > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > Jun 18 14:57:03 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > death of node 2 > > We need to fix this issue before the backup runs again at 2 am. Please > advice what we should do to fix this. > > Thanks, > Sincerely, > Saranya > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users-------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20080618/5b272482/attachment-0001.html
No. You have to upgrade the fs. But that should be quicker than rebooting. Saranya Sivakumar wrote:> Hi > Thanks for the quick response. > Would reboot the nodes solve the issue temporarily before we schedule > an upgrade? > > Thanks, > Sincerely, > Saranya Sivakumar > > > ----- Original Message ---- > From: Sunil Mushran <Sunil.Mushran at oracle.com> > To: Saranya Sivakumar <sarlavk at yahoo.com> > Cc: ocfs2-users at oss.oracle.com > Sent: Wednesday, June 18, 2008 3:54:31 PM > Subject: Re: [Ocfs2-users] ocfs2 1.2.8 issues > > http://oss.oracle.com/projects/ocfs2/news/article_18.html > > This is oss bugzilla#919 that has been fixed in 1.2.9-1. > > Saranya Sivakumar wrote: > > Hi, > > > > We recently upgraded ocfs2 to 1.2.8 from 1.2.3 on our 4 node RAC > > production systems. > > > > On one of the nodes, we notice the following in the logs > > > > Jun 18 02:00:57 db0 kernel: > > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > > Jun 18 02:00:57 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > > death of node 2 > > Jun 18 02:01:02 db0 kernel: > > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > > Jun 18 02:01:02 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > > death of node 2 > > Jun 18 02:01:07 db0 kernel: > > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > > Jun 18 02:01:07 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > > death of node 2 > > Jun 18 02:01:12 db0 kernel: > > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > > Jun 18 02:01:12 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > > death of node 2 > > Jun 18 02:01:17 db0 kernel: > > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > > Jun 18 02:01:17 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > > death of node 2 > > Jun 18 02:01:22 db0 kernel: > > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > > Jun 18 02:01:22 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > > death of node 2 > > Jun 18 02:01:28 db0 kernel: > > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > > Jun 18 02:01:28 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > > death of node 2 > > Jun 18 02:01:33 db0 kernel: > > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > > Jun 18 02:01:33 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > > death of node 2 > > Jun 18 02:01:38 db0 kernel: > > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > > Jun 18 02:01:38 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > > death of node 2 > > Jun 18 02:01:43 db0 kernel: > > (6327,7):dlm_send_remote_convert_request:398 ERROR: status = -107 > > Jun 18 02:01:43 db0 kernel: (6327,7):dlm_wait_for_node_death:365 > > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > > death of node 2 > > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > > status = -107 > > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > > status = -107 > > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > > status = -107 > > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > > status = -107 > > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > > status = -107 > > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > > status = -107 > > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > > status = -107 > > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > > status = -107 > > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > > status = -107 > > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > > status = -107 > > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > > status = -107 > > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > > status = -107 > > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > > status = -107 > > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > > status = -107 > > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_drop_lockres_ref:2284 ERROR: > > status = -107 > > Jun 18 00:09:00 db0 kernel: (15652,1):dlm_purge_lockres:189 ERROR: > > status = -107 > > > > We are suspecting that a backup that was scheduled to happen right > > around 2 am did not complete as a result of these errors. > > The backup process is hung and we can still see it in the process list. > > > > We are not able to access the /orabackup folder (ocfs2 mounted) from > > any of the nodes either. > > > > Right now we see the following in the logs > > > > Jun 18 14:56:27 db0 kernel: > > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > > Jun 18 14:56:27 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > > death of node 2 > > Jun 18 14:56:32 db0 kernel: > > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > > Jun 18 14:56:32 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > > death of node 2 > > Jun 18 14:56:37 db0 kernel: > > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > > Jun 18 14:56:37 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > > death of node 2 > > Jun 18 14:56:42 db0 kernel: > > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > > Jun 18 14:56:42 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > > death of node 2 > > Jun 18 14:56:48 db0 kernel: > > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > > Jun 18 14:56:48 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > > death of node 2 > > Jun 18 14:56:53 db0 kernel: > > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > > Jun 18 14:56:53 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > > death of node 2 > > Jun 18 14:56:58 db0 kernel: > > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > > Jun 18 14:56:58 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > > death of node 2 > > Jun 18 14:57:03 db0 kernel: > > (6327,3):dlm_send_remote_convert_request:398 ERROR: status = -107 > > Jun 18 14:57:03 db0 kernel: (6327,3):dlm_wait_for_node_death:365 > > 2CED57AE61DE47BA8D2EECE680EFFA6C: waiting 5000ms for notification of > > death of node 2 > > > > We need to fix this issue before the backup runs again at 2 am. Please > > advice what we should do to fix this. > > > > Thanks, > > Sincerely, > > Saranya > > > > > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Ocfs2-users mailing list > > Ocfs2-users at oss.oracle.com <mailto:Ocfs2-users at oss.oracle.com> > > http://oss.oracle.com/mailman/listinfo/ocfs2-users > >
Hi, We have a shared backup storage that resides on EMC storage and mounted using ocfs2 1.2.3 on a physical standby database in production. We are in the process of adding another physical standby database and need to mount the backup storage on the new physical standby as well, to be able to recover a backup that resides on it. But the new standby has ocfs2 1.2.9 installed. Eventually we will be removing the current physical standby from the configuration, but for a period of time both physical standby may be using the same backup storage. Is it ok to mount the shared storage using different ocfs2 versions on different machines? Please advice. Thanks, Sincerely, Saranya Sivakumar -------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20090804/57a1688a/attachment.html
ocfs2 1.2.3 is 3 years old. Suggest you upgrade that to 1.2.9. 1.2.3 and 1.2.9 are not network compatible. The mount will fail. Saranya Sivakumar wrote:> Hi, > We have a shared backup storage that resides on EMC storage and > mounted using ocfs2 1.2.3 on a physical standby database in production. > We are in the process of adding another physical standby database and > need to mount the backup storage on the new physical standby as well, > to be able to recover a backup that resides on it. > But the new standby has ocfs2 1.2.9 installed. > Eventually we will be removing the current physical standby from the > configuration, but for a period of time both physical standby may be > using the same backup storage. > Is it ok to mount the shared storage using different ocfs2 versions on > different machines? > Please advice. > > Thanks, > Sincerely, > Saranya Sivakumar > > >