Displaying 20 results from an estimated 25 matches for "ocfs2_replay_journal".
2009 Sep 24
1
strange fencing behavior
...mation on journal
Sep 24 14:07:57 storage0 kernel: [650683.566388] kjournald starting.
Commit interval 5 seconds
Sep 24 14:07:57 storage0 kernel: [650683.566388] ocfs2: Mounting device
(8,18) on (node 0, slot 10) with ordered data mode.
Sep 24 14:07:57 storage0 kernel: [650683.566388]
(12231,1):ocfs2_replay_journal:1149 Recovering node 10 from slot 0 on
device (8,18)
Sep 24 14:08:00 storage0 kernel: [650687.138110] kjournald starting.
Commit interval 5 seconds
Sep 24 14:08:00 storage0 kernel: [650687.268898]
(12231,1):ocfs2_replay_journal:1149 Recovering node 2 from slot 1 on
device (8,18)
Sep 24 14:08:0...
2013 Nov 01
1
How to break out the unstop loop in the recovery thread? Thanks a lot.
...ng on to the storage.
But the last one does not restart, and it still write error message into syslog as below:
Oct 30 02:01:01 server177 kernel: [25786.227598] (ocfs2rec,14787,13):ocfs2_read_journal_inode:1463 ERROR: status = -5
Oct 30 02:01:01 server177 kernel: [25786.227615] (ocfs2rec,14787,13):ocfs2_replay_journal:1496 ERROR: status = -5
Oct 30 02:01:01 server177 kernel: [25786.227631] (ocfs2rec,14787,13):ocfs2_recover_node:1652 ERROR: status = -5
Oct 30 02:01:01 server177 kernel: [25786.227648] (ocfs2rec,14787,13):__ocfs2_recovery_thread:1358 ERROR: Error -5 recovering node 2 on device (8,32)!
Oct 30 02:01:...
2006 Apr 18
1
Self-fencing issues (RHEL4)
...:45 rac1/rac1 (2903,0):o2net_set_nn_state:411 no longer
connected to node rac2 (num 1) at 10.0.1.2:7777
Apr 18 15:56:45 rac1/rac1 (2897,1):dlm_send_proxy_ast_msg:448 ERROR:
status = -107
Apr 18 15:56:45 rac1/rac1 (2897,1):dlm_flush_asts:556 ERROR: status = -107
Apr 18 15:56:46 rac1/rac1 (19545,0):ocfs2_replay_journal:1172 Recovering
node 1 from slot 1 on device (8,41)
Apr 18 15:56:46 rac1/rac1 (19544,0):ocfs2_replay_journal:1172 Recovering
node 1 from slot 1 on device (8,37)
Apr 18 15:56:51 rac2/rac2 <0>Rebooting in 60
seconds..<5>(3,0):o2net_idle_timer:1310 connection to node rac1 (num 0)
at 10...
2009 May 12
2
add error check for ocfs2_read_locked_inode() call
After upgrading from 2.6.28.10 to 2.6.29.3 I've saw following new errors
in kernel log:
May 12 14:46:41 falcon-cl5
May 12 14:46:41 falcon-cl5 (6757,7):ocfs2_read_locked_inode:466 ERROR:
status = -22
Only one node is mounted volumes in cluster:
/dev/sde on /home/apache/users/D1 type ocfs2
(rw,_netdev,noatime,heartbeat=local)
/dev/sdd on /home/apache/users/D2 type ocfs2
2006 Jul 10
1
2 Node cluster crashing
...e_change:512 connection to node
rac2.globoforce.com num 1 at 198.87.235.246:7777 has been idle for 10
seconds,
shutting it down.
Jul 7 14:56:23 rac1 kernel: (10042,0):o2net_set_nn_state:414 no longer
connected to node rac2.globoforce.com at 198.87.235.246:7777
Jul 7 14:56:56 rac1 kernel: (14410,3):ocfs2_replay_journal:1123 Recovering
node 1 from slot 1 on device (8,65)
rac2:
Jul 7 14:56:24 rac2 kernel: (0,0):o2net_state_change:512 connection to node
rac1.globoforce.com num 0 at 198.87.235.244:7777 has been idle for 10
seconds,
shutting it down.
Jul 7 14:56:24 rac2 kernel: (10201,0):o2net_set_nn_state:414 no l...
2009 Feb 04
1
Strange dmesg messages
...ore lock mastery can begin
(6968,7):dlm_get_lock_resource:947 F59B45831EEA41F384BADE6C4B7A932B:
recovery map is not empty, but must master $RECOVERY lock now
(6968,7):dlm_do_recovery:524 (6968) Node 1 is the Recovery Master for
the Dead Node 0 for Domain F59B45831EEA41F384BADE6C4B7A932B
(12281,2):ocfs2_replay_journal:1004 Recovering node 0 from slot 0 on
device (8,33)
(fs/jbd/recovery.c, 255): journal_recover: JBD: recovery, exit status 0,
recovered transactions 66251376 to 66251415
(fs/jbd/recovery.c, 257): journal_recover: JBD: Replayed 3176 and
revoked 0/0 blocks
kjournald starting. Commit interval 5 sec...
2011 Apr 01
1
Node Recovery locks I/O in two-node OCFS2 cluster (DRBD 8.3.8 / Ubuntu 10.10)
...ly way I?ve been able to successfully regain I/O within the cluster is
to bring back up the other node. While monitoring the logs, it seems that it
is OCFS2 that?s establishing the lock/unlock and not DRBD at all.
>
>
> Apr 1 12:07:19 ubu10a kernel: [ 1352.739777]
> (ocfs2rec,3643,0):ocfs2_replay_journal:1605 Recovering node 1124116672 from
> slot 1 on device (147,0)
> Apr 1 12:07:19 ubu10a kernel: [ 1352.900874]
> (ocfs2rec,3643,0):ocfs2_begin_quota_recovery:407 Beginning quota recovery in
> slot 1
> Apr 1 12:07:19 ubu10a kernel: [ 1352.902509]
> (ocfs2_wq,1213,0):ocfs2_finish_...
2007 Nov 29
1
Troubles with two node
...54FF88030591B1210C560:$RECOVERY: at least one node (0)
torecover before lock mastery can begin
Nov 22 18:14:54 web-ha2 kernel: (3550,0):dlm_get_lock_resource:876
86472C5C33A54FF88030591B1210C560: recovery map is not empty, but must
master $RECOVERY lock now
Nov 22 18:14:54 web-ha2 kernel: (17893,0):ocfs2_replay_journal:1184
Recovering node 0 from slot 0 on device (8,17)
Nov 22 18:14:55 web-ha2 kernel: (17803,1):dlm_restart_lock_mastery:1215
ERROR: node down! 0
Nov 22 18:14:55 web-ha2 kernel: (17803,1):dlm_wait_for_lock_mastery:1036
ERROR: status = -11
Nov 22 18:14:55 web-ha2 kernel: (17602,0):dlm_restart_lock_mas...
2009 Mar 04
2
[PATCH 1/1] Patch to recover orphans in offline slots during recovery and mount
During recovery, a node recovers orphans in it's slot and the dead node(s). But
if the dead nodes were holding orphans in offline slots, they will be left
unrecovered.
If the dead node is the last one to die and is holding orphans in other slots
and is the first one to mount, then it only recovers it's own slot, which
leaves orphans in offline slots.
This patch queues complete_recovery
2006 Mar 14
1
problems with ocfs2
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20060314/b38f73eb/attachment.html
2006 Sep 21
0
ocfs2 reboot
...ow :
o2net_idle_timer:1309 here are some times that might help debug the
situation: (tmr 1158758358.807993 now 1158758368.805980 dr 1158758358.807964adv
1158758358.808000:1158758358.808001 func (23633ca3:504) 1158757938.878265:
1158757938.878271)
Sep 20 15:20:02 src-rac-duplicati1 kernel:
(10047,0):ocfs2_replay_journal:1174 Recovering node 1 from slot 0 on device
(104,1)
Sep 20 15:20:05 src-rac-duplicati1 kernel:
(2062,1):dlm_get_lock_resource:847
6AEF3479C4784E9895BDE697EFCAC035:$RECOVERY: at least one node (1) torecover
before lock mastery can begin
Sep 20 15:20:05 src-rac-duplicati1 kernel:
(2062,1):dlm_get_lo...
2011 Dec 20
8
ocfs2 - Kernel panic on many write/read from both
Sorry i don`t copy everything:
TEST-MAIL1# echo "ls //orphan_dir:0000"|debugfs.ocfs2 /dev/dm-0|wc
debugfs.ocfs2 1.6.4
5239722 26198604 246266859
TEST-MAIL1# echo "ls //orphan_dir:0001"|debugfs.ocfs2 /dev/dm-0|wc
debugfs.ocfs2 1.6.4
6074335 30371669 285493670
TEST-MAIL2 ~ # echo "ls //orphan_dir:0000"|debugfs.ocfs2 /dev/dm-0|wc
debugfs.ocfs2 1.6.4
5239722 26198604
2010 Jan 14
1
another fencing question
Hi,
periodically one of on my two nodes cluster is fenced here are the logs:
Jan 14 07:01:44 nvr1-rc kernel: o2net: no longer connected to node nvr2-
rc.minint.it (num 0) at 1.1.1.6:7777
Jan 14 07:01:44 nvr1-rc kernel: (21534,1):dlm_do_master_request:1334 ERROR:
link to 0 went down!
Jan 14 07:01:44 nvr1-rc kernel: (4007,4):dlm_send_proxy_ast_msg:458 ERROR:
status = -112
Jan 14 07:01:44
2007 Mar 08
4
ocfs2 cluster becomes unresponsive
...958DB:$RECOVERY: at least one node (2) torecover before lock mastery can begin
Mar 8 07:23:40 groupwise-1-mht kernel: (28613,2):dlm_get_lock_resource:874 B6ECAF5A668A4573AF763908F26958DB: recovery map is not empty, but must master $RECOVERY lock now
Mar 8 07:23:41 groupwise-1-mht kernel: (4432,0):ocfs2_replay_journal:1176 Recovering node 2 from slot 1 on device (253,1)
Mar 8 07:23:41 groupwise-1-mht kernel: (4192,0):dlm_restart_lock_mastery:1214 ERROR: node down! 2
Mar 8 07:23:41 groupwise-1-mht kernel: (4192,0):dlm_wait_for_lock_mastery:1035 ERROR: status = -11
Mar 8 07:23:41 groupwise-1-mht kernel: (929,1)...
2009 Apr 07
1
Backport to 1.4 of patch that recovers orphans from offline slots
The following patch is a backport of patch that recovers orphans from offline
slots. It is being backported from mainline to 1.4
mainline patch: 0001-Patch-to-recover-orphans-in-offline-slots-during-rec.patch
Thanks,
--Srini
2009 Mar 06
0
[PATCH 1/1] ocfs2: recover orphans in offline slots during recovery and mount
...bail:
mutex_lock(&osb->recovery_lock);
@@ -1314,6 +1414,7 @@ bail:
goto restart;
}
+ ocfs2_free_replay_slots(osb);
osb->recovery_thread_task = NULL;
mb(); /* sync with ocfs2_recovery_thread_running */
wake_up(&osb->recovery_event);
@@ -1465,6 +1566,9 @@ static int ocfs2_replay_journal(struct ocfs2_super *osb,
goto done;
}
+ /* we need to run complete recovery for offline orphan slots */
+ ocfs2_replay_map_set_state(osb, REPLAY_NEEDED);
+
mlog(ML_NOTICE, "Recovering node %d from slot %d on device (%u,%u)\n",
node_num, slot_num,
MAJOR(osb->sb-&g...
2009 Mar 06
1
[PATCH 1/1] Patch to recover orphans in offline slots during recovery and mount (revised)
...bail:
mutex_lock(&osb->recovery_lock);
@@ -1314,6 +1414,7 @@ bail:
goto restart;
}
+ ocfs2_free_replay_slots(osb);
osb->recovery_thread_task = NULL;
mb(); /* sync with ocfs2_recovery_thread_running */
wake_up(&osb->recovery_event);
@@ -1465,6 +1566,9 @@ static int ocfs2_replay_journal(struct ocfs2_super *osb,
goto done;
}
+ /* we need to run complete recovery for offline orphan slots */
+ ocfs2_replay_map_set_state(osb, REPLAY_NEEDED);
+
mlog(ML_NOTICE, "Recovering node %d from slot %d on device (%u,%u)\n",
node_num, slot_num,
MAJOR(osb->sb-&g...
2007 Oct 08
2
OCF2 and LVM
Does anybody knows if is there a certified procedure in to
backup a RAC DB 10.2.0.3 based on OCFS2 ,
via split mirror or snaphots technology ?
Using Linux LVM and OCFS2, does anybody knows if is
possible to dinamically extend an OCFS filesystem,
once the underlying LVM Volume has been extended ?
Thanks in advance
Riccardo Paganini
2008 Oct 22
2
Another node is heartbeating in our slot! errors with LUN removal/addition
...1745 File
system was not unmounted cleanly, recovering volume.
Oct 22 03:16:30 ausracdb03 kernel: kjournald starting. Commit interval
5 seconds
Oct 22 03:16:30 ausracdb03 kernel: ocfs2: Mounting device (253,28) on
(node 2, slot 0) with ordered data mode.
Oct 22 03:16:30 ausracdb03 kernel: (9939,1):ocfs2_replay_journal:1076
Recovering node 0 from slot 3 on device (253,28)
Oct 22 03:16:32 ausracdb03 kernel: (9861,2):o2hb_do_disk_heartbeat:770
ERROR: Device "dm-28": another node is heartbeating in our slot!
Oct 22 03:16:34 ausracdb03 kernel: (9861,2):o2hb_do_disk_heartbeat:770
ERROR: Device "dm-28&qu...
2008 Sep 04
4
[PATCH 0/3] ocfs2: Switch over to JBD2.
ocfs2 currently uses the Journaled Block Device (JBD) for its
journaling. This is a very stable and tested codebase. However, JBD
is limited by architecture to 32bit block numbers. This means an ocfs2
filesystem is limited to 2^32 blocks. With a 4K blocksize, that's 16TB.
People want larger volumes.
Fortunately, there is now JBD2. JBD2 adds 64bit block number support
and some other