Mark Schloss
2008-Jul-21 08:20 UTC
[Ocfs2-users] OCFS processes active after a umount [SEC=UNOFFICIAL]
Hello, I have two OCFS file file systems mounted at /ocfs_1 and /ocfs_2. I have unmounted both OCFS file systems and was trying to then offline and unload OCFS. The offline command failed with - # ./o2cb offline Stopping O2CB cluster ocfs2: Failed Unable to stop cluster as heartbeat region still active Looking at the processes on this box shows a number of OCFS processes are still active - ps -ef |grep ocf root 4704 23 0 Jul19 ? 00:00:00 [ocfs2_wq] root 4705 23 0 Jul19 ? 00:00:00 [ocfs2vote-0] root 4710 23 0 Jul19 ? 00:00:00 [ocfs2cmt-0] root 4730 23 0 Jul19 ? 00:00:00 [ocfs2vote-1] root 4735 23 0 Jul19 ? 00:00:00 [ocfs2cmt-1] root 10214 3485 0 18:12 pts/2 00:00:00 grep ocf According to the FAQ, the ocfs2vote and ocfs2cmt processes should have gone at the umount. Mount shows that there no OCFS file systems mounted - # mount |grep -i ocfs ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw) What can I do to offline/unload OCFS? (offline force fails with the same message as above) Regards Mark Schloss ********************************************************************** Please note that your email address is known to AUSTRAC for the purposes of communicating with you. The information transmitted in this e-mail is for the use of the intended recipient only and may contain confidential and/or legally privileged material. If you have received this information in error you must not disseminate, copy or take any action on it and we request that you delete all copies of this transmission together with attachments and notify the sender. This footnote also confirms that this email message has been swept for the presence of computer viruses. ********************************************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20080721/33bf4ae5/attachment.html
Sunil Mushran
2008-Jul-21 17:43 UTC
[Ocfs2-users] OCFS processes active after a umount [SEC=UNOFFICIAL]
That is strange. Next time double check the mounts with: $ cat /proc/mounts The mount command prints the entries in /etc/mtab while the /proc/mounts dumps the information from the kernel. If those threads are there, it means the volume is still mounted. Two in this case. The entries in mtab are added by mount.ocfs2 and removed by umount. There is a chance that mount.ocfs2 was unable to add the entries in that file. Or, maybe one used the -n option to force that behavior. Which version/kernel is this? Sunil Mark Schloss wrote:> Hello, > > I have two OCFS file file systems mounted at /ocfs_1 and /ocfs_2. I > have unmounted both OCFS file systems and was trying to then offline > and unload OCFS. The offline command failed with - > > # ./o2cb offline > Stopping O2CB cluster ocfs2: Failed > Unable to stop cluster as heartbeat region still active > Looking at the processes on this box shows a number of OCFS processes > are still active - > > ps -ef |grep ocf > root 4704 23 0 Jul19 ? 00:00:00 [ocfs2_wq] > root 4705 23 0 Jul19 ? 00:00:00 [ocfs2vote-0] > root 4710 23 0 Jul19 ? 00:00:00 [ocfs2cmt-0] > root 4730 23 0 Jul19 ? 00:00:00 [ocfs2vote-1] > root 4735 23 0 Jul19 ? 00:00:00 [ocfs2cmt-1] > root 10214 3485 0 18:12 pts/2 00:00:00 grep ocf > > According to the FAQ, the ocfs2vote and ocfs2cmt processes should have > gone at the umount. > > Mount shows that there no OCFS file systems mounted - > > # mount |grep -i ocfs > ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw) > > What can I do to offline/unload OCFS? (offline force fails with the > same message as above) > > Regards > > Mark Schloss
Mark Schloss
2008-Jul-22 04:22 UTC
[Ocfs2-users] OCFS processes active after a umount [SEC=UNOFFICIAL]
Hello Sunil, Thanks for the reply. The version is 1.2.9-1 and kernel is 2.6.18-92.1.1.el5. We have managed to reproduce the problem and it appears to be related to multipathing. We recently moved to multipathing the OCFS volumes. On the weekend we tested removing one path to the OCFS volumes and all was OK. When the path was restored the box still showed it as being unavailable. We found this to be caused by the multipath daemon not running as it had not been set to start automatically. After the daemon was started both paths to the OCFS volumes were shown as available, however, when trying to umount and offline OCFS we see the behaviour outlined originally. This is reproducible as follows - 1. Start with both paths to the OCFS volumes available 2. Shutdown the multipath daemon 3. Remove one path by disconnecting a cable to the switch 4. Restore the path by reconnecting cable 5. Start multipath daemon 6. Check both paths are available 7. Umount the OCFS file system (this returns immediately without the usual few second delay) 8. Offline OCFS - error is received Under normal circumstances, i.e.: when the multipath daemon is continuously available, losing a path, restoring a path, umount and offline all work as expected. Regards Mark Mark Schloss | Oracle DBA | Information Technology | x0013 -----Original Message----- From: Sunil Mushran [mailto:Sunil.Mushran at oracle.com] Sent: Tuesday, 22 July 2008 3:43 AM To: Mark Schloss Cc: ocfs2-users at oss.oracle.com Subject: Re: [Ocfs2-users] OCFS processes active after a umount [SEC=UNOFFICIAL] That is strange. Next time double check the mounts with: $ cat /proc/mounts The mount command prints the entries in /etc/mtab while the /proc/mounts dumps the information from the kernel. If those threads are there, it means the volume is still mounted. Two in this case. The entries in mtab are added by mount.ocfs2 and removed by umount. There is a chance that mount.ocfs2 was unable to add the entries in that file. Or, maybe one used the -n option to force that behavior. Which version/kernel is this? Sunil Mark Schloss wrote:> Hello, > > I have two OCFS file file systems mounted at /ocfs_1 and /ocfs_2. I > have unmounted both OCFS file systems and was trying to then offline > and unload OCFS. The offline command failed with - > > # ./o2cb offline > Stopping O2CB cluster ocfs2: Failed > Unable to stop cluster as heartbeat region still active > Looking at the processes on this box shows a number of OCFS processes > are still active - > > ps -ef |grep ocf > root 4704 23 0 Jul19 ? 00:00:00 [ocfs2_wq] > root 4705 23 0 Jul19 ? 00:00:00 [ocfs2vote-0] > root 4710 23 0 Jul19 ? 00:00:00 [ocfs2cmt-0] > root 4730 23 0 Jul19 ? 00:00:00 [ocfs2vote-1] > root 4735 23 0 Jul19 ? 00:00:00 [ocfs2cmt-1] > root 10214 3485 0 18:12 pts/2 00:00:00 grep ocf > > According to the FAQ, the ocfs2vote and ocfs2cmt processes should have> gone at the umount. > > Mount shows that there no OCFS file systems mounted - > > # mount |grep -i ocfs > ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw) > > What can I do to offline/unload OCFS? (offline force fails with the > same message as above) > > Regards > > Mark Schloss********************************************************************** Please note that your email address is known to AUSTRAC for the purposes of communicating with you. The information transmitted in this e-mail is for the use of the intended recipient only and may contain confidential and/or legally privileged material. If you have received this information in error you must not disseminate, copy or take any action on it and we request that you delete all copies of this transmission together with attachments and notify the sender. This footnote also confirms that this email message has been swept for the presence of computer viruses. **********************************************************************
Sunil Mushran
2008-Jul-23 02:12 UTC
[Ocfs2-users] OCFS processes active after a umount [SEC=UNOFFICIAL]
Did you monitor /proc/mounts as I had suggested. -----Original Message----->From Mark Schloss <Mark.Schloss at austrac.gov.au>Sent Mon 7/21/2008 9:22 PM To Sunil Mushran <Sunil.Mushran at oracle.com> Cc ocfs2-users at oss.oracle.com Subject Re: [Ocfs2-users] OCFS processes active after a umount [SEC=UNOFFICIAL] Hello Sunil, Thanks for the reply. The version is 1.2.9-1 and kernel is 2.6.18-92.1.1.el5. We have managed to reproduce the problem and it appears to be related to multipathing. We recently moved to multipathing the OCFS volumes. On the weekend we tested removing one path to the OCFS volumes and all was OK. When the path was restored the box still showed it as being unavailable. We found this to be caused by the multipath daemon not running as it had not been set to start automatically. After the daemon was started both paths to the OCFS volumes were shown as available, however, when trying to umount and offline OCFS we see the behaviour outlined originally. This is reproducible as follows - 1. Start with both paths to the OCFS volumes available 2. Shutdown the multipath daemon 3. Remove one path by disconnecting a cable to the switch 4. Restore the path by reconnecting cable 5. Start multipath daemon 6. Check both paths are available 7. Umount the OCFS file system (this returns immediately without the usual few second delay) 8. Offline OCFS - error is received Under normal circumstances, i.e.: when the multipath daemon is continuously available, losing a path, restoring a path, umount and offline all work as expected. Regards Mark Mark Schloss | Oracle DBA | Information Technology | x0013 -----Original Message----- From: Sunil Mushran [mailto:Sunil.Mushran at oracle.com] Sent: Tuesday, 22 July 2008 3:43 AM To: Mark Schloss Cc: ocfs2-users at oss.oracle.com Subject: Re: [Ocfs2-users] OCFS processes active after a umount [SEC=UNOFFICIAL] That is strange. Next time double check the mounts with: $ cat /proc/mounts The mount command prints the entries in /etc/mtab while the /proc/mounts dumps the information from the kernel. If those threads are there, it means the volume is still mounted. Two in this case. The entries in mtab are added by mount.ocfs2 and removed by umount. There is a chance that mount.ocfs2 was unable to add the entries in that file. Or, maybe one used the -n option to force that behavior. Which version/kernel is this? Sunil Mark Schloss wrote:> Hello, > > I have two OCFS file file systems mounted at /ocfs_1 and /ocfs_2. I > have unmounted both OCFS file systems and was trying to then offline > and unload OCFS. The offline command failed with - > > # ./o2cb offline > Stopping O2CB cluster ocfs2: Failed > Unable to stop cluster as heartbeat region still active > Looking at the processes on this box shows a number of OCFS processes > are still active - > > ps -ef |grep ocf > root 4704 23 0 Jul19 ? 00:00:00 [ocfs2_wq] > root 4705 23 0 Jul19 ? 00:00:00 [ocfs2vote-0] > root 4710 23 0 Jul19 ? 00:00:00 [ocfs2cmt-0] > root 4730 23 0 Jul19 ? 00:00:00 [ocfs2vote-1] > root 4735 23 0 Jul19 ? 00:00:00 [ocfs2cmt-1] > root 10214 3485 0 18:12 pts/2 00:00:00 grep ocf > > According to the FAQ, the ocfs2vote and ocfs2cmt processes should have> gone at the umount. > > Mount shows that there no OCFS file systems mounted - > > # mount |grep -i ocfs > ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw) > > What can I do to offline/unload OCFS? (offline force fails with the > same message as above) > > Regards > > Mark Schloss********************************************************************** Please note that your email address is known to AUSTRAC for the purposes of communicating with you. The information transmitted in this e-mail is for the use of the intended recipient only and may contain confidential and/or legally privileged material. If you have received this information in error you must not disseminate, copy or take any action on it and we request that you delete all copies of this transmission together with attachments and notify the sender. This footnote also confirms that this email message has been swept for the presence of computer viruses. ********************************************************************** _______________________________________________ Ocfs2-users mailing list Ocfs2-users at oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Mark Schloss
2008-Jul-24 04:32 UTC
[Ocfs2-users] OCFS processes active after a umount [SEC=UNOFFICIAL]
Hello Sunil, Yes, we monitored /proc/mounts and /etc/mtab and both look OK. I have attached the output of both of these at various stages in the test outlined below. Also, the -n option is not used on the mount. Regards Mark Schloss Mark Schloss | Oracle DBA | Information Technology | x0013 -----Original Message----- From: Sunil Mushran [mailto:SUNIL.MUSHRAN at ORACLE.COM] Sent: Wednesday, 23 July 2008 12:13 PM To: Mark Schloss Cc: ocfs2-users at oss.oracle.com Subject: Re: [Ocfs2-users] OCFS processes active after a umount [SEC=UNOFFICIAL] Did you monitor /proc/mounts as I had suggested. -----Original Message----->From Mark Schloss <Mark.Schloss at austrac.gov.au> Sent Mon 7/21/2008 9:22PM To Sunil Mushran <Sunil.Mushran at oracle.com> Cc ocfs2-users at oss.oracle.com Subject Re: [Ocfs2-users] OCFS processes active after a umount [SEC=UNOFFICIAL] Hello Sunil, Thanks for the reply. The version is 1.2.9-1 and kernel is 2.6.18-92.1.1.el5. We have managed to reproduce the problem and it appears to be related to multipathing. We recently moved to multipathing the OCFS volumes. On the weekend we tested removing one path to the OCFS volumes and all was OK. When the path was restored the box still showed it as being unavailable. We found this to be caused by the multipath daemon not running as it had not been set to start automatically. After the daemon was started both paths to the OCFS volumes were shown as available, however, when trying to umount and offline OCFS we see the behaviour outlined originally. This is reproducible as follows - 1. Start with both paths to the OCFS volumes available 2. Shutdown the multipath daemon 3. Remove one path by disconnecting a cable to the switch 4. Restore the path by reconnecting cable 5. Start multipath daemon 6. Check both paths are available 7. Umount the OCFS file system (this returns immediately without the usual few second delay) 8. Offline OCFS - error is received Under normal circumstances, i.e.: when the multipath daemon is continuously available, losing a path, restoring a path, umount and offline all work as expected. Regards Mark Mark Schloss | Oracle DBA | Information Technology | x0013 -----Original Message----- From: Sunil Mushran [mailto:Sunil.Mushran at oracle.com] Sent: Tuesday, 22 July 2008 3:43 AM To: Mark Schloss Cc: ocfs2-users at oss.oracle.com Subject: Re: [Ocfs2-users] OCFS processes active after a umount [SEC=UNOFFICIAL] That is strange. Next time double check the mounts with: $ cat /proc/mounts The mount command prints the entries in /etc/mtab while the /proc/mounts dumps the information from the kernel. If those threads are there, it means the volume is still mounted. Two in this case. The entries in mtab are added by mount.ocfs2 and removed by umount. There is a chance that mount.ocfs2 was unable to add the entries in that file. Or, maybe one used the -n option to force that behavior. Which version/kernel is this? Sunil Mark Schloss wrote:> Hello, > > I have two OCFS file file systems mounted at /ocfs_1 and /ocfs_2. I > have unmounted both OCFS file systems and was trying to then offline > and unload OCFS. The offline command failed with - > > # ./o2cb offline > Stopping O2CB cluster ocfs2: Failed > Unable to stop cluster as heartbeat region still active Looking at the> processes on this box shows a number of OCFS processes are still > active - > > ps -ef |grep ocf > root 4704 23 0 Jul19 ? 00:00:00 [ocfs2_wq] > root 4705 23 0 Jul19 ? 00:00:00 [ocfs2vote-0] > root 4710 23 0 Jul19 ? 00:00:00 [ocfs2cmt-0] > root 4730 23 0 Jul19 ? 00:00:00 [ocfs2vote-1] > root 4735 23 0 Jul19 ? 00:00:00 [ocfs2cmt-1] > root 10214 3485 0 18:12 pts/2 00:00:00 grep ocf > > According to the FAQ, the ocfs2vote and ocfs2cmt processes should have> gone at the umount. > > Mount shows that there no OCFS file systems mounted - > > # mount |grep -i ocfs > ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw) > > What can I do to offline/unload OCFS? (offline force fails with the > same message as above) > > Regards > > Mark Schloss********************************************************************** Please note that your email address is known to AUSTRAC for the purposes of communicating with you. The information transmitted in this e-mail is for the use of the intended recipient only and may contain confidential and/or legally privileged material. If you have received this information in error you must not disseminate, copy or take any action on it and we request that you delete all copies of this transmission together with attachments and notify the sender. This footnote also confirms that this email message has been swept for the presence of computer viruses. ********************************************************************** _______________________________________________ Ocfs2-users mailing list Ocfs2-users at oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users ********************************************************************** Please note that your email address is known to AUSTRAC for the purposes of communicating with you. The information transmitted in this e-mail is for the use of the intended recipient only and may contain confidential and/or legally privileged material. If you have received this information in error you must not disseminate, copy or take any action on it and we request that you delete all copies of this transmission together with attachments and notify the sender. This footnote also confirms that this email message has been swept for the presence of computer viruses. ********************************************************************** -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: OCFS umount.txt Url: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20080724/22df53e2/attachment.txt
Sunil Mushran
2008-Jul-24 18:15 UTC
[Ocfs2-users] OCFS processes active after a umount [SEC=UNOFFICIAL]
Ok. So this is a bug with multipathing. As in, replace ocfs2 with ext3 and you are going to have the same issue, though it may expose itself differently. From the fs perspective, at the end of the sequence of events you described, the umount command is not being sent to the fs. Probably because the there is still an active reference on the superblock. If you are using EL, file a bug with Oracle. If RHEL, then Red Hat. Sunil Mark Schloss wrote:> Hello Sunil, > > Yes, we monitored /proc/mounts and /etc/mtab and both look OK. I have > attached the output of both of these at various stages in the test > outlined below. Also, the -n option is not used on the mount. > > Regards > > Mark Schloss > > > Mark Schloss | Oracle DBA | Information Technology | x0013 > > -----Original Message----- > From: Sunil Mushran [mailto:SUNIL.MUSHRAN at ORACLE.COM] > Sent: Wednesday, 23 July 2008 12:13 PM > To: Mark Schloss > Cc: ocfs2-users at oss.oracle.com > Subject: Re: [Ocfs2-users] OCFS processes active after a umount > [SEC=UNOFFICIAL] > > Did you monitor /proc/mounts as I had suggested. > > -----Original Message----- > From Mark Schloss <Mark.Schloss at austrac.gov.au> Sent Mon 7/21/2008 9:22 > PM To Sunil Mushran <Sunil.Mushran at oracle.com> Cc > ocfs2-users at oss.oracle.com Subject Re: [Ocfs2-users] OCFS processes > active after a umount [SEC=UNOFFICIAL] > > Hello Sunil, > > Thanks for the reply. The version is 1.2.9-1 and kernel is > 2.6.18-92.1.1.el5. > > We have managed to reproduce the problem and it appears to be related to > multipathing. We recently moved to multipathing the OCFS volumes. On the > weekend we tested removing one path to the OCFS volumes and all was OK. > When the path was restored the box still showed it as being unavailable. > We found this to be caused by the multipath daemon not running as it had > not been set to start automatically. After the daemon was started both > paths to the OCFS volumes were shown as available, however, when trying > to umount and offline OCFS we see the behaviour outlined originally. > This is reproducible as follows - > > 1. Start with both paths to the OCFS volumes available 2. Shutdown the > multipath daemon 3. Remove one path by disconnecting a cable to the > switch 4. Restore the path by reconnecting cable 5. Start multipath > daemon 6. Check both paths are available 7. Umount the OCFS file system > (this returns immediately without the usual few second delay) 8. Offline > OCFS - error is received > > Under normal circumstances, i.e.: when the multipath daemon is > continuously available, losing a path, restoring a path, umount and > offline all work as expected. > > Regards > > Mark > > > Mark Schloss | Oracle DBA | Information Technology | x0013 > > -----Original Message----- > From: Sunil Mushran [mailto:Sunil.Mushran at oracle.com] > Sent: Tuesday, 22 July 2008 3:43 AM > To: Mark Schloss > Cc: ocfs2-users at oss.oracle.com > Subject: Re: [Ocfs2-users] OCFS processes active after a umount > [SEC=UNOFFICIAL] > > That is strange. > > Next time double check the mounts with: > $ cat /proc/mounts > > The mount command prints the entries in /etc/mtab while the /proc/mounts > dumps the information from the kernel. > If those threads are there, it means the volume is still mounted. Two in > this case. > > The entries in mtab are added by mount.ocfs2 and removed by umount. > There is a chance that mount.ocfs2 was unable to add the entries in that > file. Or, maybe one used the -n option to force that behavior. > > Which version/kernel is this? > > Sunil > > Mark Schloss wrote: > >> Hello, >> >> I have two OCFS file file systems mounted at /ocfs_1 and /ocfs_2. I >> have unmounted both OCFS file systems and was trying to then offline >> and unload OCFS. The offline command failed with - >> >> # ./o2cb offline >> Stopping O2CB cluster ocfs2: Failed >> Unable to stop cluster as heartbeat region still active Looking at the >> > > >> processes on this box shows a number of OCFS processes are still >> active - >> >> ps -ef |grep ocf >> root 4704 23 0 Jul19 ? 00:00:00 [ocfs2_wq] >> root 4705 23 0 Jul19 ? 00:00:00 [ocfs2vote-0] >> root 4710 23 0 Jul19 ? 00:00:00 [ocfs2cmt-0] >> root 4730 23 0 Jul19 ? 00:00:00 [ocfs2vote-1] >> root 4735 23 0 Jul19 ? 00:00:00 [ocfs2cmt-1] >> root 10214 3485 0 18:12 pts/2 00:00:00 grep ocf >> >> According to the FAQ, the ocfs2vote and ocfs2cmt processes should have >> > > >> gone at the umount. >> >> Mount shows that there no OCFS file systems mounted - >> >> # mount |grep -i ocfs >> ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw) >> >> What can I do to offline/unload OCFS? (offline force fails with the >> same message as above) >> >> Regards >> >> Mark Schloss >> > > ********************************************************************** > Please note that your email address is known to AUSTRAC for the > purposes of communicating with you. The information transmitted in > this e-mail is for the use of the intended recipient only and may > contain confidential and/or legally privileged material. If you have > received this information in error you must not disseminate, copy or > take any action on it and we request that you delete all copies of > this transmission together with attachments and notify the sender. > > This footnote also confirms that this email message has been swept for > the presence of computer viruses. > ********************************************************************** > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users > > > > ********************************************************************** > Please note that your email address is known to AUSTRAC for the > purposes of communicating with you. The information transmitted in > this e-mail is for the use of the intended recipient only and may > contain confidential and/or legally privileged material. If you have > received this information in error you must not disseminate, copy or > take any action on it and we request that you delete all copies of > this transmission together with attachments and notify the sender. > > This footnote also confirms that this email message has been swept for > the presence of computer viruses. > ********************************************************************** >
Apparently Analagous Threads
- OCFS Error on RHEL -3 Update 8
- mount.ocfs2: Invalid argument while mounting /dev/mapper/xenconfig_part1 on /etc/xen/vm/. Check 'dmesg' for more information on this error.
- hang with fsdlm
- FastIPSec and OCF
- GlusterFS, Pacemaker, OCF resource agents on CentOS 7