Heinzmann, Robert
2008-Feb-19 07:29 UTC
[Ocfs2-users] DLMFS on OracleVM 2.1 (OEL5.0 based)
Hi List, I want to use DLMFS of OCFS2 to avoid multiple start of virtual machines on OracleVM. I want to use a wrapper around xm that spawns a deamon that keeps a file open in /dlm/DOMAIN. Now I played around a bit and followed the procedure in the document http://oss.oracle.com/projects/ocfs2/src/branches/ocfs2-1.2/dlmfs.txt for DLMFS. Theres one problem. The "O_NONBLOCK" option is not working. I have 2 nodes in my setup that have both mounted the /dlm as dlmfs and both have mounted a 5 GB OCFS2 test volume on FS SAN. I run the getlocks script below without O_NONBLOCK on node1 with a sleep of - eg - 30. Then I run - while the script is running on node1 - the same script on node2 the input appears on node2 as soon as the script has finished on node1. Thats what I expeced. If I run the getlocks script with O_NONBLOCK enabled, node1 successfully sleeps e.g. 30 sec. While the script is running I execute the getlocks script on node2. The following ouput appears - as expected: --- OUTPUT ---- Traceback (most recent call last): File "/INSTALL/getlocks.py", line 12, in ? fd=os.open(dir + "/" + mach, os.O_RDWR|os.O_CREAT|os.O_NONBLOCK) OSError: [Errno 26] Text file busy: '/dlm/xen/TermServ' --- OUTPUT ---- Thats what I expected. But when the getlocks script is finished on node1 and I run getlocks on node2 again the ERROR above appers also. Shouldn't it be possible that the lock is aquired on node2 when node1 releases the FH ??? Is there something wrong in the OCFS2 DLM code or did I get something wrong here ? Regards, Robert --- getlocks test script --- #!/usr/bin/python import os import time dir="/dlm/xen" mach="TermServ" if ( not os.path.isdir(dir)): os.mkdir(dir) try: # fd=os.open(dir + "/" + mach, os.O_RDWR|os.O_CREAT) fd=os.open(dir + "/" + mach, os.O_RDWR|os.O_CREAT|os.O_NONBLOCK) sleeptime=raw_input("How long to sleep ? ") print "Sleeping " + sleeptime + " seconds" time.sleep(float(sleeptime)) os.close(fd) except IOError, e: print "ERROR: could not open file\n" raise Exception, "%s [%d]" % (e.strerror, e.errno) --- end getlocks test script --- Does anyone have a hint here ? I already wrote an e-mail to mark fasheh (listed in http://oss.oracle.com/projects/ocfs2/src/branches/ocfs2-1.2/dlmfs.txt), but he's did not answer yet. Robert
The behavior is expected. Trylock is just that. As in, it does not trigger a downconvert request to the holder. The holder is only sent a downconvert request when one attempts to lock the resource without o_nonblock. The downconvert request itself from another node is honored only after the holder closes the descriptor, close(fd). If you want to use o_nonblock, you will have to remove the lock (using rm) on the node after the close(). I guess we could add a functionality for an explicit downconvert by overloading yet another open flag. File a bugzilla with an enhancement request, that is if you wish for that functionality to be added. That way we will be able to monitor the interest level. Sunil Heinzmann, Robert wrote:> Hi List, > > I want to use DLMFS of OCFS2 to avoid multiple start of virtual machines > on OracleVM. I want to use a wrapper around xm that spawns a deamon that > keeps a file open in /dlm/DOMAIN. Now I played around a bit and followed > the procedure in the document > http://oss.oracle.com/projects/ocfs2/src/branches/ocfs2-1.2/dlmfs.txt > for DLMFS. > > Theres one problem. The "O_NONBLOCK" option is not working. I have 2 > nodes in my setup that have both mounted the /dlm as dlmfs and both have > mounted a 5 GB OCFS2 test volume on FS SAN. > > I run the getlocks script below without O_NONBLOCK on node1 with a sleep > of - eg - 30. Then I run - while the script is running on node1 - the > same script on node2 the input appears on node2 as soon as the script > has finished on node1. Thats what I expeced. > > If I run the getlocks script with O_NONBLOCK enabled, node1 successfully > sleeps e.g. 30 sec. While the script is running I execute the getlocks > script on node2. The following ouput appears - as expected: > > > --- OUTPUT ---- > Traceback (most recent call last): > File "/INSTALL/getlocks.py", line 12, in ? > fd=os.open(dir + "/" + mach, os.O_RDWR|os.O_CREAT|os.O_NONBLOCK) > OSError: [Errno 26] Text file busy: '/dlm/xen/TermServ' > --- OUTPUT ---- > > Thats what I expected. But when the getlocks script is finished on node1 > and I run getlocks on node2 again the ERROR above appers also. > > Shouldn't it be possible that the lock is aquired on node2 when node1 > releases the FH ??? Is there something wrong in the OCFS2 DLM code or > did I get something wrong here ? > > Regards, > Robert > > --- getlocks test script --- > #!/usr/bin/python > > import os > import time > > dir="/dlm/xen" > mach="TermServ" > > if ( not os.path.isdir(dir)): > os.mkdir(dir) > try: > # fd=os.open(dir + "/" + mach, os.O_RDWR|os.O_CREAT) > fd=os.open(dir + "/" + mach, os.O_RDWR|os.O_CREAT|os.O_NONBLOCK) > sleeptime=raw_input("How long to sleep ? ") > print "Sleeping " + sleeptime + " seconds" > time.sleep(float(sleeptime)) > os.close(fd) > except IOError, e: > print "ERROR: could not open file\n" > raise Exception, "%s [%d]" % (e.strerror, e.errno) > --- end getlocks test script --- > > > Does anyone have a hint here ? > > I already wrote an e-mail to mark fasheh (listed in > http://oss.oracle.com/projects/ocfs2/src/branches/ocfs2-1.2/dlmfs.txt), > but he's did not answer yet. > > Robert > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users >
Seemingly Similar Threads
- [LLVMdev] Thumb-2 code generation error in Apple LLVM at all optimization levels
- Problem with ices on OpenBSD 2.9 w/ Icecast 1.3.10
- o2cb_ctl: Unable to access cluster service Cannot initialize cluster
- ocfs2 kernel bug in Fedora Core 4 update kernel
- mount.ocfs2: Value too large ...