Heinzmann, Robert
2008-Feb-19 07:29 UTC
[Ocfs2-users] DLMFS on OracleVM 2.1 (OEL5.0 based)
Hi List,
I want to use DLMFS of OCFS2 to avoid multiple start of virtual machines
on OracleVM. I want to use a wrapper around xm that spawns a deamon that
keeps a file open in /dlm/DOMAIN. Now I played around a bit and followed
the procedure in the document
http://oss.oracle.com/projects/ocfs2/src/branches/ocfs2-1.2/dlmfs.txt
for DLMFS.
Theres one problem. The "O_NONBLOCK" option is not working. I have 2
nodes in my setup that have both mounted the /dlm as dlmfs and both have
mounted a 5 GB OCFS2 test volume on FS SAN.
I run the getlocks script below without O_NONBLOCK on node1 with a sleep
of - eg - 30. Then I run - while the script is running on node1 - the
same script on node2 the input appears on node2 as soon as the script
has finished on node1. Thats what I expeced.
If I run the getlocks script with O_NONBLOCK enabled, node1 successfully
sleeps e.g. 30 sec. While the script is running I execute the getlocks
script on node2. The following ouput appears - as expected:
--- OUTPUT ----
Traceback (most recent call last):
File "/INSTALL/getlocks.py", line 12, in ?
fd=os.open(dir + "/" + mach, os.O_RDWR|os.O_CREAT|os.O_NONBLOCK)
OSError: [Errno 26] Text file busy: '/dlm/xen/TermServ'
--- OUTPUT ----
Thats what I expected. But when the getlocks script is finished on node1
and I run getlocks on node2 again the ERROR above appers also.
Shouldn't it be possible that the lock is aquired on node2 when node1
releases the FH ??? Is there something wrong in the OCFS2 DLM code or
did I get something wrong here ?
Regards,
Robert
--- getlocks test script ---
#!/usr/bin/python
import os
import time
dir="/dlm/xen"
mach="TermServ"
if ( not os.path.isdir(dir)):
os.mkdir(dir)
try:
# fd=os.open(dir + "/" + mach, os.O_RDWR|os.O_CREAT)
fd=os.open(dir + "/" + mach,
os.O_RDWR|os.O_CREAT|os.O_NONBLOCK)
sleeptime=raw_input("How long to sleep ? ")
print "Sleeping " + sleeptime + " seconds"
time.sleep(float(sleeptime))
os.close(fd)
except IOError, e:
print "ERROR: could not open file\n"
raise Exception, "%s [%d]" % (e.strerror, e.errno)
--- end getlocks test script ---
Does anyone have a hint here ?
I already wrote an e-mail to mark fasheh (listed in
http://oss.oracle.com/projects/ocfs2/src/branches/ocfs2-1.2/dlmfs.txt),
but he's did not answer yet.
Robert
The behavior is expected. Trylock is just that. As in, it does not trigger a downconvert request to the holder. The holder is only sent a downconvert request when one attempts to lock the resource without o_nonblock. The downconvert request itself from another node is honored only after the holder closes the descriptor, close(fd). If you want to use o_nonblock, you will have to remove the lock (using rm) on the node after the close(). I guess we could add a functionality for an explicit downconvert by overloading yet another open flag. File a bugzilla with an enhancement request, that is if you wish for that functionality to be added. That way we will be able to monitor the interest level. Sunil Heinzmann, Robert wrote:> Hi List, > > I want to use DLMFS of OCFS2 to avoid multiple start of virtual machines > on OracleVM. I want to use a wrapper around xm that spawns a deamon that > keeps a file open in /dlm/DOMAIN. Now I played around a bit and followed > the procedure in the document > http://oss.oracle.com/projects/ocfs2/src/branches/ocfs2-1.2/dlmfs.txt > for DLMFS. > > Theres one problem. The "O_NONBLOCK" option is not working. I have 2 > nodes in my setup that have both mounted the /dlm as dlmfs and both have > mounted a 5 GB OCFS2 test volume on FS SAN. > > I run the getlocks script below without O_NONBLOCK on node1 with a sleep > of - eg - 30. Then I run - while the script is running on node1 - the > same script on node2 the input appears on node2 as soon as the script > has finished on node1. Thats what I expeced. > > If I run the getlocks script with O_NONBLOCK enabled, node1 successfully > sleeps e.g. 30 sec. While the script is running I execute the getlocks > script on node2. The following ouput appears - as expected: > > > --- OUTPUT ---- > Traceback (most recent call last): > File "/INSTALL/getlocks.py", line 12, in ? > fd=os.open(dir + "/" + mach, os.O_RDWR|os.O_CREAT|os.O_NONBLOCK) > OSError: [Errno 26] Text file busy: '/dlm/xen/TermServ' > --- OUTPUT ---- > > Thats what I expected. But when the getlocks script is finished on node1 > and I run getlocks on node2 again the ERROR above appers also. > > Shouldn't it be possible that the lock is aquired on node2 when node1 > releases the FH ??? Is there something wrong in the OCFS2 DLM code or > did I get something wrong here ? > > Regards, > Robert > > --- getlocks test script --- > #!/usr/bin/python > > import os > import time > > dir="/dlm/xen" > mach="TermServ" > > if ( not os.path.isdir(dir)): > os.mkdir(dir) > try: > # fd=os.open(dir + "/" + mach, os.O_RDWR|os.O_CREAT) > fd=os.open(dir + "/" + mach, os.O_RDWR|os.O_CREAT|os.O_NONBLOCK) > sleeptime=raw_input("How long to sleep ? ") > print "Sleeping " + sleeptime + " seconds" > time.sleep(float(sleeptime)) > os.close(fd) > except IOError, e: > print "ERROR: could not open file\n" > raise Exception, "%s [%d]" % (e.strerror, e.errno) > --- end getlocks test script --- > > > Does anyone have a hint here ? > > I already wrote an e-mail to mark fasheh (listed in > http://oss.oracle.com/projects/ocfs2/src/branches/ocfs2-1.2/dlmfs.txt), > but he's did not answer yet. > > Robert > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users >
Apparently Analagous Threads
- [LLVMdev] Thumb-2 code generation error in Apple LLVM at all optimization levels
- Problem with ices on OpenBSD 2.9 w/ Icecast 1.3.10
- o2cb_ctl: Unable to access cluster service Cannot initialize cluster
- ocfs2 kernel bug in Fedora Core 4 update kernel
- mount.ocfs2: Value too large ...