Hello all, For 4 month now I'm using OCFS in an environment with 7 partitions each with 14 Tb running Oracle Linux 6.2 and until last week everything was fine. Now however we're running into severe performance problems when doing simple copies. I have one of the 7 partitions mounted as RW on one server and 4 servers with RO. I did a simple cp of various files on the RW server and during that copy the process got into D state and a simple df for instance blocked. It took minutes for something that should be immediate. This is happening on any of those partitions. The 3 servers that are RO are webservers running NGinX and serving those shared files. I really would like to give you more information and I already tried to google ocfs2 problems but so far no luck. Any help would be greatly appreciated. Best Regards Adelino Jos? Monteiro -- Cumprimentos / Best Regards Adelino Monteiro
I have a similar blocking/hanging/stall issue. On Oracle 6.2/x64. We are running OCFS2 on 3 partitions about 14-15TB each. One of the partitions has been running extremely slowly too. They are running on the same hardware-- LSI HW Raid Cards and Enterprise Drives. I am running over drbd and nfsd. I am getting really slow performance in writes.?The process [jbd2-drbd-18] seems to be stuck for a long length of time (about 2 minutes) before it finally commits. This jbd2-drbd2 prevents any other writes from happening which stalls the system. This would be from "ps auxr" USER?????? PID %CPU %MEM??? VSZ?? RSS TTY????? STAT START?? TIME COMMAND root????? 6374? 0.0? 0.0????? 0???? 0 ???????? D??? Apr03?? 0:28 [jbd2/drbd2-18] root????? 6876? 0.0? 0.0????? 0? ???0 ???????? D??? Apr06?? 0:11 [nfsd] root????? 6884? 0.0? 0.0????? 0???? 0 ???????? D??? Apr06?? 0:09 [nfsd] root????? 6999? 0.0? 0.0????? 0???? 0 ???????? D??? Apr06?? 0:45 [nfsd] root????? 7046? 0.0? 0.0????? 0???? 0 ???????? D??? Apr06?? 0:09 [nfsd] root????? 7053? 0.0? 0.0????? 0???? 0 ???????? D??? Apr06?? 0:09 [nfsd] root????? 7054? 0.0? 0.0????? 0???? 0 ???????? D??? Apr06?? 0:08 [nfsd] Running "scan_locks2" shows nothing. Nothing is held up locking wise. It seems to happen more with files copying about 1MB or larger. It only happens for me on my second partition, but not the other 2. It seems to super slow in writes. Reads are fast. I hope to find a solution quickly too. I wonder it is because we have very large partitions. Thanks, Jay On Thu, Apr 5, 2012 at 9:11 AM, Adelino Monteiro <adelino.monteiro at gmail.com> wrote:> > Hello all, > > For 4 month now I'm using OCFS in an environment with 7 partitions > each with 14 Tb running Oracle Linux 6.2 and until last week > everything was fine. > Now however we're running into severe performance problems when doing > simple copies. > > I have one of the 7 partitions mounted as RW on one server and 4 > servers with RO. I did a simple cp of various files on the RW server > and during that copy the process got into D state and a simple df for > instance blocked. It took minutes for something that should be > immediate. This is happening on any of those partitions. > > The 3 servers that are RO are webservers running NGinX and serving > those shared files. > > I really would like to give you more information and I already tried > to google ocfs2 problems but so far no luck. > > Any help would be greatly appreciated. > > Best Regards > > Adelino Jos? Monteiro > > > -- > Cumprimentos / Best Regards > > Adelino Monteiro > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users
On Thu, Apr 05, 2012 at 05:11:59PM +0100, Adelino Monteiro wrote:> Hello all, > > For 4 month now I'm using OCFS in an environment with 7 partitions > each with 14 Tb running Oracle Linux 6.2 and until last week > everything was fine. > Now however we're running into severe performance problems when doing > simple copies. > > I have one of the 7 partitions mounted as RW on one server and 4 > servers with RO. I did a simple cp of various files on the RW server > and during that copy the process got into D state and a simple df for > instance blocked. It took minutes for something that should be > immediate. This is happening on any of those partitions.Hey Adelino, I'd love to understand your problems. You say you've been running these systems for four months. Are the slowdowns new? Was anything happening on the RO servers at the time? Especially touching the same files or directories? How full are the filesystems? How much change to they have (that is, are the files long-lived or constantly being deleted and created)? Joel -- "Also, all of life's big problems include the words 'indictment' or 'inoperable.' Everything else is small stuff." - Alton Brown http://www.jlbec.org/ jlbec at evilplan.org