*Dear Sir,* * * *I am getting below mentioned error messages continuously in OSS1 Node,it causes that* *sge service is not running intermittently....... * * * * * Feb 5 04:03:37 oss1 kernel: LustreError: 9193:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:47 oss1 kernel: LustreError: 9164:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:47 oss1 kernel: LustreError: 28420:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:48 oss1 kernel: LustreError: 9266:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:50 oss1 kernel: LustreError: 9200:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:53 oss1 kernel: LustreError: 9230:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:57 oss1 kernel: LustreError: 9212:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:03 oss1 kernel: LustreError: 9262:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:08 oss1 kernel: LustreError: 9162:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:15 oss1 kernel: LustreError: 9271:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:23 oss1 kernel: LustreError: 9191:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:32 oss1 kernel: LustreError: 9242:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 * * *The detailed log information i have attached herewith.. The attached file containes the /var/log/messages* *continuous logs seperated by *. * * * *So kindly give me a solution for this issue.......* * * *Thanks & Regards VIJESH E K* * * -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120209/ca2f4ca0/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: newlogs.rtf Type: application/rtf Size: 62690 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120209/ca2f4ca0/attachment-0001.rtf
*Dear All,* * * *Kindly get a solution for these below issue...........* * * *Thanks & Regards VIJESH E K* * * On Thu, Feb 9, 2012 at 3:26 PM, VIJESH EK <ekvijesh at gmail.com> wrote:> *Dear Sir,* > * > * > *I am getting below mentioned error messages continuously in OSS1 Node,it > causes that* > *sge service is not running intermittently....... * > * > * > * > * > Feb 5 04:03:37 oss1 kernel: LustreError: > 9193:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc = -30 > Feb 5 04:03:47 oss1 kernel: LustreError: > 9164:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc = -30 > Feb 5 04:03:47 oss1 kernel: LustreError: > 28420:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc = -30 > Feb 5 04:03:48 oss1 kernel: LustreError: > 9266:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc = -30 > Feb 5 04:03:50 oss1 kernel: LustreError: > 9200:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc = -30 > Feb 5 04:03:53 oss1 kernel: LustreError: > 9230:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc = -30 > Feb 5 04:03:57 oss1 kernel: LustreError: > 9212:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc = -30 > Feb 5 04:04:03 oss1 kernel: LustreError: > 9262:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc = -30 > Feb 5 04:04:08 oss1 kernel: LustreError: > 9162:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc = -30 > Feb 5 04:04:15 oss1 kernel: LustreError: > 9271:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc = -30 > Feb 5 04:04:23 oss1 kernel: LustreError: > 9191:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc = -30 > Feb 5 04:04:32 oss1 kernel: LustreError: > 9242:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc = -30 > * > > * > *The detailed log information i have attached herewith.. The attached > file containes the /var/log/messages* > *continuous logs seperated by *. * > * > * > *So kindly give me a solution for this issue.......* > * > * > *Thanks & Regards > > VIJESH E K* > * > * > >- -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120210/2fcb0bb5/attachment.html
Hi vijesh. Are you running the SGE master spooling on lustre?!?! What about the exec nodes spooling?! I strongly recommend you to do not run the master spooling on lustre. And if possible use local spooling on local disk for the exec nodes. SGE (?t. least until version 6.2u7) is known to get unstable when running the spooling on lustre. Carlos On Feb 10, 2012, at 1:18 AM, "VIJESH EK" <ekvijesh at gmail.com<mailto:ekvijesh at gmail.com>> wrote: Dear All, Kindly get a solution for these below issue........... Thanks & Regards VIJESH E K On Thu, Feb 9, 2012 at 3:26 PM, VIJESH EK <ekvijesh at gmail.com<mailto:ekvijesh at gmail.com>> wrote: Dear Sir, I am getting below mentioned error messages continuously in OSS1 Node,it causes that sge service is not running intermittently....... Feb 5 04:03:37 oss1 kernel: LustreError: 9193:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:47 oss1 kernel: LustreError: 9164:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:47 oss1 kernel: LustreError: 28420:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:48 oss1 kernel: LustreError: 9266:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:50 oss1 kernel: LustreError: 9200:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:53 oss1 kernel: LustreError: 9230:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:57 oss1 kernel: LustreError: 9212:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:03 oss1 kernel: LustreError: 9262:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:08 oss1 kernel: LustreError: 9162:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:15 oss1 kernel: LustreError: 9271:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:23 oss1 kernel: LustreError: 9191:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:32 oss1 kernel: LustreError: 9242:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 The detailed log information i have attached herewith.. The attached file containes the /var/log/messages continuous logs seperated by *. So kindly give me a solution for this issue....... Thanks & Regards VIJESH E K - <ATT00001.c> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120209/0112b4a8/attachment.html
Errno 30 is EROFS, read-only file system. Perhaps there is some issue further up in the logs indicating the OST went read-only? Kevin On Feb 10, 2012, at 12:17 AM, VIJESH EK wrote: Dear All, Kindly get a solution for these below issue........... Thanks & Regards VIJESH E K On Thu, Feb 9, 2012 at 3:26 PM, VIJESH EK <ekvijesh at gmail.com<mailto:ekvijesh at gmail.com>> wrote: Dear Sir, I am getting below mentioned error messages continuously in OSS1 Node,it causes that sge service is not running intermittently....... Feb 5 04:03:37 oss1 kernel: LustreError: 9193:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:47 oss1 kernel: LustreError: 9164:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:47 oss1 kernel: LustreError: 28420:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:48 oss1 kernel: LustreError: 9266:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:50 oss1 kernel: LustreError: 9200:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:53 oss1 kernel: LustreError: 9230:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:57 oss1 kernel: LustreError: 9212:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:03 oss1 kernel: LustreError: 9262:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:08 oss1 kernel: LustreError: 9162:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:15 oss1 kernel: LustreError: 9271:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:23 oss1 kernel: LustreError: 9191:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:32 oss1 kernel: LustreError: 9242:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 The detailed log information i have attached herewith.. The attached file containes the /var/log/messages continuous logs seperated by *. So kindly give me a solution for this issue....... Thanks & Regards VIJESH E K - <ATT00001..txt> Confidentiality Notice: This e-mail message, its contents and any attachments to it are confidential to the intended recipient, and may contain information that is privileged and/or exempt from disclosure under applicable law. If you are not the intended recipient, please immediately notify the sender and destroy the original e-mail message and any attachments (and any copies that may have been made) from your system or otherwise. Any unauthorized use, copying, disclosure or distribution of this information is strictly prohibited. Email addresses that end with a ?-c? identify the sender as a Fusion-io contractor. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120210/cc2a1bc6/attachment.html
*Dear All,* * * *We have done the following changes in the exec Nodes , still now also we are * *getting the same errors in /var/log/messages.* * * *1. We have changed the exec Nodes spool directory to local directory by editing the file /home/appl/sge-root/default/common/configuration and changes the parameter execd_spool_dir. * * * *After changing this also the same error, i.e below mentioned error is coming in OSS1 Node. This error is generating only in the OSS1 Node.* * * Feb 6 18:32:10 oss1 kernel: LustreError: 9362:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 6 18:32:05 oss1 kernel: LustreError: 9422:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 6 18:32:06 oss1 kernel: LustreError: 9432:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 6 18:32:07 oss1 kernel: LustreError: 9369:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 6 18:32:10 oss1 kernel: LustreError: 9362:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc =* -30* * * *Can u tell me how to change the Master spool directory ?* *Is it possible to change the directory in live mode ?* * * *Kindly explain briefly, so that we can proceed for the next step.. * * * * * *Thanks and Regards* * * *VIJESH * * * * * * * * * * * On Fri, Feb 10, 2012 at 1:19 PM, Carlos Thomaz <cthomaz at ddn.com> wrote:> Hi vijesh. > > Are you running the SGE master spooling on lustre?!?! What about the exec > nodes spooling?! > > I strongly recommend you to do not run the master spooling on lustre. And > if possible use local spooling on local disk for the exec nodes. > > SGE (?t. least until version 6.2u7) is known to get unstable when running > the spooling on lustre. > > Carlos > > On Feb 10, 2012, at 1:18 AM, "VIJESH EK" <ekvijesh at gmail.com> wrote: > > *Dear All,* > * > * > *Kindly get a solution for these below issue...........* > * > * > *Thanks & Regards > > VIJESH E K* > * > * > > > On Thu, Feb 9, 2012 at 3:26 PM, VIJESH EK <ekvijesh at gmail.com> wrote: > >> *Dear Sir,* >> * >> * >> *I am getting below mentioned error messages continuously in OSS1 >> Node,it causes that* >> *sge service is not running intermittently....... * >> * >> * >> * >> * >> Feb 5 04:03:37 oss1 kernel: LustreError: >> 9193:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> transaction: rc = -30 >> Feb 5 04:03:47 oss1 kernel: LustreError: >> 9164:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> transaction: rc = -30 >> Feb 5 04:03:47 oss1 kernel: LustreError: >> 28420:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> transaction: rc = -30 >> Feb 5 04:03:48 oss1 kernel: LustreError: >> 9266:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> transaction: rc = -30 >> Feb 5 04:03:50 oss1 kernel: LustreError: >> 9200:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> transaction: rc = -30 >> Feb 5 04:03:53 oss1 kernel: LustreError: >> 9230:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> transaction: rc = -30 >> Feb 5 04:03:57 oss1 kernel: LustreError: >> 9212:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> transaction: rc = -30 >> Feb 5 04:04:03 oss1 kernel: LustreError: >> 9262:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> transaction: rc = -30 >> Feb 5 04:04:08 oss1 kernel: LustreError: >> 9162:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> transaction: rc = -30 >> Feb 5 04:04:15 oss1 kernel: LustreError: >> 9271:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> transaction: rc = -30 >> Feb 5 04:04:23 oss1 kernel: LustreError: >> 9191:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> transaction: rc = -30 >> Feb 5 04:04:32 oss1 kernel: LustreError: >> 9242:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> transaction: rc = -30 >> * >> >> * >> *The detailed log information i have attached herewith.. The attached >> file containes the /var/log/messages* >> *continuous logs seperated by *. * >> * >> * >> *So kindly give me a solution for this issue.......* >> * >> * >> *Thanks & Regards >> >> VIJESH E K* >> * >> * >> >> > > > - > > <ATT00001.c> > >- -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120221/4a0f88ee/attachment.html
Hi, Your OST becomes read-only, that''s the reason. Generally, it has relationship with your hardware, for example, your storage is broken, or your ldiskfs file system is broken. You''d better check your storage and e2fsck the OST. On Tue, Feb 21, 2012 at 2:52 PM, VIJESH EK <ekvijesh at gmail.com> wrote:> Dear All, > > We have done the following changes ?in the exec Nodes , still now also we > are > getting the same errors in /var/log/messages. > > 1. We have changed the exec Nodes spool directory to local directory by > editing the file?/home/appl/sge-root/default/common/configuration?and > changes the?parameter??execd_spool_dir. > > After changing this also the same error, i.e below mentioned error is coming > in OSS1 Node. This error is generating only in the OSS1 Node. > > Feb ?6 18:32:10 oss1 kernel: LustreError: > 9362:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc = -30 > Feb ?6 18:32:05 oss1 kernel: LustreError: > 9422:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc = -30 > Feb ?6 18:32:06 oss1 kernel: LustreError: > 9432:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc = -30 > Feb ?6 18:32:07 oss1 kernel: LustreError: > 9369:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc = -30 > Feb ?6 18:32:10 oss1 kernel: LustreError: > 9362:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc = -30 > > > Can u tell me how to change the Master spool directory ?? > Is it possible to change the directory in live mode ? > > Kindly?explain?briefly, so that we can proceed for the next step.. > > > Thanks and Regards > > VIJESH > > > > > > > > On Fri, Feb 10, 2012 at 1:19 PM, Carlos Thomaz <cthomaz at ddn.com> wrote: >> >> Hi vijesh. >> >> Are you running the SGE master spooling on lustre?!?! What about the exec >> nodes spooling?! >> >> I strongly recommend you to do not run the master spooling on lustre. And >> if possible use local spooling on local disk for the exec nodes. >> >> SGE (?t. least until version 6.2u7) is known to get unstable when running >> the spooling on lustre. >> >> Carlos >> >> On Feb 10, 2012, at 1:18 AM, "VIJESH EK" <ekvijesh at gmail.com> wrote: >> >> Dear All, >> >> Kindly get a solution for these below issue........... >> >> Thanks & Regards >> >> VIJESH E K >> >> >> >> On Thu, Feb 9, 2012 at 3:26 PM, VIJESH EK <ekvijesh at gmail.com> wrote: >>> >>> Dear Sir, >>> >>> I am getting below?mentioned error messages continuously in OSS1 Node,it >>> causes?that >>> sge service is not running?intermittently....... >>> >>> >>> Feb ?5 04:03:37 oss1 kernel: LustreError: >>> 9193:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb ?5 04:03:47 oss1 kernel: LustreError: >>> 9164:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb ?5 04:03:47 oss1 kernel: LustreError: >>> 28420:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb ?5 04:03:48 oss1 kernel: LustreError: >>> 9266:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb ?5 04:03:50 oss1 kernel: LustreError: >>> 9200:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb ?5 04:03:53 oss1 kernel: LustreError: >>> 9230:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb ?5 04:03:57 oss1 kernel: LustreError: >>> 9212:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb ?5 04:04:03 oss1 kernel: LustreError: >>> 9262:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb ?5 04:04:08 oss1 kernel: LustreError: >>> 9162:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb ?5 04:04:15 oss1 kernel: LustreError: >>> 9271:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb ?5 04:04:23 oss1 kernel: LustreError: >>> 9191:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb ?5 04:04:32 oss1 kernel: LustreError: >>> 9242:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> >>> >>> The detailed log information ?i have attached herewith.. The?attached >>> file containes the /var/log/messages >>> continuous logs seperated by *. >>> >>> So kindly give me a solution for this issue....... >>> >>> Thanks & Regards >>> >>> VIJESH E K >>> >>> >> >> >> >> - >> >> <ATT00001.c> > > > > > - > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
*Dear Sir,* * * *Thanks for your immediate response........... * * * *I have checked the OST permission, it is in read write mode only and no hard disk is failed in the storage console all are in on-line working status. * *Herewith i have attached the detailed log information , you kindly go through the logs and get back me * * * *Thanks & Regards VIJESH* * * On Tue, Feb 21, 2012 at 1:13 PM, Larry <tsrjzq at gmail.com> wrote:> Hi, > > Your OST becomes read-only, that''s the reason. Generally, it has > relationship with your hardware, for example, your storage is broken, > or your ldiskfs file system is broken. > You''d better check your storage and e2fsck the OST. > > On Tue, Feb 21, 2012 at 2:52 PM, VIJESH EK <ekvijesh at gmail.com> wrote: > > Dear All, > > > > We have done the following changes in the exec Nodes , still now also we > > are > > getting the same errors in /var/log/messages. > > > > 1. We have changed the exec Nodes spool directory to local directory by > > editing the file /home/appl/sge-root/default/common/configuration and > > changes the parameter execd_spool_dir. > > > > After changing this also the same error, i.e below mentioned error is > coming > > in OSS1 Node. This error is generating only in the OSS1 Node. > > > > Feb 6 18:32:10 oss1 kernel: LustreError: > > 9362:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > > transaction: rc = -30 > > Feb 6 18:32:05 oss1 kernel: LustreError: > > 9422:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > > transaction: rc = -30 > > Feb 6 18:32:06 oss1 kernel: LustreError: > > 9432:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > > transaction: rc = -30 > > Feb 6 18:32:07 oss1 kernel: LustreError: > > 9369:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > > transaction: rc = -30 > > Feb 6 18:32:10 oss1 kernel: LustreError: > > 9362:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > > transaction: rc = -30 > > > > > > Can u tell me how to change the Master spool directory ? > > Is it possible to change the directory in live mode ? > > > > Kindly explain briefly, so that we can proceed for the next step.. > > > > > > Thanks and Regards > > > > VIJESH > > > > > > > > > > > > > > > > On Fri, Feb 10, 2012 at 1:19 PM, Carlos Thomaz <cthomaz at ddn.com> wrote: > >> > >> Hi vijesh. > >> > >> Are you running the SGE master spooling on lustre?!?! What about the > exec > >> nodes spooling?! > >> > >> I strongly recommend you to do not run the master spooling on lustre. > And > >> if possible use local spooling on local disk for the exec nodes. > >> > >> SGE (?t. least until version 6.2u7) is known to get unstable when > running > >> the spooling on lustre. > >> > >> Carlos > >> > >> On Feb 10, 2012, at 1:18 AM, "VIJESH EK" <ekvijesh at gmail.com> wrote: > >> > >> Dear All, > >> > >> Kindly get a solution for these below issue........... > >> > >> Thanks & Regards > >> > >> VIJESH E K > >> > >> > >> > >> On Thu, Feb 9, 2012 at 3:26 PM, VIJESH EK <ekvijesh at gmail.com> wrote: > >>> > >>> Dear Sir, > >>> > >>> I am getting below mentioned error messages continuously in OSS1 > Node,it > >>> causes that > >>> sge service is not running intermittently....... > >>> > >>> > >>> Feb 5 04:03:37 oss1 kernel: LustreError: > >>> 9193:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > >>> transaction: rc = -30 > >>> Feb 5 04:03:47 oss1 kernel: LustreError: > >>> 9164:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > >>> transaction: rc = -30 > >>> Feb 5 04:03:47 oss1 kernel: LustreError: > >>> 28420:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > >>> transaction: rc = -30 > >>> Feb 5 04:03:48 oss1 kernel: LustreError: > >>> 9266:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > >>> transaction: rc = -30 > >>> Feb 5 04:03:50 oss1 kernel: LustreError: > >>> 9200:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > >>> transaction: rc = -30 > >>> Feb 5 04:03:53 oss1 kernel: LustreError: > >>> 9230:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > >>> transaction: rc = -30 > >>> Feb 5 04:03:57 oss1 kernel: LustreError: > >>> 9212:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > >>> transaction: rc = -30 > >>> Feb 5 04:04:03 oss1 kernel: LustreError: > >>> 9262:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > >>> transaction: rc = -30 > >>> Feb 5 04:04:08 oss1 kernel: LustreError: > >>> 9162:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > >>> transaction: rc = -30 > >>> Feb 5 04:04:15 oss1 kernel: LustreError: > >>> 9271:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > >>> transaction: rc = -30 > >>> Feb 5 04:04:23 oss1 kernel: LustreError: > >>> 9191:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > >>> transaction: rc = -30 > >>> Feb 5 04:04:32 oss1 kernel: LustreError: > >>> 9242:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > >>> transaction: rc = -30 > >>> > >>> > >>> The detailed log information i have attached herewith.. The attached > >>> file containes the /var/log/messages > >>> continuous logs seperated by *. > >>> > >>> So kindly give me a solution for this issue....... > >>> > >>> Thanks & Regards > >>> > >>> VIJESH E K > >>> > >>> > >> > >> > >> > >> - > >> > >> <ATT00001.c> > > > > > > > > > > - > > > > > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-discuss at lists.lustre.org > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120221/93125cf0/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: newoss1messages Type: application/octet-stream Size: 4037261 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120221/93125cf0/attachment-0001.obj
I have checked your logs, maybe there are several osts on your oss1, there must be at least one ost is read-only, it''s have no business with permissions. running e2fsck on you ost device is recommended to resolve the "rc=-30" problem. On Tue, Feb 21, 2012 at 4:00 PM, VIJESH EK <ekvijesh at gmail.com> wrote:> Dear Sir, > > Thanks for your immediate response........... > > I have??checked the OST permission, it is in read write mode only ?and no > hard disk is failed in the storage console all are in?on-line?working > status. > Herewith i have attached the detailed log information , you kindly go > through the logs and get back me > > Thanks & Regards > > VIJESH > > > > On Tue, Feb 21, 2012 at 1:13 PM, Larry <tsrjzq at gmail.com> wrote: >> >> Hi, >> >> Your OST becomes read-only, that''s the reason. Generally, it has >> relationship with your hardware, for example, your storage is broken, >> or your ldiskfs file system is broken. >> You''d better check your storage and e2fsck the OST. >> >> On Tue, Feb 21, 2012 at 2:52 PM, VIJESH EK <ekvijesh at gmail.com> wrote: >> > Dear All, >> > >> > We have done the following changes ?in the exec Nodes , still now also >> > we >> > are >> > getting the same errors in /var/log/messages. >> > >> > 1. We have changed the exec Nodes spool directory to local directory by >> > editing the file?/home/appl/sge-root/default/common/configuration?and >> > changes the?parameter??execd_spool_dir. >> > >> > After changing this also the same error, i.e below mentioned error is >> > coming >> > in OSS1 Node. This error is generating only in the OSS1 Node. >> > >> > Feb ?6 18:32:10 oss1 kernel: LustreError: >> > 9362:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> > transaction: rc = -30 >> > Feb ?6 18:32:05 oss1 kernel: LustreError: >> > 9422:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> > transaction: rc = -30 >> > Feb ?6 18:32:06 oss1 kernel: LustreError: >> > 9432:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> > transaction: rc = -30 >> > Feb ?6 18:32:07 oss1 kernel: LustreError: >> > 9369:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> > transaction: rc = -30 >> > Feb ?6 18:32:10 oss1 kernel: LustreError: >> > 9362:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> > transaction: rc = -30 >> > >> > >> > Can u tell me how to change the Master spool directory ?? >> > Is it possible to change the directory in live mode ? >> > >> > Kindly?explain?briefly, so that we can proceed for the next step.. >> > >> > >> > Thanks and Regards >> > >> > VIJESH >> > >> > >> > >> > >> > >> > >> > >> > On Fri, Feb 10, 2012 at 1:19 PM, Carlos Thomaz <cthomaz at ddn.com> wrote: >> >> >> >> Hi vijesh. >> >> >> >> Are you running the SGE master spooling on lustre?!?! What about the >> >> exec >> >> nodes spooling?! >> >> >> >> I strongly recommend you to do not run the master spooling on lustre. >> >> And >> >> if possible use local spooling on local disk for the exec nodes. >> >> >> >> SGE (?t. least until version 6.2u7) is known to get unstable when >> >> running >> >> the spooling on lustre. >> >> >> >> Carlos >> >> >> >> On Feb 10, 2012, at 1:18 AM, "VIJESH EK" <ekvijesh at gmail.com> wrote: >> >> >> >> Dear All, >> >> >> >> Kindly get a solution for these below issue........... >> >> >> >> Thanks & Regards >> >> >> >> VIJESH E K >> >> >> >> >> >> >> >> On Thu, Feb 9, 2012 at 3:26 PM, VIJESH EK <ekvijesh at gmail.com> wrote: >> >>> >> >>> Dear Sir, >> >>> >> >>> I am getting below?mentioned error messages continuously in OSS1 >> >>> Node,it >> >>> causes?that >> >>> sge service is not running?intermittently....... >> >>> >> >>> >> >>> Feb ?5 04:03:37 oss1 kernel: LustreError: >> >>> 9193:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> >>> transaction: rc = -30 >> >>> Feb ?5 04:03:47 oss1 kernel: LustreError: >> >>> 9164:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> >>> transaction: rc = -30 >> >>> Feb ?5 04:03:47 oss1 kernel: LustreError: >> >>> 28420:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> >>> transaction: rc = -30 >> >>> Feb ?5 04:03:48 oss1 kernel: LustreError: >> >>> 9266:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> >>> transaction: rc = -30 >> >>> Feb ?5 04:03:50 oss1 kernel: LustreError: >> >>> 9200:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> >>> transaction: rc = -30 >> >>> Feb ?5 04:03:53 oss1 kernel: LustreError: >> >>> 9230:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> >>> transaction: rc = -30 >> >>> Feb ?5 04:03:57 oss1 kernel: LustreError: >> >>> 9212:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> >>> transaction: rc = -30 >> >>> Feb ?5 04:04:03 oss1 kernel: LustreError: >> >>> 9262:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> >>> transaction: rc = -30 >> >>> Feb ?5 04:04:08 oss1 kernel: LustreError: >> >>> 9162:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> >>> transaction: rc = -30 >> >>> Feb ?5 04:04:15 oss1 kernel: LustreError: >> >>> 9271:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> >>> transaction: rc = -30 >> >>> Feb ?5 04:04:23 oss1 kernel: LustreError: >> >>> 9191:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> >>> transaction: rc = -30 >> >>> Feb ?5 04:04:32 oss1 kernel: LustreError: >> >>> 9242:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> >>> transaction: rc = -30 >> >>> >> >>> >> >>> The detailed log information ?i have attached herewith.. The?attached >> >>> file containes the /var/log/messages >> >>> continuous logs seperated by *. >> >>> >> >>> So kindly give me a solution for this issue....... >> >>> >> >>> Thanks & Regards >> >>> >> >>> VIJESH E K >> >>> >> >>> >> >> >> >> >> >> >> >> - >> >> >> >> <ATT00001.c> >> > >> > >> > >> > >> > - >> > >> > >> > _______________________________________________ >> > Lustre-discuss mailing list >> > Lustre-discuss at lists.lustre.org >> > http://lists.lustre.org/mailman/listinfo/lustre-discuss >> > > > > > >
- * * *We are waiting for your feedback.........* * * *Thanks & Regards VIJESH E K* * * On Tue, Feb 21, 2012 at 12:22 PM, VIJESH EK <ekvijesh at gmail.com> wrote:> *Dear All,* > * > * > *We have done the following changes in the exec Nodes , still now also > we are * > *getting the same errors in /var/log/messages.* > * > * > *1. We have changed the exec Nodes spool directory to local directory by > editing the file /home/appl/sge-root/default/common/configuration and > changes the parameter execd_spool_dir. * > * > * > *After changing this also the same error, i.e below mentioned error is > coming in OSS1 Node. This error is generating only in the OSS1 Node.* > * > * > Feb 6 18:32:10 oss1 kernel: LustreError: > 9362:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc = -30 > Feb 6 18:32:05 oss1 kernel: LustreError: > 9422:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc = -30 > Feb 6 18:32:06 oss1 kernel: LustreError: > 9432:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc = -30 > Feb 6 18:32:07 oss1 kernel: LustreError: > 9369:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc = -30 > Feb 6 18:32:10 oss1 kernel: LustreError: > 9362:0:(filter_io_26.c:693:filter_commitrw_write()) error starting > transaction: rc =* -30* > > * > * > *Can u tell me how to change the Master spool directory ?* > *Is it possible to change the directory in live mode ?* > * > * > *Kindly explain briefly, so that we can proceed for the next step.. * > * > * > * > * > *Thanks and Regards* > * > * > *VIJESH * > * > * > * > * > * > * > * > * > * > > * > > On Fri, Feb 10, 2012 at 1:19 PM, Carlos Thomaz <cthomaz at ddn.com> wrote: > >> Hi vijesh. >> >> Are you running the SGE master spooling on lustre?!?! What about the exec >> nodes spooling?! >> >> I strongly recommend you to do not run the master spooling on lustre. And >> if possible use local spooling on local disk for the exec nodes. >> >> SGE (?t. least until version 6.2u7) is known to get unstable when running >> the spooling on lustre. >> >> Carlos >> >> On Feb 10, 2012, at 1:18 AM, "VIJESH EK" <ekvijesh at gmail.com> wrote: >> >> *Dear All,* >> * >> * >> *Kindly get a solution for these below issue...........* >> * >> * >> *Thanks & Regards >> >> VIJESH E K* >> * >> * >> >> >> On Thu, Feb 9, 2012 at 3:26 PM, VIJESH EK <ekvijesh at gmail.com> wrote: >> >>> *Dear Sir,* >>> * >>> * >>> *I am getting below mentioned error messages continuously in OSS1 >>> Node,it causes that* >>> *sge service is not running intermittently....... * >>> * >>> * >>> * >>> * >>> Feb 5 04:03:37 oss1 kernel: LustreError: >>> 9193:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb 5 04:03:47 oss1 kernel: LustreError: >>> 9164:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb 5 04:03:47 oss1 kernel: LustreError: >>> 28420:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb 5 04:03:48 oss1 kernel: LustreError: >>> 9266:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb 5 04:03:50 oss1 kernel: LustreError: >>> 9200:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb 5 04:03:53 oss1 kernel: LustreError: >>> 9230:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb 5 04:03:57 oss1 kernel: LustreError: >>> 9212:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb 5 04:04:03 oss1 kernel: LustreError: >>> 9262:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb 5 04:04:08 oss1 kernel: LustreError: >>> 9162:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb 5 04:04:15 oss1 kernel: LustreError: >>> 9271:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb 5 04:04:23 oss1 kernel: LustreError: >>> 9191:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> Feb 5 04:04:32 oss1 kernel: LustreError: >>> 9242:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>> transaction: rc = -30 >>> * >>> >>> * >>> *The detailed log information i have attached herewith.. The attached >>> file containes the /var/log/messages* >>> *continuous logs seperated by *. * >>> * >>> * >>> *So kindly give me a solution for this issue.......* >>> * >>> * >>> *Thanks & Regards >>> >>> VIJESH E K* >>> * >>> * >>> >>> >> >> >> - >> >> <ATT00001.c> >> >> > > > - > >- -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120221/4484de94/attachment.html
This is not the correct list for help with SGE. That being said, the real issue (as has been mentioned by several people) is that an OST has gone read-only due to some issue. The file system will not function properly until this is resolved, irrespective of where you put SGE. You will need to check the logs on oss1 to find the initial issue, stop the bad ost, and take corrective action (the details of which depend on the issue), Kevin Sent from my iPhone On Feb 21, 2012, at 3:23 AM, "VIJESH EK" <ekvijesh at gmail.com<mailto:ekvijesh at gmail.com>> wrote: - We are waiting for your feedback......... Thanks & Regards VIJESH E K On Tue, Feb 21, 2012 at 12:22 PM, VIJESH EK <<mailto:ekvijesh at gmail.com>ekvijesh at gmail.com<mailto:ekvijesh at gmail.com>> wrote: Dear All, We have done the following changes in the exec Nodes , still now also we are getting the same errors in /var/log/messages. 1. We have changed the exec Nodes spool directory to local directory by editing the file /home/appl/sge-root/default/common/configuration and changes the parameter execd_spool_dir. After changing this also the same error, i.e below mentioned error is coming in OSS1 Node. This error is generating only in the OSS1 Node. Feb 6 18:32:10 oss1 kernel: LustreError: 9362:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 6 18:32:05 oss1 kernel: LustreError: 9422:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 6 18:32:06 oss1 kernel: LustreError: 9432:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 6 18:32:07 oss1 kernel: LustreError: 9369:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 6 18:32:10 oss1 kernel: LustreError: 9362:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Can u tell me how to change the Master spool directory ? Is it possible to change the directory in live mode ? Kindly explain briefly, so that we can proceed for the next step.. Thanks and Regards VIJESH On Fri, Feb 10, 2012 at 1:19 PM, Carlos Thomaz <<mailto:cthomaz at ddn.com>cthomaz at ddn.com<mailto:cthomaz at ddn.com>> wrote: Hi vijesh. Are you running the SGE master spooling on lustre?!?! What about the exec nodes spooling?! I strongly recommend you to do not run the master spooling on lustre. And if possible use local spooling on local disk for the exec nodes. SGE (?t. least until version 6.2u7) is known to get unstable when running the spooling on lustre. Carlos On Feb 10, 2012, at 1:18 AM, "VIJESH EK" <<mailto:ekvijesh at gmail.com>ekvijesh at gmail.com<mailto:ekvijesh at gmail.com>> wrote: Dear All, Kindly get a solution for these below issue........... Thanks & Regards VIJESH E K On Thu, Feb 9, 2012 at 3:26 PM, VIJESH EK <<mailto:ekvijesh at gmail.com>ekvijesh at gmail.com<mailto:ekvijesh at gmail.com>> wrote: Dear Sir, I am getting below mentioned error messages continuously in OSS1 Node,it causes that sge service is not running intermittently....... Feb 5 04:03:37 oss1 kernel: LustreError: 9193:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:47 oss1 kernel: LustreError: 9164:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:47 oss1 kernel: LustreError: 28420:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:48 oss1 kernel: LustreError: 9266:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:50 oss1 kernel: LustreError: 9200:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:53 oss1 kernel: LustreError: 9230:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:57 oss1 kernel: LustreError: 9212:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:03 oss1 kernel: LustreError: 9262:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:08 oss1 kernel: LustreError: 9162:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:15 oss1 kernel: LustreError: 9271:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:23 oss1 kernel: LustreError: 9191:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:32 oss1 kernel: LustreError: 9242:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 The detailed log information i have attached herewith.. The attached file containes the /var/log/messages continuous logs seperated by *. So kindly give me a solution for this issue....... Thanks & Regards VIJESH E K - <ATT00001.c> - - Confidentiality Notice: This e-mail message, its contents and any attachments to it are confidential to the intended recipient, and may contain information that is privileged and/or exempt from disclosure under applicable law. If you are not the intended recipient, please immediately notify the sender and destroy the original e-mail message and any attachments (and any copies that may have been made) from your system or otherwise. Any unauthorized use, copying, disclosure or distribution of this information is strictly prohibited. Email addresses that end with a ?-c? identify the sender as a Fusion-io contractor. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120221/833ced5c/attachment-0001.html
*Dear Kevin,* * * *Herewith i have attached the /var/log/messages , kindly go through the logs and * *give me a solution for this immly.* *Can u tell me How to run e2fsck for OST ? ,* *Pl tell the exact command with switch how to run e2fsck * *without effecting the data.....* * * *we are waiting for your reply..... * * * *Thanks & Regards VIJESH E K* * * On Tue, Feb 21, 2012 at 8:38 PM, Kevin Van Maren <KVanMaren at fusionio.com>wrote:> This is not the correct list for help with SGE. > > That being said, the real issue (as has been mentioned by several people) > is that an OST has gone read-only due to some issue. The file system will > not function properly until this is resolved, irrespective of where you put > SGE. > > You will need to check the logs on oss1 to find the initial issue, stop > the bad ost, and take corrective action (the details of which depend on the > issue), > > Kevin > > Sent from my iPhone > > On Feb 21, 2012, at 3:23 AM, "VIJESH EK" <ekvijesh at gmail.com> wrote: > > - > * > > * > *We are waiting for your feedback.........* > * > * > *Thanks & Regards > > VIJESH E K* > * > * > > > On Tue, Feb 21, 2012 at 12:22 PM, VIJESH EK < <ekvijesh at gmail.com> > ekvijesh at gmail.com> wrote: > >> *Dear All,* >> * >> * >> *We have done the following changes in the exec Nodes , still now also >> we are * >> *getting the same errors in /var/log/messages.* >> * >> * >> *1. We have changed the exec Nodes spool directory to local directory by >> editing the file /home/appl/sge-root/default/common/configuration and >> changes the parameter execd_spool_dir. * >> * >> * >> *After changing this also the same error, i.e below mentioned error is >> coming in OSS1 Node. This error is generating only in the OSS1 Node.* >> * >> * >> Feb 6 18:32:10 oss1 kernel: LustreError: >> 9362:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> transaction: rc = -30 >> Feb 6 18:32:05 oss1 kernel: LustreError: >> 9422:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> transaction: rc = -30 >> Feb 6 18:32:06 oss1 kernel: LustreError: >> 9432:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> transaction: rc = -30 >> Feb 6 18:32:07 oss1 kernel: LustreError: >> 9369:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> transaction: rc = -30 >> Feb 6 18:32:10 oss1 kernel: LustreError: >> 9362:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >> transaction: rc =* -30* >> >> * >> * >> *Can u tell me how to change the Master spool directory ?* >> *Is it possible to change the directory in live mode ?* >> * >> * >> *Kindly explain briefly, so that we can proceed for the next step.. * >> * >> * >> * >> * >> *Thanks and Regards* >> * >> * >> *VIJESH * >> * >> * >> * >> * >> * >> * >> * >> * >> * >> >> * >> >> On Fri, Feb 10, 2012 at 1:19 PM, Carlos Thomaz < <cthomaz at ddn.com> >> cthomaz at ddn.com> wrote: >> >>> Hi vijesh. >>> >>> Are you running the SGE master spooling on lustre?!?! What about the >>> exec nodes spooling?! >>> >>> I strongly recommend you to do not run the master spooling on lustre. >>> And if possible use local spooling on local disk for the exec nodes. >>> >>> SGE (?t. least until version 6.2u7) is known to get unstable when >>> running the spooling on lustre. >>> >>> Carlos >>> >>> On Feb 10, 2012, at 1:18 AM, "VIJESH EK" < <ekvijesh at gmail.com> >>> ekvijesh at gmail.com> wrote: >>> >>> *Dear All,* >>> * >>> * >>> *Kindly get a solution for these below issue...........* >>> * >>> * >>> *Thanks & Regards >>> >>> VIJESH E K* >>> * >>> * >>> >>> >>> On Thu, Feb 9, 2012 at 3:26 PM, VIJESH EK < <ekvijesh at gmail.com> >>> ekvijesh at gmail.com> wrote: >>> >>>> *Dear Sir,* >>>> * >>>> * >>>> *I am getting below mentioned error messages continuously in OSS1 >>>> Node,it causes that* >>>> *sge service is not running intermittently....... * >>>> * >>>> * >>>> * >>>> * >>>> Feb 5 04:03:37 oss1 kernel: LustreError: >>>> 9193:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>>> transaction: rc = -30 >>>> Feb 5 04:03:47 oss1 kernel: LustreError: >>>> 9164:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>>> transaction: rc = -30 >>>> Feb 5 04:03:47 oss1 kernel: LustreError: >>>> 28420:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>>> transaction: rc = -30 >>>> Feb 5 04:03:48 oss1 kernel: LustreError: >>>> 9266:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>>> transaction: rc = -30 >>>> Feb 5 04:03:50 oss1 kernel: LustreError: >>>> 9200:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>>> transaction: rc = -30 >>>> Feb 5 04:03:53 oss1 kernel: LustreError: >>>> 9230:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>>> transaction: rc = -30 >>>> Feb 5 04:03:57 oss1 kernel: LustreError: >>>> 9212:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>>> transaction: rc = -30 >>>> Feb 5 04:04:03 oss1 kernel: LustreError: >>>> 9262:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>>> transaction: rc = -30 >>>> Feb 5 04:04:08 oss1 kernel: LustreError: >>>> 9162:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>>> transaction: rc = -30 >>>> Feb 5 04:04:15 oss1 kernel: LustreError: >>>> 9271:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>>> transaction: rc = -30 >>>> Feb 5 04:04:23 oss1 kernel: LustreError: >>>> 9191:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>>> transaction: rc = -30 >>>> Feb 5 04:04:32 oss1 kernel: LustreError: >>>> 9242:0:(filter_io_26.c:693:filter_commitrw_write()) error starting >>>> transaction: rc = -30 >>>> * >>>> >>>> * >>>> *The detailed log information i have attached herewith.. The attached >>>> file containes the /var/log/messages* >>>> *continuous logs seperated by *. * >>>> * >>>> * >>>> *So kindly give me a solution for this issue.......* >>>> * >>>> * >>>> *Thanks & Regards >>>> >>>> VIJESH E K* >>>> * >>>> * >>>> >>>> >>> >>> >>> - >>> >>> <ATT00001.c> >>> >>> >> >> >> - >> >> > > > - > > > > Confidentiality Notice: This e-mail message, its contents and any > attachments to it are confidential to the intended recipient, and may > contain information that is privileged and/or exempt from disclosure under > applicable law. If you are not the intended recipient, please immediately > notify the sender and destroy the original e-mail message and any > attachments (and any copies that may have been made) from your system or > otherwise. Any unauthorized use, copying, disclosure or distribution of > this information is strictly prohibited. Email addresses that end with a > ?-c? identify the sender as a Fusion-io contractor. > ?? >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120222/e958f206/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: newoss1messages Type: application/octet-stream Size: 4037261 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120222/e958f206/attachment-0001.obj
The logs you attached start sometime after the issue: to tell what happened you need to find the error in the logs before you started getting these errors: Feb 5 04:03:13 oss1 kernel: LustreError: 9222:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 It looks like you rebooted the server, and OST0 and OST1 were mounted, and you are NOT getting those errors any more, but both OSTs reported errors on mount. So unmount the OSTs, and run: e2fsck /dev/dm-0 e2fsck /dev/dm-1 I don''t know how mangled your OSTs are, so I don''t know what e2fsck will report. See also http://wiki.lustre.org/index.php/Handling_File_System_Errors Kevin On Feb 21, 2012, at 10:43 PM, VIJESH EK wrote: Dear Kevin, Herewith i have attached the /var/log/messages , kindly go through the logs and give me a solution for this immly. Can u tell me How to run e2fsck for OST ? , Pl tell the exact command with switch how to run e2fsck without effecting the data..... we are waiting for your reply..... Thanks & Regards VIJESH E K On Tue, Feb 21, 2012 at 8:38 PM, Kevin Van Maren <KVanMaren at fusionio.com<mailto:KVanMaren at fusionio.com>> wrote: This is not the correct list for help with SGE. That being said, the real issue (as has been mentioned by several people) is that an OST has gone read-only due to some issue. The file system will not function properly until this is resolved, irrespective of where you put SGE. You will need to check the logs on oss1 to find the initial issue, stop the bad ost, and take corrective action (the details of which depend on the issue), Kevin Sent from my iPhone On Feb 21, 2012, at 3:23 AM, "VIJESH EK" <ekvijesh at gmail.com<mailto:ekvijesh at gmail.com>> wrote: - We are waiting for your feedback......... Thanks & Regards VIJESH E K On Tue, Feb 21, 2012 at 12:22 PM, VIJESH EK <<mailto:ekvijesh at gmail.com>ekvijesh at gmail.com<mailto:ekvijesh at gmail.com>> wrote: Dear All, We have done the following changes in the exec Nodes , still now also we are getting the same errors in /var/log/messages. 1. We have changed the exec Nodes spool directory to local directory by editing the file /home/appl/sge-root/default/common/configuration and changes the parameter execd_spool_dir. After changing this also the same error, i.e below mentioned error is coming in OSS1 Node. This error is generating only in the OSS1 Node. Feb 6 18:32:10 oss1 kernel: LustreError: 9362:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 6 18:32:05 oss1 kernel: LustreError: 9422:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 6 18:32:06 oss1 kernel: LustreError: 9432:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 6 18:32:07 oss1 kernel: LustreError: 9369:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 6 18:32:10 oss1 kernel: LustreError: 9362:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Can u tell me how to change the Master spool directory ? Is it possible to change the directory in live mode ? Kindly explain briefly, so that we can proceed for the next step.. Thanks and Regards VIJESH On Fri, Feb 10, 2012 at 1:19 PM, Carlos Thomaz <<mailto:cthomaz at ddn.com>cthomaz at ddn.com<mailto:cthomaz at ddn.com>> wrote: Hi vijesh. Are you running the SGE master spooling on lustre?!?! What about the exec nodes spooling?! I strongly recommend you to do not run the master spooling on lustre. And if possible use local spooling on local disk for the exec nodes. SGE (?t. least until version 6.2u7) is known to get unstable when running the spooling on lustre. Carlos On Feb 10, 2012, at 1:18 AM, "VIJESH EK" <<mailto:ekvijesh at gmail.com>ekvijesh at gmail.com<mailto:ekvijesh at gmail.com>> wrote: Dear All, Kindly get a solution for these below issue........... Thanks & Regards VIJESH E K On Thu, Feb 9, 2012 at 3:26 PM, VIJESH EK <<mailto:ekvijesh at gmail.com>ekvijesh at gmail.com<mailto:ekvijesh at gmail.com>> wrote: Dear Sir, I am getting below mentioned error messages continuously in OSS1 Node,it causes that sge service is not running intermittently....... Feb 5 04:03:37 oss1 kernel: LustreError: 9193:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:47 oss1 kernel: LustreError: 9164:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:47 oss1 kernel: LustreError: 28420:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:48 oss1 kernel: LustreError: 9266:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:50 oss1 kernel: LustreError: 9200:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:53 oss1 kernel: LustreError: 9230:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:57 oss1 kernel: LustreError: 9212:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:03 oss1 kernel: LustreError: 9262:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:08 oss1 kernel: LustreError: 9162:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:15 oss1 kernel: LustreError: 9271:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:23 oss1 kernel: LustreError: 9191:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:32 oss1 kernel: LustreError: 9242:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 The detailed log information i have attached herewith.. The attached file containes the /var/log/messages continuous logs seperated by *. So kindly give me a solution for this issue....... Thanks & Regards VIJESH E K - <ATT00001.c> - - Confidentiality Notice: This e-mail message, its contents and any attachments to it are confidential to the intended recipient, and may contain information that is privileged and/or exempt from disclosure under applicable law. If you are not the intended recipient, please immediately notify the sender and destroy the original e-mail message and any attachments (and any copies that may have been made) from your system or otherwise. Any unauthorized use, copying, disclosure or distribution of this information is strictly prohibited. Email addresses that end with a ?-c? identify the sender as a Fusion-io contractor. ?? <newoss1messages> Confidentiality Notice: This e-mail message, its contents and any attachments to it are confidential to the intended recipient, and may contain information that is privileged and/or exempt from disclosure under applicable law. If you are not the intended recipient, please immediately notify the sender and destroy the original e-mail message and any attachments (and any copies that may have been made) from your system or otherwise. Any unauthorized use, copying, disclosure or distribution of this information is strictly prohibited. Email addresses that end with a ?-c? identify the sender as a Fusion-io contractor. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120221/a35391da/attachment.html
Dear all: When we testing lustre on OST, get the following error: Feb 6 23:25:09 localhost kernel: LustreError: 28483:0:(osc_request.c:716:osc_announce_cached()) dirty 33673216 > dirty_max 33554432 I was wondering that, is it the "dirty pages" in memory are too much to induce this problem, pdflush thread had not written back dirty pages on time? Could you give me some advice about this issue? Many Thanks. Best Regards feng -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120224/60e22c4e/attachment.html