Hi, Mitch I think you can config ftrace to just trace function calls of btrfs.ko which will save a lot of trace buffer space. See below command: #echo '':mod:btrfs'' > /sys/kernel/debug/tracing/set_ftrace_filterAnd please send out the full ftrace log again. Another helpful information might be the strace log of the wmldbcreate process. It will show us the io pattern of this command. Thanks a lot for your help! -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011/3/1 Xin Zhong <thierryzhong@hotmail.com>:> > Hi, Mitch > I think you can config ftrace to just trace function calls of btrfs.ko which will save a lot of trace buffer space. See below command: > #echo '':mod:btrfs'' > /sys/kernel/debug/tracing/set_ftrace_filterAnd please send out the full ftrace log again. > > Another helpful information might be the strace log of the wmldbcreate process. It will show us the io pattern of this command. > Thanks a lot for your help! > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >I manually ran an strace around the build command (wmldbcreate) that is causing my problem, and I am attaching the strace for that. Please note that wmldbcreate does not seem to care when an error is returned, and continues on. So the error is occurring somewhat silently in the middle, and isn''t the last item. The error is probably associated with one of the 12288 byte writes. I have re-run an ftrace following the conditions above, and have hosted that file (~1.1MB compressed) on my local server at: http://dontpanic.dyndns.org/trace-openmotif-btrfs-v15.gz Please note I am still using some debugging modifications of my own to file.c. They server the purpose of: (1) Avoiding an infinite loop by identifying when the problem is occuring, and exiting with error after 256 loops. (2) Stopping the trace after exiting to keep from flooding the ftrace buffer. (3) Provide debugging comments (all prefaced with "TPK:" in the trace). Let me know if you want me to change any of the conditions.
I downloaded openmotif and run the command as Mitch mentioned and was able to recreate the problem locally. And I managed to simplify the command into a very simple program which can capture the problem easily. See below code: #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> static char a[4096*3]; int main() { int fd = open("out", O_WRONLY|O_CREAT|O_TRUNC, 0666); write(fd,a+1, 4096*2); exit(0); } It seems that if we give an unaligned address to btrfs write and the buffer reside on more than 2 pages. It will trigger this bug. If we give an aligned address to btrfs write, it works well no matter how many pages are given. I use ftrace to observe it. It seems iov_iter_fault_in_readable do not trigger pagefault handling when the address is not aligned. I do not quite understand the reason behind it. But the solution should be to process the page one by one. And that''s also what generic file write routine does. Any suggestion are welcomed. Thanks! -----Original Message----- From: linux-btrfs-owner@vger.kernel.org [mailto:linux-btrfs-owner@vger.kernel.org] On Behalf Of Mitch Harder Sent: Wednesday, March 02, 2011 5:09 AM To: Xin Zhong Cc: linux-btrfs@vger.kernel.org Subject: Re: [PATCH] btrfs file write debugging patch 2011/3/1 Xin Zhong <thierryzhong@hotmail.com>:> > Hi, Mitch > I think you can config ftrace to just trace function calls of btrfs.ko which will save a lot of trace buffer space. See below command: > #echo '':mod:btrfs'' > /sys/kernel/debug/tracing/set_ftrace_filterAnd please send out the full ftrace log again. > > Another helpful information might be the strace log of the wmldbcreate process. It will show us the io pattern of this command. > Thanks a lot for your help! > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" > in the body of a message to majordomo@vger.kernel.org More majordomo > info at http://vger.kernel.org/majordomo-info.html >I manually ran an strace around the build command (wmldbcreate) that is causing my problem, and I am attaching the strace for that. Please note that wmldbcreate does not seem to care when an error is returned, and continues on. So the error is occurring somewhat silently in the middle, and isn''t the last item. The error is probably associated with one of the 12288 byte writes. I have re-run an ftrace following the conditions above, and have hosted that file (~1.1MB compressed) on my local server at: http://dontpanic.dyndns.org/trace-openmotif-btrfs-v15.gz Please note I am still using some debugging modifications of my own to file.c. They server the purpose of: (1) Avoiding an infinite loop by identifying when the problem is occuring, and exiting with error after 256 loops. (2) Stopping the trace after exiting to keep from flooding the ftrace buffer. (3) Provide debugging comments (all prefaced with "TPK:" in the trace). Let me know if you want me to change any of the conditions. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Sorry, I forgot to mention that you need to undo below commit in btrfs-unstable to recreate the problem: Btrfs: fix fiemap bugs with delalloc (+224/-42) Otherwise, it will run into enospc error. I am not sure if it''s the same problem. ----------------------------------------> From: xin.zhong@intel.com > To: mitch.harder@sabayonlinux.org; thierryzhong@hotmail.com > CC: linux-btrfs@vger.kernel.org > Date: Wed, 2 Mar 2011 18:58:49 +0800 > Subject: RE: [PATCH] btrfs file write debugging patch > > I downloaded openmotif and run the command as Mitch mentioned and was able to recreate the problem locally. And I managed to simplify the command into a very simple program which can capture the problem easily. See below code: > > #include > #include > #include > static char a[4096*3]; > int main() > { > int fd = open("out", O_WRONLY|O_CREAT|O_TRUNC, 0666); > write(fd,a+1, 4096*2); > exit(0); > } > > It seems that if we give an unaligned address to btrfs write and the buffer reside on more than 2 pages. It will trigger this bug. > If we give an aligned address to btrfs write, it works well no matter how many pages are given. > > I use ftrace to observe it. It seems iov_iter_fault_in_readable do not trigger pagefault handling when the address is not aligned. I do not quite understand the reason behind it. But the solution should be to process the page one by one. And that''s also what generic file write routine does. > > Any suggestion are welcomed. Thanks! > > -----Original Message----- > From: linux-btrfs-owner@vger.kernel.org [mailto:linux-btrfs-owner@vger.kernel.org] On Behalf Of Mitch Harder > Sent: Wednesday, March 02, 2011 5:09 AM > To: Xin Zhong > Cc: linux-btrfs@vger.kernel.org > Subject: Re: [PATCH] btrfs file write debugging patch > > 2011/3/1 Xin Zhong : > > > > Hi, Mitch > > I think you can config ftrace to just trace function calls of btrfs.ko which will save a lot of trace buffer space. See below command: > > #echo '':mod:btrfs'' > /sys/kernel/debug/tracing/set_ftrace_filterAnd please send out the full ftrace log again. > > > > Another helpful information might be the strace log of the wmldbcreate process. It will show us the io pattern of this command. > > Thanks a lot for your help! > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" > > in the body of a message to majordomo@vger.kernel.org More majordomo > > info at http://vger.kernel.org/majordomo-info.html > > > > I manually ran an strace around the build command (wmldbcreate) that is causing my problem, and I am attaching the strace for that. > > Please note that wmldbcreate does not seem to care when an error is returned, and continues on. So the error is occurring somewhat silently in the middle, and isn''t the last item. The error is probably associated with one of the 12288 byte writes. > > I have re-run an ftrace following the conditions above, and have hosted that file (~1.1MB compressed) on my local server at: > > http://dontpanic.dyndns.org/trace-openmotif-btrfs-v15.gz > > Please note I am still using some debugging modifications of my own to file.c. > > They server the purpose of: > (1) Avoiding an infinite loop by identifying when the problem is occuring, and exiting with error after 256 loops. > (2) Stopping the trace after exiting to keep from flooding the ftrace buffer. > (3) Provide debugging comments (all prefaced with "TPK:" in the trace). > > Let me know if you want me to change any of the conditions. > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Excerpts from Zhong, Xin''s message of 2011-03-02 05:58:49 -0500:> I downloaded openmotif and run the command as Mitch mentioned and was able to recreate the problem locally. And I managed to simplify the command into a very simple program which can capture the problem easily. See below code: > > #include <sys/types.h> > #include <sys/stat.h> > #include <fcntl.h> > static char a[4096*3]; > int main() > { > int fd = open("out", O_WRONLY|O_CREAT|O_TRUNC, 0666); > write(fd,a+1, 4096*2); > exit(0); > } > > It seems that if we give an unaligned address to btrfs write and the buffer reside on more than 2 pages. It will trigger this bug. > If we give an aligned address to btrfs write, it works well no matter how many pages are given. > > I use ftrace to observe it. It seems iov_iter_fault_in_readable do not trigger pagefault handling when the address is not aligned. I do not quite understand the reason behind it. But the solution should be to process the page one by one. And that''s also what generic file write routine does. > > Any suggestion are welcomed. Thanks!Great job guys. I''m using this on top of my debugging patch. It passes the unaligned test but I''ll give it a real run tonight and look for other problems. (This is almost entirely untested, please don''t use it quite yet) -chris diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 89a6a26..6a44add 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1039,6 +1038,14 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb, copied = btrfs_copy_from_user(pos, num_pages, write_bytes, pages, &i); + + /* + * if we have trouble faulting in the pages, fall + * back to one page at a time + */ + if (copied < write_bytes) + nrptrs = 1; + if (copied == 0) dirty_pages = 0; else -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Mar 03, 2011 at 08:51:55PM -0500, Chris Mason wrote:> Excerpts from Zhong, Xin''s message of 2011-03-02 05:58:49 -0500: > > I downloaded openmotif and run the command as Mitch mentioned and was able to recreate the problem locally. And I managed to simplify the command into a very simple program which can capture the problem easily. See below code: > > > > #include <sys/types.h> > > #include <sys/stat.h> > > #include <fcntl.h> > > static char a[4096*3]; > > int main() > > { > > int fd = open("out", O_WRONLY|O_CREAT|O_TRUNC, 0666); > > write(fd,a+1, 4096*2); > > exit(0); > > } > > > > It seems that if we give an unaligned address to btrfs write and the buffer reside on more than 2 pages. It will trigger this bug. > > If we give an aligned address to btrfs write, it works well no matter how many pages are given. > > > > I use ftrace to observe it. It seems iov_iter_fault_in_readable do not trigger pagefault handling when the address is not aligned. I do not quite understand the reason behind it. But the solution should be to process the page one by one. And that''s also what generic file write routine does. > > > > Any suggestion are welcomed. Thanks! > > Great job guys. I''m using this on top of my debugging patch. It passes > the unaligned test but I''ll give it a real run tonight and look for > other problems. > > (This is almost entirely untested, please don''t use it quite yet) > > -chris > > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c > index 89a6a26..6a44add 100644 > --- a/fs/btrfs/file.c > +++ b/fs/btrfs/file.c > @@ -1039,6 +1038,14 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb, > > copied = btrfs_copy_from_user(pos, num_pages, > write_bytes, pages, &i); > + > + /* > + * if we have trouble faulting in the pages, fall > + * back to one page at a time > + */ > + if (copied < write_bytes) > + nrptrs = 1; > + > if (copied == 0) > dirty_pages = 0; > elseBtw this situation is taken care of in my write path rewrite patch, if copied =0 we switch to one segment at a time. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Mar 04, 2011 at 10:42:46AM +0800, Zhong, Xin wrote:> Where can I find your patch? Thanks! >It''s in my btrfs-work git tree, it''s based on the latest git pull from linus so you can just pull it onto a linus tree and you should be good to go. The specific patch is Btrfs: simplify our write path Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Where can I find your patch? Thanks! -----Original Message----- From: Josef Bacik [mailto:josef@redhat.com] Sent: Friday, March 04, 2011 10:32 AM To: Chris Mason Cc: Zhong, Xin; Mitch Harder; Xin Zhong; linux-btrfs@vger.kernel.org Subject: Re: [PATCH] btrfs file write debugging patch On Thu, Mar 03, 2011 at 08:51:55PM -0500, Chris Mason wrote:> Excerpts from Zhong, Xin''s message of 2011-03-02 05:58:49 -0500: > > I downloaded openmotif and run the command as Mitch mentioned and was able to recreate the problem locally. And I managed to simplify the command into a very simple program which can capture the problem easily. See below code: > > > > #include <sys/types.h> > > #include <sys/stat.h> > > #include <fcntl.h> > > static char a[4096*3]; > > int main() > > { > > int fd = open("out", O_WRONLY|O_CREAT|O_TRUNC, 0666); > > write(fd,a+1, 4096*2); > > exit(0); > > } > > > > It seems that if we give an unaligned address to btrfs write and the buffer reside on more than 2 pages. It will trigger this bug. > > If we give an aligned address to btrfs write, it works well no matter how many pages are given. > > > > I use ftrace to observe it. It seems iov_iter_fault_in_readable do not trigger pagefault handling when the address is not aligned. I do not quite understand the reason behind it. But the solution should be to process the page one by one. And that''s also what generic file write routine does. > > > > Any suggestion are welcomed. Thanks! > > Great job guys. I''m using this on top of my debugging patch. It passes > the unaligned test but I''ll give it a real run tonight and look for > other problems. > > (This is almost entirely untested, please don''t use it quite yet) > > -chris > > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c > index 89a6a26..6a44add 100644 > --- a/fs/btrfs/file.c > +++ b/fs/btrfs/file.c > @@ -1039,6 +1038,14 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb, > > copied = btrfs_copy_from_user(pos, num_pages, > write_bytes, pages, &i); > + > + /* > + * if we have trouble faulting in the pages, fall > + * back to one page at a time > + */ > + if (copied < write_bytes) > + nrptrs = 1; > + > if (copied == 0) > dirty_pages = 0; > elseBtw this situation is taken care of in my write path rewrite patch, if copied =0 we switch to one segment at a time. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Looks good to me. Thanks! Another mysterious thing is that this problem can only be recreated on x86 32bit system. I can not recreate it on x86_64 system using my test case. Any one have any idea about it? -----Original Message----- From: Josef Bacik [mailto:josef@redhat.com] Sent: Friday, March 04, 2011 10:41 AM To: Zhong, Xin Cc: Josef Bacik; Chris Mason; Mitch Harder; Xin Zhong; linux-btrfs@vger.kernel.org Subject: Re: [PATCH] btrfs file write debugging patch On Fri, Mar 04, 2011 at 10:42:46AM +0800, Zhong, Xin wrote:> Where can I find your patch? Thanks! >It''s in my btrfs-work git tree, it''s based on the latest git pull from linus so you can just pull it onto a linus tree and you should be good to go. The specific patch is Btrfs: simplify our write path Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Excerpts from Chris Mason''s message of 2011-03-03 20:51:55 -0500:> Excerpts from Zhong, Xin''s message of 2011-03-02 05:58:49 -0500: > > It seems that if we give an unaligned address to btrfs write and the buffer reside on more than 2 pages. It will trigger this bug. > > If we give an aligned address to btrfs write, it works well no matter how many pages are given. > > > > I use ftrace to observe it. It seems iov_iter_fault_in_readable do not trigger pagefault handling when the address is not aligned. I do not quite understand the reason behind it. But the solution should be to process the page one by one. And that''s also what generic file write routine does. > > > > Any suggestion are welcomed. Thanks! > > Great job guys. I''m using this on top of my debugging patch. It passes > the unaligned test but I''ll give it a real run tonight and look for > other problems. > > (This is almost entirely untested, please don''t use it quite yet)> > -chris > > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c > index 89a6a26..6a44add 100644 > --- a/fs/btrfs/file.c > +++ b/fs/btrfs/file.c > @@ -1039,6 +1038,14 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb, > > copied = btrfs_copy_from_user(pos, num_pages, > write_bytes, pages, &i); > + > + /* > + * if we have trouble faulting in the pages, fall > + * back to one page at a time > + */ > + if (copied < write_bytes) > + nrptrs = 1; > + > if (copied == 0) > dirty_pages = 0; > elseOk, this is working well for me. Anyone see any problems with it? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
It works well for me too. ----------------------------------------> From: chris.mason@oracle.com > To: chris.mason@oracle.com > CC: xin.zhong@intel.com; mitch.harder@sabayonlinux.org; thierryzhong@hotmail.com; linux-btrfs@vger.kernel.org > Subject: RE: [PATCH] btrfs file write debugging patch > Date: Fri, 4 Mar 2011 07:19:39 -0500 > > Excerpts from Chris Mason''s message of 2011-03-03 20:51:55 -0500: > > Excerpts from Zhong, Xin''s message of 2011-03-02 05:58:49 -0500: > > > It seems that if we give an unaligned address to btrfs write and the buffer reside on more than 2 pages. It will trigger this bug. > > > If we give an aligned address to btrfs write, it works well no matter how many pages are given. > > > > > > I use ftrace to observe it. It seems iov_iter_fault_in_readable do not trigger pagefault handling when the address is not aligned. I do not quite understand the reason behind it. But the solution should be to process the page one by one. And that''s also what generic file write routine does. > > > > > > Any suggestion are welcomed. Thanks! > > > > Great job guys. I''m using this on top of my debugging patch. It passes > > the unaligned test but I''ll give it a real run tonight and look for > > other problems. > > > > (This is almost entirely untested, please don''t use it quite yet) > > > > > -chris > > > > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c > > index 89a6a26..6a44add 100644 > > --- a/fs/btrfs/file.c > > +++ b/fs/btrfs/file.c > > @@ -1039,6 +1038,14 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb, > > > > copied = btrfs_copy_from_user(pos, num_pages, > > write_bytes, pages, &i); > > + > > + /* > > + * if we have trouble faulting in the pages, fall > > + * back to one page at a time > > + */ > > + if (copied < write_bytes) > > + nrptrs = 1; > > + > > if (copied == 0) > > dirty_pages = 0; > > else > > Ok, this is working well for me. Anyone see any problems with it? > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011/3/4 Xin Zhong <thierryzhong@hotmail.com>:> > It works well for me too. > > ---------------------------------------- >> From: chris.mason@oracle.com >> To: chris.mason@oracle.com >> CC: xin.zhong@intel.com; mitch.harder@sabayonlinux.org; thierryzhong@hotmail.com; linux-btrfs@vger.kernel.org >> Subject: RE: [PATCH] btrfs file write debugging patch >> Date: Fri, 4 Mar 2011 07:19:39 -0500 >> >> Excerpts from Chris Mason''s message of 2011-03-03 20:51:55 -0500: >> > Excerpts from Zhong, Xin''s message of 2011-03-02 05:58:49 -0500: >> > > It seems that if we give an unaligned address to btrfs write and the buffer reside on more than 2 pages. It will trigger this bug. >> > > If we give an aligned address to btrfs write, it works well no matter how many pages are given. >> > > >> > > I use ftrace to observe it. It seems iov_iter_fault_in_readable do not trigger pagefault handling when the address is not aligned. I do not quite understand the reason behind it. But the solution should be to process the page one by one. And that''s also what generic file write routine does. >> > > >> > > Any suggestion are welcomed. Thanks! >> > >> > Great job guys. I''m using this on top of my debugging patch. It passes >> > the unaligned test but I''ll give it a real run tonight and look for >> > other problems. >> > >> > (This is almost entirely untested, please don''t use it quite yet) >> >> > >> > -chris >> > >> > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c >> > index 89a6a26..6a44add 100644 >> > --- a/fs/btrfs/file.c >> > +++ b/fs/btrfs/file.c >> > @@ -1039,6 +1038,14 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb, >> > >> > copied = btrfs_copy_from_user(pos, num_pages, >> > write_bytes, pages, &i); >> > + >> > + /* >> > + * if we have trouble faulting in the pages, fall >> > + * back to one page at a time >> > + */ >> > + if (copied < write_bytes) >> > + nrptrs = 1; >> > + >> > if (copied == 0) >> > dirty_pages = 0; >> > else >> >> Ok, this is working well for me. Anyone see any problems with it? >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > >I''ve applied this patch on top of the debugging patch at the head of the thread, and I''m having trouble building gcc now. When building gcc-4.4.5, I get errors like the following: Comparing stages 2 and 3 Bootstrap comparison failure! ./cp/call.o differs ./cp/decl.o differs ./cp/pt.o differs ./cp/class.o differs ./cp/decl2.o differs <....snip.....> ./matrix-reorg.o differs ./tree-inline.o differs ./gcc.o differs ./gcc-options.o differs make[2]: *** [compare] Error 1 make[1]: *** [stage3-bubble] Error 2 make: *** [bootstrap-lean] Error 2 emake failed I''ve went back and rebuilt my kernel without these two debugging patches, and gcc-4.4.5 builds without error on that kernel. I haven''t yet tested building gcc-4.4.5 with just the debugging patch at the head of the thread, so I''ll test that, and report back. But I was wondering if anybody else can replicate this issue. BTW, I''ve been doing most of my testing on an x86 system. My x86_64 systems haven''t had as much trouble, but I haven''t been robustingly checking my x86_64 systems for these issues. I noticed that page fault handling is different by architecture. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Mar 4, 2011 at 9:33 AM, Mitch Harder <mitch.harder@sabayonlinux.org> wrote:> 2011/3/4 Xin Zhong <thierryzhong@hotmail.com>: >> >> It works well for me too. >> >> ---------------------------------------- >>> From: chris.mason@oracle.com >>> To: chris.mason@oracle.com >>> CC: xin.zhong@intel.com; mitch.harder@sabayonlinux.org; thierryzhong@hotmail.com; linux-btrfs@vger.kernel.org >>> Subject: RE: [PATCH] btrfs file write debugging patch >>> Date: Fri, 4 Mar 2011 07:19:39 -0500 >>> >>> Excerpts from Chris Mason''s message of 2011-03-03 20:51:55 -0500: >>> > Excerpts from Zhong, Xin''s message of 2011-03-02 05:58:49 -0500: >>> > > It seems that if we give an unaligned address to btrfs write and the buffer reside on more than 2 pages. It will trigger this bug. >>> > > If we give an aligned address to btrfs write, it works well no matter how many pages are given. >>> > > >>> > > I use ftrace to observe it. It seems iov_iter_fault_in_readable do not trigger pagefault handling when the address is not aligned. I do not quite understand the reason behind it. But the solution should be to process the page one by one. And that''s also what generic file write routine does. >>> > > >>> > > Any suggestion are welcomed. Thanks! >>> > >>> > Great job guys. I''m using this on top of my debugging patch. It passes >>> > the unaligned test but I''ll give it a real run tonight and look for >>> > other problems. >>> > >>> > (This is almost entirely untested, please don''t use it quite yet) >>> >>> > >>> > -chris >>> > >>> > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c >>> > index 89a6a26..6a44add 100644 >>> > --- a/fs/btrfs/file.c >>> > +++ b/fs/btrfs/file.c >>> > @@ -1039,6 +1038,14 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb, >>> > >>> > copied = btrfs_copy_from_user(pos, num_pages, >>> > write_bytes, pages, &i); >>> > + >>> > + /* >>> > + * if we have trouble faulting in the pages, fall >>> > + * back to one page at a time >>> > + */ >>> > + if (copied < write_bytes) >>> > + nrptrs = 1; >>> > + >>> > if (copied == 0) >>> > dirty_pages = 0; >>> > else >>> >>> Ok, this is working well for me. Anyone see any problems with it? >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > > I''ve applied this patch on top of the debugging patch at the head of > the thread, and I''m having trouble building gcc now. > > When building gcc-4.4.5, I get errors like the following: > > Comparing stages 2 and 3 > Bootstrap comparison failure! > ./cp/call.o differs > ./cp/decl.o differs > ./cp/pt.o differs > ./cp/class.o differs > ./cp/decl2.o differs > <....snip.....> > ./matrix-reorg.o differs > ./tree-inline.o differs > ./gcc.o differs > ./gcc-options.o differs > make[2]: *** [compare] Error 1 > make[1]: *** [stage3-bubble] Error 2 > make: *** [bootstrap-lean] Error 2 > emake failed > > I''ve went back and rebuilt my kernel without these two debugging > patches, and gcc-4.4.5 builds without error on that kernel. > > I haven''t yet tested building gcc-4.4.5 with just the debugging patch > at the head of the thread, so I''ll test that, and report back. > > But I was wondering if anybody else can replicate this issue. > > BTW, I''ve been doing most of my testing on an x86 system. My x86_64 > systems haven''t had as much trouble, but I haven''t been robustingly > checking my x86_64 systems for these issues. > > I noticed that page fault handling is different by architecture. >Some followup... I''m encountering this issue with "Bootstrap comparison failure!" in a gcc-4.4.5 build when only the patch at the head of the thread is applied (leaving the recent patch to limit pages to one-by-one on page fault out). I just hadn''t run across this issue until I started playing with patches to limit the pages to one-by-one on page fault errors. So it may not be associated with the last patch. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
So it works well now with the two patches from Chris on your system. Am I right? ----------------------------------------> Date: Fri, 4 Mar 2011 11:21:36 -0600 > Subject: Re: [PATCH] btrfs file write debugging patch > From: mitch.harder@sabayonlinux.org > To: thierryzhong@hotmail.com > CC: chris.mason@oracle.com; xin.zhong@intel.com; linux-btrfs@vger.kernel.org > > On Fri, Mar 4, 2011 at 9:33 AM, Mitch Harder > wrote: > > 2011/3/4 Xin Zhong : > >> > >> It works well for me too. > >> > >> ---------------------------------------- > >>> From: chris.mason@oracle.com > >>> To: chris.mason@oracle.com > >>> CC: xin.zhong@intel.com; mitch.harder@sabayonlinux.org; thierryzhong@hotmail.com; linux-btrfs@vger.kernel.org > >>> Subject: RE: [PATCH] btrfs file write debugging patch > >>> Date: Fri, 4 Mar 2011 07:19:39 -0500 > >>> > >>> Excerpts from Chris Mason''s message of 2011-03-03 20:51:55 -0500: > >>> > Excerpts from Zhong, Xin''s message of 2011-03-02 05:58:49 -0500: > >>> > > It seems that if we give an unaligned address to btrfs write and the buffer reside on more than 2 pages. It will trigger this bug. > >>> > > If we give an aligned address to btrfs write, it works well no matter how many pages are given. > >>> > > > >>> > > I use ftrace to observe it. It seems iov_iter_fault_in_readable do not trigger pagefault handling when the address is not aligned. I do not quite understand the reason behind it. But the solution should be to process the page one by one. And that''s also what generic file write routine does. > >>> > > > >>> > > Any suggestion are welcomed. Thanks! > >>> > > >>> > Great job guys. I''m using this on top of my debugging patch. It passes > >>> > the unaligned test but I''ll give it a real run tonight and look for > >>> > other problems. > >>> > > >>> > (This is almost entirely untested, please don''t use it quite yet) > >>> > >>> > > >>> > -chris > >>> > > >>> > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c > >>> > index 89a6a26..6a44add 100644 > >>> > --- a/fs/btrfs/file.c > >>> > +++ b/fs/btrfs/file.c > >>> > @@ -1039,6 +1038,14 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb, > >>> > > >>> > copied = btrfs_copy_from_user(pos, num_pages, > >>> > write_bytes, pages, &i); > >>> > + > >>> > + /* > >>> > + * if we have trouble faulting in the pages, fall > >>> > + * back to one page at a time > >>> > + */ > >>> > + if (copied < write_bytes) > >>> > + nrptrs = 1; > >>> > + > >>> > if (copied == 0) > >>> > dirty_pages = 0; > >>> > else > >>> > >>> Ok, this is working well for me. Anyone see any problems with it? > >>> -- > >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > >>> the body of a message to majordomo@vger.kernel.org > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >> > >> > > > > I''ve applied this patch on top of the debugging patch at the head of > > the thread, and I''m having trouble building gcc now. > > > > When building gcc-4.4.5, I get errors like the following: > > > > Comparing stages 2 and 3 > > Bootstrap comparison failure! > > ./cp/call.o differs > > ./cp/decl.o differs > > ./cp/pt.o differs > > ./cp/class.o differs > > ./cp/decl2.o differs > > <....snip.....> > > ./matrix-reorg.o differs > > ./tree-inline.o differs > > ./gcc.o differs > > ./gcc-options.o differs > > make[2]: *** [compare] Error 1 > > make[1]: *** [stage3-bubble] Error 2 > > make: *** [bootstrap-lean] Error 2 > > emake failed > > > > I''ve went back and rebuilt my kernel without these two debugging > > patches, and gcc-4.4.5 builds without error on that kernel. > > > > I haven''t yet tested building gcc-4.4.5 with just the debugging patch > > at the head of the thread, so I''ll test that, and report back. > > > > But I was wondering if anybody else can replicate this issue. > > > > BTW, I''ve been doing most of my testing on an x86 system. My x86_64 > > systems haven''t had as much trouble, but I haven''t been robustingly > > checking my x86_64 systems for these issues. > > > > I noticed that page fault handling is different by architecture. > > > > Some followup... > > I''m encountering this issue with "Bootstrap comparison failure!" in a > gcc-4.4.5 build when only the patch at the head of the thread is > applied (leaving the recent patch to limit pages to one-by-one on page > fault out). > > I just hadn''t run across this issue until I started playing with > patches to limit the pages to one-by-one on page fault errors. > > So it may not be associated with the last patch. > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2011/3/4 Xin Zhong <thierryzhong@hotmail.com>:> > So it works well now with the two patches from Chris on your system. Am I right? >No. I am getting errors building gcc-4.4.5 with the two patches from Chris. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
I''ve constructed a test patch that is currently addressing all the issues on my system. The portion of Openmotif that was having issues with page faults works correctly with this patch, and gcc-4.4.5 builds without issue. I extracted only the portion of the first patch that corrects the handling of dirty_pages when copied==0, and incorporated the second patch that falls back to one-page-at-a-time if there are troubles with page faults. --- fs/btrfs/file.c 2011-03-05 07:34:43.025131607 -0600 +++ /usr/src/linux/fs/btrfs/file.c 2011-03-05 07:41:45.001260294 -0600 @@ -1023,8 +1023,20 @@ copied = btrfs_copy_from_user(pos, num_pages, write_bytes, pages, &i); - dirty_pages = (copied + offset + PAGE_CACHE_SIZE - 1) >> - PAGE_CACHE_SHIFT; + + /* + * if we have trouble faulting in the pages, fall + * back to one page at a time + */ + if (copied < write_bytes) + nrptrs = 1; + + if (copied == 0) + dirty_pages = 0; + else + dirty_pages = (copied + offset + + PAGE_CACHE_SIZE - 1) >> + PAGE_CACHE_SHIFT; if (num_pages > dirty_pages) { if (copied > 0) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Mar 3, 2011 at 8:41 PM, Josef Bacik <josef@redhat.com> wrote:> On Fri, Mar 04, 2011 at 10:42:46AM +0800, Zhong, Xin wrote: >> Where can I find your patch? Thanks! >> > > It''s in my btrfs-work git tree, it''s based on the latest git pull from linus so > you can just pull it onto a linus tree and you should be good to go. The > specific patch is > > Btrfs: simplify our write path > > Thanks, > > Josef >Josef: I''ve been testing the kernel from you git tree (as of commit 5555f192 Btrfs: add a comment explaining what btrfs_cont_expand does). I am still running into issues when running the portion of Openmotif that has been generating the problem with page faults when given an unaligned address. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
I think Josef''s patch only address the one-by-one page processing issue. But do not address the issue that dirty_pages should be set to 0 when copied is 0. ----------------------------------------> Date: Sat, 5 Mar 2011 10:56:52 -0600 > Subject: Re: [PATCH] btrfs file write debugging patch > From: mitch.harder@sabayonlinux.org > To: josef@redhat.com > CC: xin.zhong@intel.com; chris.mason@oracle.com; thierryzhong@hotmail.com; linux-btrfs@vger.kernel.org > > On Thu, Mar 3, 2011 at 8:41 PM, Josef Bacik wrote: > > On Fri, Mar 04, 2011 at 10:42:46AM +0800, Zhong, Xin wrote: > >> Where can I find your patch? Thanks! > >> > > > > It''s in my btrfs-work git tree, it''s based on the latest git pull from linus so > > you can just pull it onto a linus tree and you should be good to go. The > > specific patch is > > > > Btrfs: simplify our write path > > > > Thanks, > > > > Josef > > > > Josef: > > I''ve been testing the kernel from you git tree (as of commit 5555f192 > Btrfs: add a comment explaining what btrfs_cont_expand does). > > I am still running into issues when running the portion of Openmotif > that has been generating the problem with page faults when given an > unaligned address.-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Excerpts from Mitch Harder''s message of 2011-03-05 11:50:14 -0500:> I''ve constructed a test patch that is currently addressing all the > issues on my system. > > The portion of Openmotif that was having issues with page faults works > correctly with this patch, and gcc-4.4.5 builds without issue. > > I extracted only the portion of the first patch that corrects the > handling of dirty_pages when copied==0, and incorporated the second > patch that falls back to one-page-at-a-time if there are troubles with > page faults.Just to make sure I understand, could you please post the full combined path that was giving you trouble with gcc? We do need to make sure the pages are properly up to date if we fall back to partial writes. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Excerpts from Chris Mason''s message of 2011-03-06 13:00:27 -0500:> Excerpts from Mitch Harder''s message of 2011-03-05 11:50:14 -0500: > > I''ve constructed a test patch that is currently addressing all the > > issues on my system. > > > > The portion of Openmotif that was having issues with page faults works > > correctly with this patch, and gcc-4.4.5 builds without issue. > > > > I extracted only the portion of the first patch that corrects the > > handling of dirty_pages when copied==0, and incorporated the second > > patch that falls back to one-page-at-a-time if there are troubles with > > page faults. > > Just to make sure I understand, could you please post the full combined > path that was giving you trouble with gcc? We do need to make sure the > pages are properly up to date if we fall back to partial writes.Ok, I was able to reproduce this easily with fsx. The problem is that I wasn''t making sure the last partial page in the write was up to date when it was also the first page in the write. Here is the updated patch, it has all the fixes we''ve found so far: diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 7084140..5986ac7 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -763,6 +763,27 @@ out: } /* + * on error we return an unlocked page and the error value + * on success we return a locked page and 0 + */ +static int prepare_uptodate_page(struct page *page, u64 pos) +{ + int ret = 0; + + if ((pos & (PAGE_CACHE_SIZE - 1)) && !PageUptodate(page)) { + ret = btrfs_readpage(NULL, page); + if (ret) + return ret; + lock_page(page); + if (!PageUptodate(page)) { + unlock_page(page); + return -EIO; + } + } + return 0; +} + +/* * this gets pages into the page cache and locks them down, it also properly * waits for data=ordered extents to finish before allowing the pages to be * modified. @@ -777,6 +798,7 @@ static noinline int prepare_pages(struct btrfs_root *root, struct file *file, unsigned long index = pos >> PAGE_CACHE_SHIFT; struct inode *inode = fdentry(file)->d_inode; int err = 0; + int faili = 0; u64 start_pos; u64 last_pos; @@ -794,15 +816,24 @@ again: for (i = 0; i < num_pages; i++) { pages[i] = grab_cache_page(inode->i_mapping, index + i); if (!pages[i]) { - int c; - for (c = i - 1; c >= 0; c--) { - unlock_page(pages[c]); - page_cache_release(pages[c]); - } - return -ENOMEM; + faili = i - 1; + err = -ENOMEM; + goto fail; + } + + if (i == 0) + err = prepare_uptodate_page(pages[i], pos); + if (i == num_pages - 1) + err = prepare_uptodate_page(pages[i], + pos + write_bytes); + if (err) { + page_cache_release(pages[i]); + faili = i - 1; + goto fail; } wait_on_page_writeback(pages[i]); } + err = 0; if (start_pos < inode->i_size) { struct btrfs_ordered_extent *ordered; lock_extent_bits(&BTRFS_I(inode)->io_tree, @@ -842,6 +873,14 @@ again: WARN_ON(!PageLocked(pages[i])); } return 0; +fail: + while (faili >= 0) { + unlock_page(pages[faili]); + page_cache_release(pages[faili]); + faili--; + } + return err; + } static ssize_t btrfs_file_aio_write(struct kiocb *iocb, @@ -851,7 +890,6 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb, struct file *file = iocb->ki_filp; struct inode *inode = fdentry(file)->d_inode; struct btrfs_root *root = BTRFS_I(inode)->root; - struct page *pinned[2]; struct page **pages = NULL; struct iov_iter i; loff_t *ppos = &iocb->ki_pos; @@ -872,9 +910,6 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb, will_write = ((file->f_flags & O_DSYNC) || IS_SYNC(inode) || (file->f_flags & O_DIRECT)); - pinned[0] = NULL; - pinned[1] = NULL; - start_pos = pos; vfs_check_frozen(inode->i_sb, SB_FREEZE_WRITE); @@ -962,32 +997,6 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb, first_index = pos >> PAGE_CACHE_SHIFT; last_index = (pos + iov_iter_count(&i)) >> PAGE_CACHE_SHIFT; - /* - * there are lots of better ways to do this, but this code - * makes sure the first and last page in the file range are - * up to date and ready for cow - */ - if ((pos & (PAGE_CACHE_SIZE - 1))) { - pinned[0] = grab_cache_page(inode->i_mapping, first_index); - if (!PageUptodate(pinned[0])) { - ret = btrfs_readpage(NULL, pinned[0]); - BUG_ON(ret); - wait_on_page_locked(pinned[0]); - } else { - unlock_page(pinned[0]); - } - } - if ((pos + iov_iter_count(&i)) & (PAGE_CACHE_SIZE - 1)) { - pinned[1] = grab_cache_page(inode->i_mapping, last_index); - if (!PageUptodate(pinned[1])) { - ret = btrfs_readpage(NULL, pinned[1]); - BUG_ON(ret); - wait_on_page_locked(pinned[1]); - } else { - unlock_page(pinned[1]); - } - } - while (iov_iter_count(&i) > 0) { size_t offset = pos & (PAGE_CACHE_SIZE - 1); size_t write_bytes = min(iov_iter_count(&i), @@ -1024,8 +1033,20 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb, copied = btrfs_copy_from_user(pos, num_pages, write_bytes, pages, &i); - dirty_pages = (copied + offset + PAGE_CACHE_SIZE - 1) >> - PAGE_CACHE_SHIFT; + + /* + * if we have trouble faulting in the pages, fall + * back to one page at a time + */ + if (copied < write_bytes) + nrptrs = 1; + + if (copied == 0) + dirty_pages = 0; + else + dirty_pages = (copied + offset + + PAGE_CACHE_SIZE - 1) >> + PAGE_CACHE_SHIFT; if (num_pages > dirty_pages) { if (copied > 0) @@ -1069,10 +1090,6 @@ out: err = ret; kfree(pages); - if (pinned[0]) - page_cache_release(pinned[0]); - if (pinned[1]) - page_cache_release(pinned[1]); *ppos = pos; /* -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Mar 6, 2011 at 6:58 PM, Chris Mason <chris.mason@oracle.com> wrote:> Excerpts from Chris Mason''s message of 2011-03-06 13:00:27 -0500: >> Excerpts from Mitch Harder''s message of 2011-03-05 11:50:14 -0500: >> > I''ve constructed a test patch that is currently addressing all the >> > issues on my system. >> > >> > The portion of Openmotif that was having issues with page faults works >> > correctly with this patch, and gcc-4.4.5 builds without issue. >> > >> > I extracted only the portion of the first patch that corrects the >> > handling of dirty_pages when copied==0, and incorporated the second >> > patch that falls back to one-page-at-a-time if there are troubles with >> > page faults. >> >> Just to make sure I understand, could you please post the full combined >> path that was giving you trouble with gcc? We do need to make sure the >> pages are properly up to date if we fall back to partial writes. > > Ok, I was able to reproduce this easily with fsx. The problem is that I > wasn''t making sure the last partial page in the write was up to date > when it was also the first page in the write. > > Here is the updated patch, it has all the fixes we''ve found so far: >This latest patch that Chris has sent out fixes the issues I''ve been encountering. I can build gcc-4.4.5 without problems. Also, the portion of Openmotif that was having issues with page faults is working correctly. Let me know if you still would like to see the path names for the portions of the gcc-4.4.5 build that were giving me issues. I didn''t save that information, but I can regenerate it. But it sounds like it''s irrelevant now. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
That''s great! :-) -----Original Message----- From: Mitch Harder [mailto:mitch.harder@sabayonlinux.org] Sent: Monday, March 07, 2011 2:08 PM To: Chris Mason Cc: Xin Zhong; Zhong, Xin; linux-btrfs Subject: Re: [PATCH] btrfs file write debugging patch On Sun, Mar 6, 2011 at 6:58 PM, Chris Mason <chris.mason@oracle.com> wrote:> Excerpts from Chris Mason''s message of 2011-03-06 13:00:27 -0500: >> Excerpts from Mitch Harder''s message of 2011-03-05 11:50:14 -0500: >> > I''ve constructed a test patch that is currently addressing all the >> > issues on my system. >> > >> > The portion of Openmotif that was having issues with page faults works >> > correctly with this patch, and gcc-4.4.5 builds without issue. >> > >> > I extracted only the portion of the first patch that corrects the >> > handling of dirty_pages when copied==0, and incorporated the second >> > patch that falls back to one-page-at-a-time if there are troubles with >> > page faults. >> >> Just to make sure I understand, could you please post the full combined >> path that was giving you trouble with gcc? We do need to make sure the >> pages are properly up to date if we fall back to partial writes. > > Ok, I was able to reproduce this easily with fsx. The problem is that I > wasn''t making sure the last partial page in the write was up to date > when it was also the first page in the write. > > Here is the updated patch, it has all the fixes we''ve found so far: >This latest patch that Chris has sent out fixes the issues I''ve been encountering. I can build gcc-4.4.5 without problems. Also, the portion of Openmotif that was having issues with page faults is working correctly. Let me know if you still would like to see the path names for the portions of the gcc-4.4.5 build that were giving me issues. I didn''t save that information, but I can regenerate it. But it sounds like it''s irrelevant now. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
mån 2011-03-07 klockan 00:07 -0600 skrev Mitch Harder:> On Sun, Mar 6, 2011 at 6:58 PM, Chris Mason <chris.mason@oracle.com> wrote: > > Excerpts from Chris Mason''s message of 2011-03-06 13:00:27 -0500: > >> Excerpts from Mitch Harder''s message of 2011-03-05 11:50:14 -0500: > >> > I''ve constructed a test patch that is currently addressing all the > >> > issues on my system. > >> > > >> > The portion of Openmotif that was having issues with page faults works > >> > correctly with this patch, and gcc-4.4.5 builds without issue. > >> > > >> > I extracted only the portion of the first patch that corrects the > >> > handling of dirty_pages when copied==0, and incorporated the second > >> > patch that falls back to one-page-at-a-time if there are troubles with > >> > page faults. > >> > >> Just to make sure I understand, could you please post the full combined > >> path that was giving you trouble with gcc? We do need to make sure the > >> pages are properly up to date if we fall back to partial writes. > > > > Ok, I was able to reproduce this easily with fsx. The problem is that I > > wasn''t making sure the last partial page in the write was up to date > > when it was also the first page in the write. > > > > Here is the updated patch, it has all the fixes we''ve found so far: > > > > This latest patch that Chris has sent out fixes the issues I''ve been > encountering. > > I can build gcc-4.4.5 without problems. > > Also, the portion of Openmotif that was having issues with page faults > is working correctly. > > Let me know if you still would like to see the path names for the > portions of the gcc-4.4.5 build that were giving me issues. I didn''t > save that information, but I can regenerate it. But it sounds like > it''s irrelevant now.With the patch I can compile libgcrypt without any problem, so it solves my problems to. // Maria -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Monday 07 March 2011 20:56:50 Maria Wikström wrote:> mån 2011-03-07 klockan 00:07 -0600 skrev Mitch Harder: > > On Sun, Mar 6, 2011 at 6:58 PM, Chris Mason <chris.mason@oracle.com>wrote:> > > Excerpts from Chris Mason''s message of 2011-03-06 13:00:27 -0500: > > >> Excerpts from Mitch Harder''s message of 2011-03-05 11:50:14 -0500: > > >> > I''ve constructed a test patch that is currently addressing all the > > >> > issues on my system. > > >> > > > >> > The portion of Openmotif that was having issues with page faults > > >> > works correctly with this patch, and gcc-4.4.5 builds without > > >> > issue. > > >> > > > >> > I extracted only the portion of the first patch that corrects the > > >> > handling of dirty_pages when copied==0, and incorporated the second > > >> > patch that falls back to one-page-at-a-time if there are troubles > > >> > with page faults. > > >> > > >> Just to make sure I understand, could you please post the full > > >> combined path that was giving you trouble with gcc? We do need to > > >> make sure the pages are properly up to date if we fall back to > > >> partial writes. > > > > > > Ok, I was able to reproduce this easily with fsx. The problem is that > > > I wasn''t making sure the last partial page in the write was up to date > > > when it was also the first page in the write. > > > > > Here is the updated patch, it has all the fixes we''ve found so far: > > This latest patch that Chris has sent out fixes the issues I''ve been > > encountering. > > > > I can build gcc-4.4.5 without problems. > > > > Also, the portion of Openmotif that was having issues with page faults > > is working correctly. > > > > Let me know if you still would like to see the path names for the > > portions of the gcc-4.4.5 build that were giving me issues. I didn''t > > save that information, but I can regenerate it. But it sounds like > > it''s irrelevant now. > > With the patch I can compile libgcrypt without any problem, so it solves > my problems to.Can confirm this. And the bug seems to be hardware-related. On my Pentium4 system it was 100% reproducible, on my Atom-based system I couldn''t trigger it. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From my experience, only x86 32bit kernel has this problem and 64bit kernel do not have it. However, atom based system do not have it although it's installed with a 32bit kernel. -----Original Message----- From: Johannes Hirte [mailto:johannes.hirte@fem.tu-ilmenau.de] Sent: Tuesday, March 08, 2011 6:12 AM To: Maria Wikström Cc: Mitch Harder; Chris Mason; Xin Zhong; Zhong, Xin; linux-btrfs Subject: Re: [PATCH] btrfs file write debugging patch On Monday 07 March 2011 20:56:50 Maria Wikström wrote:> mån 2011-03-07 klockan 00:07 -0600 skrev Mitch Harder: > > On Sun, Mar 6, 2011 at 6:58 PM, Chris Mason <chris.mason@oracle.com>wrote:> > > Excerpts from Chris Mason's message of 2011-03-06 13:00:27 -0500: > > >> Excerpts from Mitch Harder's message of 2011-03-05 11:50:14 -0500: > > >> > I've constructed a test patch that is currently addressing all the > > >> > issues on my system. > > >> > > > >> > The portion of Openmotif that was having issues with page faults > > >> > works correctly with this patch, and gcc-4.4.5 builds without > > >> > issue. > > >> > > > >> > I extracted only the portion of the first patch that corrects the > > >> > handling of dirty_pages when copied==0, and incorporated the second > > >> > patch that falls back to one-page-at-a-time if there are troubles > > >> > with page faults. > > >> > > >> Just to make sure I understand, could you please post the full > > >> combined path that was giving you trouble with gcc? We do need to > > >> make sure the pages are properly up to date if we fall back to > > >> partial writes. > > > > > > Ok, I was able to reproduce this easily with fsx. The problem is that > > > I wasn't making sure the last partial page in the write was up to date > > > when it was also the first page in the write. > > > > > Here is the updated patch, it has all the fixes we've found so far: > > This latest patch that Chris has sent out fixes the issues I've been > > encountering. > > > > I can build gcc-4.4.5 without problems. > > > > Also, the portion of Openmotif that was having issues with page faults > > is working correctly. > > > > Let me know if you still would like to see the path names for the > > portions of the gcc-4.4.5 build that were giving me issues. I didn't > > save that information, but I can regenerate it. But it sounds like > > it's irrelevant now. > > With the patch I can compile libgcrypt without any problem, so it solves > my problems to.Can confirm this. And the bug seems to be hardware-related. On my Pentium4 system it was 100% reproducible, on my Atom-based system I couldn't trigger it. NrybXǧv^){.n+{n߲)w*jgݢj/zޖ2ޙ&)ߡaGhj:+vw٥