Hi folks, Someone on the OpenMoko community list commented recently about having created a swap file on the SD card of their OpenMoko Neo phone and said that they''d been lazy as they''d not made a swap partition. My thought was that with an SSD aware filesystem like btrfs a swapfile would actually be a smarter move than a swap partition because it lets the filesystem try and even the wear generated by access to it which a swap partition will not have the freedom to do. To me that makes logical sense but given the complexity of the kernel and btrfs is it a fair comment to make and, also, would that be the case with btrfs at present ? cheers! Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC This email may come with a PGP signature as a file. Do not panic. For more info see: http://en.wikipedia.org/wiki/OpenPGP
On Sat, 2009-01-17 at 11:10 +1100, Chris Samuel wrote:> Hi folks, > > Someone on the OpenMoko community list commented recently about having created > a swap file on the SD card of their OpenMoko Neo phone and said that they''d > been lazy as they''d not made a swap partition. > > My thought was that with an SSD aware filesystem like btrfs a swapfile would > actually be a smarter move than a swap partition because it lets the > filesystem try and even the wear generated by access to it which a swap > partition will not have the freedom to do.It has actually been a while since I read through the swap-on-file code, but setup_swap_extents() makes me think it is making its own map of the blocks in use by the FS. This doesn''t quite play nicely with btrfs and should lead to all kinds of problems....I''m looking into how to disable swapfiles completely.> > To me that makes logical sense but given the complexity of the kernel and > btrfs is it a fair comment to make and, also, would that be the case with > btrfs at present ?In general, the btrfs cow will be more wear leveling friendly but this is the kind of thing that I''d expect the ssd to do for us ;) <insert David Woodhouse''s long standing debate with me about where wear leveling should live here> -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hey Cris, Chris Mason wrote:> This doesn''t quite play nicely with btrfs and should lead to all kinds > of problems....I''m looking into how to disable swapfiles completely.Please try to support swapfiles. I know their drawbacks and still use them quite often. Cheers Kaspar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2009-01-20 at 11:41 +0100, Kaspar Schleiser wrote:> Hey Cris, > > Chris Mason wrote: > > This doesn''t quite play nicely with btrfs and should lead to all kinds > > of problems....I''m looking into how to disable swapfiles completely. > Please try to support swapfiles. I know their drawbacks and still use > them quite often.There are patches to support swap over NFS that might make it safe to use on btrfs. At any rate, it is a fixable problem. But today, swapfiles on btrfs will corrupt other file data. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 21, 2009 at 12:02 AM, Chris Mason <chris.mason@oracle.com> wrote:> There are patches to support swap over NFS that might make it safe to > use on btrfs. At any rate, it is a fixable problem.FreeBSD has been able to run swap over NFS for as long as I can remember, what is different in Linux that makes it especially difficult? I''ve read that swap over non-trivial filesystems is hazardous as it may lead to a situation in which memory allocation can fail in the swap/FS code that was meant to make allocation possible again. If btrfs is to take the role of a RAID and volume manager, it would certainly be very useful to be able to run swap on it, since that frees up other volumes from an administrative standpoint. -- Dmitri Nikulin Centre for Synchrotron Science Monash University Victoria 3800, Australia -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2009-01-21 at 00:51 +1100, Dmitri Nikulin wrote:> On Wed, Jan 21, 2009 at 12:02 AM, Chris Mason <chris.mason@oracle.com> wrote: > > There are patches to support swap over NFS that might make it safe to > > use on btrfs. At any rate, it is a fixable problem. > > FreeBSD has been able to run swap over NFS for as long as I can > remember, what is different in Linux that makes it especially > difficult?There are two sides to this. First is the part where writing to NFS can cause memory allocations which can cause problems when you''re trying to swap so you can do memory allocations. The swap over NFS patches add code to deal with that. The second is an implementation detail of the linux swap file code. It expects filesystems don''t move blocks around, and takes a mapping of the blocks in the FS once. This doesn''t work with btrfs because we do move blocks around all the time. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> The second is an implementation detail of the linux swap file code. It > expects filesystems don''t move blocks around, and takes a mapping of the > blocks in the FS once. > > This doesn''t work with btrfs because we do move blocks around all the > time.That''s interesting. I have a few questions: -Is creating a loopback device from the file any different, or does that lead to the same problems? -Would mounting a filesystem image via loopback device cause similar problems? -Would this be viable if using a dedicated nodatacow subvolume, or is that still too risky because of the odd case where you do cow? -Does online defragmentation hurt this as well? Cheers, -Anthony -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2009-01-20 at 09:35 -0700, Anthony Roberts wrote:> > The second is an implementation detail of the linux swap file code. It > > expects filesystems don''t move blocks around, and takes a mapping of the > > blocks in the FS once. > > > > This doesn''t work with btrfs because we do move blocks around all the > > time. > > That''s interesting. I have a few questions: > > -Is creating a loopback device from the file any different, or does that > lead to the same problems?The loopback device would probably work. At least it would cover blocks that move around.> > -Would mounting a filesystem image via loopback device cause similar > problems?loopback goes through safer APIs.> > -Would this be viable if using a dedicated nodatacow subvolume, or is that > still too risky because of the odd case where you do cow?nodatacow is allowed to COW when there are snapshots or clones. I wouldn''t recommend swapfiles on it just because people can easily forget about the swapfile restriction. I plan on sending a patch to at least disable swapfiles for btrfs in this kernel cycle. Later on we can work out the swap bmapping apis with the VM maintainers.> > -Does online defragmentation hurt this as well? >Yes. Online defrag in XFS may have problems with this too, I''m asking the xfs people if they have worked around this. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Dmitri Nikulin <dnikulin@gmail.com> writes:> On Wed, Jan 21, 2009 at 12:02 AM, Chris Mason <chris.mason@oracle.com> wrote: >> There are patches to support swap over NFS that might make it safe to >> use on btrfs. At any rate, it is a fixable problem. > > FreeBSD has been able to run swap over NFS for as long as I can > remember, what is different in Linux that makes it especially > difficult?One big traditional difference is that FreeBSD uses fixed isolated pools for their networking buffers (that is why you had to tune most systems for higher network workloads), while Linux has fully unified[1] memory management including the network stack. Now I believe recent BSD also moved to more unified network management and it wouldn''t surprise me if they had trouble with this now too. [1] at least for now, there are unfortunately some tendencies to move back to fixed pools too.> I''ve read that swap over non-trivial filesystems is hazardous as it > may lead to a situation in which memory allocation can fail in the > swap/FS code that was meant to make allocation possible again.A lot of this has been fixed in the 2.6 timeframe (e.g. there''s now a better enforced global dirty limit), but there are likely still corner cases that could run into difficulties, so noone is really declaring it 100% safe yet.> If btrfs is to take the role of a RAID and volume manager, it would > certainly be very useful to be able to run swap on it, since that > frees up other volumes from an administrative standpoint.The fixed extent mapping of the swap files is really a different problem, independent of the memory allocation issue. In general the memory allocation problem on write out has to be solved in any ways (even if you don''t support swap files), because any dirty mmap''ed file effectively acts like a swap file. -Andi -- ak@linux.intel.com -- Speaking for myself only. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html