Hello, Yesterday, I realized the algorithm for nodatacow is broken, it can''t reliably detect whether a given extent is referenced by only one snapshot. Let me use the attached picture to describe the issue. Figure (1) shows the initial tree structure. there is only one fs tree A. Figure (2) shows the tree structure after we create a snapshot of fs tree A. The new snapshot''s root node is B. Figure (3) shows the situation after we modified leaf node L. Before we modified leaf node L, the tree is in the state showed figure (1) Figure (4) shows the situation after we modified leaf node L when snapshot B exists. In the figures, the color of rectangle is used to differentiate between tree nodes belongs to different owners (owner field in tree node header). Node A'' is the shadow copy of node A, leaf L'' is the shadow copy of L. When nodatacow option is enable, btrfs_count_snapshots_in_path is used to detect whether a given extent is referenced by only one snapshot. It uses backref info for tree blocks in btrfs_path and file extent to do the complex work. In the example showed in figure (3) or figure (4), backref info for node A'', leaf L'' and file extent are used. We can find that the backref info used in the case showed in figure (3) and in the case showed in figure (4) are same. But in figure (3), the file extent is referenced by one snapshot; in figure (4), the file extent is referenced by two snapshots. In both cases, btrfs_count_snapshots_in_path return 1. Regards YZ
On Wed, 2008-07-16 at 15:41 +0800, Yan Zheng wrote:> Hello, > > Yesterday, I realized the algorithm for nodatacow is broken, it can''t > reliably detect whether a given extent is referenced by only one > snapshot.I had to change around nodatacow back in May because it was definitely broken in the way you describe. I agree the special case I added to allow nodatacow when one of the references comes from the running transaction is broken. But, we should be able to fix it by extending the reference count checks up the tree. Any reference > 1 not held by the running transaction on any block should force a cow. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2008-07-16 at 13:15 +0000, Chris Mason wrote:> On Wed, 2008-07-16 at 15:41 +0800, Yan Zheng wrote: > > Hello, > > > > Yesterday, I realized the algorithm for nodatacow is broken, it can''t > > reliably detect whether a given extent is referenced by only one > > snapshot. > > I had to change around nodatacow back in May because it was definitely > broken in the way you describe. I agree the special case I added to > allow nodatacow when one of the references comes from the running > transaction is broken. > > But, we should be able to fix it by extending the reference count checks > up the tree. Any reference > 1 not held by the running transaction on > any block should force a cow.Actually the existing code assumes any reference held by a different generation of the current root is safe. As Yan''s picture shows that isn''t quite true. So, we can make it safe by walking down that root and checking for snapshots as well. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html