Anand Jain
2014-Jan-06 16:56 UTC
[bug] its messy when missing device reappears after its been replaced in RAID1
test case: disappear a disk then replace (RAID1) the disappeared disk and then make disappeared disk to reappear. ---- mkfs.btrfs -f -m raid1 -d raid1 /dev/sdc /dev/sdd mount /dev/sdc /btrfs dd if=/dev/zero of=/btrfs/tf1 count=1 btrfs fi sync /btrfs --- devmgt[1] will help to attach or detach a disk easily -- devmgt show devmgt detach /dev/sdc -- btrfs sill unaware of device missing. -- btrfs fi show -m Label: none uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120 Total devices 2 FS bytes used 32.00KiB devid 1 size 958.94MiB used 115.88MiB path /dev/sdc <-- devid 2 size 958.94MiB used 103.88MiB path /dev/sdd btrfs rep start -f 1 /dev/sde /btrfs Label: none uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120 Total devices 2 FS bytes used 32.00KiB devid 1 size 958.94MiB used 115.88MiB path /dev/sde devid 2 size 958.94MiB used 103.88MiB path /dev/sdd -- so far good. now missing /dev/sdc comes-back. --- devmgt attach host2 btrfs fi show -m shows sdc Label: none uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120^M Total devices 2 FS bytes used 32.00KiB^M devid 1 size 958.94MiB used 115.88MiB path /dev/sdc <- Wrong. devid 2 size 958.94MiB used 103.88MiB path /dev/sdd --- this is wrong it should be sde. this happened because when disk comes back device_list_add() is called which would invariably replace the existing disk with the given disk with the same fsid/devid. But the actual IO is still going to sde not to sdc. Further when we start fresh with (modprobe -r btrfs) unless it is carefully managed using btrfs dev scan <dev> it may pair with wrong disk. Need your review of the following proposed fix. This patch will compare the trans id before disk is substituted. ---------------------------------------------------- diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 2ca91fc..b226284 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -496,14 +496,39 @@ static noinline int device_list_add(const char *path, device->fs_devices = fs_devices; } else if (!device->name || strcmp(device->name->str, path)) { - name = rcu_string_strdup(path, GFP_NOFS); - if (!name) - return -ENOMEM; - rcu_string_free(device->name); - rcu_assign_pointer(device->name, name); - if (device->missing) { - fs_devices->missing_devices--; - device->missing = 0; + + struct buffer_head *bh; + struct btrfs_super_block *cur_disk_super; + u64 cur_transid; + + if (!device->missing) { + bh = btrfs_read_dev_super(device->bdev); + if (!bh) + return -EINVAL; + + cur_disk_super = (struct btrfs_super_block *) + bh->b_data; + cur_transid = btrfs_super_generation(ds); + } else + cur_transid = 0; + + if (found_transid > cur_transid) { + + name = rcu_string_strdup(path, GFP_NOFS); + if (!name) + return -ENOMEM; + + rcu_string_free(device->name); + rcu_assign_pointer(device->name, name); + + if (device->missing) { + fs_devices->missing_devices--; + device->missing = 0; + } + + printk_in_rcu(KERN_INFO "%s tran %llu replaced %s tran %llu\n", + path, found_transid, + rcu_str_deref(device->name), tranid); } } --------------------------------------- Thanks Anand [1] github.com/anajain/devmgt.git -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html