Thus is recounted my quest for LVM snapshots in Xen: As I mentioned earlier, I''ve attempted to get snapshots working with LVM2 in Xen. I haven''t yet been successful, but it almost works... What I''ve done so far: I''m using a recent version of Xen 2.0. (Currently testing with the xen-unstable.tgz snapshot from last night.) I''m running XenoLinux 2.4.26 in domain 0, with the device mapper patches from http://sources.redhat.com/dm/ applied to the kernel. Domain 0 is running a mostly up-to-date Debian testing installation with the LVM2 tools installed. I''ve been successfully using this setup to create and destroy LVM logical volumes for the other guest domains, so the basics of LVM are working. Snapshots aren''t, however. I attempted to create a snapshot of an existing logical volume. The command I used was lvcreate -L 256M -s -n xen1-snap vg1/xen1 and it terminates with a segmentation fault. The system logs show: Aug 9 17:45:17 localhost kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000018 Aug 9 17:45:17 localhost kernel: printing eip: Aug 9 17:45:17 localhost kernel: c02c8db3 Aug 9 17:45:17 localhost kernel: *pde=00000000(55555000) Aug 9 17:45:17 localhost kernel: Oops: 0002 Aug 9 17:45:17 localhost kernel: CPU: 0 Aug 9 17:45:17 localhost kernel: EIP: 0819:[alloc_area+99/128] Not tainted Aug 9 17:45:17 localhost kernel: EFLAGS: 00213246 Aug 9 17:45:17 localhost kernel: eax: 00000000 ebx: 00000000 ecx: 00000204 edx: 011f5000 Aug 9 17:45:17 localhost kernel: esi: 00000002 edi: 00000000 ebp: c3267860 esp: c2325e88 Aug 9 17:45:17 localhost kernel: ds: 0821 es: 0821 ss: 0821 Aug 9 17:45:17 localhost kernel: Process lvcreate (pid: 339, stackpage=c2325000)<1> Aug 9 17:45:17 localhost kernel: Stack: c4481000 000001f2 00000063 c3267860 fffffff4 00000010 c2ba390c c02c95bb Aug 9 17:45:17 localhost kernel: c3267860 000001f0 c2ba390c 00000200 c2ba38c0 c4418164 c02c7c6b c2ba390c Aug 9 17:45:17 localhost kernel: 00000010 00000000 00000000 00000003 c2ba38d4 50325ee4 c2ba38c0 c4418170 Aug 9 17:45:17 localhost kernel: Call Trace: [dm_create_persistent+155/320] [snapshot_ctr+923/1072] [dm_table_add_target+249/336] [populate_table+130/224] [table_load+104/304] Aug 9 17:45:17 localhost kernel: [ctl_ioctl+235/336] [table_load+0/304] [sys_ioctl+201/592] [system_call+47/51] Aug 9 17:45:17 localhost kernel: The offending line of code is in the alloc_area function (in drivers/md/dm-exception-store.c): int r = -ENOMEM; size_t i, len, nr_pages; struct page *page, *last = NULL; len = ps->chunk_size << SECTOR_SHIFT; /* * Allocate the chunk_size block of memory that will hold * a single metadata area. */ ps->area = vmalloc(len); if (!ps->area) return r; nr_pages = sectors_to_pages(ps->chunk_size); /* * We lock the pages for ps->area into memory since * they''ll be doing a lot of io. We also chain them * together ready for dm-io. */ for (i = 0; i < nr_pages; i++) { page = vmalloc_to_page(ps->area + (i * PAGE_SIZE)); LockPage(page); if (last) last->list.next = &page->list; last = page; } LockPage is the line causing the fault, so it appears that vmalloc_to_page is returning NULL. I''ve had success using a similarly-patched 2.4.26 kernel running on the bare hardware, not in Xen--this is where I was doing my testing of snapshot functionality last week. I don''t know the cause of the trouble with snapshots under Xen. It wouldn''t surprise me if it were a bug in the device mapper code that is somehow being triggered when running under Xen. I''m posting here in the hope that maybe someone does have an idea what the problem is, or failing that, to at least give an update on my attempts to use LVM snapshots with Xen. I''m hoping that the device mapper snapshot functionality in Linux 2.6.8, when it is released, will work. But I''d like to get snapshots working with 2.4 if possible. If I do get this working, I''ll send a post with instructions for doing so. Is anyone else using LVM2 and Xen? --Michael Vrable ------------------------------------------------------- SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Please try adding ''XEN_flush_page_update_queue()'' to mm/vmalloc.c:__vmalloc_area_pages, just before ''return 0'': ..... spin_unlock(&init_mm.page_table_lock); flush_cache_all(); XEN_flush_page_update_queue(); return 0; err: ..... -- Keir> Thus is recounted my quest for LVM snapshots in Xen: > > As I mentioned earlier, I''ve attempted to get snapshots working with > LVM2 in Xen. I haven''t yet been successful, but it almost works... > > What I''ve done so far: I''m using a recent version of Xen 2.0. > (Currently testing with the xen-unstable.tgz snapshot from last night.) > I''m running XenoLinux 2.4.26 in domain 0, with the device mapper patches > from http://sources.redhat.com/dm/ applied to the kernel. Domain 0 is > running a mostly up-to-date Debian testing installation with the LVM2 > tools installed. > > I''ve been successfully using this setup to create and destroy LVM > logical volumes for the other guest domains, so the basics of LVM are > working. Snapshots aren''t, however. > > I attempted to create a snapshot of an existing logical volume. The > command I used was > lvcreate -L 256M -s -n xen1-snap vg1/xen1 > and it terminates with a segmentation fault. The system logs show: > > Aug 9 17:45:17 localhost kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000018 > Aug 9 17:45:17 localhost kernel: printing eip: > Aug 9 17:45:17 localhost kernel: c02c8db3 > Aug 9 17:45:17 localhost kernel: *pde=00000000(55555000) > Aug 9 17:45:17 localhost kernel: Oops: 0002 > Aug 9 17:45:17 localhost kernel: CPU: 0 > Aug 9 17:45:17 localhost kernel: EIP: 0819:[alloc_area+99/128] Not tainted > Aug 9 17:45:17 localhost kernel: EFLAGS: 00213246 > Aug 9 17:45:17 localhost kernel: eax: 00000000 ebx: 00000000 ecx: 00000204 > edx: 011f5000 > Aug 9 17:45:17 localhost kernel: esi: 00000002 edi: 00000000 ebp: c3267860 > esp: c2325e88 > Aug 9 17:45:17 localhost kernel: ds: 0821 es: 0821 ss: 0821 > Aug 9 17:45:17 localhost kernel: Process lvcreate (pid: 339, stackpage=c2325000)<1> > Aug 9 17:45:17 localhost kernel: Stack: c4481000 000001f2 00000063 c3267860 fffffff4 00000010 c2ba390c c02c95bb > Aug 9 17:45:17 localhost kernel: c3267860 000001f0 c2ba390c 00000200 c2ba38c0 c4418164 c02c7c6b c2ba390c > Aug 9 17:45:17 localhost kernel: 00000010 00000000 00000000 00000003 c2ba38d4 50325ee4 c2ba38c0 c4418170 > Aug 9 17:45:17 localhost kernel: Call Trace: [dm_create_persistent+155/320] [snapshot_ctr+923/1072] [dm_table_add_target+249/336] [populate_table+130/224] [table_load+104/304] > Aug 9 17:45:17 localhost kernel: [ctl_ioctl+235/336] [table_load+0/304] [sys_ioctl+201/592] [system_call+47/51] > Aug 9 17:45:17 localhost kernel: > > The offending line of code is in the alloc_area function (in > drivers/md/dm-exception-store.c): > > int r = -ENOMEM; > size_t i, len, nr_pages; > struct page *page, *last = NULL; > > len = ps->chunk_size << SECTOR_SHIFT; > > /* > * Allocate the chunk_size block of memory that will hold > * a single metadata area. > */ > ps->area = vmalloc(len); > if (!ps->area) > return r; > > nr_pages = sectors_to_pages(ps->chunk_size); > > /* > * We lock the pages for ps->area into memory since > * they''ll be doing a lot of io. We also chain them > * together ready for dm-io. > */ > for (i = 0; i < nr_pages; i++) { > page = vmalloc_to_page(ps->area + (i * PAGE_SIZE)); > LockPage(page); > if (last) > last->list.next = &page->list; > last = page; > } > > LockPage is the line causing the fault, so it appears that > vmalloc_to_page is returning NULL. > > I''ve had success using a similarly-patched 2.4.26 kernel running on the > bare hardware, not in Xen--this is where I was doing my testing of > snapshot functionality last week. > > I don''t know the cause of the trouble with snapshots under Xen. It > wouldn''t surprise me if it were a bug in the device mapper code that is > somehow being triggered when running under Xen. I''m posting here in the > hope that maybe someone does have an idea what the problem is, or > failing that, to at least give an update on my attempts to use LVM > snapshots with Xen. > > I''m hoping that the device mapper snapshot functionality in Linux 2.6.8, > when it is released, will work. But I''d like to get snapshots working > with 2.4 if possible. If I do get this working, I''ll send a post with > instructions for doing so. > > Is anyone else using LVM2 and Xen? > > --Michael Vrable > > > ------------------------------------------------------- > SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media > 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 > Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. > http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/xen-devel------------------------------------------------------- SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
On Mon, Aug 09, 2004 at 08:35:01PM +0100, Keir Fraser wrote:> Please try adding ''XEN_flush_page_update_queue()'' to > mm/vmalloc.c:__vmalloc_area_pages, just before ''return 0'': > > ..... > spin_unlock(&init_mm.page_table_lock); > flush_cache_all(); > XEN_flush_page_update_queue(); > return 0; > err: > .....I decided to try this before attempting snapshots under Linux 2.6, since this was quicker. I made the change you specified and recompiled. This seems to fix the problem. I''ll do a bit more testing in a bit. Thanks! --Michael Vrable ------------------------------------------------------- SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel