thr3ads.net - Virtualization - [PATCH v7 00/12] Support non-lru page migration [Jun 2016]

If this information is useful, please help other people find it:
Share via:

Minchan Kim

2016-Jun-16 02:58 UTC

[PATCH v7 00/12] Support non-lru page migration

On Thu, Jun 16, 2016 at 11:48:27AM +0900, Sergey Senozhatsky
wrote:> Hi,
> 
> On (06/16/16 08:12), Minchan Kim wrote:
> > > [  315.146533] kasan: CONFIG_KASAN_INLINE enabled
> > > [  315.146538] kasan: GPF could be caused by NULL-ptr deref or
user memory access
> > > [  315.146546] general protection fault: 0000 [#1] PREEMPT SMP
KASAN
> > > [  315.146576] Modules linked in: lzo zram zsmalloc mousedev
coretemp hwmon crc32c_intel r8169 i2c_i801 mii snd_hda_codec_realtek
snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core acpi_cpufreq
snd_pcm snd_timer snd soundcore lpc_ich mfd_core processor sch_fq_codel sd_mod
hid_generic usbhid hid ahci libahci libata ehci_pci ehci_hcd scsi_mod usbcore
usb_common
> > > [  315.146785] CPU: 3 PID: 38 Comm: khugepaged Not tainted
4.7.0-rc3-next-20160614-dbg-00004-ga1c2cbc-dirty #488
> > > [  315.146841] task: ffff8800bfaf2900 ti: ffff880112468000
task.ti: ffff880112468000
> > > [  315.146859] RIP: 0010:[<ffffffffa02c413d>] 
[<ffffffffa02c413d>] zs_page_migrate+0x355/0xaa0 [zsmalloc]
> > 
> > Thanks for the report!
> > 
> > zs_page_migrate+0x355? Could you tell me what line is it?
> > 
> > It seems to be related to obj_to_head.
> 
> reproduced. a bit different call stack this time. but the problem is
> still the same.
> 
> zs_compact()
> ...
>     6371:       e8 00 00 00 00          callq  6376
<zs_compact+0x22b>
>     6376:       0f 0b                   ud2    
>     6378:       48 8b 95 a8 fe ff ff    mov    -0x158(%rbp),%rdx
>     637f:       4d 8d 74 24 78          lea    0x78(%r12),%r14
>     6384:       4c 89 ee                mov    %r13,%rsi
>     6387:       4c 89 e7                mov    %r12,%rdi
>     638a:       e8 86 c7 ff ff          callq  2b15
<get_first_obj_offset>
>     638f:       41 89 c5                mov    %eax,%r13d
>     6392:       4c 89 f0                mov    %r14,%rax
>     6395:       48 c1 e8 03             shr    $0x3,%rax
>     6399:       8a 04 18                mov    (%rax,%rbx,1),%al
>     639c:       84 c0                   test   %al,%al
>     639e:       0f 85 f2 02 00 00       jne    6696
<zs_compact+0x54b>
>     63a4:       41 8b 44 24 78          mov    0x78(%r12),%eax
>     63a9:       41 0f af c7             imul   %r15d,%eax
>     63ad:       41 01 c5                add    %eax,%r13d
>     63b0:       4c 89 f0                mov    %r14,%rax
>     63b3:       48 c1 e8 03             shr    $0x3,%rax
>     63b7:       48 01 d8                add    %rbx,%rax
>     63ba:       48 89 85 88 fe ff ff    mov    %rax,-0x178(%rbp)
>     63c1:       41 81 fd ff 0f 00 00    cmp    $0xfff,%r13d
>     63c8:       0f 87 1a 03 00 00       ja     66e8
<zs_compact+0x59d>
>     63ce:       49 63 f5                movslq %r13d,%rsi
>     63d1:       48 03 b5 98 fe ff ff    add    -0x168(%rbp),%rsi
>     63d8:       48 8b bd a8 fe ff ff    mov    -0x158(%rbp),%rdi
>     63df:       e8 67 d9 ff ff          callq  3d4b <obj_to_head>
>     63e4:       a8 01                   test   $0x1,%al
>     63e6:       0f 84 d9 02 00 00       je     66c5
<zs_compact+0x57a>
>     63ec:       48 83 e0 fe             and    $0xfffffffffffffffe,%rax
>     63f0:       bf 01 00 00 00          mov    $0x1,%edi
>     63f5:       48 89 85 b0 fe ff ff    mov    %rax,-0x150(%rbp)
>     63fc:       e8 00 00 00 00          callq  6401
<zs_compact+0x2b6>
>     6401:       48 8b 85 b0 fe ff ff    mov    -0x150(%rbp),%rax
RAX: 2065676162726166 so rax is totally garbage, I think.
It means obj_to_head returns garbage because get_first_obj_offset is
utter crab because (page_idx / class->pages_per_zspage) was totally
wrong.
> 					^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>     6408:       f0 0f ba 28 00          lock btsl $0x0,(%rax) 
<snip>
> > Could you test with [zsmalloc: keep first object offset in struct
page]
> > in mmotm?
> 
> sure, I can.  will it help, tho? we have a race condition here I think.
I guess root cause is caused by get_first_obj_offset.
Please test with it.

Thanks!

Sergey Senozhatsky

2016-Jun-16 04:23 UTC

head link

[PATCH v7 00/12] Support non-lru page migration

On (06/16/16 11:58), Minchan Kim wrote:
[..]> RAX: 2065676162726166 so rax is totally garbage, I think.
> It means obj_to_head returns garbage because get_first_obj_offset is
> utter crab because (page_idx / class->pages_per_zspage) was totally
> wrong.
> 
> > 					^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >     6408:       f0 0f ba 28 00          lock btsl $0x0,(%rax)
>  
> <snip>
> 
> > > Could you test with [zsmalloc: keep first object offset in struct
page]
> > > in mmotm?
> > 
> > sure, I can.  will it help, tho? we have a race condition here I
think.
> 
> I guess root cause is caused by get_first_obj_offset.
sounds reasonable.
> Please test with it.

this is what I'm getting with the [zsmalloc: keep first object offset in
struct page]
applied:  "count:0 mapcount:-127". which may be not related to
zsmalloc at this point.

kernel: BUG: Bad page state in process khugepaged  pfn:101db8
kernel: page:ffffea0004076e00 count:0 mapcount:-127 mapping:          (null)
index:0x1
kernel: flags: 0x8000000000000000()
kernel: page dumped because: nonzero mapcount
kernel: Modules linked in: lzo zram zsmalloc mousedev coretemp hwmon
crc32c_intel snd_hda_codec_realtek i2c_i801 snd_hda_codec_generic r8169 mii
snd_hda_intel snd_hda_codec snd_hda_core acpi_cpufreq snd_pcm snd_timer snd
soundcore lpc_ich processor mfd_core sch_fq_codel sd_mod hid_generic usb
kernel: CPU: 3 PID: 38 Comm: khugepaged Not tainted
4.7.0-rc3-next-20160615-dbg-00005-gfd11984-dirty #491
kernel:  0000000000000000 ffff8801124c73f8 ffffffff814d69b0 ffffea0004076e00
kernel:  ffffffff81e658a0 ffff8801124c7420 ffffffff811e9b63 0000000000000000
kernel:  ffffea0004076e00 ffffffff81e658a0 ffff8801124c7440 ffffffff811e9ca9
kernel: Call Trace:
kernel:  [<ffffffff814d69b0>] dump_stack+0x68/0x92
kernel:  [<ffffffff811e9b63>] bad_page+0x158/0x1a2
kernel:  [<ffffffff811e9ca9>] free_pages_check_bad+0xfc/0x101
kernel:  [<ffffffff811ee516>] free_hot_cold_page+0x135/0x5de
kernel:  [<ffffffff811eea26>] __free_pages+0x67/0x72
kernel:  [<ffffffff81227c63>] release_freepages+0x13a/0x191
kernel:  [<ffffffff8122b3c2>] compact_zone+0x845/0x1155
kernel:  [<ffffffff8122ab7d>] ? compaction_suitable+0x76/0x76
kernel:  [<ffffffff8122bdb2>] compact_zone_order+0xe0/0x167
kernel:  [<ffffffff8122bcd2>] ? compact_zone+0x1155/0x1155
kernel:  [<ffffffff8122ce88>] try_to_compact_pages+0x2f1/0x648
kernel:  [<ffffffff8122ce88>] ? try_to_compact_pages+0x2f1/0x648
kernel:  [<ffffffff8122cb97>] ? compaction_zonelist_suitable+0x3a6/0x3a6
kernel:  [<ffffffff811ef1ea>] ? get_page_from_freelist+0x2c0/0x133c
kernel:  [<ffffffff811f0350>] __alloc_pages_direct_compact+0xea/0x30d
kernel:  [<ffffffff811f0266>] ? get_page_from_freelist+0x133c/0x133c
kernel:  [<ffffffff811ee3b2>] ? drain_all_pages+0x1d6/0x205
kernel:  [<ffffffff811f21a8>] __alloc_pages_nodemask+0x143d/0x16b6
kernel:  [<ffffffff8111f405>] ? debug_show_all_locks+0x226/0x226
kernel:  [<ffffffff811f0d6b>] ? warn_alloc_failed+0x24c/0x24c
kernel:  [<ffffffff81110ffc>] ? finish_wait+0x1a4/0x1b0
kernel:  [<ffffffff81122faf>] ? lock_acquire+0xec/0x147
kernel:  [<ffffffff81d32ed0>] ? _raw_spin_unlock_irqrestore+0x3b/0x5c
kernel:  [<ffffffff81d32edc>] ? _raw_spin_unlock_irqrestore+0x47/0x5c
kernel:  [<ffffffff81110ffc>] ? finish_wait+0x1a4/0x1b0
kernel:  [<ffffffff8128f73a>] khugepaged+0x1d4/0x484f
kernel:  [<ffffffff8128f566>] ? hugepage_vma_revalidate+0xef/0xef
kernel:  [<ffffffff810d5bcc>] ? finish_task_switch+0x3de/0x484
kernel:  [<ffffffff81d32f18>] ? _raw_spin_unlock_irq+0x27/0x45
kernel:  [<ffffffff8111d13f>] ? trace_hardirqs_on_caller+0x3d2/0x492
kernel:  [<ffffffff81111487>] ? prepare_to_wait_event+0x3f7/0x3f7
kernel:  [<ffffffff81d28bf5>] ? __schedule+0xa4d/0xd16
kernel:  [<ffffffff810cd0de>] kthread+0x252/0x261
kernel:  [<ffffffff8128f566>] ? hugepage_vma_revalidate+0xef/0xef
kernel:  [<ffffffff810cce8c>] ? kthread_create_on_node+0x377/0x377
kernel:  [<ffffffff81d3387f>] ret_from_fork+0x1f/0x40
kernel:  [<ffffffff810cce8c>] ? kthread_create_on_node+0x377/0x377
-- Reboot --

	-ss

Minchan Kim

2016-Jun-16 04:47 UTC

head link

[PATCH v7 00/12] Support non-lru page migration

On Thu, Jun 16, 2016 at 01:23:43PM +0900, Sergey Senozhatsky
wrote:> On (06/16/16 11:58), Minchan Kim wrote:
> [..]
> > RAX: 2065676162726166 so rax is totally garbage, I think.
> > It means obj_to_head returns garbage because get_first_obj_offset is
> > utter crab because (page_idx / class->pages_per_zspage) was totally
> > wrong.
> > 
> > > 					^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > >     6408:       f0 0f ba 28 00          lock btsl $0x0,(%rax)
> >  
> > <snip>
> > 
> > > > Could you test with [zsmalloc: keep first object offset in
struct page]
> > > > in mmotm?
> > > 
> > > sure, I can.  will it help, tho? we have a race condition here I
think.
> > 
> > I guess root cause is caused by get_first_obj_offset.
> 
> sounds reasonable.
> 
> > Please test with it.
> 
> 
> this is what I'm getting with the [zsmalloc: keep first object offset
in struct page]
> applied:  "count:0 mapcount:-127". which may be not related to
zsmalloc at this point.
> 
> kernel: BUG: Bad page state in process khugepaged  pfn:101db8
> kernel: page:ffffea0004076e00 count:0 mapcount:-127 mapping:         
(null) index:0x1
Hm, it seems double free.

It doen't happen if you disable zram? IOW, it seems to be related
zsmalloc migration?

How easy can you reprodcue it? Could you bisect it?
> kernel: flags: 0x8000000000000000()
> kernel: page dumped because: nonzero mapcount
> kernel: Modules linked in: lzo zram zsmalloc mousedev coretemp hwmon
crc32c_intel snd_hda_codec_realtek i2c_i801 snd_hda_codec_generic r8169 mii
snd_hda_intel snd_hda_codec snd_hda_core acpi_cpufreq snd_pcm snd_timer snd
soundcore lpc_ich processor mfd_core sch_fq_codel sd_mod hid_generic usb
> kernel: CPU: 3 PID: 38 Comm: khugepaged Not tainted
4.7.0-rc3-next-20160615-dbg-00005-gfd11984-dirty #491
> kernel:  0000000000000000 ffff8801124c73f8 ffffffff814d69b0
ffffea0004076e00
> kernel:  ffffffff81e658a0 ffff8801124c7420 ffffffff811e9b63
0000000000000000
> kernel:  ffffea0004076e00 ffffffff81e658a0 ffff8801124c7440
ffffffff811e9ca9
> kernel: Call Trace:
> kernel:  [<ffffffff814d69b0>] dump_stack+0x68/0x92
> kernel:  [<ffffffff811e9b63>] bad_page+0x158/0x1a2
> kernel:  [<ffffffff811e9ca9>] free_pages_check_bad+0xfc/0x101
> kernel:  [<ffffffff811ee516>] free_hot_cold_page+0x135/0x5de
> kernel:  [<ffffffff811eea26>] __free_pages+0x67/0x72
> kernel:  [<ffffffff81227c63>] release_freepages+0x13a/0x191
> kernel:  [<ffffffff8122b3c2>] compact_zone+0x845/0x1155
> kernel:  [<ffffffff8122ab7d>] ? compaction_suitable+0x76/0x76
> kernel:  [<ffffffff8122bdb2>] compact_zone_order+0xe0/0x167
> kernel:  [<ffffffff8122bcd2>] ? compact_zone+0x1155/0x1155
> kernel:  [<ffffffff8122ce88>] try_to_compact_pages+0x2f1/0x648
> kernel:  [<ffffffff8122ce88>] ? try_to_compact_pages+0x2f1/0x648
> kernel:  [<ffffffff8122cb97>] ?
compaction_zonelist_suitable+0x3a6/0x3a6
> kernel:  [<ffffffff811ef1ea>] ? get_page_from_freelist+0x2c0/0x133c
> kernel:  [<ffffffff811f0350>] __alloc_pages_direct_compact+0xea/0x30d
> kernel:  [<ffffffff811f0266>] ? get_page_from_freelist+0x133c/0x133c
> kernel:  [<ffffffff811ee3b2>] ? drain_all_pages+0x1d6/0x205
> kernel:  [<ffffffff811f21a8>] __alloc_pages_nodemask+0x143d/0x16b6
> kernel:  [<ffffffff8111f405>] ? debug_show_all_locks+0x226/0x226
> kernel:  [<ffffffff811f0d6b>] ? warn_alloc_failed+0x24c/0x24c
> kernel:  [<ffffffff81110ffc>] ? finish_wait+0x1a4/0x1b0
> kernel:  [<ffffffff81122faf>] ? lock_acquire+0xec/0x147
> kernel:  [<ffffffff81d32ed0>] ? _raw_spin_unlock_irqrestore+0x3b/0x5c
> kernel:  [<ffffffff81d32edc>] ? _raw_spin_unlock_irqrestore+0x47/0x5c
> kernel:  [<ffffffff81110ffc>] ? finish_wait+0x1a4/0x1b0
> kernel:  [<ffffffff8128f73a>] khugepaged+0x1d4/0x484f
> kernel:  [<ffffffff8128f566>] ? hugepage_vma_revalidate+0xef/0xef
> kernel:  [<ffffffff810d5bcc>] ? finish_task_switch+0x3de/0x484
> kernel:  [<ffffffff81d32f18>] ? _raw_spin_unlock_irq+0x27/0x45
> kernel:  [<ffffffff8111d13f>] ? trace_hardirqs_on_caller+0x3d2/0x492
> kernel:  [<ffffffff81111487>] ? prepare_to_wait_event+0x3f7/0x3f7
> kernel:  [<ffffffff81d28bf5>] ? __schedule+0xa4d/0xd16
> kernel:  [<ffffffff810cd0de>] kthread+0x252/0x261
> kernel:  [<ffffffff8128f566>] ? hugepage_vma_revalidate+0xef/0xef
> kernel:  [<ffffffff810cce8c>] ? kthread_create_on_node+0x377/0x377
> kernel:  [<ffffffff81d3387f>] ret_from_fork+0x1f/0x40
> kernel:  [<ffffffff810cce8c>] ? kthread_create_on_node+0x377/0x377
> -- Reboot --
> 
> 	-ss

Apparently Analagous Threads

Search for more maybe matching threads

Virtualization - Jun 2016 - [PATCH v7 00/12] Support non-lru page migration

[PATCH v7 00/12] Support non-lru page migration

[PATCH v7 00/12] Support non-lru page migration

[PATCH v7 00/12] Support non-lru page migration

Apparently Analagous Threads