I''m trying to debug an issue on an older lapop (Toshiba Satellite A505) - that has an i3 processor (M330) - and intel graphics. This is running under Xen-unstable, and a 3.7-rc4 pvops kernel - but can also be reproduced using kernels as old as 3.2.23 - and hypervisors as old as 4.0.4 (I have cross posted here, because I am not yet sure if this is a Xen, pvops, or i915 issue - and would appreciate opinions in sorting it out.) The symptoms of the problem exhibit themselves by a lagging mouse in X11 after resume, when using the trackpad. After digging in a bit, the problem seems a bit more insidious - the i915 kworker responsible for hotplug detection (i915_hotplug_work_func) seems to be getting triggered for every PS2 IRQ - every trackpad mouse movement, and keypress triggers tracing (with drm.debug=0x06) like the following: Nov 6 21:41:58 rusting kernel: [ 263.924454] [drm:i915_hotplug_work_func], running encoder hotplug functions Nov 6 21:41:58 rusting kernel: [ 263.924468] [drm:intel_ironlake_crt_detect_hotplug], ironlake hotplug adpa=0xf40018, result 0 Nov 6 21:41:58 rusting kernel: [ 263.924472] [drm:intel_crt_detect], CRT not detected via hotplug Nov 6 21:41:58 rusting kernel: [ 263.924475] [drm:output_poll_execute], [CONNECTOR:11:VGA-1] status updated from 2 to 2 Nov 6 21:41:58 rusting kernel: [ 263.926771] [drm:drm_do_probe_ddc_edid], drm: skipping non-existent adapter i915 gmbus dpc Nov 6 21:41:58 rusting kernel: [ 263.926775] [drm:output_poll_execute], [CONNECTOR:14:HDMI-A-1] status updated from 2 to 2 Nov 6 21:41:58 rusting kernel: [ 263.927291] [drm:intel_dp_aux_ch], dp_aux_ch timeout status 0x5145003e Nov 6 21:41:58 rusting kernel: [ 263.944207] [drm:intel_dp_aux_ch], dp_aux_ch timeout status 0x5145003e Nov 6 21:41:58 rusting kernel: [ 263.964201] [drm:intel_dp_aux_ch], dp_aux_ch timeout status 0x5145003e Nov 6 21:41:58 rusting kernel: [ 263.983694] [drm:intel_dp_detect], DPCD: 0000000000000000 Nov 6 21:41:58 rusting kernel: [ 263.983704] [drm:output_poll_execute], [CONNECTOR:17:DP-1] status updated from 2 to 2 Additionally, this same trace stack is printed out at a regular 10s interval, after resume - where prior to resuming from S3 it is printed out once at boot time. This kworker consumes a significant portion of the cpu, and essentially grinds Xorg to a halt, until the probing can catch up with the user moving the cursor. There seems to be a mismatch for these IRQ delivery - or at least exhibits the behavior similar to such a problem. Does anyone have any thoughts as to where in the software stack I should start to dig in? Any opinions on which component likely contains the issue is appreciated. /btg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Ben Guthro
2012-Nov-07 19:43 UTC
Re: S3 causing IRQ delivery mismatch - i915 hotplug storm
On Wed, Nov 7, 2012 at 11:22 AM, Ben Guthro <ben@guthro.net> wrote:> I''m trying to debug an issue on an older lapop (Toshiba Satellite A505) - > that has an i3 processor (M330) - and intel graphics. > This is running under Xen-unstable, and a 3.7-rc4 pvops kernel - but can > also be reproduced using kernels as old as 3.2.23 - and hypervisors as old > as 4.0.4 > > (I have cross posted here, because I am not yet sure if this is a Xen, > pvops, or i915 issue - and would appreciate opinions in sorting it out.) > >This appears to be unrelated to Xen / pvops, at the moment, after some additional debugging, and appears to be an issue with the i915 driver with older hardware. I''ll remove xen-devel, and Konrad from future replies to this thread.> > Additionally, this same trace stack is printed out at a regular 10s > interval, after resume - where prior to resuming from S3 it is printed out > once at boot time. > >10*HZ seems to be the normal hotplug interval, when an interrupt doesn''t fire> > There seems to be a mismatch for these IRQ delivery - or at least exhibits > the behavior similar to such a problem. > >I was mistaken here. The i8042 IRQ would just start up the IRQ handling - but the i915 driver always thinks it has pending work, and schedules it. It seems that the hotplug mask is not getting cleared in pch_iir (in i915_irq.c) Manually clearing this bit with pch_iir = pch_iir ^ hotplug_mask; in the ironlake_irq_handler() function seems to resolve the issue - making it so I don''t get the flurry of hotplug work bogging down the system. ...but is this disabling hotplug detection entirely? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel