On 2021-Mar-15, at 14:57, Kevin Oberman <rkoberman at gmail.com> wrote:
> Responses in-line.
>
> On Sun, Mar 14, 2021 at 3:09 PM Mark Millard <marklmi at yahoo.com>
wrote:
>
>> On 2021-Mar-14, at 11:09, Kevin Oberman <rkoberman at gmail.com>
wrote:
>>
>> > . . .
>> >
>> > Seems to only occur on large r/w operations from/to the same disk.
"sp
>> > big-file /other/file/on/same/disk" or tar/untar operations on
large files.
>> > Hit this today updating firefox.
>> >
>> > I/O starts at >40MB/s. Dropped to about 1.5MB/s. If I tried
doing other
>> > things while it was running slowly, the disk would appear to lock
up. E.g.
>> > pwd(1) seemed to completely lock up the system, but I could still
ping it
>> > and, after about 30 seconds, things came back to life. It was also
not
>> > instantaneous. Disc activity dropped to <1MB/s for a few
seconds before
>> > everything froze.
>> >
>> > During the untar of firefox, I saw; this several times. I also
looked at my
>> > console where I found these errors during :
>> > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 55043, size:
8192
>> > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 51572, size:
4096
>>
>> Does anyone know:
>> Are those messages normal "reading is taking a rather long
>> time" notices or is their presence more useful information
>> in some way about the type of problem or context for the
>> problem?
>>
> As for the tests:
> Are these messages always present when near a time frame
> when the problem occurs? Never present in a near time
> frame to a period when the problem does not occur?
> In a large number of test, these errors have not repeated. They baffle me
for another reason. This system has 20G or RAM. Tyically, all swap space is
unused. ATM I see 16384M free out of 16384. Not sure that I have ever seen it
used, though it might have been while building rust. I have not built rust for a
month.
>
> It appears that the messages are associated with reading
> the disk(s), not directly with writing them, where the
> reads take more than "hz * 20" time units to complete.
> (I'm looking at main (14) code.) What might contribute
> to the time taken for the pending read(s)?
> The reference to hz * 20 woke up a few sleeping memory cells. I forgot that
I cleaned up my loader.conf. It was largely a copy of the one on my decade-old
T520. I commented out "kern.hz=100". I don't recall the details,
but I think it was actually from an even older system, my T42 from before I
retired.
>
> In any case, restoring this setting has greatly improved the situation. I
now have really bad disk I/O performance on large disk to disk activity
(untarring the firefox distro) instead of terrible performance and the system
freezes have vanished, though I do see pauses in response to clicks or text
entry, but the display remains active and the pauses are short... 1 to 15
seconds, I'd guess. No, I have no idea what this indicates.
Interesting.
> I'm still not seeing the performance I was seeing back in February when
40 MB/s for extended intervals was common and I once untarred firefox.tar.gz2 in
under a minute and performance seldom dropped below 1.4 MB/s.
>
>>> /*
>>> * swap_pager_getpages() - bring pages in from swap
>>> *
>>> * Attempt to page in the pages in array "ma" of
length "count". The
>>> * caller may optionally specify that additional pages
preceding and
>>> * succeeding the specified range be paged in. The number of
such pages
>>> * is returned in the "rbehind" and
"rahead" parameters, and they will
>>> * be in the inactive queue upon return.
>>> *
>>> * The pages in "ma" must be busied and will remain
busied upon return.
>>> */
>>> static int
>>> swap_pager_getpages_locked(vm_object_t object, vm_page_t *ma, int
count,
>>> int *rbehind, int *rahead)
>>> {
>>> . . .
>>> /*
>>> * Wait for the pages we want to complete. VPO_SWAPINPROG
is always
>>> * cleared on completion. If an I/O error occurs,
SWAPBLK_NONE
>>> * is set in the metadata for each page in the request.
>>> */
>>> VM_OBJECT_WLOCK(object);
>>> /* This could be implemented more efficiently with aflags
*/
>>> while ((ma[0]->oflags & VPO_SWAPINPROG) != 0) {
>>> ma[0]->oflags |= VPO_SWAPSLEEP;
>>> VM_CNT_INC(v_intrans);
>>> if (VM_OBJECT_SLEEP(object, &object->handle,
PSWP,
>>> "swread", hz * 20)) {
>>> printf(
>>> "swap_pager: indefinite wait buffer: bufobj: %p, blkno: %jd,
size: %ld\n",
>>> bp->b_bufobj,
(intmax_t)bp->b_blkno, bp->b_bcount);
>>> }
>>> }
>>> VM_OBJECT_WUNLOCK(object);
>>> . . .
>>>
>>> where:
>>>
>>> #define VM_OBJECT_SLEEP(object, wchan, pri, wmesg, timo)
\
>>> rw_sleep((wchan), &(object)->lock, (pri), (wmesg),
(timo))
>>>
>>> and:
>>>
>>> #define rw_sleep(chan, rw, pri, wmesg, timo)
\
>>> _sleep((chan), &(rw)->lock_object, (pri), (wmesg),
\
>>> tick_sbt * (timo), 0, C_HARDCLOCK)
>>>
>>> (I do not claim to be able to interpret the implications
>>> of the code that leads to the messages. But seeing some
>>> of the code might prompt a thought by someone that knows
>>> the code's context and operation.)
>>>
>>> > . . .
>>> > Backing off to Mar. 4 was not an improvement. My untar did
seem better for a couple of minutes, but then the display froze again for 30
seconds and disk performance dropped to <1M.
>>
>> You were able to see the disk performance drop while
>> the display was frozen?
>>
>
> No, but it refreshed the display when it un-froze and the refreshed display
showed a one-minute history that showed that data was still transferring during
the pause.
>
>> It might not be the best for monitoring but I'll ask
>> this in terms of top output: Does Inact, Laundry,
>> Wired, Free, or other such show anything fairly unique
>> for around the problematical time frame(s)?
>
> These all look pretty normal. Free memory stays at 13G throughout hte
operatioin.
>
>>> > then things got really bad and behaved in a manner that was
baffling to me. The screen froze again, but stayed frozen after half a minute. I
clicked on a couple of buttons in Firefox to no effect and then hit ctrl-q to
quit. After the long pause, I pressed the power button to try to force a
shutdown. Suddenly, it started unwinding everything I had done during the
freeze. My browser did the updates from my mouse clicks including quitting. It
then switched to a different workspace from ctrl-alt-right and did a clean
shutdown. ????
>>> >
>>> > Do I also have a graphics issue? Examining log files show no
indication that anything was happening. SMART shows no errors and reasonable
values for everything. No indication of a HW problem. The system performs well
unless I do something that tries a bulk disk data move. Building world takes
about 75 minutes. I just have a very hard time building big ports.
>>
>> Almost like things were stuck-sleeping and then the
>> sleep(s) finished?
> Exactly!
Multi-socket (and possibly multi-core) PowerMacs have
not historically had the times used for controlling
sleeping that can be used across FreeBSD's cpus well
matched in any FreeBSD build without extra patches.
They suffer threads being stuck-sleeping for periods
not intended. This leads to fans running wild and
obvious problems during shutdown having timeouts,
though there are more consequences than are so
obvious.
That is where I got the idea for the question: some
similarity to operational problems that I've seen
when not using my patches that provide a work around
matching the times better in my contexts.
(I'm told the type of issue is not limited to PowerMacs,
but PowerMacs are the only PowerPCish machines I've
had access to. Doing the most accurate time matching
gets into platform specific operations, no general
solution for such accuracy. Similarly, only platform
specifics might scale to lots of sockets/cores well,
even without trying to be as well matched. My workaround
is generic to the range of PowerMacs that I've had
access to but is not as accurate about matching the
times.)
For your context: how many sockets? Cores per socket?
Any other information that might be relevant to
matching times across sockets/cores? I suppose that
the board matters, not just the processor(s) in the
sockets. But what all would be appropriate information?
I do not know.
I'm not sure if the kern.hz=100 results fit with this
idea or not. (Such was never involved in my PowerMac
experiments.)
It is only somewhat suggestive evidence as stands. But
time mismatches across socket/cores might be a
direction for investigation? (Not that I've a great
idea for how to investigate such.)
==Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)