On Fri, 2021-07-23 at 13:56 -0600, Chris Murphy wrote:
On Fri, Jul 23, 2021 at 12:48 PM Patrick O'Callaghan
<pocallaghan(a)gmail.com> wrote:
>
> On Fri, 2021-07-23 at 21:09 +0300, jarmo wrote:
> > Somethin went't wrong. Got updates of new kernel,
> > kernel-5.13.4-200.fc34.x86_64.
> >
> > Now my HPlaptop won't start. I can write to loginwindow
> > asked passwd. After that everything hangs.
> > with kernel 5.12.15-300.fc34.x86_64 works. I have system in
> > legacy
> > mode, no LVM no UEFI.
> > Only patitions there are /swap and / FS is EXT4...
> > An my wife's Toshiba Satellite, After I disabled /dev/zram0 from
> > system, because from boot.log I could see, that It tried to
> > create
> > SWAP to /dev/zram0. That Toshiba has also partitions /swap and /
> > EXT4 no LVM no UEFI.
> > After disbling, it boots, but takes terrible time, "fedora
> > sircle"
> > stops and takes long time to start XFCE4 VM.
>
> I have that kernel and had to hard-reset my system after load
> average
> went to over 30, apparently due to a BTRFS cache flush process. May
> or
> may not be related. It has never happened before.
If you can find the boot this happened in and file a bug against the
kernel, that would be great:
This can help find the boot that it happened with
journalctl --list-boots
I also will use this method to iterate to find the boot, with the
kernel appearing in the first line.
journalctl -b-1
journalctl -b-2
and so on, goes back each boot
For the log to attach, you can filter just for kernel messages with
journalctl -b-3 -k -o short-monotonic --no-hostname > dmesg.txt
So that's the 3rd boot back, kernel messages only, monotonic time
stamps, and no hostname, directed to a file that you can attach to
the
bug.
Before I posting to BZ I noticed this at the end of dmesg.txt (captured
as you said):
[104791.682109] kernel: CPU: 7 PID: 150724 Comm: ThreadPoolSingl Tainted: P B OE
5.13.4-200.fc34.x86_64 #1
[104791.682111] kernel: Hardware name: MSI MS-7808/B75MA-E33 (MS-7808), BIOS V1.7
09/30/2013
[104791.682112] kernel: Call Trace:
[104791.682113] kernel: dump_stack+0x76/0x94
[104791.682116] kernel: bad_page.cold+0x9b/0xa0
[104791.682118] kernel: free_pcp_prepare+0x185/0x1c0
[104791.682121] kernel: free_unref_page_list+0xc1/0x1e0
[104791.682124] kernel: release_pages+0xe0/0x520
[104791.682129] kernel: tlb_flush_mmu+0x4b/0x150
[104791.682132] kernel: unmap_page_range+0xa5b/0xe30
[104791.682137] kernel: unmap_vmas+0x6a/0xd0
[104791.682140] kernel: exit_mmap+0x8e/0x1a0
[104791.682143] kernel: mmput+0x61/0x140
[104791.682146] kernel: begin_new_exec+0x4be/0xa30
[104791.682150] kernel: load_elf_binary+0x6f1/0x1640
[104791.682152] kernel: ? __kernel_read+0x18f/0x2b0
[104791.682154] kernel: ? __kernel_read+0x18f/0x2b0
[104791.682156] kernel: bprm_execve+0x26a/0x630
[104791.682159] kernel: do_execveat_common.isra.0+0x184/0x1c0
[104791.682162] kernel: __x64_sys_execve+0x33/0x40
[104791.682165] kernel: do_syscall_64+0x40/0x80
[104791.682167] kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
[104791.682170] kernel: RIP: 0033:0x7fc32667e04b
[104791.682173] kernel: Code: Unable to access opcode bytes at RIP 0x7fc32667e021.
[104791.682174] kernel: RSP: 002b:00007fc30d3e4be8 EFLAGS: 00000206 ORIG_RAX:
000000000000003b
[104791.682176] kernel: RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fc32667e04b
[104791.682177] kernel: RDX: 000006cc0060cfc0 RSI: 000006cc0b87b2c0 RDI: 000006cc02226700
[104791.682178] kernel: RBP: 00007fc30d3e4c50 R08: 0000000000000000 R09: 00007fc30d3e64d0
[104791.682179] kernel: R10: 0000000000000000 R11: 0000000000000206 R12: 000006cc0b87b2c0
[104791.682180] kernel: R13: 000006cc0060cfc0 R14: 000006cc02226700 R15: 000006cc0b556720
[104791.697740] kernel: BUG: Bad rss-counter state mm:00000000625f9e74 type:MM_FILEPAGES
val:-1
[104791.697756] kernel: BUG: Bad rss-counter state mm:00000000625f9e74 type:MM_ANONPAGES
val:1
Do you think that indicates a hardware problem?
Even better is if you can reproduce the high load and try to capture
one or more of the following: sysrq+l which will dump the result into
dmesg and will end up in the journal, which you can filter similarly:
journalctl -b -k -o short-monotonic --no-hostname > dmesg-
cpustack.txt
I've no idea what caused the high load. I wasn't doing anything out of
the ordinary.
poc