Hi,
I just installed a current kernel (6.1.0-0.rc1.20221018gitbb1a1146467a.16). This kernel will not boot. It appears to hang near the execution of dracut-pre-mount processing. This process seems to be looking for a disk by uuid but it NEVER completes. I let it go for over an hour but the system was still effectively hung.
My fallback kernel is the release 5.18.0-0.rc7.54.fc37.x86_64. This situation happened during early releases of this kernel series but thien it just started working.
I have no ability (that I know of) to get a log of the boot attempt.
Is this a known problem? Is there a workaround?
Please help.
Best regards,
George...
On Fri, 21 Oct 2022 21:16:48 +0000 (UTC) George R Goffe via test test@lists.fedoraproject.org wrote:
Hi,
I just installed a current kernel (6.1.0-0.rc1.20221018gitbb1a1146467a.16). This kernel will not boot. It appears to hang near the execution of dracut-pre-mount processing. This process seems to be looking for a disk by uuid but it NEVER completes. I let it go for over an hour but the system was still effectively hung.
My fallback kernel is the release 5.18.0-0.rc7.54.fc37.x86_64. This situation happened during early releases of this kernel series but thien it just started working.
I have no ability (that I know of) to get a log of the boot attempt.
Is this a known problem? Is there a workaround?
Please help.
I don't know if it is related, but I had, and continue to have, a problem on F37 with the 6.0 kernels when I build them locally. The problem is that dracut doesn't put all the libraries needed by/for systemd into the initramfs when I install the kernel. This also happens with Fedora stock kernels. I had to write a parse program that scans the missing libraries and finds their latest installed version and then creates a dracut.conf that forces them to be included when I run dracut manually. I have not built a 6.1 kernel yet, as I usually wait for a few iterations (rc3 or so) to let things settle from all the changes, and create a rescue kernel from the newly released kernel.
You can immediately tell if this is your issue by going into /boot and doing an ls -n The initramfs for the failing kernel will be about half the size of that for the successful kernel, because of all the missing libraries.
I could find no reason for this to happen. Everything was fully up to date, and the default dracut instructions to build the initramfs seemed to be the same as when it worked. They directed that the libraries be installed, it just wasn't happening. So, since it wasn't a general problem (no one else was hitting it), I assumed it was some kind of corner case hitting my system, and created the workaround I describe above.
On Sat, 22 Oct 2022 09:06:03 -0700 stan via test test@lists.fedoraproject.org wrote:
On Fri, 21 Oct 2022 21:16:48 +0000 (UTC) George R Goffe via test test@lists.fedoraproject.org wrote:
Hi,
I just installed a current kernel (6.1.0-0.rc1.20221018gitbb1a1146467a.16). This kernel will not boot. It appears to hang near the execution of dracut-pre-mount processing. This process seems to be looking for a disk by uuid but it NEVER completes. I let it go for over an hour but the system was still effectively hung.
My fallback kernel is the release 5.18.0-0.rc7.54.fc37.x86_64. This situation happened during early releases of this kernel series but thien it just started working.
I have no ability (that I know of) to get a log of the boot attempt.
Is this a known problem? Is there a workaround?
Please help.
I don't know if it is related, but I had, and continue to have, a problem on F37 with the 6.0 kernels when I build them locally. The problem is that dracut doesn't put all the libraries needed by/for systemd into the initramfs when I install the kernel. This also happens with Fedora stock kernels. I had to write a parse program that scans the missing libraries and finds their latest installed version and then creates a dracut.conf that forces them to be included when I run dracut manually. I have not built a 6.1 kernel yet, as I usually wait for a few iterations (rc3 or so) to let things settle from all the changes, and create a rescue kernel from the newly released kernel.
You can immediately tell if this is your issue by going into /boot and doing an ls -n The initramfs for the failing kernel will be about half the size of that for the successful kernel, because of all the missing libraries.
I could find no reason for this to happen. Everything was fully up to date, and the default dracut instructions to build the initramfs seemed to be the same as when it worked. They directed that the libraries be installed, it just wasn't happening. So, since it wasn't a general problem (no one else was hitting it), I assumed it was some kind of corner case hitting my system, and created the workaround I describe above.
Duh, forgot to mention that the symptoms were identical to those you are seeing. It seemed to stop at exactly the moment when systemd was being fired up, and it seemed to happen because it was unable to find the root filesystem. I, too, found no way to capture the early messages. I tried the dracut debug instructions which are supposed to bring up a terminal with the errors logged with no effect. I think this problem slips between the cracks. dracut isn't failing, it is doing exactly what it is supposed to do, it is just that when the system is transitioning from dracut to systemd, the root filesystem isn't found because of the missing libraries. Once I add them manually to the initramfs, it works just like it should.
If this seems to be your problem, you can use the program lsinitrd to examine the initramfs (it is usually compressed). Pipe it to less and check that libsystemd, libsystemd-core, and libsystemd-shared are there.
Stan,
Thanks for responding to this request for help.
Here's my"ls -n" results. The 5.18 kernels all work and the 6.1 kernel fails. The size doesn't necessarily mean that there aren't libraries missing though.
The smaller 5.18 initramfs file was created by my manual invocation of dracut to build it but it STILL BOOTS.
As I stated, my "hang"is in dracut-pre-mount hooks processing. I don't know enough about dracut to trouble shoot this any further. I'm eager to learn though. Pointers to docs would be helpful. Is there a ldd lookalike for initramfs? I could unpack the boot and bad initramfs files and compare the libraries in both. This might give a clue as to what's going on.
Around the point of this failure are what looks like something is writing a script... just a few lines but zfs is referenced though.
I don't have this problem with a F38 VM (VirtualBox). Also, I do NOT see any strange messages during system upgrade although dnf apparently runs something that uses ALL memory or somehow triggers an OOM kill event that kills the window manager and logs me off the system.
I have only one system here... sigh... Is there no other way to get a console that one could copy the output into a regular file?
Any thoughts or suggestions would be appreciated.
Regards,
George...
-rw-------. 1 0 0 90418168 Jul 10 2021 initramfs-0-rescue-e2edb6843ecc4ced9cb7a03f8590963d.img -rw------- 1 0 0 38459141 May 17 15:58 initramfs-5.18.0-0.rc7.54.fc37.x86_64.img -rw------- 1 0 0 105695183 Jun 13 02:37 initramfs-5.18.0-0.rc7.54.fc37.x86_64.img.dist -rw------- 1 0 0 114688965 Oct 20 18:36 initramfs-6.1.0-0.rc1.20221018gitbb1a1146467a.16.fc38.x86_64.img
On Sat, 22 Oct 2022 21:51:07 +0000 (UTC) George R Goffe via test test@lists.fedoraproject.org wrote:
Stan,
Thanks for responding to this request for help.
Here's my"ls -n" results. The 5.18 kernels all work and the 6.1 kernel fails. The size doesn't necessarily mean that there aren't libraries missing though.
The smaller 5.18 initramfs file was created by my manual invocation of dracut to build it but it STILL BOOTS.
I suspect that the compression algorithm for the system created initramfs is different than the compression algorithm of the one you created manually. For what it is worth, I turned off compression of the initramfs to remove a variable during troubleshooting. And left it off as it only saves a few megs of drive space. Once, that was important, but in the days of terabytes, well, I don't think that is so important. Maybe for a raspberry pi? Or android? Internet of things?
As I stated, my "hang"is in dracut-pre-mount hooks processing. I don't know enough about dracut to trouble shoot this any further. I'm eager to learn though. Pointers to docs would be helpful. Is there a ldd lookalike for initramfs? I could unpack the boot and bad initramfs files and compare the libraries in both. This might give a clue as to what's going on.
If you look at man dracut , there are instructions on how to have dracut put out debug information. As I said, that produced nothing in my case because there wasn't a dracut error, just an initramfs build error.
I'm not aware of an ldd for the initramfs. What I did is run ldd on libsystemd, libsystemd-core, and libsystemd-share to find what libraries were essential in order for systemd to start. Once systemd starts, it has the installed libraries of the system to use, and so it doesn't need anything else from the initramfs.
If you want a quick and dirty method, just run a grep as follows on the two initramfs you are comparing to create two files, and then edit them side by side. Should be all on one line, email client wrapped it.
grep -e 'lib*' initramfs[specification] > initramfs[specification]_libs.txt
Around the point of this failure are what looks like something is writing a script... just a few lines but zfs is referenced though.
If you get debug output from dracut, that might illuminate the cause of that.
I don't have this problem with a F38 VM (VirtualBox). Also, I do NOT see any strange messages during system upgrade although dnf apparently runs something that uses ALL memory or somehow triggers an OOM kill event that kills the window manager and logs me off the system.
I wonder, if you are using f38, I've been seeing that dnf5, the new version of dnf written in c++(?), might be there. So it could be you are finding a bug.
I have only one system here... sigh... Is there no other way to get a console that one could copy the output into a regular file?
The debug instructions for dracut should accomplish this.
Any thoughts or suggestions would be appreciated.
It is for this reason that I always have two versions of fedora installed. The current version and the previous one. When things like this happen, I have recourse to the previous version to debug the issue, and not lose access to a working system for other things.
Stan,
I found the cause of my particular problem.
A few days ago I reran the "mkswap /dev/sda3" command... which, apparently, re-generated the UUID of that device. I did not put the new uuid into the grub.cfg file which has a line with this content
"GRUB_CMDLINE_LINUX="resume=UUID=28cc21c5-3545-43c3-821a-653244b9493a nomodeset quiet net.ifnames=0 biosdevname=0 kernel.task_delayacct=1"
Apparently dracut et. al. doesn't handle this situation very well or the message gets lost in the "clutter" the boot process generates.
I fixed the file and reran the "grub2-mkconfig -o /boot/grub2/grub.cfg" command... then booted the system.
Poof! It worked.
It's strange to note that the 5.18 system booted just fine... even with this error condition present... apparently no strange messages or behavior either...
It seems to me that this kind of error should get a lot more attention than it does currently.
I appreciate your help and your time so THANKS for that.
Best regards and STAY SAFE!
George...
On Sunday, October 23, 2022 at 08:01:24 AM PDT, stan upaitag@zoho.com wrote:
On Sat, 22 Oct 2022 21:51:07 +0000 (UTC) George R Goffe via test test@lists.fedoraproject.org wrote:
Stan,
Thanks for responding to this request for help.
Here's my"ls -n" results. The 5.18 kernels all work and the 6.1 kernel fails. The size doesn't necessarily mean that there aren't libraries missing though.
The smaller 5.18 initramfs file was created by my manual invocation of dracut to build it but it STILL BOOTS.
I suspect that the compression algorithm for the system created initramfs is different than the compression algorithm of the one you created manually. For what it is worth, I turned off compression of the initramfs to remove a variable during troubleshooting. And left it off as it only saves a few megs of drive space. Once, that was important, but in the days of terabytes, well, I don't think that is so important. Maybe for a raspberry pi? Or android? Internet of things?
As I stated, my "hang"is in dracut-pre-mount hooks processing. I don't know enough about dracut to trouble shoot this any further. I'm eager to learn though. Pointers to docs would be helpful. Is there a ldd lookalike for initramfs? I could unpack the boot and bad initramfs files and compare the libraries in both. This might give a clue as to what's going on.
If you look at man dracut , there are instructions on how to have dracut put out debug information. As I said, that produced nothing in my case because there wasn't a dracut error, just an initramfs build error.
I'm not aware of an ldd for the initramfs. What I did is run ldd on libsystemd, libsystemd-core, and libsystemd-share to find what libraries were essential in order for systemd to start. Once systemd starts, it has the installed libraries of the system to use, and so it doesn't need anything else from the initramfs.
If you want a quick and dirty method, just run a grep as follows on the two initramfs you are comparing to create two files, and then edit them side by side. Should be all on one line, email client wrapped it.
grep -e 'lib*' initramfs[specification] > initramfs[specification]_libs.txt
Around the point of this failure are what looks like something is writing a script... just a few lines but zfs is referenced though.
If you get debug output from dracut, that might illuminate the cause of that.
I don't have this problem with a F38 VM (VirtualBox). Also, I do NOT see any strange messages during system upgrade although dnf apparently runs something that uses ALL memory or somehow triggers an OOM kill event that kills the window manager and logs me off the system.
I wonder, if you are using f38, I've been seeing that dnf5, the new version of dnf written in c++(?), might be there. So it could be you are finding a bug.
I have only one system here... sigh... Is there no other way to get a console that one could copy the output into a regular file?
The debug instructions for dracut should accomplish this.
Any thoughts or suggestions would be appreciated.
It is for this reason that I always have two versions of fedora installed. The current version and the previous one. When things like this happen, I have recourse to the previous version to debug the issue, and not lose access to a working system for other things.