Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Newer kernels fail to boot
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
rickj
Guru
Guru


Joined: 06 Feb 2003
Posts: 386
Location: Calgary, Alberta, Canada

PostPosted: Sat Apr 11, 2020 12:07 am    Post subject: Newer kernels fail to boot Reply with quote

On a Pentium4 machine, using gentoo-sources, kernels up to 5.1.10 boot via grub with no problems. Kernels 5.5.9 and 5.6.2 fail to boot. The boot process stops without any error message at a random point between 0.5 and 1.5 seconds. I have tried compiling kernels with no new options, and with all security options disabled, without improvement. As far as I can see the only change is that the compiler is now gcc-9.2.0. On several other machines, both x86 and amd-64, I have had no trouble with updates.

Can anyone see what I am missing here?
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1775
Location: KUUSANKOSKI, Finland

PostPosted: Sat Apr 11, 2020 2:09 pm    Post subject: Re: Newer kernels fail to boot Reply with quote

rickj wrote:
The boot process stops without any error message at a random point between 0.5 and 1.5 seconds.
Has init been started at this point?
I have stumbled into similar problem. I think it's lack of entropy (in /dev/random). Because the boot didn't hung up if I hit some keys during boot.
Later I installed haveged and made it start as early as possible. Booting problems seem to have gone now... Finger crossed...

As to why this suddenly happens is because kernel options CONFIG_RANDOM_TRUST_CPU and CONFIG_RANDOM_TRUST_BOOTLOADER have been introduced at some point (at which?). If both are disabled, you only get random entropy from your input (keyboard and mouse mainly). Which is quite a slow entropy generation.

Now... Try to boot and hit alt and ctrl (they don't mess up the boot messages) rapidly during boot. ;) If you manage to boot successfully, then my theory is on more solid ground. ;)
_________________
..: Zucca :..

Code:
ERROR: '--failure' is not an option. Aborting...
Back to top
View user's profile Send private message
DONAHUE
Watchman
Watchman


Joined: 09 Dec 2006
Posts: 7602
Location: Goose Creek SC

PostPosted: Sat Apr 11, 2020 3:25 pm    Post subject: Reply with quote

when per linux kernel database:
https://cateee.net/lkddb/web-lkddb/RANDOM_TRUST_BOOTLOADER.html says
option found in kernel versions 4.19-4.20 then reappeared in 5.4 until present
https://cateee.net/lkddb/web-lkddb/RANDOM_TRUST_BOOTLOADER.html says
option found in kernel version 5.4 until present

Theory looking good.
_________________
Defund the FCC.
Back to top
View user's profile Send private message
rickj
Guru
Guru


Joined: 06 Feb 2003
Posts: 386
Location: Calgary, Alberta, Canada

PostPosted: Sat Apr 11, 2020 4:25 pm    Post subject: Reply with quote

Unfortunately, it seems that no amount of key pressing during boot prevents the stall.

This happens well before init has started. At present the last lines on the screen are:
Code:
fuse init (API version 7.31)
*** VALIDATE fuseblk ***

which sounds like a process which may need some entropy.

I will try enabling the CONFIG_RANDOM options, and see if it helps
Back to top
View user's profile Send private message
rickj
Guru
Guru


Joined: 06 Feb 2003
Posts: 386
Location: Calgary, Alberta, Canada

PostPosted: Sat Apr 11, 2020 4:56 pm    Post subject: Reply with quote

With both CONFIG_RANDOM_TRUST_CPU and CONFIG_RANDOM_TRUST_BOOTLOADER set, the situation is unchanged. No amount of key pressing seems to help before or after setting these options.
Back to top
View user's profile Send private message
rickj
Guru
Guru


Joined: 06 Feb 2003
Posts: 386
Location: Calgary, Alberta, Canada

PostPosted: Sat Apr 11, 2020 5:11 pm    Post subject: Reply with quote

I have installed haveged, but it does not help. It is hard to see how it can be started so early as 0.5 seconds into the boot.
Back to top
View user's profile Send private message
rickj
Guru
Guru


Joined: 06 Feb 2003
Posts: 386
Location: Calgary, Alberta, Canada

PostPosted: Fri Apr 17, 2020 3:59 pm    Post subject: Reply with quote

On the basis of "if in doubt, thrash about" I tried removing all modules and module support from the kernel. This had absolutely no effect.
Memtest 86+ runs 20 passes on the system with no errors.
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1775
Location: KUUSANKOSKI, Finland

PostPosted: Fri Apr 17, 2020 7:14 pm    Post subject: Reply with quote

I didn't quite get rid of the booting problems, so I downgraded to latest stable gentoo-sources, which is 5.4.28.
It has been working ever since without problems to boot or any crashes.
One thing that did stopped working, however, was sway's screensaver, swayidle. I doubt this is any kernel related problem...

Have you tried 5.4 -series kernel or earlier?

Also you might want to add debug to the kernel command line in hopes to get more detailed error messages.
_________________
..: Zucca :..

Code:
ERROR: '--failure' is not an option. Aborting...
Back to top
View user's profile Send private message
rickj
Guru
Guru


Joined: 06 Feb 2003
Posts: 386
Location: Calgary, Alberta, Canada

PostPosted: Sat Apr 18, 2020 2:41 am    Post subject: Reply with quote

I have now tried the 5.4.28 kernel, and it fails to boot. The failure is later, at a little over 1.5s:
Code:
PCI_DMA: Using software bounce buffering for IO (SWIOTLB)
software IO TLB: mapped [mem 0x32ac0000-0x36ac0000] (64MB)


I will try next to get more detailed error messages.
Back to top
View user's profile Send private message
rickj
Guru
Guru


Joined: 06 Feb 2003
Posts: 386
Location: Calgary, Alberta, Canada

PostPosted: Sat Apr 18, 2020 6:42 pm    Post subject: Reply with quote

I have also tried the 5.4.28 kernel, compiled with gcc-8.2.0. It still does not boot, although it fails in a different place.
Back to top
View user's profile Send private message
rickj
Guru
Guru


Joined: 06 Feb 2003
Posts: 386
Location: Calgary, Alberta, Canada

PostPosted: Thu Apr 30, 2020 2:37 am    Post subject: Reply with quote

In the hope that the trouble was in the boot manager, grub, I just installed and configured Lilo.

This has no effect, booting fails in the same way, so the problem is somehow in the kernel.
Back to top
View user's profile Send private message
fturco
Veteran
Veteran


Joined: 08 Dec 2010
Posts: 1051
Location: Italy

PostPosted: Thu Apr 30, 2020 9:08 am    Post subject: Reply with quote

@rickj: You may try to do a kernel git bisect: https://wiki.gentoo.org/wiki/Kernel_git-bisect
You just need to know the last good kernel version and the first bad one, and use the same config file.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum