Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Kernel crash after initial reboot after install
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Installing Gentoo
View previous topic :: View next topic  
Author Message
evull
n00b
n00b


Joined: 15 Nov 2019
Posts: 4

PostPosted: Fri Nov 15, 2019 3:06 am    Post subject: Kernel crash after initial reboot after install Reply with quote

Greetings[/url],

I have run into some trouble that has me a bit stumped right now while trying to get gentoo installed on a new system. It seems like it might be AMD Vega10 related but I did follow the AMDGPU guide in the wiki so I am not sure if that is a red herring or not. I am only guessing that since that is what was in the output right before the reboot happens.

The setup itself is not too complex. It has LVM but does not have LUKS. No initramfs. Below are all the particulars. It should be noted that the 0.713056 line in the screenshot is the last line shown and then the machine immediately reboots.

The kernel version is 4.19.82 and the linux-firmware version is 20191108.

Any suggestions would be greatly appreciated.

screenshot at time of panic
dmesg from LiveCD
lspci from LiveCD
kernel config

Code:

Filesystem              Type      Size  Used Avail Use% Mounted on
/dev/mapper/system-root ext4       30G  3.3G   25G  12% /
/dev/mapper/system-var  ext4       40G  1.3G   36G   4% /var
/dev/mapper/system-home ext4      295G   65M  295G   1% /home
Back to top
View user's profile Send private message
jburns
Veteran
Veteran


Joined: 18 Jan 2007
Posts: 1079
Location: Massachusetts USA

PostPosted: Fri Nov 15, 2019 6:40 am    Post subject: Reply with quote

Try using a 5.3 kernel
Back to top
View user's profile Send private message
evull
n00b
n00b


Joined: 15 Nov 2019
Posts: 4

PostPosted: Fri Nov 15, 2019 3:45 pm    Post subject: Reply with quote

A reasonable suggestion since a lot of this hardware is pretty new. I gave that a try. I just did make oldconfig against the same config pasted above. The only driver config change related to my system is that 5.3 seems to have direct support for my Realtek 8822BE WiFi device so I enabled that. Compiled, installed, rebooted and it is still not working... but it's not working in a different way this time which is potentially good. Instead of panicking and rebooting immediately after the FB line like before, it's getting past that point and instead freezing (without reboot) right after initializing HD-Audio components (see screenshot). I am fairly certain I have the correct HD Audio device configured though.

Screenshot

This is kernel 5.3.10.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 45383
Location: 56N 3W

PostPosted: Fri Nov 15, 2019 6:56 pm    Post subject: Reply with quote

evull,

evull wrote:
It has LVM but does not have LUKS. No initramfs.


You can't do that. You need the userspace tool vgchange to start your logical volumes becoge the kernel can mount /dev/mapper/system-root as its root filesystem.
That has to go into an initrd.

You have
Code:
CONFIG_PANIC_TIMEOUT=0
so the system should panic and hang, not reboot.
The reboot suggests that the kernel has been built for the wrong CPU and is getting an exception that it can't handle.

That seems unlikely given
Code:
# CONFIG_MK8 is not set
# CONFIG_MPSC is not set
# CONFIG_MCORE2 is not set
# CONFIG_MATOM is not set
CONFIG_GENERIC_CPU=y

_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
evull
n00b
n00b


Joined: 15 Nov 2019
Posts: 4

PostPosted: Fri Nov 15, 2019 7:32 pm    Post subject: Reply with quote

Thanks for the response. Ah, when I was reading the GRUB2 docs it gave me the impression that it supported LVM. Which, well it does I suppose since /boot itself with the kernel is on /dev/mapper/system-root. But I guess that is separate to mounting it afterward?

Yes I was not expecting the reboot either, and it was rather inconvenient doing camera gymnatsics while trying to get a screenshot as a result. But that symptom has gone away when changing from kernel 4.19.82 to 5.3.10. Now it hangs as expected with the text still on the screen. I want to blame AMDGPU for it on 4.19.82 but that's just a gut feeling, no evidence.

Regarding the processor type in config, I suppose maybe I should have that set to "Opteron/Athlon64/Hammer/K8" since it matches AMD but none of those specific types are "Zen" so I left it generic. In my Portage make.conf I do have -march=znver1 set however.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 45383
Location: 56N 3W

PostPosted: Fri Nov 15, 2019 7:57 pm    Post subject: Reply with quote

evull,

The text on the screen just stopping makes me think the kernel has swapped consoles and the boot contines, or it panics somewhere else because there is no panic message.

grub2 can read LVM. It has to make its own arrangements for that because the kernel filesystem tree does not exist until the kernel mounts root and the localmount service processes /etc/fstab.

grub2 loads the kernel and initrd and than combination has to mount the root filesystem and run the initscript.
The kernel alone cannot start logical volumes, nor read from raid sets.
Raid autodetect only works with raid Ver 0.9 superblocks. The default for a long time has been raid Ver 1.2 superblocks.

Have you added rootwait to your kernel command line?
That makes the kernel wail forever if root never appears. It would prevent the can't mount root panic message appearing.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
evull
n00b
n00b


Joined: 15 Nov 2019
Posts: 4

PostPosted: Sat Nov 16, 2019 4:59 am    Post subject: Reply with quote

So I went ahead and built initramfs to see if that would fix it. Unfortunately, same result more or less. The ordering of the items at the bottom changed a little but still frozen.

Screenshot

Oh, I had already snapped that photo but one more line popped up on the screen afterwards. It says this:

Code:
[  332.760467] kworker/dying (7) used greatest stack depth: 13736 bytes left


I think that would rule out the "swapped consoles" theory. Also no, there is no rootwait on the kernel command line. Kernel command line is this:

Code:
linux /boot/kernel-genkernel-x86_64-5.3.10-gentoo-initial root=/dev/mapper/system-root ro root_trim=yes dolvm
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 45383
Location: 56N 3W

PostPosted: Sat Nov 16, 2019 10:40 am    Post subject: Reply with quote

evull,

What command did you use to build the intrd?
LVM support is an optional extra.

There is still no sign of a kernel panic.

Code:
CONFIG_MSDOS_PARTITION=y
CONFIG_EFI_PARTITION=y

That's all you need on a PC.

Code:
CONFIG_BINFMT_SCRIPT=y

Good, or your init script won't run.

Code:
CONFIG_BLOCK=y


Code:
# CONFIG_IDE is not set
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
CONFIG_BLK_DEV_SD=y
CONFIG_BLK_DEV_SR=y
CONFIG_BLK_DEV_SR_VENDOR=y
CONFIG_CHR_DEV_SG=y

Thats the top layer of the SCSI cake correct.

Code:
CONFIG_ATA_VERBOSE_ERROR=y
is only logspam. Turn it off at your next kernel rebuild.

Code:
CONFIG_SATA_AHCI=y
takes care of your SATA hardware but ...
some designs need
Code:
# CONFIG_SATA_AHCI_PLATFORM is not set
on and some don't.
If root is on NVMe, if won't matter if your SATA doesn't work.

You can turn off
Code:
CONFIG_ATA_BMDMA=y
thats a menu full of low level HDD controllers you don't have.
Its harmless, just kernel bloat.

Code:
CONFIG_MD=y
CONFIG_BLK_DEV_MD=y
Good.
Code:
CONFIG_MD_AUTODETECT=y
is obsolete. Unless you take care when setting up mdadm raid, it won't work.
Thats the raid superblock version I posted about earlier.

Code:
CONFIG_BLK_DEV_DM=y
Required for the lvm2 userspace tool.

Code:
CONFIG_DM_CRYPT=y
CONFIG_DM_MIRROR=y
CONFIG_DM_RAID=y
CONFIG_DM_ZERO=y
are only needed if you have a use for them.

Code:
CONFIG_NVME_CORE=y
CONFIG_BLK_DEV_NVME=y
looks good for NVMe users.

Oh ... You have initrd support too.
Code:
CONFIG_BLK_DEV_INITRD=y


How did you modify grub.cfg to load your initrd?
Are you sure that you are using that boot entry?

That's all you need to get to a kernel panic because root is not found.

Having said all that, there is no sign that your kernel tries to mount root. It all looks good, it just doesn't work.
A kernel panic message would be a step forward. At last a panic with an initrd would drop you to the busybox shell, so you could poke about.

If you think the problem is the amdgpu driver, build a new kernel with CONFIG_DRM_AMDGPU=y set off.

You have
Code:
CONFIG_FB_VESA=y
CONFIG_FB_EFI=y
CONFIG_FB_SIMPLE=y
so you should still have a console. Xorg won't run but that won't matter right now.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
OldTango
l33t
l33t


Joined: 21 Feb 2004
Posts: 630

PostPosted: Sat Nov 16, 2019 10:28 pm    Post subject: Reply with quote

evull wrote:
I want to blame AMDGPU for it on 4.19.82 but that's just a gut feeling, no evidence.

I doubt that, that combination is causing the problems. It's more likely a kernel config or GRUB issue. From your dmesg it appears that the console is handed off to the AMDGPU driver successfully. I can't help you as to it being configured correctly or not or if all the proper firmware gets loaded. I use nvidia graphics cards. As Neddy says if you think it is causing issues turn it off and worry about it later after you get a booting kernel. Also as Neddy has said, your kernel is bloated. You have options set that your system can't or will never use.

I just completed a 4.19.82 kernel on a x399 chipset for the gen-1 AMD Threadripper and it booted on the first attempt.

evull wrote:
Regarding the processor type in config, I suppose maybe I should have that set to "Opteron/Athlon64/Hammer/K8" since it matches AMD but none of those specific types are "Zen" so I left it generic. In my Portage make.conf I do have -march=znver1 set however.
If you want to get support for your specific processor you would need to emerge the kernel sources with the "experimental" use flag set. That will get you a larger list of processors to pick from. Add something like
Code:
>=sys-kernel/gentoo-sources-4.16.5 experimental
to your "/etc/portage/package.use" file.

Best Tango..... :)
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Installing Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum