Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
System stuck during boot for ~5 minutes hanging at random
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
chrisk2305
Tux's lil' helper
Tux's lil' helper


Joined: 05 Sep 2007
Posts: 110

PostPosted: Sun Apr 05, 2020 3:44 pm    Post subject: System stuck during boot for ~5 minutes hanging at random Reply with quote

Hi,

My system is stuck during boot for quite some time due to random fast init and random crng init hanging:

Code:
3.104177] clocksource: Switched to clocksource tsc
[   54.950502] random: fast init done
[  149.704022] random: crng init done
[  423.106828] smartpqi 0000:0c:00.0:


I searched the forums and already emerged:

Code:

USE=+jitterentropy emerge sys-apps/rng-tools
emerge haveged


And enabled both services but that did not do the trick (I guess because they are userspace tools and my issues happens at the very beginning of booting).

Kernel version I am using is 5.4.3

Any help is much appreciated!

Chris
Back to top
View user's profile Send private message
fedeliallalinea
Bodhisattva
Bodhisattva


Joined: 08 Mar 2003
Posts: 24067
Location: here

PostPosted: Sun Apr 05, 2020 4:06 pm    Post subject: Reply with quote

You tried also some of others proposed by toralf (I use haveged)?
_________________
Questions are guaranteed in life; Answers aren't.
Back to top
View user's profile Send private message
mike155
Advocate
Advocate


Joined: 17 Sep 2010
Posts: 2126
Location: Frankfurt, Germany

PostPosted: Sun Apr 05, 2020 4:37 pm    Post subject: Reply with quote

Please install haveged AND enable it.
Back to top
View user's profile Send private message
chrisk2305
Tux's lil' helper
Tux's lil' helper


Joined: 05 Sep 2007
Posts: 110

PostPosted: Mon Apr 06, 2020 5:06 am    Post subject: Reply with quote

Hi,

please see my inital post. I installed and enabled haveged. Did not do the trick.
Back to top
View user's profile Send private message
mike155
Advocate
Advocate


Joined: 17 Sep 2010
Posts: 2126
Location: Frankfurt, Germany

PostPosted: Mon Apr 06, 2020 12:36 pm    Post subject: Reply with quote

Quote:
I installed and enabled haveged. Did not do the trick.

Sorry! I have overlooked that.
  1. Please upgrade to the latest kernel of the 5.4. series (5.4.30) and retry.

  2. Please post the output of
    Code:
    emerge --info

  3. Please post the output of 'dmesg' directly after booting

  4. Please post your kernel config using wgetpaste.
Back to top
View user's profile Send private message
chrisk2305
Tux's lil' helper
Tux's lil' helper


Joined: 05 Sep 2007
Posts: 110

PostPosted: Mon Apr 06, 2020 2:30 pm    Post subject: Reply with quote

1. done - no change

2.

https://pastebin.com/LuFrWWsW

3.

https://pastebin.com/VjpY6fWV

4.

http://dpaste.com/1QT2PES
Back to top
View user's profile Send private message
mike155
Advocate
Advocate


Joined: 17 Sep 2010
Posts: 2126
Location: Frankfurt, Germany

PostPosted: Mon Apr 06, 2020 3:54 pm    Post subject: Reply with quote

dmesg shows 2 issues, which are most probably independent of one another:
  1. it takes 150 seconds to initialize the random number generator:
    Code:
    [  149.762497] random: crng init done

  2. it takes 430 seconds to initialize the Adaptec SmartRAID driver:
    Code:
    [  434.259870] smartpqi 0000:0c:00.0: Online Firmware Activation enabled

The second issue is the one that hurts you. Let's try to solve that first.
Back to top
View user's profile Send private message
chrisk2305
Tux's lil' helper
Tux's lil' helper


Joined: 05 Sep 2007
Posts: 110

PostPosted: Mon Apr 06, 2020 4:24 pm    Post subject: Reply with quote

hmm....you are probably right. At first I thought that the Adaptec HBA is the issue but concentrated on random (which also takes far too long :) )

In the meanwhile I updated the HBA's firmware to the latest one but that did not change anything in boot behaviour. Not sure what else I can try.
Back to top
View user's profile Send private message
mike155
Advocate
Advocate


Joined: 17 Sep 2010
Posts: 2126
Location: Frankfurt, Germany

PostPosted: Mon Apr 06, 2020 4:29 pm    Post subject: Reply with quote

Some of your kernel option look suspicious. AMD ACPI2Platform is disabled, NUMA is disabled, IOMMU is disabled, etc.

Please read https://wiki.gentoo.org/wiki/Ryzen and adjust your kernel settings. It might also be necessary to increase CONFIG_NR_CPUS to 32.

EDIT (not related to your boot issues): don't use -O3 in your CFLAGS! -O3 is known to break several packages and it's not recommended.


Last edited by mike155 on Mon Apr 06, 2020 4:36 pm; edited 1 time in total
Back to top
View user's profile Send private message
chrisk2305
Tux's lil' helper
Tux's lil' helper


Joined: 05 Sep 2007
Posts: 110

PostPosted: Mon Apr 06, 2020 4:34 pm    Post subject: Reply with quote

are these options also necessary as this is a virtual machine? The Adaptec HBA is passed through to the VM as additional info.
Back to top
View user's profile Send private message
mike155
Advocate
Advocate


Joined: 17 Sep 2010
Posts: 2126
Location: Frankfurt, Germany

PostPosted: Mon Apr 06, 2020 4:52 pm    Post subject: Reply with quote

You're talking about a virtual machine? Really?

Your output of dmesg doesn't look like a virtual machine! 16 CPUs, multiple ethernet adapters, many scsi devices, ... What are you doing?

I'm afraid I can't help you.
Back to top
View user's profile Send private message
chrisk2305
Tux's lil' helper
Tux's lil' helper


Joined: 05 Sep 2007
Posts: 110

PostPosted: Mon Apr 06, 2020 5:23 pm    Post subject: Reply with quote

Yes I am.

This is quite a special setup. I have multiple VMs runnig on that host ( 16C/32T Epyc2 CPU, 128GB RAM). I am passing through 2 NVMe disks and the Adaptec HBA - no more specialties than that.

Neither of that can explain my issues.
Back to top
View user's profile Send private message
toralf
Developer
Developer


Joined: 01 Feb 2004
Posts: 3769
Location: Hamburg

PostPosted: Mon Apr 06, 2020 7:06 pm    Post subject: Reply with quote

chrisk2305 wrote:
are these options also necessary as this is a virtual machine?
Gha,
and didn't you set CONFIG_HW_RANDOM_VIRTIO to be passed to the virtual machine to avoid hangs of virtual machines during boot ???
Back to top
View user's profile Send private message
chrisk2305
Tux's lil' helper
Tux's lil' helper


Joined: 05 Sep 2007
Posts: 110

PostPosted: Tue Apr 07, 2020 5:55 am    Post subject: Reply with quote

@toralf: The Hypervisor used here is esxi 6.7. As far as I know VMWare does not pass through a hardware number generator.
Back to top
View user's profile Send private message
toralf
Developer
Developer


Joined: 01 Feb 2004
Posts: 3769
Location: Hamburg

PostPosted: Tue Apr 07, 2020 7:17 am    Post subject: Reply with quote

chrisk2305 wrote:
@toralf: The Hypervisor used here is esxi 6.7. As far as I know VMWare does not pass through a hardware number generator.
Understood - you should take a look at the LKML archive. The problem with hanging virtual machines due to the rng not being initialized was a hot topic a time ago - I'm pretty sure you'll find the answer there.
Back to top
View user's profile Send private message
chrisk2305
Tux's lil' helper
Tux's lil' helper


Joined: 05 Sep 2007
Posts: 110

PostPosted: Wed Apr 08, 2020 7:31 am    Post subject: Reply with quote

All the threads I find on the internet suggest to install haveged and/ or rng-tools. I also tried "Trust CPU for rng" Kernel option but the issue persists. I am stuck here.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum