Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
My VPS wont boot with kernel >=4.15.0
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
Elleni
l33t
l33t


Joined: 23 May 2006
Posts: 858

PostPosted: Mon Feb 05, 2018 10:22 pm    Post subject: My VPS wont boot with kernel >=4.15.0 Reply with quote

My virtual private server is setup with encrypted partition including boot. I had used following guide to setup.
http://blog.guya.de/linux-gentoo-encrypted-boot-partition/

For more than a year now, it was no problem to build updated kernels. I just did copy my .config from previous kernel and then continued with make menuconfig && make -j9 && make install && genkernel --luks initramfs && grub-mkconfig -o /boot/grub/grub.cfg

But for some reason, this does not work anymore with 4.15.0 kernel. Very early in boot process vnc window is closed an on the webinterface of my hoster I can see, that the server is offline. Anyone has an idea, what could have changed in kernel >=4.15.0 that could explain this behaviour ? Could it be related to some new kernel options ex. retpoline or PTI and/or is there a problem of the way cryptsetup with key is setup according to above link?

Following two short vids of how it looks like now, and how it was before (till kernel 4.14.x) - actually I am using 4.9.76-gentoo-r1 latest stable kernel.

Failed boot:
https://videobin.org/+po7/u7q.html

Booting ok:
https://videobin.org/+po8/u7r.html

I dont even know, where to start for troubleshooting this, as its so early in the boot process that I have no clue, where I could find any usefull error messages. Anyone has a hint for me to start troubleshooting this issue ?
Back to top
View user's profile Send private message
Elleni
l33t
l33t


Joined: 23 May 2006
Posts: 858

PostPosted: Wed Feb 07, 2018 8:34 pm    Post subject: Reply with quote

Please give me something to work with and have a starting point to find the issue. I cant find anything in /var/log/messages and I think this is because the error happens too early in boot process, so how to debug ?
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 43183
Location: 56N 3W

PostPosted: Wed Feb 07, 2018 8:45 pm    Post subject: Reply with quote

Elleni,

Wild guess as there is little to go on.
Its an Intel CPU and you are using a hardened profile.

If that's not true. we heed to do some analysis starting with your lspci output, your kernel .config (put that on a pastebin) and a description of how you migrate from one kernel to another.

Using an encrypted disk on a VPS is fairly pointless. It only protects against data theft when the system is off.
That makes it useful for portable equipment and less so for fixed systems.

-- edit --
Its an Intel CPU and you are using a hardened profile was a feature of 4.14.x
Its fixed in 4.15.0
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
grumblebear
Tux's lil' helper
Tux's lil' helper


Joined: 26 Feb 2008
Posts: 141

PostPosted: Wed Feb 07, 2018 8:49 pm    Post subject: Reply with quote

Just copying the old .config should rarely work. At least do a "make oldconfig" before building the new kernel.
Back to top
View user's profile Send private message
szatox
Veteran
Veteran


Joined: 27 Aug 2013
Posts: 1746

PostPosted: Wed Feb 07, 2018 8:49 pm    Post subject: Reply with quote

That video is unreadable for me.
It seems to have loaded kernel, perhaps you're missing some modules either builtin on in initramfs. It's a shot in the dark, I just can't read anything besides the loading dots. Can you just boot the old kernel again and then try to rebuild new one?
You could reuse config from the working kernel, just copy it and repair with oldconfig.
Back to top
View user's profile Send private message
Elleni
l33t
l33t


Joined: 23 May 2006
Posts: 858

PostPosted: Wed Feb 07, 2018 10:16 pm    Post subject: Reply with quote

NeddySeagoon wrote:
Elleni,

Wild guess as there is little to go on.
Its an Intel CPU and you are using a hardened profile.

If that's not true. we heed to do some analysis starting with your lspci output, your kernel .config (put that on a pastebin) and a description of how you migrate from one kernel to another.

Using an encrypted disk on a VPS is fairly pointless. It only protects against data theft when the system is off.
That makes it useful for portable equipment and less so for fixed systems.

-- edit --
Its an Intel CPU and you are using a hardened profile was a feature of 4.14.x
Its fixed in 4.15.0


Hello Neddy, yes you are correct. Its hardened profile and a virtual intel cpu. But what does that mean? I build the kernel by:
Code:
make menuconfig && make -j9 && make install && genkernel --luks initramfs && grub-mkconfig -o /boot/grub/grub.cfg


grumblebear, part of make after menuconfig is
Code:
scripts/kconfig/conf  --silentoldconfig Kconfig
so kernel options from previous working config are copied.

using encrypted disk was more for a learning expereance. Maybe it saves data, when provider sells old discs on amazon, I know that its not really very meaningfull. But this procedure always worked now for more than a year, but suddenly does not anymore with kernel 4.15 and newer..

szatox this .config works with older kernels before 4.15. Video should only show how early vnc windows is closed and server goes offline, even before first services come up


Last edited by Elleni on Wed Feb 07, 2018 10:22 pm; edited 1 time in total
Back to top
View user's profile Send private message
Elleni
l33t
l33t


Joined: 23 May 2006
Posts: 858

PostPosted: Wed Feb 07, 2018 10:22 pm    Post subject: Reply with quote

Neddy, does that mean changing profile from

Code:
default/linux/amd64/17.0/no-multilib/hardened

to
Code:
default/linux/amd64/17.0/no-multilib


will do the trick?

I ll try. Thanks.

Edit: Will emerge -uDNav --with-bdeps=yes be enough after profile switch, or should I do emerge -e system followed by emerge -e world? And should I rebuild kernel after profile switch?
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 43183
Location: 56N 3W

PostPosted: Wed Feb 07, 2018 10:30 pm    Post subject: Reply with quote

Elleni,

There was a problem with the gentoo hardened profile and the 4.14 kernel that prevented the kernel booting on Intel CPUs.
That's why the 4.14 stable gentoo kernel was masked while it was investigated.
The most obvious symptom was an early panic. Its been fixed now by a change to the CFLAGS used for building the kernel and only 4.14 was affected as far as I know.
Whatever, ifs fixed in 4.15, so its not that unless you are editing the kernel build time CFLAGS.

Now, what of your lspci and kernel .config.

There is at least one new option added to the kernel recently that was defaulted to off.It was a very bad thing as it disabled USB interfaces connected to a PCI bus.
That's fixed now too. It may not matter on a VPS.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Elleni
l33t
l33t


Joined: 23 May 2006
Posts: 858

PostPosted: Wed Feb 07, 2018 10:36 pm    Post subject: Reply with quote

Oh, ok, I am doing the profile switch and emerge -uDNav --with-bdeps=y anyways

lspci
Code:
00:00.0 Host bridge: Intel Corporation 82P965/G965 Memory Controller Hub (rev 02)
00:01.0 PCI bridge: Intel Corporation 82G35 Express PCI Express Root Port (rev 02)
00:03.0 Unassigned class [ff00]: Parallels, Inc. Virtual Machine Communication Interface
00:05.0 Ethernet controller: Intel Corporation 82545EM Gigabit Ethernet Controller (Copper)
00:0a.0 PCI bridge: Digital Equipment Corporation DECchip 21150
00:0e.0 RAM memory: Red Hat, Inc Virtio memory balloon
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev f2)
00:1f.0 ISA bridge: Intel Corporation 82801HB/HR (ICH8/R) LPC Interface Controller (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801BA IDE U100 Controller (rev 05)
00:1f.2 SATA controller: Intel Corporation 82801HR/HO/HH (ICH8R/DO/DH) 6 port SATA Controller [AHCI mode] (rev 02)
01:00.0 VGA compatible controller: Parallels, Inc. Accelerated Virtual Video Adapter

cat /usr/src/linux/.config |wgetpaste
https://paste.pound-python.org/show/UNKxgnItCabka7K4G2Hs/

emerge --info
Code:
Portage 2.3.19 (python 2.7.14-final-0, default/linux/amd64/17.0/no-multilib, gcc-6.4.0, glibc-2.25-r9, 4.9.76-gentoo-r1 x86_64)
=================================================================
System uname: Linux-4.9.76-gentoo-r1-x86_64-Intel-R-_Xeon-R-_CPU_E5-2620_v3_@_2.40GHz-with-gentoo-2.4.1
KiB Mem:     6116180 total,   2555680 free
KiB Swap:    4194300 total,   4194300 free
Timestamp of repository gentoo: Wed, 07 Feb 2018 22:00:01 +0000
Head commit of repository gentoo: 471d2fa0870254bcc6557cef8f429d85cc512e71
sh bash 4.4_p12
ld GNU ld (Gentoo 2.29.1 p3) 2.29.1
app-shells/bash:          4.4_p12::gentoo
dev-lang/perl:            5.24.3::gentoo
dev-lang/python:          2.7.14-r1::gentoo, 3.5.4-r1::gentoo
dev-util/cmake:           3.9.6::gentoo
dev-util/pkgconfig:       0.29.2::gentoo
sys-apps/baselayout:      2.4.1-r2::gentoo
sys-apps/openrc:          0.34.11::gentoo
sys-apps/sandbox:         2.12::gentoo
sys-devel/autoconf:       2.69-r4::gentoo
sys-devel/automake:       1.15.1-r1::gentoo
sys-devel/binutils:       2.29.1-r1::gentoo
sys-devel/gcc:            6.4.0-r1::gentoo
sys-devel/gcc-config:     1.8-r1::gentoo
sys-devel/libtool:        2.4.6-r3::gentoo
sys-devel/make:           4.2.1::gentoo
sys-kernel/linux-headers: 4.13::gentoo (virtual/os-headers)
sys-libs/glibc:           2.25-r9::gentoo
Repositories:

gentoo
    location: /usr/portage
    sync-type: rsync
    sync-uri: rsync://rsync.gentoo.org/gentoo-portage
    priority: -1000
    sync-rsync-extra-opts:

x-portage
    location: /usr/local/portage
    masters: gentoo
    priority: 0

ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="* -@EULA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/easy-rsa /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php5.6/ext-active/ /etc/php/apache2-php7.1/ext-active/ /etc/php/cgi-php5.6/ext-active/ /etc/php/cgi-php7.1/ext-active/ /etc/php/cli-php5.6/ext-active/ /etc/php/cli-php7.1/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-march=native -O2 -pipe"
DISTDIR="/usr/portage/distfiles"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-logs candy config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync multilib-strict news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LANG="de_CH.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LINGUAS="de el en fr it tr"
MAKEOPTS="-j5"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
USE="3dnow 3dnowext acl amd64 apache2 authdaemond berkdb bzip2 cgi clamav clamdtop cli crypt cryptsetup curl cxx device-mapper dkim dovecot-sasl dri fam fontconfig fortran fpm gd gdbm geoip iconv imap jpeg libmysqlclient maildir mmx mmxext modules mysql mysqli ncurses nls nptl openmp pam pcntl pcre pdo png popcnt readline seccomp sockets spamassassin spell sqlite sse sse2 sse3 sse4_1 sse4a ssl symlink tcpd truetype unicode vhosts xattr xmlwriter zip zlib" ABI_X86="64" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_core authn_dbm authn_file authz_core authz_dbm authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate env expires ext_filter file_cache filter headers include info log_config logio mime mime_magic negotiation proxy proxy_http proxy_wstunnel rewrite setenvif socache_shmcb speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="aes avx mmx mmxext pclmul popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="evdev" KERNEL="linux" L10N="de" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-1" POSTGRES_TARGETS="postgres9_5" PYTHON_SINGLE_TARGET="python3_5" PYTHON_TARGETS="python2_7 python3_5" RUBY_TARGETS="ruby22 ruby23" USERLAND="GNU" VIDEO_CARDS="parallels vesa vga" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CC, CPPFLAGS, CTARGET, CXX, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Back to top
View user's profile Send private message
Elleni
l33t
l33t


Joined: 23 May 2006
Posts: 858

PostPosted: Thu Feb 08, 2018 12:25 am    Post subject: Reply with quote

Still the same situation after having switched profile and re-emerge world -uDNav --with-bdeps=y followed by a kernel rebuild.
Back to top
View user's profile Send private message
Elleni
l33t
l33t


Joined: 23 May 2006
Posts: 858

PostPosted: Fri Feb 09, 2018 9:13 pm    Post subject: Reply with quote

Anything else, I could provide to find a solution?
Back to top
View user's profile Send private message
Elleni
l33t
l33t


Joined: 23 May 2006
Posts: 858

PostPosted: Mon Feb 12, 2018 8:18 pm    Post subject: Reply with quote

Could that be a limitation on the hostside in the end ? I have no clue, what to do here to find out more :cry:
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 13831

PostPosted: Tue Feb 13, 2018 2:33 am    Post subject: Reply with quote

Linus has a zero-regressions policy. That does not prevent regressions from happening, but it does generally require a very compelling reason to allow a known regression to remain. If your old kernel worked, and you did not misconfigure the new kernel, then the new kernel should work. If it does not, that sounds like a possible regression. We would need to rule out configuration error before declaring it to be a regression. Can you find the specific patch that breaks it? A bisection is likely to be required. Unfortunately, the videos are unusable for me, so I cannot help with your actual problem.
Back to top
View user's profile Send private message
Elleni
l33t
l33t


Joined: 23 May 2006
Posts: 858

PostPosted: Sun Feb 18, 2018 1:09 am    Post subject: Reply with quote

Hello Hu and thank you for your reply. I am willing to provide anything needed to find this error. What I can confirm, that it only happens with kernel >=4.15.0. I can confirm this as I come from booting gentoo-sources-4.14.20 which I come from compiling by the same procedure as I always do. After missconfigration is rooled out, I am willing to redo a better quality video if necessary or provide whatever information might help to find out what's going on:

- emerge gentoo-sources-version
- menuconfig, then make sure I add a random kernel option and undo it immediately afterwards to make sure that config file is saved upon exit of menuconfig
- make -j9
- make install
- genkernel --luks initramfs
- grub-mkconfig -o /boot/grub/grub.cfg

And it successfully boots. I tried the same procedure for new 4.15.4 and it does not boot, so I made new videos, which can be seen here:
https://cloud.tsarouchas.ch/index.php/s/J9Yfe6kG7LtdnXw

Thanks in advance for any help :D
Back to top
View user's profile Send private message
bunder
Bodhisattva
Bodhisattva


Joined: 10 Apr 2004
Posts: 5839

PostPosted: Mon Feb 19, 2018 4:26 am    Post subject: Reply with quote

is it possible to set your hypervisor to not close the window when the system panics? all i see is normal boot then the window closing.

or better yet, could we get a screenshot of the panic?
Back to top
View user's profile Send private message
Elleni
l33t
l33t


Joined: 23 May 2006
Posts: 858

PostPosted: Mon Feb 19, 2018 10:26 pm    Post subject: Reply with quote

I asked for it to support of the hoster and will see, what they say.
Back to top
View user's profile Send private message
Elleni
l33t
l33t


Joined: 23 May 2006
Posts: 858

PostPosted: Tue Feb 20, 2018 6:55 pm    Post subject: Reply with quote

Hoster supporter was willing to make a printscreen but tells me, that he also only sees the booting kernel for a short moment and then his hypervisor software reports virtual machine down :(

May this be a problem with hypervisor parallels ? I asked them to confirm that they are able to boot 4.15 kernels and am waiting for an answer. 8O

Could it be worth a try to boot a kernel from other sources than gentoo-sources ?


Last edited by Elleni on Wed Feb 21, 2018 12:00 am; edited 1 time in total
Back to top
View user's profile Send private message
Elleni
l33t
l33t


Joined: 23 May 2006
Posts: 858

PostPosted: Tue Feb 20, 2018 11:59 pm    Post subject: Reply with quote

Out of couriosity I tried to boot vanilla-sources-4.15.4 and git-sources-4.16_rc2 with same result so I am really out of ideas. One thing that came to my mind, was that I had tried if I can update intel-microcode on this virtual server. I had followed wiki variant "New method without initram-fs/disk" but verification by dmesg | grep microcode showed nothing so I thought, that maybe this is not intended to work on virtual server anyways. When compiling git-sources I removed corresponding kernel options

Code:
Processor type and features  --->
    <*> CPU microcode loading support
    [*]   Intel microcode loading support

Device Drivers  --->
  Generic Driver Options  --->
    [*]   Include in-kernel firmware blobs in kernel binary
    (intel-ucode/06-3c-03) External firmware blobs to build into the kernel binary
    (/lib/firmware) Firmware blobs root directory


But that did not the trick so I am really out of ideas and am afraid that I have to stick with 4.14 kernel series. Again, can this eventually be a problem with parallels hypervisor ?

What else can I try? genkerel all ? Never used that though.

[Moderator edit: added [code] tags to preserve output layout. -Hu]
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 13831

PostPosted: Wed Feb 21, 2018 2:22 am    Post subject: Reply with quote

If it is a kernel regression, and it is present in v4.15 and in v4.16-rc2, then it is likely that no one has reported it to the correct maintainer. Please try a bisection to identify the specific commit which breaks v4.15 for you.
Back to top
View user's profile Send private message
Elleni
l33t
l33t


Joined: 23 May 2006
Posts: 858

PostPosted: Mon Feb 26, 2018 12:16 am    Post subject: Reply with quote

I dont know what a bisection is, and thus am not able to do so.
Back to top
View user's profile Send private message
Elleni
l33t
l33t


Joined: 23 May 2006
Posts: 858

PostPosted: Tue Mar 06, 2018 7:30 pm    Post subject: Reply with quote

I will ask my hoster for a testvm to do a new minimal gentoo setup and I also ask them to setup a fedora27 or opensuse-tumbleweed vm to see, wether their kernel 4.15.X is able to boot on their hypervisor parallels and if so to provide me their kernel config. The support of the hoster already announced to me that they are planing to upgrade their hypervisor within the next 2 weeks and is asking for patience. Maybe this will enable my kernel to boot.

Is there a way to setup 4.15. kernel with the same settings as 4.14.X that is successfully booting, and deactivate any newly added kernel options that are new on 4.15 and did not exist on older kernel? Thatway I could prove that one of the new kernel options is causing this problem?

I will also ask them if they also are able to provide a vm on kvm/qemu instead of parallels.

If nothing helps, I will probably migrate to another provider who has vms on qemu/kvm hypervisor.
Back to top
View user's profile Send private message
gengreen
Tux's lil' helper
Tux's lil' helper


Joined: 23 Dec 2017
Posts: 84

PostPosted: Thu Mar 08, 2018 1:25 am    Post subject: Reply with quote

NeddySeagoon wrote:
Elleni,

There was a problem with the gentoo hardened profile and the 4.14 kernel that prevented the kernel booting on Intel CPUs.
That's why the 4.14 stable gentoo kernel was masked while it was investigated.
The most obvious symptom was an early panic. Its been fixed now by a change to the CFLAGS used for building the kernel and only 4.14 was affected as far as I know.
Whatever, ifs fixed in 4.15, so its not that unless you are editing the kernel build time CFLAGS.

Now, what of your lspci and kernel .config.

There is at least one new option added to the kernel recently that was defaulted to off.It was a very bad thing as it disabled USB interfaces connected to a PCI bus.
That's fixed now too. It may not matter on a VPS.


Hello,

I'm not familiar with all kind of VPS but they don't have bootloader ?

Using a crypted partition on a dedicated (and on any personal computer I would say) is a must have, it can avoid a single user boot...
Back to top
View user's profile Send private message
Elleni
l33t
l33t


Joined: 23 May 2006
Posts: 858

PostPosted: Fri Apr 06, 2018 11:02 pm    Post subject: Reply with quote

Just for information. My provider informed me, that kernels 4.15 and newer are not supported, so aparently it is not a problem of kernel compilation, but hostsystem that would not support it. Thank you to everyone helping with this issue.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum