Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
BUG: Bad page map in process xz
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
Simdol
n00b
n00b


Joined: 27 Mar 2016
Posts: 11

PostPosted: Mon Mar 28, 2016 10:08 am    Post subject: BUG: Bad page map in process xz Reply with quote

Hello,

I've been seeing this message on my jouralctl for every single second (literally), and I wanted to know what is the cause for this message and what I should do to resolve this. Would it be fine if I were to ignore this message? Or is this message something that I should be concerned about? Apart form this message appearing in my journalctl, I don't see apparent issue with my installation. Before I begin anything else, I would like to note that I am currently using pf-sources as my kernel source and this issue seems to still persist on the gentoo-source on the latest stable version. The same kernel seems to run fine on my laptop without any notable issues but my desktop seems to not like linux in general.

Code:
Mar 28 11:34:54  kernel: BUG: Bad page map in process xz  pte:26db71025 pmd:113434067
Mar 28 11:34:55  kernel: page:ffffea0009b6dc40 count:62 mapcount:-197 mapping:ffff8801ccc36ad0 index:0x15e
Mar 28 11:34:55  kernel: flags: 0x20000000000086c(referenced|uptodate|lru|active|private)
Mar 28 11:34:55  kernel: page dumped because: bad pte
Mar 28 11:34:55  kernel: addr:00007fdca96de000 vm_flags:00000075 anon_vma:          (null) mapping:ffff8801ccc36ad0 index:15e
Mar 28 11:34:55  kernel: file:libc-2.21.so fault:filemap_fault mmap:btrfs_file_mmap readpage:btrfs_readpage
Mar 28 11:34:55  kernel: CPU: 0 PID: 19769 Comm: xz Tainted: P    B      O    4.4.0-pf6 #15
Mar 28 11:34:55  kernel: Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./X99-Gaming 5, BIOS F20 01/12/2016
Mar 28 11:34:55  kernel:  0000000000000000 ffff88011b0cfc70 ffffffff81689b78 00007fdca96de000
Mar 28 11:34:55  kernel:  ffff880254b132e0 ffff88011b0cfcc0 ffffffff81215fad 00000001e09f8025
Mar 28 11:34:55  kernel:  00003ffffffff000 00000001236f7067 00007fdca9711000 00007fdca96de000
Mar 28 11:34:55  kernel: Call Trace:
Mar 28 11:34:55  kernel:  [<ffffffff81689b78>] dump_stack+0x4d/0x65
Mar 28 11:34:55  kernel:  [<ffffffff81215fad>] print_bad_pte+0x1bd/0x280
Mar 28 11:34:55  kernel:  [<ffffffff81217ca5>] unmap_single_vma+0x735/0x790
Mar 28 11:34:55  kernel:  [<ffffffff81218525>] unmap_vmas+0x45/0xa0
Mar 28 11:34:55  kernel:  [<ffffffff81220825>] exit_mmap+0xc5/0x180
Mar 28 11:34:55  kernel:  [<ffffffff81123288>] mmput+0x38/0xd0
Mar 28 11:34:55  kernel:  [<ffffffff81127761>] do_exit+0x2f1/0xb60
Mar 28 11:34:55  kernel:  [<ffffffff8113f776>] ? task_work_run+0x76/0x90
Mar 28 11:34:55  kernel:  [<ffffffff81128dc0>] do_group_exit+0x40/0xa0
Mar 28 11:34:55  kernel:  [<ffffffff81128e2f>] SyS_exit_group+0xf/0x10
Mar 28 11:34:55  kernel:  [<ffffffff81c84857>] entry_SYSCALL_64_fastpath+0x12/0x6a
systemd[1]: Looping too fast. Throttling execution a little.
systemd[1]: Looping too fast. Throttling execution a little.
systemd[1]: Looping too fast. Throttling execution a little.


I am running stable version of Gentoo without '~amd64' tag with exception of few software that needs that tag (genkerne) due to systemd. The weird thing to note is that once I re-emerge libc again, the error message seems to disappear after awhile. Here are some information that may be useful:

Hardware Specification
Code:

Intel X99 platform (Gigabyte X99 Gaming 5)
Intel Core i7 Haswell-E 5820k
NVIDIA GTX 970
Micron Crucial DDR4 2133Mhz



make.conf
Code:

CHOST="x86_64-pc-linux-gnu"
CFLAGS="-O2 -pipe -march=haswell -mmmx -mno-3dnow"
CXXFLAGS="${CFLAGS}"
MAKEOPTS="-j13"
INPUT_DEVICES="evdev" #synaptics"
VIDEO_CARDS="nvidia"
GRUB_PLATFORMS="efi-64"
USE="-qt4 -kde X dbus gtk gnome vdpau vaapi xvmc samba"
PORTDIR="/usr/portage"
DISTDIR="${PORTDIR}/distfiles"
PKGDIR="${PORTDIR}/packages"


Kernel Config: http://pastebin.com/BDiurVRb

Thank you.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 7281
Location: almost Mile High in the USA

PostPosted: Mon Mar 28, 2016 6:50 pm    Post subject: Reply with quote

indeed this is not normal. It seems hardware but could be software-hardware interaction.

same kernel as in binary or source compared to your laptop?

I guess this came up recently too: updated firmware on the desktop? Are you overclocking your desktop?
_________________
Intel Core i7 2700K@ 4.1GHz/HD3000 graphics/8GB DDR3/180GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Simdol
n00b
n00b


Joined: 27 Mar 2016
Posts: 11

PostPosted: Mon Mar 28, 2016 11:01 pm    Post subject: Reply with quote

eccerr0r wrote:
indeed this is not normal. It seems hardware but could be software-hardware interaction.

same kernel as in binary or source compared to your laptop?

I guess this came up recently too: updated firmware on the desktop? Are you overclocking your desktop?


I've been recently switching over to Gentoo from Arch Linux distribution and I've not had an issue like this over Arch Linux. I've compiled both source from the scratch, using the configuration posted on the pastebin above. I am using latest stable version for my desktop's motherboard directly from the motherboard vender, Gigabyte. Indeed, I've been overclocking my desktop -- which I am sure that it is stable at this point as I've been using this overclocked configuration for decades (even in Arch Linux) now. If you are sure that it is the overclocked setting that may have caused this issue, please feel free to inform me; I will test this out with default BIOS settings.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 7281
Location: almost Mile High in the USA

PostPosted: Tue Mar 29, 2016 12:54 am    Post subject: Reply with quote

If another distribution works, perhaps it was compiled with different CFLAGS and possibly not as cycle efficient as Gentoo or perhaps compiled with a different compiler or version.

But likely you're hitting limits of your CPU. But there's no way for me to tell, you have to experimentally test it. Ideally there should be a flag on bug reports to make it easy to weed out overclockers as it's too easy to mistake a software problem for a hardware problem.

Yes time will tell whether a system is stable, but how could you have been running your Haswell for decades? It hasn't been out for that long. My Sandybridge is just about 5 years old, and not to mention the wires will wear out in a CPU so maximum speed limit will go down as the part ages.

\|/ yes I have a K unlocked chip too... though I've stopped overclocking as I ended up choosing longevity and stability over speed (cpu gets too warm when overclocking.) I should go fix my sig though the chip still runs fine at 4.1/stockvoltage/stockheatsinkfan...
Back to top
View user's profile Send private message
Simdol
n00b
n00b


Joined: 27 Mar 2016
Posts: 11

PostPosted: Tue Mar 29, 2016 1:07 am    Post subject: Reply with quote

eccerr0r wrote:
If another distribution works, perhaps it was compiled with different CFLAGS and possibly not as cycle efficient as Gentoo or perhaps compiled with a different compiler or version.

But likely you're hitting limits of your CPU. But there's no way for me to tell, you have to experimentally test it. Ideally there should be a flag on bug reports to make it easy to weed out overclockers as it's too easy to mistake a software problem for a hardware problem.

Yes time will tell whether a system is stable, but how could you have been running your Haswell for decades? It hasn't been out for that long. My Sandybridge is just almost 5 years old... Not to mention the wires will wear out in a CPU so maximum speed limit will go down as the part ages.


Apologies for my poor wording. I've had this overclocked configuration since December of 2014 and have not changed the overclocked setting as it seemed stable after 2 days of stress testing. I am not sure if my CPU degraded overtime but I don't see how it could degrade that quickly enough for the system to fail, given that the CPU temperature at max load was reasonable and the vcore voltage supplied to the CPU wasn't excessive. The other reason that I assume is that this may have been due to poor use of CFLAGS which resulted in this issue. Could you take a look into this?

My current CFLAG is: '-O2 -pipe -march=native', used to be '-O2 -pipe -march=haswell -mmmx -mno-3dnow', but as it was failing to compile numerous packages including systemd with 32bits support, I've had to change it recently. I've followed up the guide for Safe CFLAGS but I am unsure what to change at this moment.

Here is the output of 'diff march.s native.s' after following the guide for https://wiki.gentoo.org/wiki/Safe_CFLAGS
Code:

18,19c18,19
< # -m128bit-long-double -m64 -m80387 -maes -malign-stringops -mavx -mavx2
< # -mbmi -mbmi2 -mcx16 -mf16c -mfancy-math-387 -mfma -mfp-ret-in-387
---
> # -m128bit-long-double -m64 -m80387 -mabm -maes -malign-stringops -mavx
> # -mavx2 -mbmi -mbmi2 -mcx16 -mf16c -mfancy-math-387 -mfma -mfp-ret-in-387


Thank you,
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 7281
Location: almost Mile High in the USA

PostPosted: Tue Mar 29, 2016 1:26 am    Post subject: Reply with quote

Find out what Arch uses and see if you can use the same gcc compiler as Arch since it works.

You may have to also recompile your kernel with the same gcc as Arch.

You may also need to try with generic x86_64 to see if it has any effect.
_________________
Intel Core i7 2700K@ 4.1GHz/HD3000 graphics/8GB DDR3/180GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Simdol
n00b
n00b


Joined: 27 Mar 2016
Posts: 11

PostPosted: Mon Apr 04, 2016 11:18 am    Post subject: Reply with quote

eccerr0r wrote:
Find out what Arch uses and see if you can use the same gcc compiler as Arch since it works.

You may have to also recompile your kernel with the same gcc as Arch.

You may also need to try with generic x86_64 to see if it has any effect.


Thank you very much. After recompiling most of the base system with native '-march=native', the recurring message in the 'dmesg' and 'journalctl' seems to be gone! It is apparent now that somehow -march=haswell doesn't support my CPU (Haswell-E 5820k), either missing instruction or having additional non-usable instruction. Those who are having the same issue with the 5820k CPU, try to compile your package with CFLAG of '-O2 -pipe -march=native' as it did the trick for me.

Thank you.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 7281
Location: almost Mile High in the USA

PostPosted: Mon Apr 04, 2016 4:16 pm    Post subject: Reply with quote

I still wonder if it works without overclocking.

And if there are anyone with a Haswell working with -march=haswell .. if nobody has it, gcc needs to fix it, but the errors seem too random to be a gcc bug versus hardware issue.
_________________
Intel Core i7 2700K@ 4.1GHz/HD3000 graphics/8GB DDR3/180GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum