Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
GPU fault detected, eventually system freeze
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
squeegily
n00b
n00b


Joined: 17 Apr 2016
Posts: 57

PostPosted: Wed Mar 08, 2017 6:24 pm    Post subject: GPU fault detected, eventually system freeze Reply with quote

My computer crashes (screen freezes, can't change TTY or do anything but unplug the PC) after playing CS:GO for around 30 minutes (TF2 triggers this bug within a very few minutes of playing; Minecraft has never triggered a system crash).

Leading up to the crash, dmesg gradually begins printing more and more of these:
Code:
[Mar 8 12:06] amdgpu 0000:04:00.0: GPU fault detected: 147 0x0006c802
[  +0.000003] amdgpu 0000:04:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
[  +0.000001] amdgpu 0000:04:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x060C8002
[  +0.000002] amdgpu 0000:04:00.0: VM fault (0x02, vmid 3) at page 0, read from '' (0x00000000) (200)
[  +0.000253] amdgpu 0000:04:00.0: GPU fault detected: 147 0x0006c802
[  +0.000001] amdgpu 0000:04:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
[  +0.000001] amdgpu 0000:04:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x060C8002
[  +0.000001] amdgpu 0000:04:00.0: VM fault (0x02, vmid 3) at page 0, read from '' (0x00000000) (200)
[  +0.001155] amdgpu 0000:04:00.0: GPU fault detected: 147 0x0002c802
[  +0.000002] amdgpu 0000:04:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
[  +0.000002] amdgpu 0000:04:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x020C8002
[  +0.000003] amdgpu 0000:04:00.0: VM fault (0x02, vmid 1) at page 0, read from '' (0x00000000) (200)
GPU: Radeon HD 7450 (Pitcairn)

I have experienced this bug on varying combinations of xf86-video-amdgpu, mesa, and xorg-* versions: stable, testing, and live, both with and without video_cards_radeon (which I can now confirm is an unneeded USE flag for PITCAIRN+AMDGPU)
Back to top
View user's profile Send private message
khayyam
Watchman
Watchman


Joined: 07 Jun 2012
Posts: 6228
Location: Room 101

PostPosted: Wed Mar 08, 2017 9:12 pm    Post subject: Reply with quote

squeegily ...

difficult to say exactly without the versions of mesa, llvm, etc, but it may be this bug.

best ... khay
Back to top
View user's profile Send private message
squeegily
n00b
n00b


Joined: 17 Apr 2016
Posts: 57

PostPosted: Thu Mar 09, 2017 4:40 am    Post subject: Reply with quote

khayyam wrote:
difficult to say exactly without the versions of mesa, llvm, etc, but it may be this bug.

best ... khay
That bug could be resolved by downgrading to llvm <3.9.1; however, I just did a test with
  • llvm-3.9.0-r1
  • clang-3.9.0-r100
  • clang-runtime-3.9.0
  • libomp-3.9.0
  • mesa-13.0.5
(and then rebuilding xf86-video-amdgpu-9999), and the bug is still occurring.

So it seems to be a different one. :/ Thanks for the link, though. It was worth a shot.
Back to top
View user's profile Send private message
squeegily
n00b
n00b


Joined: 17 Apr 2016
Posts: 57

PostPosted: Fri Mar 10, 2017 1:29 am    Post subject: Reply with quote

OK, so:
  • Downgrade to Clang 3.9.0 does NOT fix it
  • Disabling DRI3 does NOT fix it
  • mesa and libdrm live ebuilds do NOT fix it

What else can I try?
Back to top
View user's profile Send private message
khayyam
Watchman
Watchman


Joined: 07 Jun 2012
Posts: 6228
Location: Room 101

PostPosted: Fri Mar 10, 2017 9:03 am    Post subject: Reply with quote

squeegily ...

after downgrading llvm you rebuilt mesa? Other than that you might try some of the suggestions (GALLIUM_DDEBUG="1000", GALLIUM_DDEBUG="1000 noflush", GALLIUM_DDEBUG="pipelined 1000", LIBGL_DRI3_DISABLE=1) made in this bug.

HTH & best ... khay
Back to top
View user's profile Send private message
squeegily
n00b
n00b


Joined: 17 Apr 2016
Posts: 57

PostPosted: Fri Mar 10, 2017 6:49 pm    Post subject: Reply with quote

khayyam wrote:
squeegily ...

after downgrading llvm you rebuilt mesa? Other than that you might try some of the suggestions (GALLIUM_DDEBUG="1000", GALLIUM_DDEBUG="1000 noflush", GALLIUM_DDEBUG="pipelined 1000", LIBGL_DRI3_DISABLE=1) made in this bug.

HTH & best ... khay
I rebuilt mesa with the 3.9.0 Clang buildchain (then restarted X); that didn't fix it.

I also tried using Kernel 4.11; no luck there.

GALLIUM_DDEBUG="pipelined 1000" breaks TF2 and CS:GO; however, removing the "1000" fixes it: (I've cut a few repetitive lines with "[CUT x<num>]")
Code:
$ GALLIUM_DDEBUG="pipelined" LIBGL_DRI3_DISABLE=1 steam steam://rungameid/440     
Running Steam on gentoo 2.2 64-bit
STEAM_RUNTIME is disabled by the user
Installing breakpad exception handler for appid(steam)/version(1489101908)
Installing breakpad exception handler for appid(steam)/version(1489101908)
Installing breakpad exception handler for appid(steam)/version(1489101908)
Installing breakpad exception handler for appid(steam)/version(1489101908)
../vgui_surfacelib/FontManager.cpp (276) : Assertion Failed: descs.Count() >= 1
Assert( Assertion Failed: descs.Count() >= 1 ):../vgui_surfacelib/FontManager.cpp:276

Installing breakpad exception handler for appid(steam)/version(1489101908)
crash_20170310123750_6.dmp[6518]: Uploading dump (out-of-process)
/tmp/dumps/crash_20170310123750_6.dmp
../vgui_surfacelib/FontManager.cpp (276) : Assertion Failed: descs.Count() >= 1
../vgui_surfacelib/FontManager.cpp (276) : Assertion Failed: descs.Count() >= 1
../vgui_surfacelib/FontManager.cpp (276) : Assertion Failed: descs.Count() >= 1
Installing breakpad exception handler for appid(steam)/version(1489101908)
Installing breakpad exception handler for appid(steam)/version(1489101908)
Installing breakpad exception handler for appid(steam)/version(1489101908)
Installing breakpad exception handler for appid(steam)/version(1489101908)
Installing breakpad exception handler for appid(steam)/version(1489101908)
Installing breakpad exception handler for appid(steam)/version(1489101908)
Installing breakpad exception handler for appid(steam)/version(1489101908)
Installing breakpad exception handler for appid(steam)/version(1489101908)

** (steam:6480): WARNING **: Could not initialize NMClient /org/freedesktop/NetworkManager: The name org.freedesktop.NetworkManager was not provided by any .service files
crash_20170310123750_6.dmp[6518]: Finished uploading minidump (out-of-process): success = yes
crash_20170310123750_6.dmp[6518]: response: CrashID=bp-977e6f14-0537-4195-bc23-1e1532170310
crash_20170310123750_6.dmp[6518]: file ''/tmp/dumps/crash_20170310123750_6.dmp'', upload yes: ''CrashID=bp-977e6f14-0537-4195-bc23-1e1532170310''
Installing breakpad exception handler for appid(steam)/version(1489101908)
Generating new string page texture 2: 48x256, total string texture memory is 49.15 KB
Generating new string page texture 3: 384x256, total string texture memory is 442.37 KB
Installing breakpad exception handler for appid(steam)/version(1489101908)
Installing breakpad exception handler for appid(steam)/version(1489101908)
roaming config store loaded successfully - 1817 bytes.
migrating temporary roaming config store
Installing breakpad exception handler for appid(steam)/version(1489101908)
Failed to init SteamVR because it isn't installed
sh: lspci: command not found
ExecCommandLine: ""/home/james/.local/share/Steam/ubuntu12_32/steam" "steam://rungameid/440" "
ExecSteamURL: "steam://rungameid/440"
Installing breakpad exception handler for appid(steam)/version(1489101908)
System startup time: 13.65 seconds
Generating new string page texture 71: 256x256, total string texture memory is 704.51 KB
Game update: AppID 440 "Team Fortress 2", ProcID 6543, IP 0.0.0.0:0
>>> Adding process 6543 for game ID 440
ERROR: ld.so: object '/home/james/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
ERROR: ld.so: object '/home/james/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
pid 6545 != 6544, skipping destruction (fork without exec?)
ERROR: ld.so: object '/home/james/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
ERROR: ld.so: object '/home/james/.local/share/Steam/ubuntu12_64/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS64): ignored.
>>> Adding process 6544 for game ID 440
>>> Adding process 6546 for game ID 440
>>> Adding process 6547 for game ID 440
SDL video target is 'x11'
SDL video target is 'x11'
Installing breakpad exception handler for appid(gameoverlayui)/version(20170309222622)
Installing breakpad exception handler for appid(gameoverlayui)/version(1.0)
Installing breakpad exception handler for appid(gameoverlayui)/version(1.0)
Installing breakpad exception handler for appid(gameoverlayui)/version(1.0)
Using Breakpad minidump system. Version: 3833195 AppID: 440
Setting breakpad minidump AppID = 440
Using breakpad crash handler
Forcing breakpad minidump interfaces to load
Looking up breakpad interfaces from steamclient
Calling BreakpadMiniDumpSystemInit
Looking up breakpad interfaces from steamclient
Calling BreakpadMiniDumpSystemInit
Steam_SetMinidumpSteamID:  Caching Steam ID:  76561198090193328 [API loaded yes]
Steam_SetMinidumpSteamID:  Setting Steam ID:  76561198090193328
No cached sticky mapping in Get*ActionHandle. Native Steam Controller support won't work.
[CUT x11]
No cached sticky mapping in Get*ActionHandle. Native Steam Controller support won't work.
No cached sticky mapping in GetActionSetHandle. Native Steam Controller support won't work.
Did not detect any valid joysticks.

 ##### CTexture::LoadTextureBitsFromFile couldn't find materials/models/player/items/taunts/loot_crate/mannco_crate.vtf
 
 ##### CTexture::LoadTextureBitsFromFile couldn't find materials/models/player/items/taunts/scooter/pauling_moped.vtf
 
 [CUT x223]
 
 ##### CTexture::LoadTextureBitsFromFile couldn't find materials/models/workshop_partner/weapons/c_models/c_tw_eagle/c_tw_eagle.vtf

 ##### CTexture::LoadTextureBitsFromFile couldn't find materials/models/workshop_partner/weapons/v_models/v_hm_watch/v_hm_watch.vtf
This system supports the OpenGL extension GL_EXT_framebuffer_object.
This system supports the OpenGL extension GL_EXT_framebuffer_blit.
This system supports the OpenGL extension GL_EXT_framebuffer_multisample.
This system DOES NOT support the OpenGL extension GL_APPLE_fence.
This system DOES NOT support the OpenGL extension GL_NV_fence.
This system supports the OpenGL extension GL_ARB_sync.
This system supports the OpenGL extension GL_EXT_draw_buffers2.
This system DOES NOT support the OpenGL extension GL_EXT_bindable_uniform.
This system DOES NOT support the OpenGL extension GL_APPLE_flush_buffer_range.
This system supports the OpenGL extension GL_ARB_map_buffer_range.
This system supports the OpenGL extension GL_ARB_vertex_buffer_object.
This system supports the OpenGL extension GL_ARB_occlusion_query.
This system DOES NOT support the OpenGL extension GL_APPLE_texture_range.
This system DOES NOT support the OpenGL extension GL_APPLE_client_storage.
This system DOES NOT support the OpenGL extension GL_ARB_uniform_buffer.
This system supports the OpenGL extension GL_ARB_vertex_array_bgra.
This system supports the OpenGL extension GL_EXT_vertex_array_bgra.
This system supports the OpenGL extension GL_ARB_framebuffer_object.
This system DOES NOT support the OpenGL extension GL_GREMEDY_string_marker.
This system supports the OpenGL extension GL_ARB_debug_output.
This system DOES NOT support the OpenGL extension GL_EXT_direct_state_access.
This system DOES NOT support the OpenGL extension GL_NV_bindless_texture.
This system supports the OpenGL extension GL_AMD_pinned_memory.
This system supports the OpenGL extension GL_EXT_framebuffer_multisample_blit_scaled.
This system supports the OpenGL extension GL_EXT_texture_sRGB_decode.
This system supports the OpenGL extension GL_NVX_gpu_memory_info.
This system supports the OpenGL extension GL_ATI_meminfo.
This system supports the OpenGL extension GL_EXT_texture_compression_s3tc.
This system supports the OpenGL extension GL_EXT_texture_compression_dxt1.
This system supports the OpenGL extension GL_ANGLE_texture_compression_dxt3.
This system supports the OpenGL extension GL_ANGLE_texture_compression_dxt5.
This system supports the OpenGL extension GL_ARB_buffer_storage.
This system DOES NOT support the OpenGL extension GLX_EXT_swap_control_tear.
OpenGL: Gallium 0.4 on AMD PITCAIRN (DRM 3.9.0 / 4.11.0-rc1, LLVM 3.9.0) 3.0 Mesa 17.0.1 (3.0.0)
GL_NV_bindless_texture: DISABLED
GL_AMD_pinned_memory: DISABLED
GL_ARB_buffer_storage: AVAILABLE
GL_EXT_texture_sRGB_decode: AVAILABLE
Timed out waiting for game mapping!
GL_NVX_gpu_memory_info: AVAILABLE
GL_ATI_meminfo: AVAILABLE
GL_NVX_gpu_memory_info: Total Dedicated: 2075876, Total Avail: 4171896, Current Avail: 2072672
GL_MAX_SAMPLES_EXT: 8
CShaderDeviceMgrBase::GetRecommendedConfigurationInfo: CPU speed: 2668 MHz, Processor: GenuineIntel
GlobalMemoryStatus: 4294967295
CShaderDeviceMgrBase::GetRecommendedConfigurationInfo: CPU speed: 2668 MHz, Processor: GenuineIntel
GlobalMemoryStatus: 4294967295
IDirect3DDevice9::Create: BackBufWidth: 1440, BackBufHeight: 900, D3DFMT: 3, BackBufCount: 1, MultisampleType: 0, MultisampleQuality: 0
GL sampler object usage: DISABLED

 ##### swap interval = 0     swap limit = 1 #####
Installing breakpad exception handler for appid(steam)/version(1489101908)
Shader 'shaders\fxc\skin_ps20b.vcs' - Couldn't load combo 860160 of shader (dyn=160)
Shader 'shaders\fxc\vertexlit_and_unlit_generic_ps20b.vcs' - Couldn't load combo 3833862 of shader (dyn=24)
Shader 'shaders\fxc\skin_ps20b.vcs' - Couldn't load combo 860240 of shader (dyn=160)
Shader 'shaders\fxc\skin_ps20b.vcs' - Couldn't load combo 1998080 of shader (dyn=160)
Shader 'shaders\fxc\skin_ps20b.vcs' - Couldn't load combo 145200 of shader (dyn=160)
Shader 'shaders\fxc\vertexlit_and_unlit_generic_ps20b.vcs' - Couldn't load combo 11796864 of shader (dyn=24)
Shader 'shaders\fxc\skin_ps20b.vcs' - Couldn't load combo 71680 of shader (dyn=160)
Shader 'shaders\fxc\vertexlit_and_unlit_generic_ps20b.vcs' - Couldn't load combo 884742 of shader (dyn=24)
Shader 'shaders\fxc\vertexlit_and_unlit_generic_ps20b.vcs' - Couldn't load combo 1867782 of shader (dyn=24)
Shader 'shaders\fxc\vertexlit_and_unlit_generic_ps20b.vcs' - Couldn't load combo 1597440 of shader (dyn=24)
Loaded program cache file "glbaseshaders.cfg", total keyvalues: 266, total successfully linked: 227
Loaded program cache file "glshaders.cfg", total keyvalues: 230, total successfully linked: 230
Precache: Took 16237 ms, Vertex 2338, Pixel 2361
server.so loaded for "Team Fortress"
No cached sticky mapping in Get*ActionHandle. Native Steam Controller support won't work.
[CUT x29]
No cached sticky mapping in Get*ActionHandle. Native Steam Controller support won't work.
The above is the log from a successful launch, however, it did cause dmesg to start shooting out several of these
Code:
[ +35.533977] amdgpu 0000:04:00.0: GPU fault detected: 147 0x000ec802
[  +0.000002] amdgpu 0000:04:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
[  +0.000001] amdgpu 0000:04:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0E0C8002
[  +0.000002] amdgpu 0000:04:00.0: VM fault (0x02, vmid 7) at page 0, read from '' (0x00000000) (200)
and so I went ahead and closed it before it crashed my system.

P.S., that bug you linked, I believe, is a separate issue: it was caused by an outdated libxcb.so.1, whereas mine is up-to-date.
(You probably already knew it was a different bug, but I wanted to make that explicit to anyone stumbling across this thread—we're now a front-page result for "GPU fault detected" on Google, Bing, and Yahoo)


…looking at some of those other results, there are some reports of people only experiencing that problem when programs use OpenCL.
Imma recompile mesa with USE=-opencl real quick and get back with results.
Back to top
View user's profile Send private message
squeegily
n00b
n00b


Joined: 17 Apr 2016
Posts: 57

PostPosted: Fri Mar 10, 2017 7:15 pm    Post subject: Reply with quote

squeegily wrote:
…looking at some of those other results, there are some reports of people only experiencing that problem when programs use OpenCL.
Imma recompile mesa with USE=-opencl real quick and get back with results.

Alas, no dice. Recompile mesa, restart X, relaunch Steam and dmesg…and it begins to spit out the dreaded GPU fault detected once the game spins up. :cry:
Back to top
View user's profile Send private message
squeegily
n00b
n00b


Joined: 17 Apr 2016
Posts: 57

PostPosted: Sun Mar 12, 2017 8:23 am    Post subject: Reply with quote

a HA, it seems to be specific to the AMDGPU driver! Just tried it with Radeon and everything seems to be working!

EDIT: …might not have fixed it. I'm still getting the system freeze—just in absense of the dmesg warnings. (Though, the freezes are less frequent. Even rare now. Games are actually PLAYABLE, unlike when on the AMDGPU driver.)
Back to top
View user's profile Send private message
TigerJr
Guru
Guru


Joined: 19 Jun 2007
Posts: 504
Location: /dev/x0

PostPosted: Tue Jun 04, 2019 8:54 am    Post subject: Reply with quote

Code:
Jun  4 04:50:13 localhost kernel: amdgpu 0000:05:00.0: [gfxhub] VMC page fault
Jun  4 04:50:13 localhost kernel: amdgpu 0000:05:00.0:   in page starting at address 0x000000058ea00000 from 27
Jun  4 04:50:13 localhost kernel: amdgpu 0000:05:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00541154


Interesting thing

kernel gentoo-soources-5.0.18 noacpi with
amdgpu-pro-18.50-756341
on Sapphire AMD Radeon VII
Is it hardware problem?
_________________

Do not update portage without hotdog!

Xenogentooway?
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum