[ubuntu] Ubuntu 18.04 – System freezing. Nouveau crash in kern.log – nVidia K2200 – Dell T5500

This issue is happening on an old Dell T5500 I brought home from work when we cleaned out old systems. It was intended to be my dedicated rig when I’m on call. Now its my all day rig since we all work from home during the current pandemic.

Sometimes once or twice a day my system will crash. All 3 displays and applications freeze. My mouse cursor stops responding. I cannot switch virtual consoles. My only option is to hard reset with the power button on the PC or magic sys request keys to sync, umount, and reboot.

I had this issue since fresh installing a little less than a year ago or maybe 6 months ago. Its always been Ubuntu 18.04. I tried changing to the proprietary nVidia drivers at some point but that caused a whole host of other issues with desktop not displaying and displays not being detected. I gave up on that and moved back to nouveau. I just need this system to work so I can support our customers. Corporate budgets are locked down due to the pandemic so I can’t just ask for a new workstation like I normally would. Besides that, a new workstation wouldn’t have all the cores and memory I have now. I scavenged other PCs that were being scrapped to add a CPU and memory to this one. I do a lot of testing in VMs so the extra hardware really helps.

I’ve struggled with this for a while so I’m finally reaching out. I do still have the proprietary nVidia drivers present and I’m not sure if that’s exacerbated my problem or not. That is all shown in the output below anyway.

I have an old AMD Radeon 6670, nVidia NVS 510, and nVidia Quadro K2000D that I can swap out for troubleshooting the hardware. I haven’t done this yet but can if that will help in troubleshooting.
nouveau errors from most recent crashs in kern.log:

Code:

Jul 22 16:05:30 DSHOMEUBUNTU kernel: [117409.626941] nouveau 0000:03:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT] Jul 22 16:05:30 DSHOMEUBUNTU kernel: [117409.626954] nouveau 0000:03:00.0: fifo: runlist 0: scheduled for recovery
Jul 22 16:05:30 DSHOMEUBUNTU kernel: [117409.626969] nouveau 0000:03:00.0: fifo: channel 13: killed
Jul 22 16:05:30 DSHOMEUBUNTU kernel: [117409.626975] nouveau 0000:03:00.0: fifo: engine 0: scheduled for recovery
Jul 22 16:05:30 DSHOMEUBUNTU kernel: [117409.627610] nouveau 0000:03:00.0: Xorg[2079]: channel 13 killed!
...
Jul 23 10:13:54 DSHOMEUBUNTU kernel: [ 9642.074759] nouveau 0000:03:00.0: gr: TRAP ch 13 [00fe089000 Xorg[5804]] Jul 23 10:13:54 DSHOMEUBUNTU kernel: [ 9642.074773] nouveau  0000:03:00.0: gr: GPC0/TPC0/MP trap: global 00000000 [] warp 3e0009  [ILLEGAL_INSTR_ENCODING] Jul 23 10:14:10 DSHOMEUBUNTU kernel: [ 9657.536749] nouveau  0000:03:00.0: fifo: fault 00 [READ] at 0000004d01c26000 engine 00 [GR]  client 08 [GPC0/PE_2] reason 00 [PDE] on channel 13 [00fe089000  Xorg[5804]] Jul 23 10:14:10 DSHOMEUBUNTU kernel: [ 9657.536763] nouveau 0000:03:00.0: fifo: channel 13: killed
Jul 23 10:14:10 DSHOMEUBUNTU kernel: [ 9657.536765] nouveau 0000:03:00.0: fifo: runlist 0: scheduled for recovery
Jul 23 10:14:10 DSHOMEUBUNTU kernel: [ 9657.536769] nouveau 0000:03:00.0: fifo: engine 0: scheduled for recovery
Jul 23 10:14:10 DSHOMEUBUNTU kernel: [ 9657.536774] nouveau 0000:03:00.0: fifo: engine 5: scheduled for recovery
Jul 23 10:14:10 DSHOMEUBUNTU kernel: [ 9657.536792] nouveau 0000:03:00.0: Xorg[5804]: channel 13 killed!


The pastebin link is the output from the “diagnostic” command recommended at https://help.ubuntu.com/community/Gr…otingProcedure
https://pastebin.com/u8V9f2rc

 

Leave a Comment