Constant BSOD crashing causing PC to be unusable

Hi GWJ'ers,

Over the past week I've been tackling a problem of my PC constantly crashing making it practically unusable. I've been consulting with fellow GWJ'er Wayfarer who has been generously helping me out with some troubleshooting steps.

I was wondering if anyone on the boards here also has any potential insights to move forward on solving this problem.

PC Setup

  • AMD Ryzen 5 5600X 3.7 GHz 6-Core Processor
  • ARCTIC Freezer 34 eSports DUO CPU Cooler
  • MSI MPG B550 GAMING PLUS
  • ARCTIC Freezer 34 eSpCrucial Ballistix 16 GB (2 x 8 GB) DDR4-3600 CL16 Memory
  • Team MP33 1 TB M.2-2280 NVME Solid State Drive
  • RTX 3080 Founders Edition
  • Lian Li Lancool II Mesh
  • ADATA XPG CORE Reactor 750 W
  • Windows 10 Home

All items were purchased April 2020 except for the GPU which was purchased by a friend of mine on release and was resold to me.

Symptoms

When the PC boots up, after a certain amount of time (ranging from 30 seconds to approx. 20 minutes) Windows will freeze for a few moments then the system with crash to the BSOD. The error message within the BSOD varies but the most common I've seen is "WHEA Uncorrectable error" although i've seen "System service exception" on occasion.

While it is BSOD, it will attempt to do the Windows crash log dump, but it will be stuck at 0%, time out after 2/3 minutes, and then auto reboot to the BIOS.

In this state the BOOT LED indicator on the motherboard will then light up.

IMAGE(https://storage-asset.msi.com/global/picture/about/FAQ/mb/boot-no-display-5.jpg)

If you close the BIOS and the system auto reboots, instead of going into Windows it will return to the BIOS with the BOOT LED indicator still lit up (as if it can't detect my NVMe SSD).

I can then do a hard power down, start the PC back up again, and it will return to Windows (until the next BSOD).

I can let it sit in the BIOS menu for a considerable amount of time and it does not crash out. At this time it seems like the symptoms occur only when it hits Windows.

I was trying to see if there was a particular trigger that causes the BSOD (e.g doing specific actions), and initially I thought it was during heavy read/writes to the SSD, but as an experiment I let the computer sit there on the windows desktop after a reboot not touching anything and it crashed out regardless.

Troubleshooting

Trying to find issues via software

  • To rule out CPU overheating, I downloaded a temperature monitor and watched it before the next BSOD. Right before the BSOD it was at about ~40C which from my understanding is well within tolerances.
  • I then downloaded Teamgroup's S.M.A.R.T tool to check the health of the NVMe SSD (which was frustrating as the system kept BSOD'ing before i could download/start the application) and it came out as "healthy"
  • I then ran "chkdsk c: /f" to look for errors but windows did not find any.
  • I then ran the "Windows Memory Diagnostic" from Control Panel which came out as no errors.
  • As a last resort to rule out any software issues with Windows drivers, I did a full reimage of the PC with a fresh install of Windows 10.

After a full reimage of Windows, the system seemingly worked, but after a few hours the symptoms began occurring again. This made me feel like this the issue was hardware.

Replacing NVMe SSD, issue still occurs with slight change in behaviour

At this point I was fairly sure it's a bad NVMe SSD. I've never had to send in a computer component for RMA before so when I looked it up the process it was a more of a headache than expected with me having to covering a bunch of the costs shipping it to Taiwan and back.

I was fairly confident the issue was a bad NVMe SSD and not wanting to go weeks/months without a gaming PC I went out and purchased a Samsung 970 Evo Plus NVMe and swapped out my old presumably dead one.

After 4 days of smooth sailing, I thought I was out of the woods.

Unfortunately, last night it BSOD'd again and started to exhibit very similar symptoms (system crashing to BSOD in intervals of about 30 seconds to 15 minutes from boot).

The only difference is that when it hit's the BSOD, rather than waiting at the BSOD log gathering screen stuck at 0%, it will immediately reboot the system and return to Windows. The system will then BSOD again within 10-20 seconds, and then after the second reboot return to the BIOS. From what i've seen, the BOOT LED on the motherboard does not light up.

Lastly I updated the BIOS of my motherboard to the latest version but no change in behaviour.

Potential Next Steps?

At this point, I'm starting to run out of potential troubleshooting steps I can think of.
My next guess is that the faulty component is the Motherboard and not the NVMe SSD after all?

I'm thinking my next two options are :

  • Research MSI's warranty procedures hoping it's not a massive hassle/expense to get it RMA since it's within warranty, and then pray like hell that's the issue.
  • Walk down to the computer component store a few blocks away from my condo and just straight up buy another motherboard... and then pray like hell that's the issue.

Does anyone have any other potential next steps that I haven't though of? Are there any other ways I can potentially pinpoint what the problem component is?

Thank you for your help!

You've done the things I would have done. A couple of thoughts:

1. In order of likelihood from least to most, I think the potential points of failure are the motherboard, PSU, weird USB conflict.

2. Weird USB conflict is easy enough to test. Turn on the PC with nothing other than the power cord and monitor cable plugged in. If it doesn't crash, plug things in one at a time, waiting between each one for a crash.

3. PSU. There was news about some PSUs not liking the higher end RTX cards. Don't think that applies to yours, but worth trying a different PSU if you have one available.

4. Try the m.2 drive in another slot. Looks like your mobo has 2?

5. If none of those things or number 4, then an RMA for the motherboard seems like the right choice.

Thanks for the advice!!!

So it looks like RMA is the way to go which sucks since as someone relatively new to hardware, every time I have to touch a component (remove/put back) I feel like I run the risk of my finger slipping and me destroying something, so I'm not looking forward to tearing down my PC to get out the mobo.

I tried the m.2 drive in the secondary (slower) slot. It worked for a few hours but then BSOD'd again with the exact same symptoms.

After I replace the motherboard and if it doesn't work I'm guessing the next thing to try to replace is the PSU (unfortunately I don't have a spare one).

A question though about removing the CPU. I pulled out all of my old boxes from storage and realized I'm missing the plastic holder for the CPU. Without the plastic, what's the best way to store the CPU while the motherboard is being repaid? My understanding are that the pins are ultra fragile

I was thinking the procedure would be

  • take off the heatsink
  • carefully clean out any thermal with a paper towel (and a q-tip with rubbing alcohol for anything stuck)
  • very carefully take out the CPU, and lay it in the anti static bag pin-side up
  • put it somewhere careful until the new motherboard comes in

Would this sound correct?

Very frustrating and I can only imagine how much time you've spent on this.

I am curious what you have the RAM running at? Looks like your mb only supports 3600 in XMP mode (technically overclock).

Have you tried unplugging all the psu cables and then putting them back in? Sometimes you don't get a good connection and redoing them works.

Neither question resolves the Boot LED from turning on though.

garion333 wrote:

I am curious what you have the RAM running at? Looks like your mb only supports 3600 in XMP mode (technically overclock).

Not sure what I should be looking at for this, I took a picture of the BIOS screen. I'm guessing the "XMP 3600mhz" in the center right is telling me it is.

IMAGE(https://i.imgur.com/MZlONZi.jpg)

garion333 wrote:

Have you tried unplugging all the psu cables and then putting them back in? Sometimes you don't get a good connection and redoing them works.

Hmm... i haven't yet, but I'll give it a shot tmrw. Just unplug everything and then replug it.

Looks like your motherboard has two profiles set to get your RAM up to 3600 MHz, but since it's set to 2666 MHz you don't have either active. This kind of rules out the memory to a degree, at least on configuration. It should be stable at 2666 MHz, so unless the RAM is just faulty that's probably not your issue. Looks like your DIMMs are in A2 and B2, which per the manual that's exactly what they show for two sticks, so that's good. Once you figure out the issue, you'll want to set your XMP profile to actually utilize the 3600 MHz configuration because... well, there's no point in having fast RAM and setting it to default speed.

Outside of stuff that's been mentioned, I just wanted to say that you should check your case's standoffs when you remove the motherboard. If an extra one is installed it could be intermittently shorting the board, and that can cause all kinds of headaches.

Gotcha thanks guys!

So last night I tried Garion's advice and reseeded all power related plugs along with the memory and SSD (returning it to it's original "primary" slot).

After bootup it seems to be relatively stable today (but as a naturally paranoid person I trust nothing) with the symptoms not happening as of yet.

If the symptoms do reoccur, I was looking at the RMA process for my motherboard with MSI and it's a hassle and it takes like 20-30 days (not including shipping time) for them to get it done. Since i spend so much time on my PC i'm thinking of just buying a new motherboard rather than going through the headache and worrying about shipping it to MSI and damaging it in transit through canada post.

If I eventually go down that route, if I can't get a hold of a motherboard that's an exact replacement of my current one (MSI MPG B550 GAMING PLUS), is there any motherboards out there you guys recommend? It looks like "B550" seems to be a standard, so should I just look for any motherboard from a reputable manufacturer (MSI, ASUS, etc.) with B550 in the name?

I keep waiting for the inevitable hyperlink... maybe tomorrow? I wonder what the account's name will be.

merphle wrote:

I keep waiting for the inevitable hyperlink... maybe tomorrow? I wonder what the account's name will be.

I'm just enjoying observing the bots interact in their natural habitat.

Chairman_Mao wrote:
merphle wrote:

I keep waiting for the inevitable hyperlink... maybe tomorrow? I wonder what the account's name will be.

I'm just enjoying observing the bots interact in their natural habitat.

They are certainly more polite than your average meatspace interactions these days.

Chairman_Mao wrote:
merphle wrote:

I keep waiting for the inevitable hyperlink... maybe tomorrow? I wonder what the account's name will be.

I'm just enjoying observing the bots interact in their natural habitat.

But I'm not a bot lol

liverongug wrote:
Chairman_Mao wrote:
merphle wrote:

I keep waiting for the inevitable hyperlink... maybe tomorrow? I wonder what the account's name will be.

I'm just enjoying observing the bots interact in their natural habitat.

But I'm not a bot lol

I never said you were a bot. Welcome to the site!

But just to be safe, please click below to confirm:

IMAGE(https://www.cividesk.com/sites/cividesk.com/files/recapatcha.png)

merphle wrote:

I keep waiting for the inevitable hyperlink... maybe tomorrow? I wonder what the account's name will be.

Indeed.

MOD

Thanks for reporting when things look sus, folks; we're keeping an eye on things.

merphle wrote:
liverongug wrote:
Chairman_Mao wrote:
merphle wrote:

I keep waiting for the inevitable hyperlink... maybe tomorrow? I wonder what the account's name will be.

I'm just enjoying observing the bots interact in their natural habitat.

But I'm not a bot lol

I never said you were a bot. Welcome to the site!

But just to be safe, please click below to confirm:

IMAGE(https://www.cividesk.com/sites/cividesk.com/files/recapatcha.png)

Funny

You may not be a bot, but you're a spammer nonetheless.

*mod*

A link was edited into the post. Removed and banned. Nice catch merphle.