View Single Post
  #1  
Old 03-04-2017
Mysticial Mysticial is offline
grunt bot
 
Join Date: Mar 2013
Location: United States
Posts: 115
Thanks: 5
Thanked 45 Times in 26 Posts
Mysticial
Exclamation Ryzen 1800X - Instant system crash when running sequence of FMA3 instructions. Request for verification.

One of my internal benchmark applications is insta-hard-freezing on Ryzen.
  • Ryzen 7 1800X
  • Asus Prime B350M-A (BIOS 0502)
  • 4 x 8GB Corsair CMK32GX4M4A2400C14 @ 2133 MHz
  • Nothing is overclocked. Everything is stock.
  • Windows 10 Anniversary Update

When I run the Haswell binary from here: https://github.com/Mysticial/Flops/t...naries-windows

The entire system usually freezes when it gets to:

Quote:
Single-Precision - 128-bit FMA3 - Fused Multiply Add:
Sometimes, it will make it past that, but it usually ends up crashing/freezing later on in the test anyway.

For those who don't trust the binary, the program is completely open-sourced in that GitHub repo. If you have Visual Studio installed: Open the project, build the x64 Haswell binary, and run.

For me this always hard freezes the computer:
  • At all clock speeds.
  • When running single-threaded, it happens to any core that I pin it to.

The questions that I want to answer are:
  1. Is this specific to my setup? No - Confirmed by multiple other people.
  2. Is this specific to Asus mobos or an immature BIOS? If so, can it be fixed with a later BIOS?
  3. Is this an issue with Windows? The crash does not seem to happen in Linux, but that is with slightly different code due to differing compilers.
  4. Is this a CPU errata? (I hope not - however unlikely it might be.)

---------------------------

Current Testing Status:

All of these are running Windows, and are at stock settings or underclocked.

Confirmed Crashes:
  1. 1800X + Asus Prime B350M-A (BIOS 0502)
  2. 1700 + Asus Prime B350M-A (BIOS ???)
  3. 1700 + Asus Crosshair VI Hero
  4. 1700 + Asus Crosshair VI Hero (BIOS 5803) (two sets of memory G.Skill + Kingston - also fails with overvolted SOC)
  5. 1800X + Asus Crosshair VI Hero (Windows 7) - Once pass, mostly failures.

Confirmed No-Crash:
  1. none yet


For those interested in the technical details, I'm getting hard freezes for all types of FMAs (128-bit, 256-bit, single and double precision). But for some reason, it only affects this particular benchmark. Other programs (like prime95 and y-cruncher) aren't affected despite using FMAs.

---------------------------

Update 3/16/2017:

As much as I had least expected this to be the case, this appears to have been confirmed as an errata in the AMD Zen processor. In other words, the last bullet on my list (and the most serious). Fortunately, it's one that is fixable with a microcode update and will not result in something catastrophic like a recall or the disabling of features.

To everyone pouring in from the various news sites:
  • The important part is that a user mode program should not be able to hard freeze the entire system. Because if it can (as is the case here), it makes it possible to perform DOS attacks. IOW, this errata is a security issue.
  • Don't be fooled by the "Haswell binary". The benchmark is 5 years old and I've largely neglected it for the last 3. So I haven't updated it for Zen yet. Any processor will be able to run any of the binaries if it supports the underlying instruction sets. If it doesn't, the program merely crashes with an, "illegal instruction". Under no circumstances should a user-mode application be able to bring down an entire system.
__________________

Last edited by Mysticial; 03-16-2017 at 18:15.
Reply With Quote
The Following User Says Thank You to Mysticial For This Useful Post:
Massman (03-07-2017)