Supercomputer Frontier faces errors almost daily due to AMD and HPE solutions. In supercomputer design, both hardware and software have to be top-notch. Frontier faces hardware failure every day. This computer uses 9,472 AMD's 64-core EPYC Trento processors, 37,888 Instinct MI250X GPUs and HPE (Hewlett Packard Enterprise)'s Slingshot (12.8 terabits/second bandwidth) connection as hardware. The supercomputer was designed and built by HPE using the Cray EX architecture. In this way, the processing power is reflected in the system as 1.685 FP64 ExaFLOPS calculations. However, it has big problems in terms of stability. Although the hardware components have been placed in the system and the installation has been carried out, the supercomputer Frontier is not used in research due to the problems experienced. Not a day goes by at Oak Ridge National Laboratory without numerous hardware failures. The malfunctions in question continued for hours on a day-to-day basis. In order for the software to work, the hardware must run smoothly, and due to these problems, the installed software components need to be corrupted and constantly reinstalled or updated. In this case, it can be said that the architecture used in the design of the computer is not sufficiently developed and does not respond adequately to complex hardware. Since the integrated circuits operate with very high frequency "clock signals", it is possible that the tolerances of the appropriate electronic circuit elements were high. Although it has been previously stated that the supercomputer will be online in 2022, the answer is "time will tell" when it will start its full operation. Source:
https://www.donanimhaber.com/dunyanin-en-hizli-super-bilgisayari-her-gun-hatayla-karsilasiyor--154515