Why Multi Socket Systems Don't Always Perform Well on Y-Cruncher

If you check out the leading scores for most multi-threaded benchmarks, you tend to find that the systems with the most threads tends to win out. Stands to reason of course. It also makes sense that submissions made with systems with more than one CPU are likely to score higher – more CPUs means more cores, more threads and therefore more performance. Or so you would think.

Y-Cruncher is a multi-threaded benchmark that in fact does not adhere to the above hypothesis. The Y-Cruncher Pi-25m score rankings page on HWBOT, reveals that beyond octa-core systems, there is virtually zero improvement in score. The record 48x CPU system scores 7sec 648ms. Compare that to the record score of 0sec 766ms set by Sandalo using a single octa-core Intel Core i7 5960X.

Y-Cruncher is supported in beta on HWBOT and it's an interesting benchmark program that can use multiple threads for a near linear boost in calculation speed. It can also use and stress unlimited amounts of system memory and it also uses ISA extensions such as SSE and AVX available on most modern processors. HWBOT member Mystical is the creator of the newly updated version of Y-Cruncher. Luckily for us he took some time to explain why multi-socket systems don’t always perform as they should. Turns out, it’s pretty much all about memory:

“On multi-socket systems, each processor socket has its own set of memory banks. A processor has fast access to its own set of memory. But if it needs to access memory that's elsewhere (on a different socket), it needs to go over the interconnect to get it from the other processor. So it's a lot slower. In other words, the assumption that is critical to y-cruncher's performance is no longer valid. Some memory is faster, and some memory is really slow - hence "Non-Uniform Memory Access". If you have two sockets, half the memory will be fast and the other half slow. If you have a lot of sockets, then the vast majority of the memory will be slow with respect to each individual processor.”

To understand more about this issue, go ahead and read the full post from Mystical here on the HWBOT forum.

Please log in or register to comment.