3DMark Time Spy-Gate: In Summary

What do you get if take a pair of new GPU architectures, add a new API and a new benchmark…. Answer? A whole load of debate and plenty of discordant noise. It all started when some clued up people on the Overclock.net started debating the relative pros and cons of Futuremark’s decision to not implement any vendor specific architectural optimizations in its latest Time Spy benchmark which was rolled out earlier this week.

Claims from the AMD fanboy estate fiercely point out that AMD has invested a great deal in making sure that its architecture is optimized for the arrival of DX12, and more specifically, ‘asynchronous computing’. Asynchronous computing in DX12 allows for applications to make specific choices about how the workload is distributed across GPU cores. This means that a 3D video game application can a) know which GPU is being used and b) make workload and queuing decisions to optimize the experience.

One example that is used to illustrate just how AMD have worked hard and succeeded at making its Polaris GPUs optimized in a specific title is the massively improved experienced in Doom when using the Vulkan API (an alternative to DX12 which also supports asynchronous computing). According to German tech reviewers Computebase, a true implementation of asynchronous compute would give AMD a significant performance boost, whereas Nvidia would see significantly less improvement.

So why did Futuremark decide to not implement vendor specific optimizations for asynchronous computing? After all it is a key feature of DX12, and Time Spy is billed as the first DX12 benchmark? A statement released by the company states:

“Asynchronous compute is one of the most interesting new features in DirectX 12…. The implementation is the same regardless of the underlying hardware. In the benchmark at large, there are no vendor specific optimizations in order to ensure that all hardware performs the same amount of work. This makes benchmark results from all vendors comparable across multiple generations of hardware. Whether work placed in the COMPUTE queue is executed in parallel or in serial is ultimately the decision of the underlying driver.”

So Futuremark are clearly trying to dispel any accusations of bias, instead arguing that using vendor specific optimizations would in fact be unfair. Users commenting on a reddit thread on the subject tend to disagree however:

“…they just confirmed it's not a proper DX12 benchmark due to it not utilizing the benefits of DX12 low level optimization, all in the sake of "fairness" they used a single path... the path that fits Pascal architecture capabilites.”

Reading the forum thread on OC.net and other comments around web, it’s clear that emotions between green and red camps can certainly run high. My view is perhaps that the customer should ultimately have a choice. If I want to assess how well a GPU vendor is doing in terms of low level optimizations to ‘get closer to the metal’ of a GPU, why shouldn’t a benchmark app provide me with that opportunity? Likewise, if I am of the opinion that a single, common path or compute implementation is fairer, perhaps I should have that option too.

Yup. I vote for an on/off switch. Please add your thoughts in the forum thread below.


15

Taiwan sdougal says:

Looking forward to all your views and opinions....

K404 says:

PhysX is not allowed for Vantage (HWBot,) Tess can not be disabled for 3DM11 (FM) (am I right in saying that?), now this.

It's a synthetic bench. Normal rules of what's appropriate don't really come into it. If people want a better 3DMark score, they should buy the card that gives it, as the rules stand. or, if they're going to play actual games, they should buy the card that, overall, gives better performance in the games that they play.

I don't see what the fuss is about. This is only important uintil the next cards come out, or a new driver comes out with improvement X.

A lot of us use LN2 and voltmods. People aren't lining up to defend those as part of an equal playing field :p


First-world problems. "Boo hoo. I'm not allowed to tick/untick the box in my e-peen waving campaign"

Finland FM_Jarnis says:

Sooo... Let's see: hypothetical "AMD optimized" and "Nvidia optimized" switches... who decides when they are "properly optimized"? Who programs the codepaths? If Futuremark does, you do know that as soon as the numbers are in, one or both "teams" yell how their codepath is not "properly optimized", right? AMD, NVIDIA and Intel engineers have all seen the source and visited Futuremark offices in the past six months, discussing the implementation, offering their optimizations etc. and anything that helps while not hindering any other architecture is generally accepted. We still strongly doubt that game developers will spend the time & money to do vendor specific code paths (with the exception of sponsored games). Also AMD, NVIDIA, Intel and Microsoft all have indicated that they do not want vendor-specific paths in 3DMark as it would devalue it as an useful neutral benchmark to them. How about using games to figure out who has the best team of engineers for optimizing for their architecture (see green or red logo for the team responsible for each game) and 3DMark as the "neutral ground"?

Belgium Massman says:

How about each "team" commits their own codepath? The source code is available to all BDP members, so any trickery can be spotted. Wouldn't the situation be similar to the PCMark8 where you have the choice between a default and an accelerated benchmark? In this case the Time Spy benchmark can be used to show how fast the GPUs are without optimization and how fast they are with optimization. That doesn't sound like a bad situation for the end-user?

Belgium Massman says:

Oh, also, thumbs up for joining PCPer podcast :) [ythd]8OrHZPYYY9g[/ythd]

Finland FM_Jarnis says:

Massman said: How about each "team" commits their own codepath? The source code is available to all BDP members, so any trickery can be spotted. Wouldn't the situation be similar to the PCMark8 where you have the choice between a default and an accelerated benchmark? In this case the Time Spy benchmark can be used to show how fast the GPUs are without optimization and how fast they are with optimization.

That doesn't sound like a bad situation for the end-user?


Uh, any vendor-specific optimizations would, by definition be "trickery". The biggest issue is that as soon as you have separate paths, how do you truly enforce that the both do the exact same work. You really can't.

Most vendor-specific optimizations in games involve actually changing the work subtly to trade off some slight differences in image quality, or to use different algorithm for similar-but-not-identical output. Fine for games, when all you really care about is the framerate and that the output looks acceptable visually.

In a benchmark that makes a honest effort to be reproducible and fair, doing vendor-specific optimizations would rapidly turn into two benchmarks (then we'd have those mythical beast, "AMDmark" and "NVIDIAmark") and results from neither would be directly comparable.

PCMark 8 is very different - there the difference between Conventional and Accelerated is "do we use OpenCL for compute?". This would be comparable to a very very early benchmark offering "software renderer or DirectX or OpenGL?". Scores from those two are not comparable against each other.

3DMark is never designed to be the guy that squeezes every single vendor-specific processor cycle out of each GPU. In the real world, where game development schedules and budgets say you can't spend an year to gain 3% on one vendor tend to agree as well. It is quite well optimized, but in a generic way - optimizations that benefit on various hardware, but do not harm the performance on others.

I know HWBot "use case" is very very different from normal benchmarking, but even here the fact that each GPU from each vendor is actually pushed to do the same work should definitely matter.

Indonesia speed.fastest says:

FM_Jarnis said: In a benchmark that makes a honest effort to be reproducible and fair, doing vendor-specific optimizations would rapidly turn into two benchmarks (then we'd have those mythical beast, "AMDmark" and "NVIDIAmark") and results from neither would be directly comparable.


Maybe i'm blind, what in real world DirectX12 or Vulkan gaming, RX 480 is faster than GTX 1060 even at stock clock. But in 3DMark Time Spy GTX 1060 is faster, i don't know if Futuremark create 3DMark to emulate current gaming situation or make 3DMark is not biased to "AMDmark" or "Nvidiamark". But who i am :)

Finland FM_Jarnis says:

speed.fastest said: Maybe i'm blind, what in real world DirectX12 or Vulkan gaming, RX 480 is faster than GTX 1060 even at stock clock. But in 3DMark Time Spy GTX 1060 is faster, i don't know if Futuremark create 3DMark to emulate current gaming situation or make 3DMark is not biased to "AMDmark" or "Nvidiamark". But who i am :)


Have you considered how many of those DX12 games are actual DX12 engines, built DX12 first, and how many are DX11 games with DX12 renderer fitted in after-the-fact?

Unfortunately there is really no way to "prove" this either way until we have considerably more DX12 games. Could we take another look in, say, 12 months from now? I'm venturing an educated guess, with the backing of our engine team, that Time Spy will track the performance of DX12 titles over a larger sample-size quite well.

(adding Vulkan to DX12 comparisons gets from me "Objection, relevance!" since it is a different API, plus the only Vulkan game out there has AMD-optimized codepath only at this very moment, so using it for amd vs. nv comparisons is not the greatest idea)

Indonesia speed.fastest says:

Maybe that not "really dx12 engine", but from this review all i can say is even DX12 from Nvidia Games and "not true dx12", RX 480 is faster than GTX 1060. The source that i use for comparison : HARDOCP - Introduction - NVIDIA GeForce GTX 1060 Founders Edition Review

And i hope Time Spy is the next Sky Diver, not the next Firestrike ;)

Czech Republic buildzoid says:

If games use different code paths and vendor specific optimizations benchmarks should do that too. Especially with a low level API like DX12 or Vulkan which were created to allow programmers to play to the strengths of each architecture. The way I see it 3Dmark should have an Nvidia path and an AMD path and implement all available speed increasing tricks in each path so long that the tricks don't impact visual quality.

Croatia dabar_Solta says:

So they say they dont want to damage nvidia scores by activating dx12 feature in the way it should work,but damaging amd does not matter even if it means disabling async in way it should work. So in the end it is nvidiamark for now.

Finland FM_Jarnis says:

dabar_Solta said: So they say they dont want to damage nvidia scores by activating dx12 feature in the way it should work,but damaging amd does not matter even if it means disabling async in way it should work. So in the end it is nvidiamark for now.


I really thought the forums here had more knowledgeable people than this.

There is no "activating a dx12 feature in a way it should work" that *can* be done.

Time Spy submits compute queue, flagged for asynchronous processing and the rest is up to the drivers to handle. Do not believe the garbage that some people have posted on the internet around the subject. Yes, Maxwell drivers ignore the request and process everything sequentially anyway, getting zero gains. That is perfectly fine by DX12 spec.

Besides, if it were "nvidiamark" as you suggest, why would AMD promote the benchmark on their site? Both AMD and NVIDIA are perfectly fine with the test being unbiased and fair.

Indonesia speed.fastest says:

As far as i know in Time Spy when Async On RX 480 got higher improvement than GTX 1060 in percent. Only that matter is the Score, because i dont know Time Spy is based in what DX12 game? Because i see no game in DX12 that tell GTX 1060 faster than RX 480, the best is on par in Nvidia title, even no full DX12 games like Rise of Tomb Raider.

Finland FM_Jarnis says:

speed.fastest said: As far as i know in Time Spy when Async On RX 480 got higher improvement than GTX 1060 in percent. Only that matter is the Score, because i dont know Time Spy is based in what DX12 game? Because i see no game in DX12 that tell GTX 1060 faster than RX 480, the best is on par in Nvidia title, even no full DX12 games like Rise of Tomb Raider.


Because the sample size of DX12 games is still non-existant.

Please come again when we have, lets say, 20 different DX12 games.

Pretty much every DX12 game today with the possible exception of Ashes of Singularity is a DX11 game with a DX12 renderer bolted on.

Croatia dabar_Solta says:

@fm_jarnis
actualy there is info somewhere that nvidia uses individual CU-s for compute only instead of graphic and compute , so it is async when you look at chip as whole but there is no hardware in nvidia gpus to use real async,- 100% of gpu runing graphic + lets say 30% running compute without performance hit at those 100% of graphic, timespy is made with this in mind and that just plain sucks and favours nvidia "solution".

Please log in or register to comment.

Leave a Reply: (BBCODE allowed: [B], [QUOTE], [I], [URL], [IMG],...)