Intel Haswell Overclocking Fully Disclosed - Theory For Core i7 4770K!

  • News, Editorials, Articles, Intel
  • 49
  • HWBOT

Author: Pieter-Jan Plaisier

As I am writing this, I am flying back from Beijing to Taipei. Tired, but intellectually challenged by the speakers at IDF 2013 Beijing, I am composing this article to disclose all the information I have picked up at the event. On the first day, we (the IDF attendees) had two one-hour introductions to the new Haswell micro-architecture, but the most valuable and interesting information came on the second day. In the technical seminar about overclocking, featuring GIGABYTE’s Hicookie demoing a 7GHz Core i7 3770K, Intel representatives disclosed a lot of valuable information on overclocking the upcoming Haswell. Check out the article for the full disclosure – hope you enjoy it!


Haswell Novelties In Short – What is New?


As the technical readers amongst you already know, Haswell is a new Tock in the Intel micro-architectural evolutionary process. A Tock is Intel’s nomenclature for implementing a new micro-architecture using an already known and working silicon process. In this case, 22nm. Previous Tocks were Sandy Bridge and Nehalem. The next generation of Intel products will be a Tick and brings the Haswell micro-architecture to 18nm. The fifth generation of Intel Core products we know as Broadwell. The next Tock, after Broadwell in 2014, is Skylake and is scheduled for 2015 release.

So what is new about Haswell? From a practical overclocker point of view, there is not that much new. However, from an engineering and focus-on-overclocking point of view a lot has changed.

The most dramatic novelty in Haswell is moving a large chunk of the VRM from the motherboard onto the CPU die – that is including it in the CPU package. This is quite an impressive feat from a design perspective! Many people, amongst whom enthusiasts such as you and I, feared that Intel’s taking over the control of the VRM design would limit the overclocking capabilities in a similar fashion as integrating the clock generator limited overclocking on Sandy Bridge. Luckily, that is not the case.

Apart from the VRM implementation, Haswell introduces a couple new instruction sets, an on-die eDram and a variety of improvements for overclocking.


Haswell Frequency Control – The Basics


As the main structural parts of the Haswell micro-architecture is quite similar to the one of Ivy Bridge, we can distinguish mostly similar sections on the CPU die:

  • Core(s)
  • L1, L2 and L3 cache
  • Ring bus
  • System Agent
  • Integrated Memory Controller
  • Integrated Graphics Processor
  • Edram IC

To put in simple terms, the Ring bus interconnects the various sections of the architecture, just like on Sandy Bridge. The L1 and L2 cache are still exclusive to the CPU cores, meaning each CPU core has its own L1 and L2 cache, and the larger L3 cache is shared with all cores as well as the internal graphics processor. The ring bus serves as data bus to transfer data around the CPU die. New in Haswell is the introduction of an in-house designed on-die IC called eDram, which according to Intel serves as additional level of cache to boost the GT3 performance as it provides significantly more bandwidth and reduced access latency compared to regular DDR3. The size of the eDram and is said to be large enough to hold frame buffers.

Let’s talk overclocking!


Haswell Frequency Controls – The Details.


As for the Core frequency, Intel added additional register bits for turbo multipliers increasing the theoretical maximum multiplier from 63x to 80x (!). The BCLK frequency steps are still the standard 100MHz and the voltage is programmable via the iVR (more on that later). Note that opening up the CPU ratio registry to 80x is not a guarantee that the CPUs have the capability of operation near to 80×100MHz. As Intel suggests, its engineers look for ways to give as much tools as possible for users to find the physical limits of their design and not be arbitrarily limited.

Regarding the BCLK frequency, we now have the same BCLK gear ratios at our disposal as on X79 but were not present on Ivy Bridge. In case you did not know about a year ago, I enquired Intel about the non-availability of this gear ratio on Ivy Bridge at a technology session at CeBIT 2012. Simply said, they explained that there was simply not enough time to add the gear ratios to the Ivy Bridge micro-architecture. It is not simply a matter of just “adding the options”. The gear ratio requires an additional DB1200 clock multiplier on the CPU package – not as simple as it sounds!

The gear ratios selection includes options 1.00x, 1.25x or 1.67x as BCLK “multipliers”, giving you an offset of 100MHz, 125MHz or 167MHz. For those who wonder, the BCLK gear ratios seem to be fully functional with S3 sleep state enabled. This means your Haswell system is able to resume from S3 state when having the BCLK ratio overclocked. For those who wonder if Haswell supports the 2.50x gear ratio that was also available on the early X79 motherboards, the answer is yes and no. Yes, theoretically, it might be possible to use a 5:2 PEG/DMI ratio, which is the 2.50x BCLK gear ratio, but implementation depends on the motherboard vendor. We will find out when the products launch.

The range of BCLK overclocking, for each gear ratio, is still the same like on Ivy Bridge: about five to seven percent (5-7%). On Ivy Bridge, the highest BCLK frequency we have seen so far is 117MHz (+17%) and there is no reason to assume this will be worse on Haswell. So perhaps we might see 167MHz x 1.17 = 195MHz BCLK frequency?

Completely new for Haswell is gaining control over the Ring Bus frequency. In rough terms, you can see this ring bus frequency as the new “uncore frequency” – although that is technically not correct. The ring bus is adjustable up to 80x ratio as well, just like the CPU, and is typically not clocked higher than the CPU core frequency. As Intel stated at IDF 2013 Beijing, they have not seen much performance increase from overclocking the ring bus frequency, but did hint at the possibility of the ring bus frequency affecting overall system stability. It will be interesting to see what overclockers (read: you!) will come up with in this respect.

As we know, Ivy Bridge featured a reasonable mediocre memory overclocking. That is, if we compare it to the AMD record-breaking memory overclocking capabilities. The DRAM frequency overclocking seems to have improved on the new Haswell CPUs as Intel is now officially supporting logic all the way up to DDR3-2933. Although a bunch of Z77 motherboards currently already supports the DDR3-2933 ratio, it is not official Intel specification. Having this ratio supported by Intel is in fact a step forward as it means running it passed the Intel internal validation and qualification at this rated speed. Nice!

Last but not in the least important, the IGP clock frequency is also still unlocked. Intel provides up to 60x ratio, in 50MHz steps. This means a theoretical maximum overclock of 3GHz core frequency. Given the GT2 of Ivy Bridge can currently reach about 2GHz under extreme cooling; we do not expect the more complex GT3e to reach that 3GHz, but who knows? Overclocking has never been an exact science.

As far as the eDram frequency goes, no information has been given on clock domains or control for that matter.


Haswell Voltage Control – The iVR Options.


The most significant change in Haswell – at least for tech enthusiasts – is of course the integration of the Voltage Regulation. On any of the architectures before Haswell, the CPU has a number of external voltage regulation units each powering one the various parts of the CPU. There was a separate VR input for the Core voltage (Vcore), IO (Vio), Graphics procession (Vgfx), System agent (Vsa) and the PLL clock generator (Vpll). With Haswell, this changes. Instead of having separate Voltage input rails, Intel has merged all into a single input.

This one input goes by Vccin and serves as input voltage for the integrated voltage regulation (iVR). It supports up to 3.04V input, which means a lot of current can be driven into the CPU. The iVR uses the input to distribute voltage to the various parts inside the CPU. Unlike what many feared, Intel has not limited the voltage options for overclocking. You can still deliver all the way up to 2.0V to the cores, the ring and the integrated graphics and it supports a 500mV over offset for the system agent and the IO. The rules for overvoltage on Haswell are similar to Ivy Bridge with one very important exception:

As a rule of thumb, the Vccin should be at least 400mV higher than the Vcore. In other words: Vccin >= Vcore + 400mV.


Haswell Voltage Control – The Overvoltage Options.


One aspect of overvolting and overclocking of the Haswell micro-architecture the speakers were very proud of was the available possibilities for fine-tuning your system overclock. In fact, the possibilities of fine-tuning Haswell are the practically the same as for Ivy Bridge which, given the complexity of the iVR, Intel representatives see as a major achievement.

First, let us quickly go over the options for overclocking again. All of the options are based on (re-)configuring the Turbo mode feature. As you know, Intel allows any processor that supports Turbo Mode to increase the clock frequency automatically. A couple of parameters define the maximum Turbo Mode overclock:

  • CPU Specification: the highest clock frequency is determined by the CPU SKU – some products have more options than others.
  • Workload: depending on the type of workload, all cores can be overclocked or one core can be overclocked while others remain at default.
  • Temperature: the CPU is overclocked unless the critical temperature threshold is reached.
  • Current: the CPU is overclocked unless the critical current threshold is reached.

Based on these four parameters, the CPU is automatically overclocked via turbo mode. The turbo modes are defined by so-called P-states, where P0 represents the highest possible power. In other words, it is at highest defined turbo multiplier.

Note that all of the overclocking on Sandy Bridge, Ivy Bridge and Haswell happens through the turbo multipliers. Effectively, it means that when overclocking any of these platforms, you are always configuring the turbo mode. Even when you apparently “disable” turbo mode! Actually, by disabling turbo mode in the BIOS you force the P-state to always be P0. Therefore, maybe a bit ironically, by disabling the turbo, you are actually enabling it permanently.

Anyway, these are the four types of overclocking as provided by Intel.

  • Default turbo configuration: you have the processor overclock automatically according to the default Intel specifications.
  • Increase max turbo ratio: you increase the maximum turbo frequency, but let the CPU determine the require voltage.
  • Increase Voffset: you increase the voltage offset for your processor, allowing for possible higher ratio bins via automatic overclocking.
  • Manual override: f*ck the system! I do everything myself!

As you can see on the graphs above, each of the four possibilities for overclocking are still supported for Haswell. You will be able to use the default overclocking options (duh!) as well as the manual override mode (as discussed before). Additionally, you may also configure the turbo mode behaviour of your system via manual tuning of the maximum turbo ratio and adjust the voltage offset. In future overclocking articles we’ll go deeper in on how to exactly fine-tune your Haswell system.


The Haswell – HWBOT Connection: Extreme Tuning Utility


Last but not least, Haswell has full support for the new Intel Extreme Tuning Utility (“XTU”), which features a smooth integration with your HWBOT website. I have been part of this project since the absolute beginning in January 2012 and I am very happy to see it finally come to completion. When Haswell launches, the new XTU software will allow you to:

  • Upload and download overclocking settings
  • Export and import the overclocking settings into your XTU
  • Compare benchmark scores, overclocks and configurations on-site
  • Link and feed information to other benchmark scores (e.g. SuperPI)
  • Participate in HWBOT competitions via XTU
  • Analyse your system configuration and get suitable OC suggestions

Although the technical session contained more information than just the one slide, this information did not get incorporated into the final PDF.

The Intel XTU integration will be available to any brand and motherboard that has XTU support implemented via the BIOS.


In Conclusion.


Without having seen a Haswell system overclocked (in public), Intel fully disclosed how to overclock a Haswell-based system at IDF 2013 Beijing. As usual, it’s quite difficult to predict how far the new micro-architecture can be pushed exactly – there are some leaks on the internet already though – but as Intel continues to work on the overclocking aspect of their products, we can rest assure that overclocking on Haswell seems to be working just fine. We still have to wait until early June to know every detail about Haswell in terms of performance and clock frequency ranges, but at least now we know the theory. And it looks exciting!

With the re-introduction of the data bus frequency control (Ring Bus) and BCLK gear ratios, Haswell provides us with a couple of extra knobs to play with compared to Ivy Bridge. As the performance should go up by 5-7% – based on performance leaks on the internet – it will be interesting to see if Haswell can break all the overclocking records or not. Additionally, I’m pleased to see the cooperation between Intel and HWBOT on integrating overclocking software into the website come to completion and hope it boosts to the social aspect of overclocking through the sharing of overclocking profiles. In any case, I look forward to Computex 2013!

I would like to thank Mike Moen and Joachim Algstam from Intel for the interesting technical session. Also congratulations to Hicookie from Gigabyte for showing 7GHz in a live demo.


// Pieter out.



49

Belgium Massman says:

You can download the .pdf slides as shown at IDF here: https://intel.activeevents.com/bj13/scheduler/catalog.do (just search for "overclocking"). There's a bunch of other interesting material there as well.

All the information in this article should be public knowledge, no NDAs broken.

India sumonpathak says:

link has a small typo...

Germany der8auer says:

Thanks for all that information PJ!

Indonesia LSlowmotion says:

thanks a lot! :D

NoMS says:

If BCLK OC means that the times of fun overclockable cheap chips (Haswell based Celeron's/Pentium's) like the ones of "775 era" is back, the only thing I have to say about this platform is:




:D

Belgium leeghoofd says:

Why do we get tips to bench Ivy ? so confused by this...

United States Splave.ROM says:

NoM$_YesLinux said: If BCLK OC means that the times of fun overclockable cheap chips (Haswell based Celeron's/Pentium's) like the ones of "775 era" is back, the only thing I have to say about this platform is:




:D


there are probably steps like x79 100, 125, etc for the bclk and they might not be as flexible straying away from those steps. I assume they would lock that down on cheap chips..just conjecture though :)

NoMS says:

Splave said: there are probably steps like x79 100, 125, etc for the bclk and they might not be as flexible straying away from those steps. I assume they would lock that down on cheap chips..just conjecture though :)


Sure, but it's always better than actual 100+/-10% BCLK OC that we have with 1155 Sandy/Ivy and that would make the cheap chips oc able. But being somewhat realistic, I don't expect to see Intel allowing BCLK OC on these cheap chips too...:(

But well... The hope is the last thing to die! :D

Germany der8auer says:

I guess only the K-suffix CPUs will have the option to change the bclk.

TaPaKaH says:

overclockable or not, low-end CPUs are only good for hardware points

Belgium Massman says:

Overclocking has become just another product feature that Intel can charge for. Just like - hyperthreading - amount of cores - amount of cache - type of IGP - virtualisation - dram ratios And so on. There has been no confirmed (or leaked) product information regarding the correct configuration of the SKUs, just guesses, so we can only make assumptions here. But it seems fairly obvious that the non K-sku processors will not have the option to play with PEG:DMI (bclk gear ratio) settings.

United States Bobnova says:

The 3820 has gear ratios despite being a non-K, so some of 'em might. Almost certainly not the low end stuff though.

Belgium Massman says:

Bobnova said: The 3820 has gear ratios despite being a non-K, so some of 'em might. Almost certainly not the low end stuff though.


Hah, same response I had :D.

I was made aware of the fact that the Core i7 3820 was part of the most high-end desktop platform and that Haswell is a replacement for the mainstream platform. Between the lines, it probably meant that all non-K sku CPUs built for a high-end X-chipset will have additional overclocking options compared those for a mainstream Z-chipset.

Greece crustytheclown says:

thanks for the info Peter

Germany websmile says:

Interesting info, thanks for sharing, PJ - maybe decision to skip Haswell was made bit too soon :D

FlanK3r says:

K-CPUs, better binned :)

Poland G.Foyle says:

IDF 2012 docs say that L3 cache is on ring frequency and power domain. IDF2013 China docs say L3 is on core domain. I think the former is correct?

BTW, what you call eDRAM is not on-die (it's a separate die on the same package), also the small e for embedded is not exactly correct, since it's a separate die :)

FlanK3r says:

but.. I thought eDRAM will be only mobile CPUs (GT3), or not?

Belgium Massman says:

xoqolatl said: IDF 2012 docs say that L3 cache is on ring frequency and power domain. IDF2013 China docs say L3 is on core domain. I think the former is correct?

BTW, what you call eDRAM is not on-die (it's a separate die on the same package), also the small e for embedded is not exactly correct, since it's a separate die :)


Thanks for the corrections. L3 must be on Ring frequency as it's shared between all cores.

There wasn't that much information on the eDRAM - even behind-the-scenes. All the info we got was "additional cache", "in-house design" and "super dooper fast".

United States Bobnova says:

SB/IB the L3 runs at core speed, doesn't it? It could be anywhere!

United States Splave.ROM says:

200 BCLK ftw

Belgium Massman says:

Shhht :p

Puerto Rico chispy says:

Very nice article , thanks for sharing Pieter.

Italy Gigioracing says:

8 gigahertz cpu !!!

Belgium Massman says:

gigioracing said: 8 gigahertz cpu !!!


No.

United States Splave.ROM says:

6.7ghz ftw! so about 7.1 - 7.2 ish ivy speed 32m/PF I hope :D

Hiwa says:

everyone will bench pedro benchmark again.

Belgium leeghoofd says:

and AM3 Hiwa, bandwith whores !!!

Croatia stasio says:

Splave said: 200 BCLK ftw





Btw,
CPU-Z 1.64 display CPU VRM voltage,instead of CPU Vcore.

Belgium Massman says:

Oh, only 188MHz ?

Croatia stasio says:

Massman said: Oh, only 188MHz ?

Published........atm.

United States xxbassplayerxx says:

For some reason BCLK close to 200 just seems right... 1366 days :)

Belgium Massman says:

Wonder who'll get the 2.5x gear ratio working :)

Denmark zzolio says:

I never got it to work on X79

Belgium Massman says:

That's because the CPUs were just not capable of running anything near that :). Maybe they are now ... not sure about this, actually.

Denmark zzolio says:

16*250=4000 should have been working

Belgium Massman says:

The CPUs were just not capable of doing much more than 170MHz BCLK. To fully use the 2.5x ratio, you'd need to use a BCLK of 68MHz with 2.5x gear ratio. Anything below 95MHz BCLK on SB-E was already a struggle.

United States sin0822 says:

yea but you thin you can run the PCI-E less than 90mhz?

Netherlands rsnubje says:

It's possible for Haswell to go below 90 bclk.

United States sin0822 says:

but can the PCi-E bus itself not the CPU, i am talking about the GPU go below? Maybe, idk. I have one GPu i used to do that 117mh BCLK, so i guess you can go the other direction, but i always felt it could go up more than down.

Germany Hyperhorn says:

Any new thoughts about "is L3 cache running at core or ring frequency" thing? What would you guys think if core AND ring frequency boost L3 cache performance at a 2/3 (core) and 1/3 (ring) relationship? Wouldn't it be a nice idea that L3-cache runs at core frequency and ringbus only boosts the interface between the cores and the L3-cache? Given L3-cache and ring share one clock domain in fact and run at the same frequency - for what reasons would the core frequency affect synthetic L3 performance values much more than the ring frequency or e. g. CPU NB clock frequency of AMD CPUs?

Belgium Massman says:

If I recall the IDF presentation correctly, the Intel engineers mentioned that "even though the Ring Bus frequency can be adjusted, they have not seen major performance gains in the labs". And then they asked the reviewers and power users to test it themselves :D

Romania Alex@ro says:

Aside from superpi and couple of 2d benchmarks+2001,gains are not so big as you'd expect to,pretty much below 5%....

Austria basco says:

can someone plz advise me how to play with rtl-settings on maximus? if i change one setting one step up or down i get 55 no matter what volts.... i get non consistent rtl=changes after reboots on auto and i think this is only setting effecting 3d11mark pysx score.(maybe interesting for massman in x79 pysx). thanks in advance

Poland Xtreme Addict says:

lower bclk for instance to 98,6 mhz and try to boot, you have to pass memory training first before going full mhz

Kazakhstan TerraRaptor says:

Bascom my way to adjust RTLs @M6E is as follows: 1. Set all RTLs & RTL-IOs to desired values iat the same time. (eg. 39/39/40/40 4/4/4/4 - do not change only one parameter). 2. Start adjusting RTL initial value in small steps from 63 to smth like 40 (63, 61, 59 etc) trying to post/train with a given RTL initial value. For my PSC kit @2600 I was able to adjust rtls from 42/42/47/45 4/4/9/7 to 41/41/42/42 4/4/4/4 with RTL initial value of 44. I beleive RTL initial value is a key for successful post.

Austria basco says:

thx very much xtreme addict.
its better then before now,but i am still not able to put values manuell.
best score i get is with 39\40\40\41 and this is only sometimes after boots if i am lucky
only rtl initial value i can do manuell.

does more ramvolt help you with lower rtl?-me is not getting lower values.

thx terraraptor-will try and report.

ex:my last 3 boots on rtl-auto:39\39\40\41 ; 39\39\40\40 : 39\40\41\40

my sys:max6gene 1002-17600cl7 pi-1:9 2424mhz cl8-11-8

Austria basco says:

thanks a lot Terraraptor & Xtremeaddict
with your help i can finally boot with manuell rtl.
2424mhz rtlinit-44 \ 39\39\40\40

Please log in or register to comment.