Nvidia’s Green Light Program Overclocking Limitations – Origins and Implications
Recently, a new controversy appeared in the high emotional world of performance graphics cards. EVGA has just announced that they will no longer support EVBot overvoltage tool for their highest end cards. We dug up and found the reason why – to meet the Green Light program. We investigate the origins of the program and some of its implications.
Back in June 2007, Nvidia launched its GPGPU adventure very modestly, in a briefing with maybe 10 journalists and 2-3 analysts in a building across the now-iconic headquarters in Santa Clara. It was a hot summer day, but one of questions that I had on my mind was how the company is going to cope with the supply demands for its GeForce and Quadro cards, if Tesla really picks up.
After all, the G80 die the company used was the largest die ever manufactured by TSMC, using the 65nm process. The rollout was slow, but Tesla lineup started to receive design wins and is on the way of becoming a sizeable business. However, semiconductor manufacturing did not go as well as the company hoped for. TSMC had a nice run with 90, 80, 65 and 55nm process nodes, but the company ran into numerous issues with the 40nm process, followed up by canceling the 32nm process. In May 2008, Nvidia repeated the event with the GT200-based Teslas, and the question of allocation remained. The current process is the 28 nanometer node, and we've heard AMD, Nvidia, Qualcomm and Apple all discussing yield issues with the said node, responsible for the supply constraint of GPUs, SoC's, APUs and baseband chips. In short – everything 28nm is in the short supply.
With Fermi (GTX 400 and 500 Series), we heard that Nvidia implemented stricter die allocation program, with the sole goal of achieving as high revenue per die as possible. With Kepler GPU architecture and the 28nm allocation being the way it is, it was quite obvious that in terms of revenue per die, Tesla will take the first place, Quadro should come second and GeForce last. This is the same policy as with other semiconductor companies, since Intel will sell every Xeon-certified die they can lay their hands on.
When Kepler launched, we heard partners mentioning the Green Light program, and alleged consequences that Nvidia's own partners could go through. We spoke with Nvidia representatives, and members of no less than four companies which manufacture Nvidia-based graphics cards. We'll address both sides and leave you to come up with your own conclusions.
In our discussion with Nvidia, which was very quick to react to our inquiry, it was stated that the Green Light program in no way affects the allocation of the GPUs and boards for its ecosystem. The statements should be taken as such and we see no reason why their statement should not be a correct one. After all, both sides are trying to get as much money as they can.
However, in our discussions with the AIC (Add-in Card) vendors, we were told that not adhering to the Green Light program indeed can affect the chip allocation, and that at least two board partners "felt the pinch" of not adhering to the Green Light. Now, there are two sides of every story, so is this one.
The AIC Side of things
First of all, Nvidia indeed, does have its "darlings", preferred board partners that get premium paws on the allocation. You can easily guess who these partners are, but we've known for years that working with Nvidia is tougher than for instance, working with AMD or even Intel. In fact, there are Nvidia executives that don't have a clue about a project, walk in, do the damage and ultimately allow for companies like AMD, Intel or Freescale to sweep the deals. However, that is the same with most companies.
Out of four AICs we spoke with, two said they experienced allocation issues with the GTX 600 Series and that they believe they're being punished for offering GTX 500 and 600 boards with unsanctioned voltage modifications, either through BIOS, software utilities or hardware mods. Some AICs even went public with the issues they experienced, such as the recent MSI and TweakTown vs. Nvidia escapade.
Our sources told us that they are not the only ones which experienced allocation issues, as:
"NV is constantly thinking about Quados and Teslas. They no longer want to sell or support high-end GeForces with modifications. In the past, we could do whatever we want with the GPU and they would honor the RMA for as long as you return the board with the original heatsink. Thus all of our modified boards with 3rd party air or liquid cooling had their original heatsinks stored in boxes and in case of RMA we would just unmod the GeForce card, put the heatsink back on and ship it back."
This policy was known for years, and we visited several storage facilities which had hundreds of boxes storing tens of thousands of original reference heatsinks. With the Fermi refresh, partners were told that the company would only accept RMA on custom PCBs, not allowing for overvoltage of boards on their regular PCBs. Furthermore, one source said the following:
"When we started manufacturing AMD boards, Nvidia told us that they will cut our allocation and they did. But they were too dependent on us and came back humbled offering us level playing field. However, good part of our GF110 chips was directed to EVGA, especially in Q4 2011 and Q1 2012. Fourth quarter really hurt us. Nvidia doesn’t think AMD is their competitor anymore and with Teslas they're no longer dependent on us."
All of this, combined with the fact that special boards like EVGA Classified, ASUS DC2-4GD5, MSI Lightning all had to have their voltage mods removed only goes to show that there is a limited number of dies available, and that Nvidia probably does not want to share more than they have to.
Unfortunately for AIC and gaming/hardware enthusiast ecosystem, selling a GTX 680 even for $800 per board (which usually retails for $499) – Nvidia knows they need each and every Fermi and Kepler die for their Tesla x20xx and K10 boards, as they're going as hotcakes in the professional segment. We recently learned of Amazon acquiring over 10,000 Tesla K10 boards at a price of just $1500-1800 each, with a mandatory $500/board annual subscription for overnight replacement. Thus, GK104 dies are in very hot demand and with the 28nm yields being the way they are (read: poor) – Nvidia will not do anything to support enthusiasts and damage the premium sales.
After all, it is the success of Tesla and Quadro lineups that enabled prices of GeForce boards to be very affordable. GeForce cards of today feature up to seven billion transistors on a single board (GTX 690) for less money than you'd need to spend on an Intel Core i7-3960X, for example – which offers significantly less compute performance. For price parity comparison, pairing the $999 i7-3960X with a $100 graphics card will result in significantly weaker experience than if you would pair a $999 GTX 690 board with a $100 CPU.
This is the answer we got from Nvidia, which is published at glance:
"Not sure why you think Green Light is a bad thing for consumers and/or AICs. It’s not. It’s actually an awesome program designed to ensure a set level of quality criteria are met for all GTX products in the field.
Green Light was created to help ensure that all of the GTX boards in the market all have great acoustics, temperatures, and mechanicals. This helps to ensure our GTX customers get the highest quality product that runs quiet, cool, and fits in their PC. GTX is a measureable brand, and Green Light is a promise to ensure that the brand remains as strong as possible by making sure the products brought to market meet our highest quality requirements.
Reducing RMAs has never been a focus of Green Light.
We support overvoltaging up to a limit on our products, but have a maximum reliability spec that is intended to protect the life of the product. We don’t want to see customers disappointed when their card dies in a year or two because the voltage was raised too high.
Regarding overvoltaging above our max spec, we offer AICs two choices:
· Ensure the GPU stays within our operating specs and have a full warranty from NVIDIA.
· Allow the GPU to be manually operated outside specs in which case NVIDIA provides no warranty.
We prefer AICs ensure the GPU stays within spec and encourage this through warranty support, but it’s ultimately up to the AIC what they want to do. Their choice does not affect allocation. And this has no bearing on the end user warranty provided by the AIC. It is simply a warranty between NVIDIA and the AIC.
With Green Light, we don’t really go out of the way to look for ways that AICs enable manual OV. As I stated, this isn’t the core purpose of the program. Yes, you’ve seen some cases of boards getting out into the market with OV features only to have them disabled later. This is due to the fact that AICs decided later that they would prefer to have a warranty. This is simply a choice the AICs each need to make for themselves. How, or when they make this decision, is entirely up to them.
With regards to your MSI comment below, we gave MSI the same choice I referenced above — change their SW to disable OV above our reliability limit or not obtain a warranty. They simply chose to change their software in lieu of the warranty. Their choice. It is not ours to make, and we don’t influence them one way or the other.
Remember, we introduced a whole new set of API functionality for AICs to integrate, that they have never seen or worked with before. We wanted to ensure that everyone could have a great app, and EVGA was simply one of the first folks trying to line up their software introduction to our accelerated launch window.
In short, Green Light is an especially important program for a major, new product introduction like Kepler, where our AICs don’t have a lot of experience building and working with our new technologies, but also extends the flexibility to AICs who provide a design that can operate outside of the reliability limits of the board. And, if you look at the products in the market today, there is obviously evidence of differentiation. You only need to look at the large assortment of high quality Kepler boards available today, including standard and overclocked editions."
If we take a look today, it is noticeable that the choices are not as vast as they used to be. Still, Nvidia is the manufacturer that makes the silicon, and their reference designs now make for majority of GeForce-based offerings. This is something they implemented long time ago, with hiring Flextronics to do the manufacturing for all GeForce 8800 GTX (we come full circle to the G80) and later GeForce 8800 Ultra boards, with some manufacturers offering custom 9800 GTX, GTX 275/280/285 and GTX 480/580 boards.
But with everyone being pressured for margins, the fact of the matter is that Nvidia has to protect themselves and offer more "ala carte pricing" model, more akin to U.S. airlines (pay for this, pay for that). The time of romance ended and the room for growth is now located elsewhere.
Naturally, if TSMC "gets their s*** together" and delivers on the 28nm node in 2013 and more importantly, delivers solid Gate-Last 20nm process at the tail end of 2013, Nvidia should have easier time to become more flexible towards its partners. Until then, Green Light it is. Now, where's AMD with parts that will blow Kepler and Kepler Refresh out of the water?