Nvidia Fermi – Arriving in Q1 2010
Following Nvidia CEO Jen Hsung Huang’s keynote speech, details about Nvidia’s next gen architecture are finally available, putting rest to months of speculation.
We reported most key specifications before, but now we have most of our gaps filled.
More details next page.
Following Nvidia CEO Jen Hsung Huang’s keynote speech, details about
Nvidia’s next gen architecture Fermi are finally available, putting rest to
months of speculation.
We reported most key specifications previously, but now we have most of our gaps filled.
First thing worth pointing out is that Nvidia sees clear potential in High Performance Computing and GPU Stream Computing – perhaps even more than gaming – and believe there is multi-billion dollar potential in the HPC industry, which is currently dominated by much more expensive and less powerful CPUs. As a result, Fermi is the closest a GPU has ever come to resembling a CPU, complete with greater programmability, leveled cache structure and significantly improved double precision performance. As such, today’s event and whitepaper concentrates more on stream computing with little mention of gaming.
That said – GF100 is still a GPU – and a monster at that. Packing in 3 billion transistors @ 40nm, GF100 sports 512 shader cores (or CUDA cores) over 16 shader clusters (or Streaming Microprocessors, as Nvidia calls them). Each of these SMs contain 64KB L1 cache, with a unified 768KB L2 cache serving all 512 cores. 48 ROPs are present, and a 384-bit memory interface mated to GDDR5 RAM. On the gaming side of things, DirectX 11 is of course supported, though Tesselation appears to be software driven through the CUDA cores. Clock targets are expected to be around 650 / 1700 / 4800 (core/shader/memory). It remains to be seen how close to these targets Nvidia can manage.
Of course, at 3 billion transistors, GF100 will be massive and hot. Assuming similar transistor density to Cypress at the same process (RV770 had a higher density than GT200), we are approaching 500 mm2. In terms of DirectX/OpenGL gaming applications, we expect GF100 to end up comfortably faster than HD 5870, something Nvidia confirms (though they refuse to show benchmarks at this point). However, it is unknown as to where GF100 performs compared to Hemlock.
Products based on the Fermi architecture will only be available on retail stores in Q1 2010 – which is a rather long time away. This length delay and yields/costs could be two major problems for Nvidia. While there is no doubt Fermi/GF100 is shaping up to be a strong architecture/GPU, it will be costlier to produce than Cypress. We have already heard horror stories about the 40nm yields, which if true, is something Nvidia will surely fix before the product hits retail. However, this does take time, and Nvidia’s next-gen is thus 3-6 months away. By then, AMD will have an entire range of next-gen products, most of them matured, and would perhaps be well on their way to die shrinks, especially for Cypress, which would might end up being half a year old at the time. There is no information about pricing either, although we can expect the monster that is GF100 to end up quite expensive. More economical versions of Fermi are unknown at this point too, which might mean the mainstream Juniper will go unchallenged for many months.
If you are in the market for a GPU today – we don’t see any point in holding out for Nvidia’s GF100. However, if you are satisfied with your current GPU or looking forward to much improved stream computing – Fermi/GF100 might just be what you are after.
In the meantime, we can expect price cuts and entry-level 40nm products from Nvidia. With all that has transpired today (being HD 5850 release day as well), there’s one conclusion – ATI Radeon HD 5850 does seem like the GPU to get. If you are on a tighter budget, Juniper might have something for you soon.