# NVIDIA GeForce 7800 GTX Review

**8 Vertex Shader Pipelines**

Zooming in on the geometry pipelines, G70 has two more vertex

shaders than the NV40/45 with a total of 8 vertex shaders handling geometry

processing. This is to speed up geometry processing in a polygons intensive

scene. NVIDIA has re-architecture the vertex shader unit where each of them can

process more ops per clock than the NV40/45. To be precise, 8 FP MADDs

(Multiply-ADD) operations per clock and in a single cycle. The single-cycle

MADDs can boost up to 30% improvement in scalar math ops. NVIDIA mentioned that

texture fetch efficient is improved significantly especially for large textures

due to new hardware algorithms and better caching to speed up filtering and

blending operations.

**24 Pixel Shader Pipelines**

**NV40/45**

**G70**

G70 is based on the SIMD (single-instruction, multiple-data)

architecture that delivers massive parallelism by executing the Shaders in

parallel for all its 24 pixel pipelines. On the G70, the 24 pixel pipelines are

divided into 6 Quads with each quad having 4 pixel shader units. Certainly we

knew that pipelines play an important role in the performance of today’s games

therefore NVIDIA and ATI will keep adding more and more pipes for their next

generation GPUs. Clearly ATI has taken the road to unified shader as evident in

the Xenon GPU with 48 shader pipes which means that there are no discrete pixel

or vertex shader units but instead combined into a set of general execution

units serving either pixel shader or vertex shader instructions. The advantage

of a unified shader approach is unproven yet but generally it should benefit the

geometry processing more since vertex shader units are always lesser in number

than pixel shader units. 24 pixel shader units vs 8 vertex shader units in the

case of G70.

NVIDIA claims to have modeled 1300 common shader algorithms in

games to determine the best usage model for the shader units on G70 so as to

spot and eliminate bottlenecks. As a result, G70 is retrofitted with a new pixel

shader unit design to deliver twice as many FP ops and much more math than the

NV40/45. On the NV40, the pixel shader pipeline consists of two shader units

with a texture unit in between and each shader unit can process 4 ops per pixel

which translate to up to 8 ops per pixel. Compared to the G70, each shader unit

can process 10 ops which translates to 20 ops per pixel. As such each pixel

pipeline in the 7800GTX is optimized to deliver 50% more efficiency when

comparing clock to clock against the NV40/45. G70 can perform two four-component

MADDs per fragment per clock compared to NV40 one four-component MADD operation

per fragment per clock.

**16 ROP Pixel Pipelines**

**NV40/45**

**G70**

The ROPs (Raster Operators) task is to convert fragments into

pixels and apply Multisample AA if need be, does color and Z compression as well

as sending the completed pixel data the frame buffer. In NV40, there are 16 ROPs

to 16 Pixel Shaders where each ROP contains a Z ROP and a C ROP and is capable

of writing one color Z pixel or 2 Z/Stencil value per clock cycle. However, in

G70 there are 16 ROPs to 24 Pixel Shaders so perhaps NVIDIA realize that there

are no bottlenecking with fewer ROPs in the case of 6600GT which as 4 ROPs to 8

Pixel Shaders. Double-speed Z is useful especially in shadow calculations so

that is partially why NV40 excels Doom 3.