NVIDIA GeForce 8800 GT (G92)
Part 2: Features, Synthetic Tests
Direct3D 9: Pixel Shaders Tests
The first group of pixel shaders to be reviewed here is too simple
for modern GPUs. It includes various versions of pixel programs of
relatively low complexity: 1.1, 1.4, and 2.0.

We can see that the tests are too easy for modern architectures and
fail to reveal their true capacity. Performance in simple tests is
limited by texture lookups and fill rate, we can see it in low results
of the RADEON HD 2900 XT. Results get more interesting in more complex
PS 2.0 tests, the GeForce 8800 GT always outperforms the GTS product,
being only slightly slower than the top GTX card in full compliance
with the theory.
GeForce 8600 GTS and 8800 GT being on a par is out of the question, the previous
Mid-End solution is heavily outperformed, more than twofold. Its performance
is limited by the fill rate and texture lookups in the first place.
Let's have a look at results in more complex pixel programs of intermediate
versions:

Depending on the texel rate, the water test uses dependent texture
lookups of high nesting depth, so the RADEON lags far behind the NVIDIA
solutions. The GeForce 8600 GTS is again noticeably slower than the
GeForce 8800 GT. The AMD card shoots forward in the second more compute-intensive
test. This task fits its architecture with more unified processors.
The difference in results demonstrated by the GeForce 8800 GT and
the GTS/GTX cards appears owing to the performance differences of
shader units and TMUs. These results agree well with the theory.
Direct3D 9: New Pixel Shaders Tests
These tests of DirectX 9 pixel shaders are even more complex, they are divided into two categories. We'll start with easier shaders - SM 2.0:
- Parallax Mapping - a texturing method used in many games, which is described in detail in our article Modern 3D Graphics Terms
- Frozen Glass - a complex procedural texture that visualizes frozen glass with adjustable parameters
There are two modifications of these shaders: arithmetic intensive and texture
sampling intensive. Let's analyze arithmetic-intensive modifications,
they are more promising from the point of view of future applications:

Situation with the NVIDIA cards in the Frozen Glass test is similar
to that in the previous group of tests. The GeForce 8600 GTS is still
outperformed by the 8800 GT more than twofold. The latter keeps very
close to the 8800 GTX. NVIDIA cards based on the G80 and G92 outperform
the HD 2900 XT, which confirms the fact that their performance is
limited by the texel rate.
Although the HD 2900 XT leads in the Parallax Mapping test (the second test),
the GeForce 8800 GT is only a little slower, outperforming the GeForce
8800 GTX! To all appearances, that's the effect of improved TMUs,
as parallax mapping requires an additional texture lookup. Let's analyze
results obtained in the texture sampling intensive tests, where the
GeForce 8800 GT may perform even better:

The situation changes quite radically. Performance is limited by
the speed of texture units more than ever, so the GeForce 8800 GT
is faster than the GeForce 8800 GTX by almost one third! And the RADEON
HD 2900 XT is outperformed by the GeForce 8800 cards in the Parallax
Mapping test, where they have always been very strong. You should
be aware that the situation in real applications will be different,
because you almost always enable trilinear and/or anisotropic filtering
on such powerful graphics cards. So the GeForce 8800 GT will most
likely be slower than the GTX card.
As usual, arithmetic-intensive shaders work faster on all graphics cards. Texturing-intensive shaders make no sense for modern GPU architectures, new products from AMD and NVIDIA prefer arithmetic operations to texturing.
Let's have a look at results of another two pixel shader tests - SM 3.0. They are the most complex of all our tests for Direct3D 9 pixel shaders. The tests load ALUs and texture units heavily. Both shader programs are complex, long, and include a lot of branches:
- Steep Parallax Mapping is a much heavier modification of parallax mapping, which is also described in the article Modern 3D Graphics Terms
- Fur - a procedural shader that visualizes fur

The load on graphics cards in these two tests is rather large for
such powerful GPUs as R600 and G80, and the G80 outperforms the G84
by more than twofold. Although the R600 apparently executes complex
Pixel Shaders 3.0 with a lot of branches more efficiently than the
G80, its advantage over the new G92 almost disappears in our synthetic
tests. What's interesting, the GeForce 8800 GT again performs noticeably
better than the GeForce 8800 GTX in both tests. This acceleration
relative to the G80 can be explained only by bilinear texture lookups,
because the new GPU does not have another 20-40% theoretical advantage
over the G80.