HOME BUYING
ZONE
One2surf Logo TECH
SUPPORT
FOR
SALE
Intel Pentium 4 and i850 chipset reviewed
Labs - Home Introduction

Intel Pentium 4 and i850 chipset reviewed


Featured Product
Intel Pentium 4 and i850 chipset reviewed

Manufacturers Web Site


Sections
The Pentium 4 Platform
Inside the NetBurst™ Micro-Architecture
Architectural Features of Intel Pentium 4 Processor
Intel 850 Chipset & D850GB Motherboard
Performance
Our Verdict

Intel Pentium 4 and i850 chipset reviewed

Inside the NetBurst™ Micro-Architecture

Intel promotes its NetBurst Micro-architecture as paving the way for an advanced class of next generation computers. In many ways this is true, although one has to say in its current form this statement can be challenged as the speed it is launched at does little for the superior architecture of the Pentium 4. Two factors which then spring to mind are firstly, whether it is worth jumping on the bandwagon now, and secondly, whether Intel's NetBurst Micro-architecture will indeed stand the test of time when higher frequency based processors will be demanded and performance will be the name of the game. We need to clarify what the NetBurst Micro-architecture brings to us as NEW technology. As is the case in our industry buzzwords surround all the new launches manufacturers make - NetBurst is one of them. The NetBurst Micro-architecture consists of eight features:
Hyper-pipelined technology, Rapid Execution Engine, Execution Trace Cache, Advanced transfer cache, Advanced dynamic execution, Enhanced floating point/multimedia, Streaming SIMD extensions 2 and a new 400MHz system bus.
These features are described in detail in our chapter to follow this, however at this stage we wish to discuss a topic close to all our hearts "Performance". Do we truly know which factors determine true processor performance?

P4 Breakdown

What Factors Determine True Processor Performance?
In a day and age when performance is demanded, and manufacturers like Intel and AMD provide us with faster clock frequency processors, it is easy for us to be thrown into confusion when we suddenly discover why our latest "highest clocked frequency processor" is not as quick as we anticipated. We have at one stage all been there. This is the outcome of a distinct lack of any explanation or standard which measures performance based on a given criteria. Intel however has a definition, which measures performance based on the time it takes to execute a given application. It states; True performance is a combination of both clock frequency (MHz) and IPC (instructions per cycle): Performance= MHz x IPC. This highlights that performance can be improved by increasing frequency, IPC or both. Therefore frequency seems to be a function of both process technology and micro-architecture, at a given clock frequency, the IPC is a function of processor micro-architecture and the specific task being executed. In addition to the two methods described above, it is also possible to increase performance by reducing the number of instructions it takes to execute the specific task being measured. Single Instruction Multiple Data (SIMD) is a technique first introduced by Intel in 1996 using 64-bit integer on the Pentium Processor with MMX technology and subsequently 128bit SIMD single precision floating point (SSE) on the Pentium III processor. Applications can be broadly divided into two categories: integer/basic office productivity applications and floating point/multimedia applications. The IPC achievable by these applications varies greatly, which is affected by the number of branches that an application code typically takes and the predictability of these branches. The more branches taken that are difficult to predict, the higher the possibility of mis-predicting and performing non-productive work.

Integer and basic office productivity applications, such as word processing and spreadsheet processing, tend to have many branches in the code that are difficult to predict, which reduces overall IPC potential, as a result these are prone to improvements in micro-architectural means, such as deeper pipelines. In addition to this a significant increase in performance levels on this platform does little to increase the users experience, as this these type of applications only need to keep pace with the users level of read and write response time and today's higher end Pentium III and alike processors would suffice. Floating point and multimedia applications are much easier to deal with as they have branches that are very predictable, and thus have a higher IPC potential. As a result, these types of applications scale very well with frequency and are inclined to benefit from deeper pipelines. In addition, the processing power required by these applications tends to be abundant, the more the performance, the better the users experience. Intel Pentium 4's NetBurst Micro-Architecture is lower on IPC, but according to Intel the increase in frequency capability more than makes up to deliver overall higher performance capability to the end user, this was achieved in the NetBurst Micro-Architecture by implementing a Hyper Pipelined Technology where the depth of the pipeline was doubled from that of the P6.

Pipeline

Hyper Pipelined Technology
The most common problem faced by Intel in the past has been one of increasing the clock speed of their processors, history is littered with examples; the Pentium Classic and Pentium MMX reached as far as 233MHz before it fizzled out, the Pentium Pro a P6 generation processor reached its maturity at 200MHz, this was than replaced with a Pentium II by moving the L2 cache off-die and after a die shrink it managed to reach a grand 450MHz, the method of die shrinking is another way of getting an increase in clock speeds but one which in the long term does not yield enough to make it profitable to invest in a new manufacturing plant. Continuing the P6 theme we were all able to see the disaster Intel was faced with when it attempted to increase the clock speed of the Pentium III over the 1GHz barrier, resulting in having to recall all its 1.13GHz processors, this is easily done as the market pressure applied to Intel was too great not to take a 'stab at it'. This gamble was purely down to its 0.13-micron fabrication process which was and still is someway in the future (Q3-2001 anticipated), this brings us to another way of increasing clock speeds to achieve its objective. As opposed to shrinking the die as mentioned earlier, one could make the processor do less. Intel achieves this by what it calls Hyper Pipelined Technology. This in simple terms signifies increasing the number of stages in the processors pipeline, the deeper the pipeline the more stages an instruction has to go through to reach the end of the pipeline, thus you are achieving less per clock. The trade off is you can ramp up the processor to higher clock rates which if ramped up to the correct speed, you will end up achieving more than the trade off made. The original Pentium (P5) only featured a 5 stage pipeline, which by today's standard was small, this was increased with the introduction of the Pentium Pro, Pentium II and Pentium III which featured a 10 stage pipeline doubling the P5 pipeline and thus restricting the clock speed to 1GHz. The Intel Pentium 4 doubles the length of pipeline to 20 stages deep. This then brings us back to our scenario of branch predicting and the risk of mis-predicting. Intel has taken certain steps to reduce mis-predicts, although branch prediction algorithms are highly accurate, they are not 100% accurate if the processor mis-predicts a branch, you have to start all over again, and with 20 stages deep, this results in a longer recovery time which makes for a lower IPC. To minimise this the NetBurst Micro-architecture has implemented and Advanced Dynamic Execution engine and an Execution Trace Cache (see Architectural features of Intel Pentium 4 section).

Prev | Next