Architecture Summary
The following features summarise the architecture of the AMD Athlon Thunderbird processor:
Multiple x86 instruction decoders - implemented by internally decoding x 86 instructions into fixed-length
"Macro-Ops" for higher instruction throughput and increased processing power.
Three out-of-order, superscalar, pipelined address calculation units
Three out-of-order, superscalar, pipelined integer units
Three out-of-order, superscalar, pipelined floating-point execution units - which execute all MMX instructions, 3DNow!
Technology, and x87 floating point instructions.
Enhanced 3DNow! Technology - with new instructions to enable improved integer math calculations for speech or video
encoding and improved data movement for Internet plug-ins and other streaming applications.
200-MHz System Bus for x86 Platforms - (100MHz DDR - 200MHz effective EV6) - this is the fastest front-side bus
implementation in any x86 platform currently available. AMD has made this possible by licensing the Alpha EV6 interface from
Digital Equipment Corp., and has answered an industry challenge by delivering high-speed system interface for x86 platforms
without sacrificing legacy x86 compatibility. The AMD Athlon processor's front-side bus is scalable to operate beyond 400MHz
and is capable of delivering a peak data transfer rate of 1.6 Gbytes/sec, and also provides 64-byte burst transfers, well above
the Intel Pentium III offerings.
High-Performance Cache Design - The most important change of all, the cache architecture features an integrated
128-Kbyte L1 cache and a 16-way set-associative, on-chip (on-die) 256-Kbyte L2 cache that operates at clock speed, providing a
total of 384Kbytes of on-chip cache. Integrating the L2 cache on-chip increases the die size of the 0.18-micron Athlon core
slightly, whilst still making it smaller than the original Athlon die size of 0.25-micron. More importantly, the Thunderbird
processor's 37-million transistor die is now 120mm², in comparison to 184mm² die of the original Athlon. As discussed in our
previous Duron article, the Thunderbird adopts the same 'exclusive' cache architecture instead of the conventional 'inclusive'
cache architecture adopted in the current Intel Celeron and Pentium III Coppermine processors. What this basically means is all
the data stored in L1 cache is duplicated in the L2 cache, hence AMD's claim of 384Kbytes of on-chip cache is technically correct.
The L2 cache on Thunderbird features a 64-bit data path, while the Pentium III Coppermine features a 256-bit data path, providing
the Intel Coppermine with a quadruple L2 bandwidth advantage over the Thunderbird. Overall though, it is the L1 and L2 architecture
on the Thunderbird processors that seems to win the day for AMD.