Following Intel's "tick-tock" methodology of introducing new processor generation architecture every two years and then refining it the next year, Intel's Core i7 (also known as Nehalem) is the first bearer of brand-new architecture introduced by Intel in November of 2008. This means that new features and performance levels physically impossible for previous generation chips are now within your reach. But aside from the regular marketing hype, what tangible benefits does the new Core i7's bring to you? Read on to get the facts about whether the new computing platform is right for you!
While many of the internal computational units of Core i7 are based on previous-generation Core 2 architecture (due to it being so efficient), it’s the slew of brand-new features that have been added which justify the next-generation claim. The features are:
1. Removal of traditional Front Side Bus and usage of QPI. In previous generation Intel designs, the processor “talked” to the rest of the system as well as to memory through its FSB. If you look at the X48 platform diagram below, you will see that the maximum bandwidth was around 12.8 GigaBytes per second. The new Quick Path Interconnect provides 25.6GB/s, thus doubling the processor’s communication bandwidth to the rest of the machine.
2. Integrated memory controller: on previous generation designs, the processor needing some data from system memory (RAM) first had to follow the FSB path to Northbridge and only then get to memory (see X48 diagram). So not only did it have to take a relatively narrow FSB bus, but there also was a “middleman” in the face of the Northbridge introducing some extra latency. Nehalem shuns this design deficiency and goes directly to memory, the same way DPC goes to the customer! This means 25.6 Gigabytes per second of pure unobstructed memory bandwidth (see X58 diagram). An attentive reader may ask: ‘Wait, doesn’t the dual channel DDR3 memory in the older design provide 2 x 12.8 = 25.6GBps of bandwidth?’ This may be so, but while the bandwidth from RAM to Northbridge is 25.6GBps, the speed at which the NB can then relay the data to the processor is a mere half of that (12.8)! Talk about a bottleneck!
3. Hyper-Threading: this technology makes every one of Core i7’s 4 cores appear as two cores to the operating system. While the technology has been implemented to a varying degree of success in Pentium 4 generation, back then the software was not really designed to take advantage of multiple work threads. Today, the programs capable of multi-threading (such as video editing and encoding) see an average of 33% percent of performance increase from technology being enabled! (see task manager screenshot)
4. SSE 4.2 support: short for Streaming SIMD (single instruction multiple data) Extensions, SSE is essentially a set of “shortcut algorithms” for software to use. For a simplified example, rather than asking the processor to “add a number, switch to the next one up, add a number, switch to the next one up, … “ for 1000 numbers, SSE-aware program will “tell” an SSE-supporting processor to “take each number, add to it and repeat for 1000 times”. This makes certain operations execute extremely fast, and SSE 4.2 is the latest iteration of SSE family.
5. Power control unit: this technology enables the processor to slow down or “down-clock” unused cores when possible, to ensure the higher performance/Watt ratio. The new function of Core i7 family is the ability to completely shut down unused cores! They become available in a split second as soon as duty calls.
6. Turbo mode: despite noticeable progress, most of the software out there (especially games) barely learned how to use dual cores, let alone 4 with Hyper-Threading. So Intel engineers have found an ingenious way to boost performance in single-threaded operations: what happens is once a single-threaded app is detected, the one or two cores assigned to it get their frequency increased (over-clocked) by 133 or 266MHz! What’s more, should the thermal conditions allow, all 4 cores get dynamically over-clocked by 133MHz in multi-threaded tasks as well! Having over-clocked dual and quad-cores for the past few years, we can tell you that performance increase from CPU speed increase is pretty much linear in most scenarios.
Additional features of the new architecture also include the re-arrangement of cache hierarchy. Cache is a high-speed memory built into the processor itself, and that’s where it looks for information first. Imagine this scenario: a craftsman requires a certain tool for the job at hand. First, he will check whether it is in his hand (L1 cache), then, if it’s on the work bench (L2 cache), then, in the drawer (L3 cache). In case it is not found in immediate vicinity, he may have to go to another part of the shop to find it (RAM). If it still is not there, he may have to go out and buy it (Hard Drive), or even order it in (DVD). So the larger the immediate workspace is, the better the chance of finding the tool needed immediately available!
Core 2 architecture of the second generation (Wolfdale Core 2 Duo E8--- series and Yorkfield Core 2 Quad Q9--- series) has 32KB instructions + 32KB data = 64KB of L1 cache per each core and 6MB of shared L2 cache per each pair of cores . Core i7 Nehalem has 32KB data + 32KB instructions L1 cache as well as 256KB L2 cache per core as well as 8MB of L3 cache that all 4 cores share. While this rearrangement is more due to architectural specifics rather than anything else, comparing it to Core 2’s cache hierarchy will be apples-to-oranges. It’s just something to keep in mind for those of you who read technical briefs for the sake of enriching your understanding of how computers work.
Also, Core i7 uses 45nm (nanometer, a billionth of a meter) technology in order to fit its 731 transistors on its 263mm² core. To compare, a width of average human hair is 80000nm. So the technology used to run your application is pretty much on nano-machinery level, as every node is about 180 atoms in size. Imagine that!
All in all, the new architecture Core i7 brings to you is as good as it gets today. While the overall platform cost (mainboard and DDR3 triple-channel memory) come at a premium due to novelty of the technology as well as limited production, you do get what you pay for (like with all things in DPC). Now available in our 900-class Eclipse UHD Extreme Performance PC, nicely equipped starting at 2499USD with no fine print.