Digit-Life Hardware News
15.05.2008
[16:16] Sharp Achieves the Highest Power Density for Direct Methanol Fuel Cells (DMFC)
[16:10] Elpida Offers First DDR2 SDRAM With x32-bit I/O Configuration
14.05.2008
[21:13] Daily Mailbox
[16:29] ICANN Approves RegistryPro Proposal to Expand the .Pro TLD
[00:10] Thermaltake Unveils X5 Orb FXII Cooler in 5 Colors
13.05.2008
[23:58] Daily Mailbox
[23:55] Microsoft Announces LifeCam VX-5000, LifeCam VX-500 and LifeChat LX-2000
[23:39] Matrox Introduces Dual-Link Extio F1240
[23:29] Creative Introduces Vado Pocket Video Cam
[23:05] Apple to Showcase Mac OS X Leopard and OS X iPhone Development Platforms at WWDC 2008
Your link here

Home Home
Latest News | Platform | Coolers | HDD/DVD | Video | Sound | Network | Imaging | Mobile
Monthly | Rightmark Tools | Search | Forum | Mailing | Links | Advertise | About Us
Digit-Life Articles Feed    Digit-Life News Feed

Latest Articles:

NVIDIA GeForce 8800 GT Performance Scaling in Modern Games

Sapphire PURE CrossFireX 790FX Motherboard on AMD 790FX Chipset (Socket AM2+)

Passively Cooled Gigabyte GeForce 9600 GT 512MB

i3DSpeed, April 2008

Biostar TA780G M2+ Motherboard on AMD 780G Chipset (Socket AM2+)






Performance and Power Consumption Control Features in Intel Pentium 4 and Intel Xeon Processors – Different Names, the Same Principles

 
»

We haven't paid attention to the analysis of "thermal" technologies in modern processors for a long time. The last article on this issue – Testing Thermal Throttling in Pentium 4 CPUs with Northwood and Prescott Cores – is already over 8 months old. At the same time, power saving processor technologies have been gaining increasing popularity of recently – no less than the performance race. Power consumption control functions of the processors can be clearly divided into two categories – protecting from overheating in emergency cases and reducing power dissipation in standard mode when idle. It's no secret that processors spend most of the time in idle mode, both in modern personal computers and servers. Intel confirmed this fact by presenting a new technology Enhanced SpeedStep for Xeon server processors. This article will review the most important performance and power consumption control technologies, implemented in Intel Pentium 4 series processors (as well as to their brief comparison with other similar technologies).

Testbed configurations

Testbed 1

  • CPU: 3.4 GHz Intel Pentium 4 (Prescott core)
  • Chipset: Intel 865PE
  • Motherboard: Gigabyte GA-8IPE1000 Pro2
  • Memory: 2x256 MB KingMax DDR-433
  • Video: ATI Radeon 9800 Pro
  • HDD: WD Raptor WD360, SATA, 10000 rpm, 36Gb

Testbed 2

  • CPU: Intel Xeon 3.4 GHz (Nocona core)
  • Chipset: Intel E7525
  • Motherboard: Supermicro X6DA8-G
  • Video: Leadtek PX350 TDH, NVIDIA PCX5900
  • HDD: WD Raptor WD360, SATA, 10000 rpm, 36Gb

Software

Test methods

In this article you will often come across the following notions: Actual CPU Clock, CPU Load, and Throttled CPU Clock. Though some of them may seem quite evident, we should still dwell on the ideological principles and measuring methods for every above mentioned parameter.

So, Actual CPU Clock is a complete number of cycles, which the clock sends to a processor core via its internal bus per unit time (to get the standard frequency expression in Hertz, you should use 1 second for the unit time). In fact, Actual CPU Clock is just the frequency of a clock signal, generated by the clock, which is reflected in Time Stamp Counter (TSC). You can read it any time using the RDTSC instruction from IA-32 (x86) set. So, the difference between TSC readings for a given time period divided by its duration will give us Actual CPU Clock. This standard method is used for real time CPU clock measurement in most utilities that provide CPU information (CPU-Z, WCPUID, and well as RightMark utilities – RMMA, RMSpy, and RMClock).

Naturally, CPU Load is the number of duty CPU cycles divided by the total number of cycles per unit time. This may seem a trivial matter, but measuring CPU load as such is far from a trivial task. First of all, note that CPU Load is not a differential but an integral quantity. To put it simply, it cannot be measured for an infinitesimal period of time (in fact, it can be done, even with a precision of a cycle, but this quantity will have no profound meaning, because it will either be zero – in case of an "idle" cycle, or one (100%) – in case of an "effective" cycle). That's why it makes sense to measure CPU Load for a relatively large period – for example, 100 ms (or even 1 second). The larger the period is, the higher the measurement accuracy is, but the shorter the response time is. That is if the integration period is too long, the curves will be too smooth and you won't be able to make out abrupt changes in CPU Load. Of course, necessary measurement accuracy is dictated by a given task – in practice, 1 second is quite enough for precise measurements.

We should say a few words about the CPU Load measurement itself as well. Of course, you can rely on the readings taken by an operating system, as most system applications developers do. A significant disadvantage of this method is a relatively low measurement accuracy (system counter resolution does not exceed 15 ms, that is, for example, it cannot get more than 6-7 counts for the 100 ms period. And so the measurement accuracy will be below 15-16%). But the quantity obtained is also of a considerable conditional character – first of all, relative to the real device – CPU (because only a developer knows what and how the operating system measures). In our today's research we'll use a new utility RMClock, which is the first to use a cardinally different method, based on taking readings (counters) of the CPU itself. It lends quite tangible or physical (as scientists put it) meaning to the measurements.

Going back to the beginning, duty cycles are those spent by a physical CPU (we emphasize this moment, because Pentium 4 processors are now equipped with Hyper-Threading technology, which allows to present one physical processor as two logical ones. Thus, we are interested in readings from the real device as a whole) on executing the code of user applications (user mode) and of the operating system (kernel and user mode). All other cycles will be considered "ineffective" or "idle" – the operating system puts CPU for these cycles into sleep mode (it executes HLT instructions). The total of duty and idle cycles must obviously be equal the total number of cycles, so CPU Load can take values from 0% to 100% inclusive.

Now we are to dwell on the last, the most profound quantity – Throttled CPU Clock. It's an exclusive feature of our new utility RMClock – as far as we know, no other alternative SysInfo software can measure this quantity so far. In order to understand this quantity, we should run a few steps forward and briefly mention what CPU throttling actually is. It turned out that its principle is quite simple, it consists in modulating CPU clock. What is it? It's just a part of the total number of CPU cycles is forced "idle". We have put this notion on the third place on purpose, because it is closely related to the first two notions reviewed.





CPU Clock Modulation. Sample Modulation with 25% Duty Cycle. (Source: IA-32 Intel(R) Architecture Software Developer’s Manual, Volume 3: System Programming Guide)

Clock modulation has one important effect – when this procedure starts, CPU Load may become... less than 100% even at full load. A part of cycles in throttling mode are forced "idle", and thus a processor cannot spend 100% of its cycles on executing user code resulting in its lower real load. Certainly, the drop of real CPU Load with the increase of Clock Modulation will not be detected by any operating system or utility, based on OS methods. That's why we mentioned above the conditional character of CPU Load readings in OS.

Considering the above said, RMClock determines Throttled CPU Clock in the following way – it loads CPU 100% for a relatively small period of time (to minimize the influence on Total CPU Load) and counts the total number of executed cycles and duty cycles. Their ratio is the throttling order (or level). Multiplied by the actual CPU clock, it gives the required quantity – Throttled CPU Clock. The latter can obviously be from 0 (only theoretically) to the actual CPU clock, which cannot be exceeded though.

Performance and power consumption control technologies

This detailed explanation concludes the methodological chapter. Let's proceed to more interesting issues – the analysis of performance and power consumption control technologies in Intel Pentium 4 and Intel Xeon series processors. Intel processors currently use the following technologies of this kind:

  • Emergency overheating detector
  • Automatic thermal monitoring mechanism – Thermal Monitor 1 and Thermal Monitor 2
  • Software controlled on-demand clock modulation
  • Enhanced Intel(R) SpeedStep

Let's examine each of these functions.

Emergency overheating detector

It's the simplest, completely automated (that is it cannot be controlled; besides, its presence in a processor cannot be detected) mechanism that first appeared in P6 series processors, which is also implemented in Pentium 4, Xeon and Pentium M processors. Its idea is quite simple – on reaching a certain thermal threshold (specified at the production stage) the processor just suspends execution until RESET# (as claimed in manufacturer's documentation). In practice (at about 100°C) Pentium 4 and Xeon based systems are actually powered off (until the PWRGOOD signal :), that is power on).

 
»

Dmitri Besedin (dmitri_b@ixbt.com)
February 9, 2005





Latest News | Platform | Coolers | HDD/DVD | Video | Sound | Network | Imaging | Mobile
Monthly | Rightmark Tools | Search | Forum | Mailing | Links | Advertise | About Us

Copyright © by Digit-Life.com, 1997-2008. Produced by iXBT.com
Design by Explosion