April 2003|If there's one thing constant in the field of digital video, it's that you can never have too much processing power. "When was the last time you heard someone say, ‘The thing I don't like about this application is that it's too fast, too powerful?'" asks Ioannis Katsavounidis, software manager for InterVideo, makers of WinDVD and WinDVD Creator. "It doesn't happen." Whether it's capture, transcoding, effects preview, rendering, or formatting a DVD disc image, there's always a need to do it faster and better—and hopefully at lower cost.
For years, the standard approach to satisfying this appetite for power, at least for the professional video market, was to handle processor-intensive tasks with specialized hardware, using the host CPU for less-demanding tasks such as user interface and control. Meanwhile, for those who couldn't afford hardware-based systems, software-based processing meant doing without some processes and waiting a long time for the results of others. As CPU throughput has accelerated, however, software-only systems have become increasingly capable, particularly when run on dual-processor machines.
The latest ammunition for the software-only insurgency is Intel's introduction of "hyper-threading" processors for desktop PCs, which potentially boost performance by up to 25%. The keyword, of course, is "potentially." The promise of hyper-threading is real, but the extent to which end-users will see the difference in their daily video routines depends not only on what tasks they are trying to accomplish, but also on how each task is implemented in their toolset.
While hyper-threading—dubbed "HT" for short—isn't exactly new, it's previously been available only on Intel's Xeon processors, used primarily in high-end servers. Late last year, Intel started enabling the technology in Pentium 4 CPUs, specifically 3.06gHz chips aimed at top-of-the-line desktop machines. Requirements for using the P4's HT capabilities include an HT-enabled chipset and BIOS and an operating system that includes HT optimizations, either Microsoft Windows XP Professional or certain versions of Linux. A new "Intel Pentium 4 Processor with HT Technology" logo is being used to indicate systems that are HT-ready.
As described by Intel, the impetus behind HT is the realization that "clock speed is only half the story. Faster clock speeds are an important way to deliver more computing power...[but] the other route to higher performance is to accomplish more work on each clock cycle." To do this, HT takes advantage of multi-threading, which is the ability of software to subdivide a process into "threads," meaning tasks that can be scheduled and run independently. "A thread," Intel explains, "shares code and data with the parent process but has its own unique stack and architectural state." The resources needed to execute a thread are allocated by the operating system when the thread is scheduled, then freed when the thread is completed.
Multi-threading really comes into its own on a multiprocessor system, where the threads can be executed on different physical processors. Without multiprocessing, a PC is like a one-lane road. If one thread is waiting for I/O to complete or for another task to provide information it needs to move forward, other threads can get backed up behind it. When threads can run in parallel, on the other hand, traffic in one lane can keep moving even when there's a slowdown in the other. Thus, multiprocessing can maximize the overall throughput of multi-threaded processes by minimizing the impact of bottlenecks.
Intel has integrated HT into the Pentium 4 to provide the benefits of multi-threading even on machines with only a single physical processor. "HT technology allows a single Pentium 4 processor to function as two virtual or logical processors," Intel says. "The processor can execute two threads simultaneously, use resources that otherwise would sit idle, and get more work done in the same amount of time."
Intel doesn't achieve HT's "two-in-one" trick by squeezing two physical processors into a single chip. Instead, the chip has the ability to simultaneously handle two "architectural states," each of which constitutes a logical processor. Each state has its own set of general-purpose registers, control registers, and other elements needed to track the flow of a program or thread. Other than architectural state, however, all processor execution resources—the units on the processor that perform work such as addition and multiplication—are either shared or partitioned between the logical processors. The performance boost comes because HT can keep those execution resources busier than in a standard processor.