The DNA of DNE
Fundamentally, Apple's Distributed Network Encoding (DNE) is not like the similarly named capabilities found in other NLEs or high-end motion graphics or 3D tools. Apple's DNE is designed to be used independently from other applications through batch processing. It works with Apple's Compressor, which comes with Apple DVD Studio Pro and functions as an independent batch encoder. Specifically, Apple's DNE option installs with Apple's DVD Studio Pro as the Qmaster component.
A file or files are submitted to be encoded to MPEG-2 (for DVD authoring), MPEG-4 (web streaming and download), mobile device-compatible formats, advanced format conversions, audio files, HD DVDs, or the newer H.264 codec.
The encoding process can be modified by adjusting the data rate, frame size, speed, ratio, and so on, thereby yielding virtually unlimited choices for customized encoding. Compressor is the Swiss Army knife of audio and video compression, as long as the output format is QuickTime-readable. You can also use Compressor directly to output video from Final Cut Pro.
Distributed rendering or encoding is the process where a computer will break up a large file and send smaller segments to other computers to render out their portion. Then the computer will marry these segments together again for your final video file. The benefits include decreased encoding time from using more than one computer and the ability to offload your rendering onto another computer.
Apple's DNE is designed to work with many different workflows and configurations, which is a testament to the quality of the software's design. You can offload your export file onto another computer to encode on a cluster of your choosing. You can also use your local Mac and group it with the cluster to speed up your encoding.
When a job is submitted, the controller breaks up the job file depending on the number of computers it has available. It routes each job to the machine best suited for the task. The size of the file and type of encoding will affect how small or large the individual segments are. When running speed tests for MPEG compression with three or four G5s, for example, the controller would typically divide the job into three or four segments. On the same network with H.264, a more compression-intensive codec, the controller would break the file into a greater number of smaller segments to avoid overloading any single computer.
|Sidebar: Rendering vs. Encoding |
To understand how Apple's DNE works, it may be helpful to clarify the difference between rendering and encoding. Most NLEs and motion graphics software refer to the method of processing your files for real-time playback, exporting, effects processing, etc. as "rendering."
Since Compressor (usually) creates a new file in a different format, Apple considers this "encoding." With this meaning in mind, we refer to the process of Compressor reading and writing new files as encoding.
Unlike Sony Vegas, Compressor is designed solely for output purposes. With Sony Vegas, you can offload the rendering of effects, motion, audio, etc. for your timeline. In Final Cut Pro (FCP), you can't. Interestingly, Apple's DNE can also be used to encode Maya or "generic" files.
For the purpose of answering our performance questions, and because "generic" is used primarily to refer to Unix-based rendering, we're going to stick with the process of using Compressor for distributed encoding of files created in tools from Apple's Final Cut Studio such as FCP, DVD Studio Pro, and Motion.
While it might have been nice to connect a rack of servers to a rack of drives to see some amazing performance, that's not the real world that most of us work in. The speed tests that follow were designed for a "real-world" postproduction facility with two to five Macs.
The figure shown here (left) illustrates the type of setup we used for testing.
In an effort to answer my original questions about performance boosts using multiple workstations, I eagerly ran the speed tests at the postproduction center where I work (the television production unit at University of California-Santa Barbara). We currently have four Dual 2.5GHz G5s, one Dual 2.0GHz G5, and one 1GHz Apple Xserve. The variety of systems available allowed for multiple speed-test options. Going into the testing, I couldn't see how a small post facility would benefit from using the DNE option if it did not gain a significant speed boost by clustering five G5s together.
For the Vegas test, Drenk was able to run the distributed rendering on three PCs from three different manufacturers with three different operating systems and three different sets of hardware specs. From conception, Vegas's distributed rendering is designed to spread the load over a few PCs. Apple's design lets you use as few as two Macs, or hundreds more.
In our lab, we link to all the Macs through a gigabit router and connect using the Ethernet port. After reading the Vegas article and running my own tests, I would strongly recommend using gigabit speed before trying to set up a distributed rendering/encoding environment (Drenk used a 100Mbps or 100BaseT Ethernet network). Compressor, like Vegas, copies to a temporary directory the footage being rendered before it starts processing, so if you have a larger file, it can take a lot of time just to transfer the media.
It took some time to figure out how best to use the new technology in our postproduction environment. Its importance became apparent when considering our need to output to MPEG-2; we needed both a broad format to use for our client copies and a distribution format for satellite broadcast. We wanted faster digital file encoding to one output file, so it seemed like the perfect time to tap into the power of Apple's DNE.
As the tests show, using one G5 to compress a DV-sourced project to MPEG-2 is slower than realtime. In fact, it will usually take two to three times longer than realtime to encode a program to MPEG-2. What I was really looking for in my DNE testing was some way to achieve real-time performance when encoding to MPEG-2 straight out of my FCP sequence. PCs have always had the edge over Macs when encoding to MPEG-2. When rendering files on our in-house PC, I can get near-real-time performance without any additional hardware or upgrades. Because I prefer to do my postproduction work in FCP, I'd be eager to devise a setup that allows me to get this kind of encoding performance in a Mac-based editing environment.
While configuring your DNE, one of the first challenges that you need to address is data access and management. As the tests show, one of the biggest variables in network encoding is how fast the computers have access to the data. If you're reading and writing back to the same hard drive, it creates a bottleneck for your data; the encoding cluster is trying to access the same drive onto which it's trying to save the movie file. When you have three to five computers all trying to read from that at the same time, it creates a logjam.
Another impact of the rate at which the network can access the media is its effect on the resulting rendering speed. For example, in one speed test with five G5s, the media was stored locally on a FireWire drive attached to one of the G5s. The G5 that was attached to the FireWire drive encoded its files three to four times faster than the other G5s! This was not a faster machine; it got the job done more quickly because it had faster access through the local FireWire cable than did the G5s that had to access the drive using the network.
It complicates matters even more when you export your file to Compressor directly out of the FCP sequence, because the entire encoding cluster must have access to the drive(s) that hold the original media. In other words, when working with DNE from an FCP sequence, you will need to "mount" the local hard drive (the one that contains the media files) onto each of the cluster computers through your network.
It quickly became clear that Apple designed its DNE solution to work optimally in a high-bandwidth network environment, with particularly high reliance on connection speed. The fastest available option would be a Fibre Channel network with dedicated servers to work as the controller for the distributed encoding. This way, the network would have near-instant access to read and write the data, and an Xserve would act as the traffic cop for all of the information going back and forth.
Configuring Your Distributed Network
When installing DVD Studio Pro, make sure that you check the Qmaster and Qadminstrator check boxes.
Next, go into System Preferences to configure your computer so the rest of the network can see and use your computer, as shown in the figure (left).
The Qmaster acts in much the same way as your Share preferences. You can choose from QuickCluster with services, Services and cluster controller, and Services only.
The Quick Cluster is the easiest way to get started with distributed encoding.
In our network, I wanted the ability to use any of the G5s as a controller and/or as a node, so Services and cluster controller was the best choice for running the speed tests (shown as selected in the image on the left). QuickCluster automates the configuring and administering for your cluster. The Services and cluster controller lets the computer act as a controller and/or a service node. Services only lets the computer work as a cluster node. When you've chosen the configuration that's best-suited to your network setup, click Start Sharing in the lower right, and you're done.
After you choose your setting in the Qmaster panel, you probably won't need to open it again unless you need to make changes to your initial settings. Most of the customization options available with this panel are intended for advanced setups.
You'll repeat the same Qmaster process with all of the Macs that you want to use, to define each Mac's role in the encoding setup. The figure at left shows how, once you've created your settings in the Qmaster panel, each computer acts as a client and can submit jobs to the cluster.
For the Qadministrator to see and use your Macs for distributed encoding, they all need to reside on the same subnet. After you've assigned all your computers in the Qmaster panel, open the Qadminstrator from the Applications folder.
At this point, you should see a list of the computers shared with the Qmaster (see figure, left). Create a new cluster by clicking on the plus box (#1 in the figure), and title your cluster as desired. Next choose your controller, either the local computer or another networked Mac. Then simply drag and drop the computers that you want to add to the cluster. You can create multiple clusters, but one computer can only be on one cluster. If you are working with a few computers that you are networking together, then you can probably get by with one cluster.
For the speed tests, I created two clusters, one for each controller that I wanted to use. We won't go into an explanation of the other tabs or settings, since they aren't required to start basic distributed encoding. After you apply the settings for your cluster, it's ready to go! The next time you're ready to submit a job through Compressor, you will have the choice of either "this computer" or your pre-designed cluster.
When deciding whether or not your cluster will work as is on future jobs, keep two questions in mind: where is your source file and where are you saving to? When using direct export from a sequence in FCP, you will need to mount the drive that contains any source footage that you will be compressing. With our DV projects we use FireWire drives, so when exporting from FCP, we use Apple File Share to connect to the appropriate FireWire drive for all the cluster nodes.
When using a self-contained file, you can skip this step. When encoding self-contained files with Compressor, the controller has all the information and sends it out to the appropriate computer when needed. But if you're exporting directly from FCP, each of the nodes needs to process the data the same way the local computer would, from within FCP. When you submit a file to be encoded, the nodes will automatically launch FCP to use with the cluster node.
The challenge is to make sure that all of your computers are ready. Often, a node will need to reset a scratch disk or check the connection for the FireWire port. Either of these delays will make a node hang and fail to encode the file that has been sent to it.
It's a good idea to launch FCP and make sure you've mounted any necessary drives on every node before you submit a distributed encoding file. Once a file or project is submitted, each node will need to complete the job it's been given; otherwise the job will fail. After you submit a job for encoding, you can monitor the process through the batch monitor by choosing the cluster.
For the testing done in this article, I really needed to see which computers were running and which weren't. By opening Qadministrator and choosing the cluster in question, you can see if a computer is processing a job and which part of the job it's doing. This is one easy way to make sure that all your computers are performing correctly. Unfortunately, the batch monitor doesn't give a lot of information if you are trying to troubleshoot a problem, so keep in mind the point about data access.
For our DNE test, I encoded a 20-minute DV file to two popular encoding formats: MPEG-2, at Apple's default "best 90min" (for DVD) quality, and H.264, on Apple's default setting of 300Kbps QuickTime 7. Incidentally, H.264 gives you great-looking footage at a fraction of the size of other codecs, but encoding to the format it is very processor-intensive. I tried to encode H.264 files overnight and found them still encoding in the morning.
The tests yielded some interesting results. As expected, the more G5s that we used for the tests, the shorter the encoding time was. What surprised me was that the time didn't decrease that much. Comparing tests with four G5s working in a cluster to a five-G5 setup, the time difference was only three to six minutes. (Of course, the time savings would increase exponentially with larger files.)
The most dramatic finding was related to data access speed. For most of the testing, I used a G5 as the client as well as a render node. All four of the G5s encoded the data at approximately the same speed, except the G5 that was directly connected to the FireWire drive. It encoded at double or triple the speed of the other computers. This was presumably the result of having the client G5 connected at a faster data rate than the other machines.
With a little effort, you can imagine how much faster the encoding would have been with a Fibre-Channel setup, as shown in the figure on the left.
The biggest surprise (not reflected in the tests) was that while encoding a file that was self-contained (the two columns on the right in the table), the controller would copy the file into a Temp directory. Due to the bandwidth limitations of the network, it took almost 11 minutes to copy the file before it started to encode.
Again, with a near-instant access network, the test results for these files would have decreased by ten or more minutes.
The truly shocking results were revealed in the bottom row. At first glance they don't make a lot of sense, but if you were paying attention, the reasons are clear. The big difference that yielded these amazing results was due to the fact that the test was with one G5 as the client and controller but not as a node (see figure, left). Even more importantly, the media was accessed through the local FireWire drive. In contrast, during the other tests, the controller was connected through the network.
It's been professionally (and personally) very rewarding to test and push a G5 network to the limit. Hopefully you will be inspired to set up your own distributed encoding scheme. As the test results show, there is definitely some performance boost from clustering several G5s, but how you cluster them and how you share the data play just as critical of a role in overall performance. With the new Apple Intel workstations on the way, you can bet they, too, will affect the speed of encoding. We'll test and discover how much when the pieces fall into place.
After hours and hours of rigorous testing, I have come to the conclusion that Compressor is a great product made even better with the integration of Apple's Distributed Network Encoding. While the feature is not right for every situation, it has proven itself to be a good solution for many postproduction environments.