Basic Issues in Parallel MPEG-2 Encoding

A Parallel MPEG-2 Encoder:
Basic Issues We Need To Tackle
(Why This Is a Hard Project)

Problems

There are two major reasons that MPEG-2 encoding is difficult at good speed:

Computation. MPEG-2 encoding, like MPEG-1 encoding (to which it is very similar), involves a great deal of computation. By far the most time-consuming part of the process is motion vector search, in which a given block of the image is compared against many other blocks to find the closest match. In the workstatons of NOW (HP 735's, 32 Sun SPARCstation 10's and 20's, and UltraSPARC's), you can get values close to XXX macroblocks encoded per second (reading and writing the MBs from memory). The only way to solve this problem is ``go parallel''.

Input. Raw digital video is simply enormous; we plan to work with 1920x1152 pixel resolution frames, so obtaining 26.5 Mb per frame (12 bits YUV). As the objective is to encode real-time video (30 fps), this means that we need to move around 0.8 Gbits/second. This data rate will completely overwhelm traditional Ethernet (10 Mbps) or even Fast Ethernet (100 Mbps); Gigabit Ethernet comes closer, but is still unlikely to give us the bandwidth we need. The network use by the Berkeley NOW's project is Myrinet, whose rate is close to the needed, but may not have the adequate bandwidth, so we have to solve this problem.
- As a result, it becomes obvious that our input cannot be coming from disk, since one-tenth of the required speed is considered very fast for a disk; however, as high-end digital cameras and VTRs are more than capable of providing these data rates, we want to obviate this problem, so we will simulate the enormous throughput in the disk.

Parallelization

Of course, the whole point of this project is to parallelize this process to make it easier. There are basically four different approaches to the parallelization:

Macroblock level. Each MPEG frame is divided into 16x16 groups of pixels called macroblocks, each of which is encoded separately. It is the finest parallelism level. The problem with it is that each macroblock has to encode the value of its DC coefficient as a difference to the DC coefficient of the I macroblock to its left, so making it very inefficient.

Slice level. Each MPEG frame is divided in rows of macroblocks (slices), each of which is encoded separately. It is a medium approach, and is a spatial parallelism approach.

Frame level. Each processor has to encode a different frame. It presents a basic flaw: frames in MPEG movies can be I frames, which are standalone JPEG-like frame encodings, and P and B frames, which are encoded using references to frames that fall in the past or the future. Because of this, the encoder, when compressing a B or a P frame, needs access not only to the current frame but also to one or two more (called penalty frames). This limits the degree of temporal parallelism: if one of the worse problems on MPEG encoding is the network capacity, you just do not want to overhead the system with penalty frames, so it is not recommendable to split up frames arbitrarily.

Group of Pictures Level. When encoding MPEG videos, you try to use the temporal redundancy of the images to save bits. This is achieved by motion vector searching in past/future frames. For that purpose, MPEG movies are divided into sets of frames, typically somewhere from 4-20 frames each, called Groups of Pictures. Frames in a single Group of Pictures (GOP) contain at most one reference to frames in other GoPs, so reducing the penalty frames to the minimum (in fact, with some frame-patterns like IBBPBBP, it will be zero). This is the coarsest approach.

Blackboard Architecture

One of the main ideas of this project is to change the parameters of the encoder on-the-fly. Basically, there are three main parameters on MPEG encoding which define the quality of a encoder:

Compression rate (bits used per pixel)
Quality (perhaps PSNR)
Speed (frames encoded per second)

These three factors should be as higher as possible, but an increment in one of them will produce a decrement in any or both of the others. There are some different knobs which permit the user to play with these three parameters. They are:

Factor of quantization (Q). Affects the compression rate and the quality
Search window. Affects compression rate, quality, and speed.
P-frame search. Affects compression rate, quality, and speed.
B-frame search. Affects compression rate, quality, and speed.
XXXXXXXXXXXXXX. Affects xxx, xxx and xxx.

In our implementation of the MPEG-2 encoder, we will use a blackboard architecture in which the user will be able to easily define a dynamic policy over these knobs: the input information will be the values of the three parameters in the previous encoded frames, and the output will be modifications on the knobs.

A Parallel MPEG-2 Encoder: Basic Issues We Need To Tackle (Why This Is a Hard Project)

Problems

Parallelization

Blackboard Architecture

A Parallel MPEG-2 Encoder:
Basic Issues We Need To Tackle
(Why This Is a Hard Project)