A Parallel MPEG-2 Encoder:
Basic Issues We Need To Tackle
(Why This Is a Hard Project)
Problems
There are two major reasons that MPEG-2 encoding is difficult at good
speed:
- Computation. MPEG-2 encoding, like MPEG-1 encoding (to which
it is very similar), involves a great deal of
computation. By far the most
time-consuming part of the process is motion vector search, in which a
given block of the image is compared against many other blocks to find
the closest match. In the workstatons of NOW (HP 735's, 32 Sun SPARCstation
10's and 20's, and UltraSPARC's), you can get values close to XXX
macroblocks encoded per second (reading and writing the MBs from memory).
The only way to solve this problem is ``go parallel''.
- Input. Raw digital video is simply enormous; we plan
to work with 1920x1152 pixel resolution frames, so obtaining 26.5 Mb per
frame (12 bits YUV). As the objective is to encode real-time video (30 fps),
this means that we need to move around 0.8 Gbits/second. This data rate
will completely overwhelm traditional Ethernet (10 Mbps) or even Fast
Ethernet (100 Mbps); Gigabit Ethernet comes closer, but is still unlikely
to give us the bandwidth we need. The network use by the Berkeley NOW's
project is Myrinet, whose rate is close to the needed, but may not have
the adequate bandwidth, so we have to solve this problem.
- As a result, it becomes obvious that our input cannot be coming
from disk, since one-tenth of the required speed is considered very fast
for a disk; however, as high-end digital cameras and VTRs are more than
capable of providing these data rates, we want to obviate this problem, so
we will simulate the enormous throughput in the disk.
Parallelization
Of course, the whole point of this project is to parallelize this process
to make it easier. There are basically four different approaches to the
parallelization:
- Macroblock level. Each MPEG frame is divided into 16x16 groups
of pixels called macroblocks, each of which is encoded separately.
It is the finest parallelism level.
The problem with it is that each macroblock has to encode the value of
its DC coefficient as a difference to the DC coefficient of the I macroblock
to its left, so making it very inefficient.
- Slice level. Each MPEG frame is divided in rows of macroblocks
(slices), each of which is encoded separately. It is a medium
approach, and is a spatial parallelism approach.
- Frame level. Each processor has to encode a different frame.
It presents a basic flaw:
frames in MPEG movies can be I frames, which are standalone JPEG-like
frame encodings, and P and B frames, which are encoded using references
to frames that fall in the past or the future. Because of this, the encoder,
when compressing a B or a P frame, needs access not only to the current
frame but also to one or two more (called penalty frames). This
limits the degree of temporal parallelism: if one of the worse problems
on MPEG encoding is the network capacity, you just do not want to overhead
the system with penalty frames, so it is not recommendable to split up frames
arbitrarily.
- Group of Pictures Level. When encoding MPEG videos, you try
to use the temporal redundancy of the images to save bits. This is
achieved by motion vector searching in past/future frames. For that
purpose, MPEG movies are divided into sets of frames, typically somewhere
from 4-20 frames each, called Groups of Pictures. Frames in a
single Group of Pictures (GOP) contain at most one reference to frames
in other GoPs, so reducing the penalty frames to the minimum (in fact,
with some frame-patterns like IBBPBBP, it will be zero). This is the
coarsest approach.
Blackboard Architecture
One of the main ideas of this project is to change the
parameters of the encoder on-the-fly. Basically, there
are three main parameters on MPEG encoding which define
the quality of a encoder:
- Compression rate (bits used per pixel)
- Quality (perhaps PSNR)
- Speed (frames encoded per second)
These three factors should be as higher as possible, but an increment in one
of them will produce a decrement in any or both of the others.
There are some different knobs which permit the user to play with
these three parameters. They are:
- Factor of quantization (Q). Affects the compression rate and the quality
- Search window. Affects compression rate, quality, and speed.
- P-frame search. Affects compression rate, quality, and speed.
- B-frame search. Affects compression rate, quality, and speed.
- XXXXXXXXXXXXXX. Affects xxx, xxx and xxx.
In our implementation of the MPEG-2 encoder, we will use a blackboard
architecture in which the user will be able to easily define a
dynamic policy over these knobs: the input information will be the
values of the three parameters in the previous encoded frames, and
the output will be modifications on the knobs.