I've put some more work into getting P and B frames rendering correctly and things are proceeding very well. Currently texturing from the reference frame works for P frames. All I have to do is add the differentials, which is a little tricky. The problem is that differentials are 9 bits, which means that in an A8L8 texture we get 8 bits in the L channel and 1 bit in the A channel. This shouldn't be too hard, just a bit of arithmetic in the pixel shader code. A more interesting problem is dealing with field-based surfaces, both when rendering and when using them in motion prediction. There's no straightforward way to render to even/odd scanlines on conventional hardware, so this will require some special attention. Currently I'm thinking I will have to render line lists instead of triangle lists when a macroblock uses field-based motion prediction and for rendering even/odd scanlines.
Here are some images from mpeg2play_accel, which I've been using as a test program:
Initial I-frame of the video.
Next frame, only P macroblocks using frame-based motion prediction are currently displayed, the rest are skipped.
Next frame, more macroblocks are rendered, and it looks mostly correct, except for the fine details. This is because the differentials are not taken into account yet.
Next frame, a few more unhandled macroblocks in this one.