I reached a nice milestone today: working playback on my Geforce 6200. Most of the work went into the winsys layer, with some bug fixes and workarounds in other places, but everything is up and running now. Unfortunately the output isn't perfect, there is some slight corruption here and there. I'm guessing it has to do with some dodgy assumptions I made about shader arithmetic (rounding, saturation, etc) that SoftPipe went along with but the GPU didn't. The other issue is that there is some severe slowdown when any 2D drawing happens on the rest of the desktop. I'm guessing this may be due locking when copying the backbuffer to the window, or maybe I'm completely soaking up the CPU.
Currently nothing is optimized, I'm not even turning on compiler optimization, and I have a really slow prototype IDCT implementation performed on the CPU in place of the hardware version, so I'm sure I'm eating up a lot more CPU time than I will be by the end of the summer. I have a lot of different ideas on optimization that will target CPU usage and GPU fillrate usage, but given that I get almost full speed playback currently, I'm pretty confident that I'll be able to get HD playback by the end of SoC.
As far as the winsys goes, I was able to use most of the current Nouveau winsys. Unfortunately the DRI stuff is buried within Mesa right now, so I had to extract a lot of things and create a standalone library to handle screens, drawables, the SAREA, etc. to be able to use DRI without including and linking with half of Mesa. The winsys interface is also simpler than Mesa's; there are only a few client calls, the backbuffer is handled in the state tracker, and the winsys doesn't have to create or call into the state tracker. It took me a while to realize why the Mesa winsys was set up the way it was, and that I could simpify things on my end.
Here are some screen grabs: