About the project
Google Summer of Code 2011
Hardware Accelerated VP8 Video Decoding for Gallium3D
The goal of this project is to write a Gallium3D state tracker capable of hardware accelerated VP8 video decoding through the VDPAU API. This would allow every graphic card with a Gallium3D driver (primary targets are r300g, r600g and Nouveau drivers) to be able to decode VP8 videos, and every VDPAU enabled multimedia software to play these videos.
Hardware acceleration will be built upon graphic card’s shaders units, able to take care of some heavy computations like motion compensation, intra-predictions, iDCT, deblocking filter.
Benefits to Community
Video decoding can be a heavy task even for today’s hardware. High definition videos need a great deal of computational power to achieve a smooth decoding experience, and low class CPU are sometimes unable to play videos at decent speed. When a CPU is not powerful enough, what is usually done is offloading the video decoding process to a dedicated chip living on-board the GPU. Unfortunately, these dedicated chips cannot be used by free software drivers, because of the complete lack of documentation available about them, and the only solution left is pure CPU-based decoding. Bringing generic hardware accelerated video decoding to free software drivers would be a great feature, allowing users to fully enjoy their video experiences.
VP8 has been chosen for this project because of its straightforward design and its openness. Unlike most existing video formats, no patents problems have to be expected when the time will come to merge the project with mesa master.
The VDPAU has been chosen because the API is widely in use across many multimedia softwares and has a solid reputation.
Implementation plan, with two major steps :
- Get a purely software implementation of a VP8 decoder to run into a proper gallium state tracker, and use it to play videos through various “VDPAU ready” softwares.
- The goal of this project is not to build a VP8 decoder from scratch, as that task would be more of an entire GSoC project, but rather to work on GPU optimizations, so the first step will be to port an existing VP8 decoder into a state tracker. My first choice is to work with libvpx, the VP8 reference implementation.
- Since VP8 is currently not supported by the official nVidia VDPAU implementation, multimedia software will need to be slightly patched in order to acknowledge the VP8 decoding ability of the state tracker.
- That step also includes making the state tracker ready to embed more video formats in a close future.
- Start implementing shader based optimizations, where they can be useful.
This part of the project will consist in building optimizations of the most shaders-friendly parts of the decoding process, including motion compensation, deblocking filter, intra-predictions and iDCT, by moving them from the CPU to the GPU shaders. These shaders programs will be written using the Tokenized Gallium Shader Instructions (TGSI) intermediate representation (IR).
The main advantage of using the GPU shaders to do video decoding is the provision of more computational power accessible in a generic way across a lot of different graphic cards and operating systems. This raw power can be dedicated to do some video decoding tasks, thus significantly offloading the CPU, leaving it free to perform other common tasks.
The trick is to properly manage the power of these shaders. They perform very well on vectorized arithmetic operations, and very poorly on logical operations (branches, loops, …). They don’t have proper cache memory and have very slow access to the main memory. Instead they can access memory areas stored on the video card, and read or write on these areas (but cannot read and write into the same memory area).
In order to achieve a significant speed gain, the different decoding tasks must be carefully divided into small independent and repetitive tasks operating on large sets of data, which can be simultaneously fed to several shader units.
The deliverables will correspond to the two major milestones of the project. The first one will be a functional Gallium3D state tracker able to decode VP8 videos. This is an important step in order to build more functionalities beyond the scope of this project. The second one will be a more elaborate version of the state tracker, using shaders to achieve faster video decoding speed.
Shaders accelerated MPEG2 decoding, using XvMC in a Gallium3d state tracker (previous GSoC work)
OpenCL accelerated VP8 decoder, using libvpx
OpenGL accelerated h264 decoder, using ffmpeg (previous GSoC work)
https://github.com/kasbah/gsoc#readme (an attempt to finish the project mentioned above)