概念:
将具有共同时间基准的一个或多个PES组合(复合)而成的单一的数据流称为节目流(Program Stream)。
ES是直接从编码器出来的数据流,可以是编码过的视频数据流,音频数据流,或其他编码数据流的统称。ES流经过PES打包器之后,被转换成PES包。
构成:
PS包由包头、系统头、PES包3部分构成。包头由PS包起始码、系统时钟基准(SCR-System Clock Reference)的基本部分、SCR的扩展部分和PS复用速率4部分组成。
维基百科对应的图表(包头、系统头):
字节顺序,如下所示:
4B的包起始码:
byte 0 | byte 1 | byte 2 | byte 3 | ||||||||||||||||||||||||||||
7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
0000 0000 0000 0000 0000 0001start code | 1011 1010 PACK identifier |
PACK identifier -- 0xBA
系统时钟基准(SCR-System Clock Reference)的基本部分、SCR的扩展部分:
byte 4 | byte 5 | byte 6 | byte 7 | byte 8 | byte 9 | ||||||||||||||||||||||||||||||||||||||||||
7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
01 | SCR 32..30 | 1 | SCR 29..15 | 1 | SCR 14..00 | 1 | SCR_ext | 1 |
PS复用速率:
byte 10 | byte 11 | byte 12 | byte 13 | ||||||||||||||||||||||||||||
7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
Program_Mux_Rate | 1 | 1 | reserved | pack_stuffing_length |
- SCR and SCR_ext together are the System Clock Reference, a counter driven at 27MHz, used as a reference to synchronize streams. The clock is divided by 除以300 (to match the 90KHz clocks such as PTS/DTS), the quotient 商 is SCR (33 bits), the remainder 余数 is SCR_ext (9 bits)
- Program_Mux_Rate -- This is a 22 bit integer specifying the rate at which the program stream target decoder receives the Program Stream during the pack in which it is included. The value of program_mux_rate is measured in units of 50 bytes/second. The value 0 is forbidden.
- pack_stuffing_length -- A 3 bit integer specifying the number of stuffing bytes which follow this field.
- stuffing byte -- This is a fixed 8-bit value equal to '1111 1111' that can be inserted by the encoder, for example to meet the requirements of the channel. It is discarded by the decoder.
两个头之后便是PES包(payload):
参考一段代码理解:http://read.pudn.com/downloads104/sourcecode/multimedia/mpeg/427188/PESdecode/pesdecode.cpp__.htm
可以看到PTS/DTS(流识别码,用于区别不同性质ES)是打在PES包里面的,这两个参数是解决视音频同步显示,防止解码器输入缓存上溢或下溢的关键。PTS表示显示单元出现在系统目标解码器(STD: system target decoder)的时间,DTS表示将存取单元全部字节从STD的ES解码缓存器移走的时刻。每个I、P、B帧的包头都有一个PTS和DTS,但PTS与DTS对B帧都是一样的,无须标出B帧的DTS。对I帧和P帧,显示前一定要存储于视频解码器的重新排序缓存器中,经过延迟(重新排序)后再显示,一定要分别标明PTS和DTS。
关于音视频的同步:
除了PTS和DTS的配合工作外,还有一个重要的参数是SCR(system clock reference)。在编码的时候,PTS,DTS和SCR都是由STC(system time clock)生成的,在解码时,STC会再生,并通过锁相环路(PLL-phase lock loop),用本地SCR相位与输入的瞬时SCR相位锁相比较,以确定解码过程是否同步,若不同步,则用这个瞬时SCR调整27MHz的本地时钟频率。最后,PTS,DTS和SCR一起配合,解决视音频同步播放的问题。