




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、第12章 mpeg视频编码iislide 1 目录mpeg-4概述可视对象编码合成对象编码mpeg-4 overviewslide 3mpeg-4 可视对象编码的特点综合性:自然音视频对象与合成音视频对象的集成交互性:选择播放,超链等高效率的压缩编码:1/51/10的mpeg2码率,几乎相同的质量mpeg-4可视对象的编码slide 5第1代视频编码the smallest entity in a picture is a pixel with its associated texture (color), and motionmessage to be coded for every pix
2、el: texture (color) + motionslide 6第1代视频编码的不足 与人的视觉本质不同 不易控制场景中的不同对象潜力有限slide 7第2代视频编码将一个场景分为一系列组成对象,对每个对象分别编码slide 8第2代视频编码slide 9第2代视频编码the smallest entity in a picture is an object with its associated shape, texture (color), and motionmessage to be coded for every pixel: shape + texture (color) +
3、 motionslide 10mpeg-4的音视场景slide 11mpeg-4音视场景的描述在mpeg-4中,音视场景采用基于对象的描述方式,场景由媒体对象以层次方式组合而成(树),叶节点是初级(primitive) 媒体对象,例如: 静止图像 (固定不变的背景), 视频对象 (没有背景的说话人) 音频对象 (说话人所发出的声音); 其他,如文本和图形. 初级媒体对象可以是自然的,也可以是人造(合成)的, 可以是 2维,也可以是3维. 使用bifs的(binary format for scenes)语言来对场景的组成、场景中的音视对象的时空关系进行描述slide 12mpeg-4的音视场景
4、假想的观察者位置视频复合投影平面场景坐标系用户输入下载的数据/控制复合流上载的数据/控制复合流场景人2d背景家具演示地球仪讲台声音教师(场景的逻辑结构)slide 13mpeg-4 场景描述的优点可以集成各种对象,无缝地集成自然媒体(源于麦克风、摄象机等)与人造媒体(计算机生成) 、实时信息与存储信息, av0可以是单双多声道音频信息、单双多镜头2d3d视频信息。提供更强的交互能力,场景中的对象(人、桌子、地球仪、白扳、人的声音)以及多媒体演示声音均作为单个对象而独立编码,用户可以有选择地与其中某(几)个对象交互。具有良好的重用性,可重新组合音视对象 avo (audio visual obj
5、ect)构造新场景。slide 14bifs 示例示例slide 15mpeg-4视频流结构视觉对象序列(vs:visual object sequence)视频对象(vo:video object)视频对象层(vol:video object layer)视频对象平面组(gov: group of vop)视频对象平面(vop:video object plane)slide 16vop的编码vop的描述:形状(shape)、运动(motion)、纹理(texture)。muxbuffertexturecodingmotioncompensationmotionestimationprevi
6、ous reconstruction vop+-shapecoding vop of arbitrary shape vop of arbitrary shape shape infomotion infotexture info输入vopvop编码器slide 17基于vop的运动补偿mc-based vop coding in mpeg-4 again involves three steps: motion estimation. mc-based prediction. coding of the prediction error. only pixels within the vop
7、 of the current (target) vop are considered for matching in mc.to facilitate mc, each vop is divided into many macro blocks (mbs). mbs are by default 1616 in luminance images and 88 in chrominance images.slide 18motion compensationslide 19padding an example of repetitive padding in a boundary macrob
8、lock of a reference vop: (a) original pixels within the vop, (b) after horizontal repetitive padding, (c) followed by vertical repetitive padding.slide 20motion vector目标vop中的每个宏块在参考vop中寻找一个最佳匹配宏块。n the size of the mb. map(p; q) = 1 when c(p; q) is a pixel within the target vop, otherwise map(p; q) =
9、 0.运动矢量编码与h.263类似,采用预测编码1010),(),(),(),(nknllykxmapljykixrlykxcjisadslide 21texture codingtexture coding in mpeg-4 can be based on: dct or shape adaptive dct (sa-dct).i. texture coding based on dct in i-vop, the gray values of the pixels in each mb of the vop are directly coded using the dct followe
10、d by vlc, similar to what is done in jpeg. in p-vop or b-vop, mc-based coding is employed it is the prediction error that is sent to dct and vlc.slide 22texture coding(cont.)coding for the interior mbs: each mb is 1616 in the luminance vop and 88 in the chrominance vop. prediction errors from the si
11、x 88 blocks of each mb are obtained after the conventional motion estimation step.coding for boundary mbs: for portions of the boundary mbs in the target vop outside of the vop, zeros are padded to the block sent to dct since ideally prediction errors would be near zero inside the vop. after mc, tex
12、ture prediction errors within the target vop are obtained.slide 23shape adaptive dct7,.,0)(16) 12(cos2)()(70uifuiucufi10)(2) 12(cos)(2)(niifnuiucnufn优点:不会产生多余的系数n缺点:需要额外的模板记录最初的形状nshape adaptive dct (sa-dct) is another texture coding method for boundary mbs.n due to its efctiveness, sa-dct has been
13、adopted for coding boundary mbs in mpeg-4 version 2.slide 24 shape adaptive dct(cont.)slide 25shape codingmpeg-4 supports two types of shape information, binary and gray scale.binary shape information can be in the form of a binary map (also known as binary alpha map) that is of the size as the rect
14、angular bounding box of the vop. a value 1 (opaque) or 0 (transparent) in the bitmap indicates whether the pixel is inside or outside the vop. alternatively, the gray-scale shape information actually refers to the transparency of the shape, with gray values ranging from 0 (completely transparent) to
15、 255 (opaque).slide 26分割出来的前景图像作为一个任意形状的vo进行编码只在视频序列的第1帧画面时传输1次,保存在背景缓冲器中, 此后仅仅传输描述镜头运动的8个参数sprite coding在编码前从一系列的视频画面中把背景图像抽出并拼合而成使用8个参数,对背景进行仿射变换,重建出每一帧画面的背景mpeg-4的合成对象编码slide 282d mesh codinguniformmeshdelaunayslide 29coding of delaunay triangulationexcept for the first location (x0, y0), all sub
16、sequent coordinates are coded differentially that is, for n1,dxn = xn xn1; dyn=ynyn1; and afterward, dxn, dyn are variable-length coded.slide 302d mesh motion codinga new mesh structure can be created only in the intra-frame, and its triangular topology will not alter in the subsequent inter-frames enforces a one-to-one mapping in 2d mesh motion estimation.for any mop triangle (pi, pj, pk), if the motion vectors for pi and pj are known to be mvi and mvj, then a prediction predk will be made for the motion vector of pk and this is rounded to
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 豆制品加工考试及答案
- 2025年北京服装考试真题及答案
- 2025年护理搬运考试题目及答案
- 慢性鼻窦炎临床路径
- 2025年陈列培训考试题目及答案
- 金安中学周考试卷及答案
- 情景识字课件
- 书法生篆书考试题及答案
- 税务面试考试原题及答案
- 药剂师考试处方题及答案
- 2025年月度工作日历含农历节假日电子表格版
- GB/T 27697-2024立式油压千斤顶
- 建筑结构选型课程设计
- 无人机航拍技术
- 癫痫患者的急救护理
- 国家心理健康和精神卫生防治中心招聘笔试真题2023
- 配电室改造施工方案
- 《生物经济学》课程教学大纲
- 选矿厂安全培训教材
- DB3305-T 119-2019公路沥青混合料拌和站建设规范
- 财经法规与会计职业道德(经管类专业)全套教学课件
评论
0/150
提交评论