为了把多媒体数据正确地发送到用户界面上.ppt_第1页
为了把多媒体数据正确地发送到用户界面上.ppt_第2页
为了把多媒体数据正确地发送到用户界面上.ppt_第3页
为了把多媒体数据正确地发送到用户界面上.ppt_第4页
为了把多媒体数据正确地发送到用户界面上.ppt_第5页
已阅读5页,还剩8页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、为了把多媒体数据正确地发送到用户界面上,同步在其中起着重要的作用。很难从人的主观感知角度这同步提供一个客观的度量标准。每个人的感知都不一样,只有一些启发性的标准可以决定一个媒体流的展现正确与否。 For delivering multimedia data correctly at the user interface, synchronization is essential. It is not possible to provide an objective measurement for synchronization from the viewpoint of subjective

2、human perception. As human perception varies from person to person, only heuristic criteria can determine whether a stream presentation is correct or not.,表现要求,口形同步要求 口形同步是指在人说话的情况下,音频与视频之间的时序关系。音频与视频的逻辑数据单元之间的时间偏差称为错切(shew),同步的媒体流之间的应该没有偏差。 图15.18给出实验室结果的概述,纵轴表示受试者发现同步错误的相对数目,但不管是滞后或提前,他们最初的假设是与不同视

3、图相关的三条曲线应该大不一样。但事实上并非如此(如图15.18所示)。,左:头像;中:正面半身;右:远景全身像 图15.17 图15.18三个不同视角发现同步错误的曲线 15.3.1 Lip synchronization refers to the temporal relationship between an audio and video stream for the particular case of humans speaking. The time difference between related audio and video LDUs is known as the s

4、kew. Figure 15.17: Left: head view; middle: shoulder view; right: body view. Figure 15.18 provides an overview of the results. The vertical axis denotes the relative number of test candidates who detected a synchronization error, regardless of being able to determine if the audio was before or after

5、 the video.,Figure 15.17: Left: head view; middle: shoulder view; right: body view.,指向同步要求 在计算机支持的协同工作环境中(CSCW),摄像机与麦克风通常与用户的工作站相连。在这个实现中,实现人员要观察一个包含有一些数据及相关图形的商务报告,所有受试人员有一个观察这些数据与图形的观察窗口。在讨论时,共享一个指针,使用这一指针说话者可以指向任一与讨论内容相关的图形,这就要求音频与远程指针的同步。,In a Computer-Supported Co-operative Work (CSCW) environm

6、ent, cameras and microphones are usually attached to the users workstations. In the next experiment, the experimenters looked at a business report that contained some data with accompanying graphics. All participants had a window with these graphics on their desktop where a shared pointer was used i

7、n the discussion. Using this pointer, speakers pointed out individual elements of the graphics which may have been relevant to the discussion taking place. This obviously required synchronization of the audio and remote telepointer.,实验人员设计了两类实验: 第一是对一般船的技术部件进行解释,指针指向正在讨论的区域(图15.21右边解释越短,同步的要求越高。实验人员

8、选择了一个使用很短单词的讲话速度很快的人。 实验人员的另一个实验是在地图上对航海路线进行解释(图15.21左边),这包括指针的连续移动。 从人的感知角度来看,指向同步与口形同步极不同。在接近同步的偏差值的情况下,它更难发现同步错误。口形同步错误的偏差值在40ms到160ms之间,对于指向同步,The experimenters conducted two experiments: The first was to explain some technical parts of a sailing boat, while a pointer located the area under disc

9、ussion(Figure15.21). The shorter the explanation, the more crucial the synchronization; therefore, the experimenters selected a fast-speaking person who used fairly short words. Additionally, the experimenters held a second experiment with the explanation of a traveling route on a map(Figure15.21,le

10、ft side). This involved the continuous movement of the pointer. From the human perception point of view, pointer synchronization is very different from lip synchronization as it is much more difficult to detect the “out of sync” error at skew values near the error-free case. While a lip synchronizat

11、ion error is a matter of discussion for skews between 40ms and 160ms, for a pointer.,基本的媒体同步 前面对口形同步进行研究人,下面对同步研究的结果作一个总结,给出较全面的同步要求。在数字化音频一出现时,就对专用硬件所容忍的跳跃范围进行了研究,Dannenberg给出了这些研究的文献与解释。在文献Ble78中,对16位音频质量中最大的不跳跃采样间隔是200ps。在文献Sto 72中,一些感知实验推荐的音频间隔是5到10ns,更进一步的实验Lic5,Woo51表明,由短暂的滴答声融合为连续的音调的最大间隔是2ms

12、(参见文献RM80),Lip synchronization and pointer synchronization were investigated due to inconsistent results from available sources. The following summarizes other synchronization result s to give a complete picture of synchronization requiremints. Since the beginning of digital audio, the jitter to be

13、tolerate by dedicated hardware has been studied. Dannenberg provided some references and explanations of these studies. InBle78, the maximum allowable jitter for 16-bit quality audio in a sample period is 200ps, which is the error equivalence to the magnitude of the LSB(Least-Significant Bit)of a fu

14、ll-level maximum-frequency 0-KHz signal. In Sto72, some perception experiments ,recommended an allowable jitter in an audio sample period between 5 and 10ns. Further perception experiments were carried out by Lic51 and Wood51, the maximum spacing of short clicks to obtain fusion into one continuous

15、tone was given at 2ms(as cited byRM80),一般的音频与视频的集成没有口形同步算法那么严格,对于舞蹈的多媒体教学软件,它可表现为由动画展现的伴随着音乐的舞步。使用多媒体交互能力,就可以一遍又一遍地观看舞蹈动作。在这个特定的例子中,音乐与动画之间的同步重要,经验表明,80ms的偏差值基本能满足用户的要求,不过,最具挑战性的课题是噪声事件和视频表达之间的关联(例如,两车的碰撞,这里我们用口形同步的相同约束,即80ms)。 双音道既可紧耦合,也可以松散耦合,合成的效果与其内容紧密相关,The combination of audio and animation is usually not as stringent as lip synchronization. A multimedia course on dancing, for example, could show the dancing steps as animated sequences with accompanying music. By making use of the interactive capabilities, individual sequences can be viewed over and over again. In this particular exa

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论