




免费预览已结束,剩余1页可下载查看
下载本文档
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
Instance Normalization: The Missing Ingredient for Fast Stylization Dmitry Ulyanov Computer Vision Group Skoltech however, neither achieved as good results as the slower optimization-based method of Gatys et. al. In this paper we revisit the method for feed-forward stylization of Ulyanov et al. (2016) and show that a small change in a generator architecture leads to much improved results. The results are in fact of comparable quality as the slow optimization method of Gatys et al. but can be obtained in real time on standard GPU hardware. The key idea (section 2) is to replace batch normalization arXiv:1607.08022v3 cs.CV 6 Nov 2017 Figure 1: Artistic style transfer example of Gatys et al. (2016) method. (a) Content image.(b) Stylized image. (c) Low contrast content image. (d) Stylized low contrast image. Figure 2: A contrast of a stylized image is mostly determined by a contrast of a style image and almost independent of a content image contrast. The stylization is performed with method of Gatys et al. (2016). layers in the generator architecture with instance normalization layers, and to keep them at test time (as opposed to freeze and simplify them out as done for batch normalization). Intuitively, the normalization process allows to remove instance-specifi c contrast information from the content image, which simplifi es generation. In practice, this results in vastly improved images (section 3). 2Method The work of Ulyanov et al. (2016) showed that it is possible to learn a generator network g(x,z) that can apply to a given input image x the style of another x0, reproducing to some extent the results of the optimization method of Gatys et al. Here, the style image x0 is fi xed and the generator g is learned to apply the style to any input image x. The variable z is a random seed that can be used to obtain sample stylization results. The function g is a convolutional neural network learned from examples. Here an example is just a content image xt,t = 1,.,n and learning solves the problem min g 1 n n X t=1 L(x0,xt,g(xt,zt) 2 Figure 3: Row 1: content image (left), style image (middle) and style transfer using method of Gatys et. al (right). Row 2: typical stylization results when trained for a large number of iterations using fast stylization method from Ulyanov et al. (2016): with zero padding (left), with a better padding technique (middle), with zero padding and instance normalization (right). where zt N(0,1) are i.i.d. samples from a Gaussian distribution. The loss L uses a pre-trained CNN (not shown) to extracts features from the style x0image, the content image xt, and the stylized image g(xt,zt), and compares their statistics as explained before. While the generator network g is fast, the authors of Ulyanov et al. (2016) observed that learning it from too many training examples yield poorer qualitative results. In particular, a network trained on just 16 example images produced better results than one trained from thousands of those. The most seriousartifactswerefoundalongtheborderoftheimageduetothezeropaddingaddedbeforeevery convolution operation (see fi g. 3). Even by using more complex padding techniques it was not pos- sible to solve this issue. Ultimately, the best results presented in Ulyanov et al. (2016) were obtained using a small number of training images and stopping the learning process early. We conjectured that the training objective was too hard to learn for a standard neural network architecture. A simple observation is that the result of stylization should not, in general, depend on the contrast of the content image (see fi g. 2). In fact, the style loss is designed to transfer elements from a style image to the content image such that the contrast of the stylized image is similar to the contrast of the style image. Thus, the generator network should discard contrast information in the content image. The question is whether contrast normalization can be implemented effi ciently by combining standard CNN building blocks or whether, instead, is best implemented directly in the architecture. The generators used in Ulyanov et al. (2016) and Johnson et al. (2016) use convolution, pooling, upsampling, and batch normalization. In practice, it may be diffi cult to learn a highly nonlinear contrast normalization function as a combination of such layers. To see why, let x RTCWH be an input tensor containing a batch of T images. Let xtijkdenote its tijk-th element, where k and j span spatial dimensions, i is the feature channel (color channel if the input is an RGB image), and t is the index of the image in the batch. Then a simple version of contrast normalization is given by: ytijk= xtijk PW l=1 PH m=1xtilm .(1) It is unclear how such as function could be implemented as a sequence of ReLU and convolution operator. 3 On the other hand, the generator network of Ulyanov et al. (2016) does contain a normalization layers, and precisely batch normalization ones. The key difference between eq. (1) and batch nor- malization is that the latter applies the normalization to a whole batch of images instead for single ones: ytijk= xtijk i p2 i + ? ,i= 1 HWT T X t=1 W X l=1 H X m=1 xtilm,2 i = 1 HWT T X t=1 W X l=1 H X m=1 (xtilmmui)2. (2) In order to combine the effects of instance-specifi c normalization and batch normalization, we pro- pose to replace the latter by the instance normalization (also known as “contrast normalization”) layer: ytijk= xtijk ti p2 ti+ ? ,ti= 1 HW W X l=1 H X m=1 xtilm,2 ti= 1 HW W X l=1 H X m=1 (xtilm muti)2.(3) We replace batch normalization with instance normalization everywhere in the generator network g. This prevents instance-specifi c mean and covariance shift simplifying the learning process. Differ- ently from batch normalization, furthermore, the instance normalization layer is applied at test time as well. 3Experiments In this section, we evaluate the effect of the modifi cation proposed in section 2 and replace batch nor- malization with instance normalization. We tested both generator architectures described in Ulyanov et al. (2016) and Johnson et al. (2016) in order too see whether the modifi cation applies to different architectures. While we did not have access to the original network by Johnson et al. (2016), we carefully reproduced their model from the description in the paper. Ultimately, we found that both generator networks have similar performance and shortcomings (fi g. 5 fi rst row). Next, the replaced batch normalization with instance normalization and retrained the generators using the same hyperparameters. We found that both architectures signifi cantly improved by the use of instance normalization (fi g. 5 second row). The quality of both generators is similar, but we found the residuals architecture of Johnson et al. (2016) to be somewhat more effi cient and easy to use, so we adopted it for the results shown in fi g. 4. 4Conclusion In this short note, we demonstrate that by replacing batch normalization with instance normalization it is possible to dramatically improve the performance of certain deep neural networks for image generation. The result is suggestive, and we are currently experimenting with similar ideas for image discrimination tasks as well. References Gatys, L. A., Ecker, A. S., and Bethge, M. (2016). Image style transfer using convolutional neural networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Johnson, J., Alahi, A., and Li, F. (2016). Perceptual losses for real-time style transfer and super-
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2025年执业药师资格证之《西药学专业一》预测试题附答案详解ab卷
- 江苏城市职业学院《新药设计》2023-2024学年第二学期期末试卷
- 首都经济贸易大学《广播电视艺术概论》2023-2024学年第二学期期末试卷
- 广州体育职业技术学院《西方礼仪文化》2023-2024学年第二学期期末试卷
- 云南城市建设职业学院《动画综合创作》2023-2024学年第二学期期末试卷
- 漳州职业技术学院《工程项目管理》2023-2024学年第二学期期末试卷
- 郑州理工职业学院《工程计量》2023-2024学年第二学期期末试卷
- 快速找到最佳候选人人才招聘流程
- 亲子阅读绘本视频企业制定与实施新质生产力项目商业计划书
- 营养炖品自动售卖机企业制定与实施新质生产力项目商业计划书
- GB/T 23932-2009建筑用金属面绝热夹芯板
- 北京开放大学工具书与文献检索形成性考核1答案-答案
- 初中地理会考试卷
- 清华大学抬头信纸
- Unit 2 Lesson 1 Money vs Success 课件 高中英语新北师大版性选择必修第一册(2022-2023学年)
- 天津大学年《仪器分析实验》期末试题及答案
- 特种设备风险分级管控清单(叉车)
- 《创新创业实践》课程思政教学案例(一等奖)
- 项目激励管理制度
- 核酸的降解与核苷酸代谢课件
- T∕CGMA 033001-2018 压缩空气站能效分级指南
评论
0/150
提交评论