




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、Multi-marginal Wasserstein GAN Jiezhang Cao1, Langyuan Mo1, Yifan Zhang1, Kui Jia1, Chunhua Shen3, Mingkui Tan1,2 1South China University of Technology,2Peng Cheng Laboratory,3The University of Adelaide secaojiezhang, selymo, sezyifan mingkuitan, kuijia, .au Abstract Multiple
2、 marginal matching problem aims at learning mappings to match a source domain to multiple target domains and it has attracted great attention in many applications, such as multi-domain image translation. However, addressing this problem has two critical challenges: (i) Measuring the multi-marginal d
3、istance among different domains is very intractable; (ii) It is very diffi cult to exploit cross-domain correlations to match the target domain distributions. In this paper, we propose a novel Multi-marginal Wasserstein GAN (MWGAN) to minimize Wasserstein distance among domains. Specifi cally, with
4、the help of multi-marginal optimaltransporttheory, wedevelopanewadversarialobjectivefunctionwithinner- and inter-domain constraints to exploit cross-domain correlations. Moreover, we theoretically analyze the generalization performance of MWGAN, and empirically evaluate it on the balanced and imbala
5、nced translation tasks. Extensive experiments on toy and real-world datasets demonstrate the effectiveness of MWGAN. 1Introduction Multiple marginal matching (M3) problem aims to map an input image (source domain) to multiple target domains (see Figure 1(a), and it has been applied in computer visio
6、n,e.g.,multi-domain image translation 10,23,25. In practice, the unsupervised image translation 30 gains particular interest because of its label-free property. However, due to the lack of corresponding images, this task is extremely hard to learn stable mappings to match a source distribution to mu
7、ltiple target distributions. Recently, some methods 10, 30 address M3problem, which, however, face two main challenges. First, existing methods often neglect to jointly optimize the multi-marginal distance among domains, which cannot guarantee the generalization performance of methods and may lead t
8、o distribution mismatching issue. Recently, CycleGAN 49 and UNIT 32 repeatedly optimize every pair of two different domains separately (see Figure 1(b). In this sense, they are computationally expensive and may have poor generalization performance. Moreover, UFDN 30 and StarGAN 10 essentially measur
9、e the distance between an input distribution and a mixture of all target distributions (see Figure 1(b). As a result, they may suffer from distribution mismatching issue. Therefore, it is necessary to explore a new method to measure and optimize the multi-marginal distance. Second, it is very challe
10、nging to exploit the cross-domain correlations to match target domains. Existing methods 49,30 only focus on the correlations between the source and target domains, since they measure the distance between two distributions (see Figure 1(b). However, these methods often ignore the correlations among
11、target domains, and thus they are hard to fully capture information to improve the performance. Moreover, when the source and target domains are signifi cantly different, or the number of target domains is large, the translation task turns to be diffi cult for existing methods to exploit the cross-d
12、omain correlations. Authors contributed equally. Corresponding author. 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada. target distribution target distribution generated distribution generated distribution 1t 2t source distribution s 1 2 match match (a) The
13、 EdgeCelebA image translation task. CycleGANStarGAN and UFDNMWGAN s s 1 1 2 2 1 1t 2 2t (b) Comparisons of different distribution measures. Figure1: AnexampleofM3problemandcomparisonsofexistingmethods. (a)FortheEdgeCelebA task, we aim to learn mappings to match a source distribution (i.e.,Edge image
14、s) to the target distributions (i.e.,black and blond hair images). (b) Left: we employ CycleGAN multiple times to measure the distance between every generated distribution and its corresponding target distribution. Middle: StarGAN and UFDN measure the distance betweenPsand a mixed distribution ofP1a
15、nd P2; Right: MWGAN jointly measures Wasserstein distance amongPs,P1andP2. (Dotted circle: the generated distributions, solid circle: the real source or target distributions, double-headed arrow: distribution divergence, different colors represent different domains.) In this paper, we seek to use mu
16、lti-marginal Wasserstein distance to solve M3problem, but directly optimizing it is intractable. Therefore, we develop a new dual formulation to make it tractable and propose a novel multi-marginal Wasserstein GAN (MWGAN) by enforcing inner- and inter-domain constraints to exploit the correlations a
17、mong domains. The contributions of this paper are summarized as follows: We propose a novel GAN method (called MWGAN) to optimize a feasible multi-marginal distance among different domains. MWGAN overcomes the limitations of existing methods by alleviating the distribution mismatching issue and expl
18、oiting cross-domain correlations. We defi ne and analyze the generalization of our proposed method for the multiple domain transla- tion task, which is more important than existing generalization analyses 13,36 studying only on two domains and non-trivial for multiple domains. We empirically show th
19、at MWGAN is able to solve the imbalanced image translation task well when the source and target domains are signifi cantly different. Extensive experiments on toy and real-world datasets demonstrate the effectiveness of our proposed method. 2Related Work Generative adversarial networks (GANs).Deep n
20、eural networks have theoretical and experimental explorations 7,21,46,47,51. In particular, GANs 17 have been successfully applied in computer vision tasks, such as image generation 3,6,18,20, image translation 2,10,19 and video prediction 35 . Specifi cally, a generator tries to produce realistic s
21、amples, while a discriminator tries to distinguish between generated data and real data. Recently, some studies try to improve the quality 5,9,26 and diversity 42 of generated images, and improve the mechanism of GANs 1,11,38,39 to deal with the unstable training and mode collapse problems. Multi-do
22、main image translation.M3problem can be applied in domain adaptation 43 and image translation 27,50. CycleGAN 49, DiscoGAN 28, DualGAN 45 and UNIT 32 are proposed to address two-domain image translation task. However, in Figure 1(b), these methods measure the distance between every pair of distribut
23、ions multiple times, which is computationally expensive when applied to the multi-domain image translation task. Recently, StarGAN 10 and AttGAN 23 use a single model to perform multi-domain image translation. UFDN 30 translates images by learning domain-invariant representation for cross-domains. E
24、ssentially, the above three methods are two-domain image translation methods because they measure the distance between an input distribution and a uniform mixture of other target distributions (see Figure 1(b). Therefore, these methods may suffer from distribution mismatching issue and obtain mislea
25、ding feedback for updating models when the source and target domains are signifi cantly different. In addition, we discuss the difference between some GAN methods in Section I in supplementary materials. 2 3 Problem Defi nition Notation.We use calligraphic letters (e.g.,X) to denote space, capital l
26、etters (e.g.,X) to denote random variables, and bold lower case letter (e.g.,x) to denote the corresponding values. Let D=(X,P)be the domain,Porbe the marginal distribution overXandP(X)be the set of all the probability measures overX. For convenience, letX=Rd, and letI=0,.,NandN=1,.,N. Multiple marg
27、inal matching (M3) problem.In this paper, M3problem aims to learn mappings to match a source domain to multiple target domains. For simplicity, we consider one source domain Ds=X,PsandNtarget domainsDi=X,Pti,iN, wherePsis the source distribution, and Ptiis thei-th real target distribution. Letgi,iNb
28、e the generative models parameterized byi, andPibe the generated distribution in thei-th target domain. In this problem, the goal is to learn multiple generative models such that each generated distributionPiin thei-th target domain can be close to the corresponding real target distribution Pti(see
29、Figure 1(a). Optimal transport (OT) theory.Recently, OT 41 theory has attracted great attention in many applications 3,44. Directly solving the primal formulation of OT 40 might be intractable 16. To address this, we consider the dual formulation of the multi-marginal OT problem as follows. Problem
30、I (Dual problem 40)GivenN+1marginalsiP(X), potential functionsfi,iI, and a cost functionc(X(0),.,X(N) : Rd(N+1)R , the dual Kantorovich problem can be defi ned as: W(0,.,N) = sup fi X i Z fi ? X(i) ? di ? X(i) ? , s.t. X i fi ? X(i) ? c ? X(0),.,X(N) ? .(1) In practice, we optimize the discrete case
31、 of Problem I. Specifi cally, given samplesx(0) j jJ0and x(i) j jJidrawn from source domain distributionPsand generated target distributionsPi,iN, respectively, where Jiis an index set and ni=|Ji| is the number of samples, we have: Problem II (Discrete dual problem)LetF=f0,.,fNbe the set of Kantorov
32、ich potentials, then the discrete dual problem h(F) can be defi ned as: max F h(F) = X i 1 ni X jJi fi ? x(i) j ? ,s.t. X i fi ? x(i) ki ? c ? x(0) k0,.,x (N) kN ? , ki ni.(2) Unfortunately, it is challenging to optimize Problem II due to the intractable inequality constraints and multiple potential
33、 functions. To address this, we seek to propose a new optimization method. 4Multi-marginal Wasserstein GAN 4.1A New Dual Formulation For two domains, WGAN 3 solves Problem II by setting potential functions asf0=fandf1=f. However, it is hard to extend WGAN to multiple domains. To address this, we pro
34、pose a new dual formulation in order to optimize Problem II. To this end, we use a shared potential in Problem II, which is supported by empirical and theoretical evidence. In the multi-domain image translation task, the domains are often correlated, and thus share similar properties and differ only
35、 in details (see Figure 1(a). The cross-domain correlations can be exploited by the shared potential function (see Section J in supplementary materials). More importantly, the optimal objectives of Problem II and the following problem can be equal under some conditions (see Section B in supplementar
36、y materials). Problem IIILetF=0f,.,Nf be Kantorovich potentials, then we defi ne dual problem as: max F h(F) = X i i ni X jJi f ? x(i) j ? ,s.t. X i if ? x(i) ki ? c ? x(0) k0,.,x (N) kN ? ,kini. (3) To further build the relationship between Problem II and Problem III, we have the following theorem
37、so that Problem III can be optimized well by GAN-based methods (see Subsection 4.2). Theorem 1Suppose the domains are connected, the cost functioncis continuously differentiable and eachiis absolutely continuous. If(f0,.,fN)and(0f,.,Nf)are solutions to Problem I, then there exist some constants ifor
38、 each i I such that P ii = 0 and fi= if + i. Remark 1From Theorem 1, if we train a shared functionfto obtain a solution of Problem I, we have an equivalent Wasserstein distance,i.e., P ifi= P iif regardless of whatever the valueiis. Therefore, we are able to optimize Problem III instead of intractab
39、le Problem II in practice. 3 Algorithm 1 Multi-marginal WGAN. Input:Training dataxjn0 j=1in the initial domain, x (i) j ni j=1in thei-th target domain; batch sizembs; the number of iterations of the discriminator per generator iteration ncritic; Uniform distribution U0,1. Output: The discriminator f
40、, the generators giiN and the classifi er 1: while not converged do 2:for t = 0,.,ncriticdo 3:Sample xPsand xPi,i, and x x + (1 ) x, where U0,1 4:Update f by ascending the gradient: w E xPsf (x) P i + iE xPif ( x)+R(f) 5: Update classifi er by descending the gradient vC() 6:end for 7:Update each gen
41、erator giby descending the gradient: i+ iE xPif ( x) M(gi) 8: end while 4.2Proposed Objective Function To minimize Wasserstein distance among domains, we now present a novel multi-marginal Wasserstein GAN (MWGAN) based on the proposed dual formulation in (3). Specifi cally, let F=f:RdRbe the class o
42、f discriminators parameterized byw, andG=g:RdRdbe the class of generators andgiGis parameterized byi. Motivated by the adversarial mechanism of WGAN, let 0=1 and i:=+ i , + i 0,iN, then Problem III can be rewritten as follows: Problem IV (Multi-marginal Wasserstein GAN)Given a discriminatorfFand gen
43、erators gi G,iN, we can defi ne the following multi-marginal Wasserstein distance as W ? Ps,P1,.,PN ? = max f Ex Psf(x) X i + iE xPif ( x), s.t.PiDi,f.(4) wherePsis the real source distribution, and the distributionPiis generated bygiin thei-th domain, =f|f(x) P iN + i f( x(i)c(x, x(1),., x(N),fFwit
44、hxPsand x(i)Pi, iN. In Problem IV, we refer toPiDi,iNasinner-domain constraintsandfasinter-domain constraints (See Subsections 4.3 and 4.4). The infl uence of these constraints are investigated in Section N of supplementary materials. Note that+ i refl ects the importance of thei-th target domain. I
45、n practice, we set+ i =1/N,iNwhen no prior knowledge is available on the target domains. To minimize Problem IV, we optimize the generators with the following update rule. Theorem 2If each generatorgiG,iNis locally Lipschitz (see more details of Assumption 1 3), then there exists a discriminatorfto
46、Problem IV, we have the gradientiW(Ps,P1,.,PN) = + i Ex Psif(gi(x) for all i ,iN when all terms are well-defi ned. Theorem 2 provides a good update rule for optimizing MWGAN. Specifi cally, we fi rst train an optimal discriminatorfand then update each generator along the direction ofEx Psif(gi(x). T
47、he detailed algorithm is shown in Algorithm 1. Specifi cally, the generators cooperatively exploit multi-domain correlations (see Section J in supplementary materials) and generate samples in the specifi c target domain to fool the discriminator; the discriminator enforces generated data in target d
48、omains to maintain the similar features from the source domain. 4.3Inner-domain Constraints In Problem IV, the distributionPigenerated by the generatorgishould belong to thei-th domain for any i. To this end, we introduce an auxiliary domain classifi cation loss and the mutual information. Domain cl
49、assifi cation loss.Given an inputx:=x(0)and generatorgi, we aim to translate the input xto an output x(i) which can be classifi ed to the target domainDicorrectly. To achieve this goal, we introduce an auxiliary classifi er: XYparameterized byv to optimize the generators. Specifi cally, we label rea
50、l dataxPtias1, wherePtiis an empirical distribution in thei-th target domain, and we label generated data x(i)Pi as 0. Then, the domain classifi cation loss w.r.t. can be defi ned as: C() = Ex0 PtiPi (x0),y),(5) whereis a hyper-parameter,yis corresponding tox0, and(,) is a binary classifi cation los
51、s, such as hinge loss 48, mean square loss 34, cross-entropy loss 17 and Wasserstein loss 12. 4 Mutual information maximization. After learning the classifi er, we maximize the lower bound of the mutual information 8, 23 between the generated image and the corresponding domain, i.e., M(gi) = Ex Ps h
52、 log ? y(i)=1 ? ? ?g i(x) ?i .(6) By maximizing the mutual information in (6), we correlate the generated imagegi(x)with thei-th domain, and then we are able to translate the source image to the specifi ed domain. 4.4Inter-domain Constraints Then, we enforce the inter-domain constraints in Problem I
53、V,i.e.,the discriminatorfF. One can let discriminator be1-Lipschitz continuous, but it may ignore the dependency among domains (see Section H in supplementary materials). Thus, we relax the constraints by the following lemma. Lemma 1 (Constraints relaxation)If the cost functionc()is measured by2norm
54、, then there exists Lf1 such that the constraints in Problem IV satisfy P i|f(x)f( x (i)|/kx x(i)kLf. Note thatLfmeasures the dependency among domains (see Section G in supplementary materials). In practice,Lfcan be calculated with the cost function, or treated as a tuning parameter for simplicity.
55、Inter-domain gradient penalty.In practice, directly enforcing the inequality constraints in Lemma 1 would have poor performance when generated samples are far from real data. We thus propose the following inter-domain gradient penalty. Specifi cally, given real dataxin the source domain and generate
56、d samples x(i), if x(i)can be properly close tox, as suggested in 37, we can calculate its gradient and introduce the following regularization term into the objective of MWGAN, i.e., R(f) = ?X i E x(i) Qi ? ? ?f ? x(i) ? ? ?Lf ?2 + ,(7) where()+=max0,is a hyper-parameter, x(i)is sampled betweenxand
57、x(i), and Qi,iN is a constructed distribution relying on some sampling strategy. In practice, one can construct a distribution where samples x(i)can be interpolated between real dataxand generated data x(i)for every domain 18. Note that the gradient penalty captures the dependency of domains since t
58、he cost function in Problem IV measures the distance among all domains jointly. 5Theoretical Analysis In this section, we provide the generalization analysis for the proposed method. Motivated by 4, we give a new defi nition of generalization for multiple distributions as follows. Defi nition 1 (Generalization)LetPsandPibe the continuous
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 防治肿瘤科普知识课件
- 江苏省泰兴市黄桥教育联盟2026届中考语文模拟试题含解析
- 黄金买卖合同2025年
- 工程技术服务合同书2025年
- 会议室装修合同(2025版)
- 湖北省枣阳市重点名校2026届中考英语押题卷含答案
- 社区救助管理办法
- 灌区设计管理办法
- 物流纸箱管理办法
- 煤炭网点管理办法
- ISO 31000-2018 风险管理标准-中文版
- 河北省廊坊市各县区乡镇行政村村庄村名居民村民委员会明细
- 危货运输安全知识
- 沈阳终止解除劳动合同范文证明书(三联)
- 脚手架架在楼板上验算书
- ThinkPad X220 拆机解析深入分析
- 第3章沼气发酵原理与设计
- 《中学思想政治课程教学论》课程教学大纲
- 华为技术校园招聘会ppt招聘宣讲会ppt课件
- 消防预防方案及处理预案
- 安全隐患排查整改台账2012
评论
0/150
提交评论