版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
DiffusionProbabilisticModels:Theory
andApplicationsFan
BaoTsinghua
UniversityByFanBao,
Tsinghua
University1DiffusionProbabilisticModels(DPMs)Hoetal.Denoisingdiffusionprobabilisticmodels(DDPM),
Neurips
2020.Song
etal.Score-based
generativemodelingthrough
stochastic
differential
equations,
ICLR2021.Bao
etal.Analytic-DPM:
an
AnalyticEstimateof
theOptimal
ReverseVariance
inDiffusionProbabilistic
Models,ICLR
2022.Bao
etal.EstimatingtheOptimal
Covariance
withImperfect
Mean
inDiffusionProbabilisticModels,ICML2022.ByFanBao,
Tsinghua
University2•
Diffusionprocessgradually
injects
noiseto
data•
Described
by
aMarkov
chain:
푞
푥
,
…
,
푥
=
푞
푥
푞
푥
푥
…
푞(푥
|푥
)0푁01
0푁
푁−1Transition
ofdiffusion:
푞
푥
푥=
푁(
훼
푥
,
훽
퐼)
훼
=
1
−
훽푛
푛−1푛
푛−1
푛푛푛…푥0푥1푥2푥푁≈
푁(0,
퐼)Diffusionprocess:
푞
푥
,
…
,
푥
=
푞
푥
푞
푥
푥
…
푞(푥
|푥
)0푁01
0푁
푁−1Demo
Imagesfrom
Song
etal.Score-based
generativemodeling
through
stochastic
differential
equations,
ICLR
2021.ByFanBao,
Tsinghua
University3•
Diffusionprocessin
thereverse
direction⇔
denoisingprocess•
Reverse
factorization:
푞
푥
,
…
,
푥
=
푞
푥
|푥
…
푞
푥
푥
푞(푥
)0푁0
1푁−1
푁푁Transition
ofdenoising:
푞
푥
푥
=?푛−1
푛…푥0푥1푥2푥푁≈
푁(0,
퐼)Diffusionprocess:
푞
푥
,
…
,
푥
=
푞
푥
푞
푥
푥
…
푞(푥
|푥
)0푁01
0푁
푁−1=
푞
푥
|푥
…
푞
푥
푥
푞(푥
)0
1푁−1
푁푁ByFanBao,
Tsinghua
University4•
Approximatediffusionprocessin
thereverse
directionModel
transition:
푝
푥
푥
=
푁(휇
푥
,
Σ
(푥
))푛−1
푛푛
푛푛
푛approximateTransition
ofdenoising:
푞
푥
푥
=?푛−1
푛…푥0푥1푥2푥푁≈
푁(0,
퐼)Diffusionprocess:
푞
푥
,
…
,
푥
=
푞
푥
푞
푥
푥
…
푞(푥
|푥
)0푁01
0푁
푁−1=
푞
푥
|푥
…
푞
푥
푥
푞(푥
)0
1푁−1
푁푁Themodel:
푝
푥
,
…
,
푥
=
푝
푥
|푥
…
푝
푥
푥
푝(푥
)0푁0
1푁−1
푁푁ByFanBao,
Tsinghua
University5•
We
hope푞
푥
,
…
,
푥
≈
푝
푥
,
…
,
푥푁푝
푥
푥
=
푁(휇
푥
,
Σ
(푥
))푛−1
푛
푛
푛
푛
푛0푁0•
Achievedbyminimizing
their
KLdivergence
(i.e.,maximizing
theELBO)min
KLmaxELBO푝(푥
)0:푁min
퐾퐿(푞(푥
)||푝
푥
)
⇔
max
E
log푛
푛0:푁0:푁푞푞(푥
|푥
)휇
,Σ휇
,Σ푛
푛1:푁
0Whatistheoptimalsolution?ByFanBao,
Tsinghua
University6Bao
etal.Analytic-DPM:an
AnalyticEstimateof
theOptimal
Reverse
Variance
in
DiffusionProbabilistic
Models,ICLR
2022.Theorem(The
optimalsolution
under
scalarvariance,i.e.,Σ
푥
=
휎2퐼)푛
푛푛Theoptimalsolutionto
min
퐾퐿(푞(푥
)||푝
푥
)
is0:푁0:푁2휇
⋅
,휎푛푛3key
steps
in
proof:➢
Moment
matching➢
Law
of
totalvariance➢
Score
representation
ofmoments
of
푞(푥
|푥
)1∗,휇
푥
=푥
+
훽
∇
log
푞
(푥
)푛
푛
푛
푛푛
푛훼푛0푛2훽∇
log
푞
푥푛∗2푛푛).휎
=
(1
−
훽
E푛푛
푞
(푥
)푛
푛훼푑푛Noise
predictionform:Parameterizationof
흁
⋅
:풏111∇
log
푞
(푥
)
=
−푛E푞[휖
]푛푥
푥0
푛휇
푥
=푛푥
−
훽푛휖Ƹ(푥
)푛푛ഥ푛푛푛훽ഥ훽훼푛푛푛Estimatedby
predictingnoiseByFanBao,
Tsinghua
University7Bao
etal.EstimatingtheOptimal
Covariance
withImperfect
Mean
inDiffusionProbabilisticModels,ICML2022.Theorem(The
optimalsolution
fordiagonal
covariance,i.e.,Σ
푥
=
diag(휎
푥
2)
)푛
푛푛
푛Theoptimalsolutionto
min
퐾퐿(푞(푥
)||푝
푥
)
is0:푁0:푁2휇
⋅
,휎
⋅푛푛Predict
noise1∗,휇
푥
=푥
+
훽
∇
log
푞
(푥
)푛
푛푛푛푛
푛훼푛ഥ2훽훽푛∗2푛−122휎
푥
=훽
+(E휖
−
E휖
).푛
푛푛푞
(푥
|푥
)
푛푞(푥
|푥
)
푛ഥ훽ഥ훽
훼푛
푛푛
푛푛
푛푛Predict
squarednoiseByFanBao,
Tsinghua
University8
Implementation
framework
ofpredictingsquarednoise最优协方差表达式:constant预测网络(푥
)最小化均方误差Ƹ2휖Ƹmin
퐄‖휖휖ො(푥
)
−
휖
‖푛
2푛푛푛푛푛数据푥0高斯噪声휖푛带噪数据푥푛预测网络最小化均方误差2
2ℎ
(푥
)
min
퐄‖ℎ
푥
−
휖
‖푛푛푛푛푛
2ℎ푛平方噪声2휖푛基于预测噪声平方的最优协方差估计:ByFanBao,
Tsinghua
University9휎푥
=훽
+E[
휖
−
휖Ƹ(푥
)
].Bao
etal.EstimatingtheOptimal
Covariance
withImperfect
Mean
inDiffusionProbabilisticModels,ICML2022.Theorem(The
optimalsolution
fordiagonal
covariance,i.e.,Σ
푥
=
diag(휎
푥
2)
)푛
푛푛
푛Theoptimalsolutionto
min
퐾퐿(푞(푥
)||푝
푥
)
withimperfect
meanis0:푁0:푁2휎
⋅푛ഥ2훽훽푛∗2푛−12푛
푛푛푞(푥
|푥
)
푛
푛
푛ഥ훽ഥ훽
훼푛
푛0
푛푛Noiseprediction
residual(NPR)11Generally,
the
mean
휇
푥
=푥
−
훽휖Ƹ(푥
)
isnotoptimal
due
to
approximationor푛
푛푛푛푛푛훼푛ഥ훽푛optimization
errorof휖Ƹ(푥
).푛푛ByFanBao,
Tsinghua
University10
Implementation
framework
ofpredictingNPR最优协方差表达式:预测网络
最小化均方误差(푥
)
min
퐄‖휖2휖ƸƸ(푥
)
−
휖
‖푛푛푛푛푛
2휖ො푛数据푥0高斯噪声휖푛带噪数据푥푛预测网络最小化均方误差2
2푔
(푥
)
min
퐄‖푔
푥
−
(휖Ƹ푥
−
휖
)
‖푛푛푛푛푛
푛
푛
2푛噪声残差푥
−
휖
)2(휖Ƹ푛푛푛基于预测噪声残差的最优协方差估计:Page11Songet
al.Score-based
generativemodeling
throughstochastic
differentialequations,
ICLR2021.•
Thecontinuoustimesteps
version(SDE)•
푞
푥
,
…
,
푥
becomes0푁•
푑풙
=
푓
푡
풙
푑푡
+
푔
푡
푑풘
↔
푑풙
=
푓
푡
풙
−
푔
푡
2
∇log
푞
풙
푑푡
+
푔
푡
푑풘ഥ푡•
푝
푥
,
…
,
푥
becomes0푁•
푑풙
=
푓
푡
풙
−
푔
푡
2풔
풙
푑푡
+
푔
푡
푑풘ഥ푡ByFanBao,
Tsinghua
University12Conditional
DPMs:Paired
DataWe
have
pairs
of(푥
,
푐),where푥
isthedata
and푐
isthecondition.00Thegoal
is
to
learntheunknown
conditional
data
distribution
푞(푥
|푐).0ByFanBao,
Tsinghua
University13Conditional
Model
Original
model푠
푥
→conditionalmodel푠
푥
|푐푛
푛푛
푛
Training:
minE
E
훽ҧE푠
푥
|푐
−
∇
log
푞
(푥
|푐)2푐
푛
푛
푞
(푥
|푐
)
푛
푛푛
푛푛
푛푠푛
Conditional
DPM:1
Discrete
time:푝
푥
푥
,
푐
=
푁(휇
푥
|푐
,
Σ
(푥
)),휇
푥
=푥
+
훽
푠
푥
|푐푛
푛
푛
푛푛−1
푛푛
푛2푛
푛푛
푛훼푛
Continuous
time:푑풙
=
푓
푡
풙
−
푔
푡
풔
풙|c
푑푡
+
푔
푡
푑풘ഥ푡
Challenge:designthemodel
architecture
푠
푥
|푐푛
푛ByFanBao,
Tsinghua
University14DiscriminativeGuidance
Exact
reverse
SDE:
푑풙
=
푓
푡
풙
−
푔
푡
∇log
푞
풙|푐
푑푡
+
푔
푡
푑풘ഥ2푡
∇log
푞
풙|푐
=
∇
log
푞
(푥)
+
∇
log
푞
(푐|푥)푡푡푡Thepaired
data
isusedin
thetraining
of
thediscriminative
modelOriginalDPMDiscriminativemodelApproximated
by
Conditional
score-based
SDE:2
푑풙
=
푓
푡
풙
−
푔
푡
(푠
푥
+
∇
log
푝
(푐|푥))
푑푡
+
푔
푡
푑풘ഥ푡푡
Benefits:
Manydiscriminativemodels
have
wellstudiedarchitecturesByFanBao,
Tsinghua
University15ScaleDiscriminativeGuidance
Exact
reverse
SDE:
푑풙
=
푓
푡
풙
−
푔
푡
(∇
log
푞
(푥)
+
∇
log
푞
(푐|푥))
푑푡
+
푔
푡
푑
풘ഥ2푡푡
Scale
discriminativeguidance:2
푑풙
=
푓
푡
풙
−
푔
푡
(∇
log
푞
(푥)
+
휆∇
log
푞
(푐|푥))
푑푡
+
푔
푡
푑풘ഥ푡푡
Conditional
score-basedSDE:2
푑풙
=
푓
푡
풙
−
푔
푡
(푠
푥
+
휆∇
log
푝
(푐|푥))
푑푡
+
푔
푡
푑풘ഥ푡푡2
푑풙
=
푓
푡
풙
−
푔
푡
(푠
푥|푐
+
휆∇
log
푝
(푐|푥))
푑푡
+
푔
푡
푑풘ഥ푡푡ByFanBao,
Tsinghua
University16Conditioned
on
labelDhariwal
et
al.Diffusion
ModelsBeatGANs
on
ImageSynthesisByFanBao,
Tsinghua
University17SelfGuidanceHoet
al.Unconditional
Diffusion
Guidance
Scalediscriminative
guidance:
푑풙
=
푓
푡
풙
−
푔
푡
(∇
log
푞
(푥)
+
휆∇
log
푞
(푐|푥))
푑푡
+
푔
푡
푑
풘ഥ2푡푡Require
anextradiscriminative
model
∇
log
푞
(푐|푥)
=
∇
log
푞
(푥|푐)
−
∇
log
푞
(푥)푡푡푡
Learnconditional
&unconditional
modeltogether
Introducetoken
∅,anduse푠
푥
|∅
to
represent
unconditional
cases푡
푡
Conditional
score-basedSDE:2
푑풙
=
푓
푡
풙
−
푔
푡
(푠
푥|∅
+
휆(푠
푥
푐
−
푠
(푥|∅))
푑푡
+
푔
푡
푑풘ഥ푡푡푡
Training:ҧ2ҧ2
minE
E
훽
E푠
푥
|푐
−
∇
log
푞
(푥
|푐)
+
휆E
훽
E푠
푥
|∅
−
∇
log
푞
(푥
)푐
푛
푛
푞
(푥
|푐
)
푛
푛푛
푛푛
푛
푞
(푥
)
푛
푛
푛
푛푛푛푛푛푠
⋅푛conditional
lossunconditional
lossByFanBao,
Tsinghua
University18Sahariaet
al.ImageSuper-Resolution
viaIterative
RefinementApplication:
ImageSuper-Resolution
Paired
data
(푥
,
푐),
푥
ishighresolutionimage,
푐
islowresolutionimage00
Learnaconditionalmodel푠
푥
|푐푛
푛
Architecture:푠
푥
|푐
=
UNet(cat
푥
,
푐
,
푛)
푐′,
′
isthebicubicinterpolation
of푐푛
푛푛ByFanBao,
Tsinghua
University19Sahariaet
al.ImageSuper-Resolution
viaIterative
RefinementApplication:
ImageSuper-ResolutionByFanBao,
Tsinghua
University20Nichol
etal.GLIDE:
Towards
Photorealistic
ImageGenerationand
Editing
with
Text-Guided
Diffusion
ModelsApplication:
Text
to
Image
Dataset
contains
pairsof(푥
,
푐),where
푥
isimage
and푐
istext00
Techniques:
conditionalmodel
with
self-guidance
Challenge:
design푠
푥
푐푡ByFanBao,
Tsinghua
University21Nichol
etal.GLIDE:
Towards
Photorealistic
ImageGenerationand
Editing
with
Text-Guided
Diffusion
ModelsApplication:
Text
to
Image
Architecture
of푠
푥
푐
:UNet+
TransformerOther
detailsDataset:
the
sameasDALL-E푡#parameters:
2.3billion
for
64x64
UNetencodesimage푥
Transformer
encodestext
푐
andtheembeddingisinjectedto
UNet
The
token
embeddingisinjected
after
group
normalization
inResBlock:
The
token
embeddingisconcatenated
totheattention
context
inUNetByFanBao,
Tsinghua
University22Amitet.al.SegDiff:
ImageSegmentationwith
Diffusion
Probabilistic
ModelsApplication:
Segmentation
Paired
data
(푥
,
푐),
푥
issegmentation,
푐
isimage00
푠
푥
푐
=
UNet(퐹
푥
+
퐺(푐),
푡)푡ByFanBao,
Tsinghua
University23Conditional
DPMs:UnpairedDataWe
only
have
asetof푥
(data).0Thegoal
is
to
constructaconditionaldistribution
푝(푥
|푐).0ByFanBao,
Tsinghua
University24Energy
Guidance
UnconditionalDPMtrainedfromaset
of
푥
(data):02
푑풙
=
푓
푡
풙
−
푔
푡
풔
풙
푑푡
+
푔
푡
푑풘ഥ푡
A
strategy
to
construct푝(푥
|푐)
is
to
insert
anenergyfunction:02
푑풙
=
푓
푡
풙
−
푔
푡
(풔
풙
−
∇퐸
(풙,
푐))
푑푡
+
푔
푡
푑풘ഥ,
푥
∼
푝(푥
|푐)푡푡푇푇
The
generated
datatendsto
have
alowenergy퐸
(풙,
푐)푡
The
energy
dependsonspecific
applicationsByFanBao,
Tsinghua
University25Energy
Guidance
Pros:
Provides
aframework
for
incorporating
domain
knowledge
to
DPMs
Cons:
푝(푥
|푐)
isveryblack
box0
Energydesign
isbasedonintuitionByFanBao,
Tsinghua
University26Application:
Text
to
Image
Highlevelidea:
Defineenergyasanegative
similarity
between
imageandtext
CLIPprovidesamodelto
measurethesimilarity
between
imagesandtexts:
Similarity:
sim
풙,
푐
=
풇(풙)
∙
품(푐)
Energy:퐸
풙,
푐
=
−sim
풙,
푐푡Nichol
etal.GLIDE:
Towards
Photorealistic
ImageGenerationandEditing
withText-Guided
Diffusion
ModelsByFanBao,
Tsinghua
University27Application:
Text
to
ImageEnergyguidanceSelfguidanceByFanBao,
Tsinghua
University28Vikash
etal.
GeneratingHigh
FidelityData
fromLow-density
Regionsusing
Diffusion
ModelsApplication:
Generate
Low
DensityImagesSamplesfromSDEismore
similarto
high
densitypartin
datasetSamplesfromSDEof
풔
풙|cDataset푡ByFanBao,
Tsinghua
University29Vikash
etal.
GeneratingHigh
FidelityData
fromLow-density
Regionsusing
Diffusion
ModelsApplication:
Generate
Low
DensityImages
Original
SDE:푑풙
=
푓
푡
풙
−
푔
푡
2풔
풙|c
푑푡
+
푔
푡
푑풘ഥ푡
NewSDE:
푑풙
=
푓
푡
풙
−
푔
푡
(풔
풙|c
−
∇퐸
(풙,
푐))
푑푡
+
푔
푡
푑풘ഥ2푡푡
Highlevelintuition:
Small
energy~x
isaway
from
theclass
c
퐸
푥,
푐
=
sim
푥,
푐
=
푓
푥
∙
휇푡푐
푓
isan
imageencoderand휇
isthe
averaged
embeddingof
class푐푐
Empirically,
useacontrastiveversionofthelossVikash
etal.
GeneratingHigh
FidelityData
fromLow-density
Regionsusing
Diffusion
ModelsByFanBao,
Tsinghua
University30Vikash
etal.
GeneratingHigh
FidelityData
fromLow-density
Regionsusing
Diffusion
ModelsApplication:
Generate
Low
DensityImagesSamplesfromSDEof
풔
풙|cSamplesfrom풔
풙|c
−
∇퐸
(풙,
푐)Dataset푡푡푡ByFanBao,
Tsinghua
University31Mengetal.
ImageSynthesis
andEditing
withStochastic
Differential
EquationsApplication:
Image2Image
Translation
푐
isthe
reference
image
풔
풙
isaDPM
on
target
domain푡2
푑풙
=
푓
푡
풙
−
푔
푡
(풔
풙
)
푑푡
+
푔
푡
푑풘,
푥
∼
푝(푥
|푐)ഥ푡푡푡00
Noenergyguidance
푐
only
influencethestartdistribution
Chooseanearlystart
time
푡
<
푇0
푝(푥
|푐)
isaGaussian
perturbation
of
푐푡0ByFanBao,
Tsinghua
University32Mengetal.
ImageSynthesis
andEditing
withStochastic
Differential
EquationsApplication:
Image2Image
Translation푝(푥
|푐)
isa
Gaussianperturbation
of
푐푡0Stroke
to
paintingByFanBao,
Tsinghua
University33DPMsfor
Downstream
TasksRegardDPMsaspretrainedmodels(feature
extractors)ByFanBao,
Tsinghua
University34Dmitry
et.al.
Label-Efficient
SemanticSegmentationwithDiffusion
ModelsDPMsfor
Downstream
SegmentationDPM
features
are
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 全国爱耳日宣传活动方案 (一)
- 2026 学龄前自闭症社交技巧提升课件
- 会计核算的基本规范
- 内部质量监管服务方案
- 保安煤业公司调度管理规定
- 八年级语文期中复习
- 全球医疗器械市场概况分析
- 全国消防宣传日演讲稿800字(32篇)
- 2026 自闭症沟通表达提升课件
- 高校教育与地方经济发展的协同创新
- 环保设施安全风险
- 2026年湖南事业单位招聘笔试题目及答案
- 2026年太原初一信息技术试卷
- 教育信息化领域违纪违规案例警示剖析材料
- 国开2026年春季《形势与政策》大作业答案
- 《毛泽东思想和中国特色社会主义》课件-专题一 马克思主义中国化时代化
- 2025年中国民用航空飞行学院马克思主义基本原理概论期末考试模拟题带答案解析
- 装潢拆除应急预案(3篇)
- 陕北民歌课件
- 腰椎穿刺课件
- 郑州生物会考试题及答案
评论
0/150
提交评论