版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
线性代数基础For
machine
learningVery
looselyBut
intuitive
(hopefully)Linear
Spce
-
Vector
Space(线性)向量空间:set
of
elements,elements
called
vectors
(一个函数也可称为向量!)but
different
to
set,
two
extra
operations
defined
over
it:1)
addition, 2)
scalar
multiplicationThese
two
must
satisfy
7
rules.e.g,
交换律,分配律等,very
trivial
but
general
enough,should
not
be
considered
as
strong
constraints.Examplese.g.,1.n-dimensional
real
coordinate
space(欧氏空间):null
vector,x+y,ax,
etc.2.
space
of
real
sequence:
each
element
is
an
infinite
seq.
ofrealnumber. Bounded
realsequence:
exist
a
constant
M,
s.t.,
theabsolutevalue
of
any
real
number
in
a
seq
is
smaller
than
it.
Boundedsequences
are
vector
space
as
well.f
=
g
means
that
f(x)=g(x)
at
every
position
xf
+
g means
a
new
function
hSubspace(子空间):是V的子集,但是其中元素对线性组合运算封闭。Linearcombination(线性组合):一种用已有样本线性组合生成新样本的机制。特别地,如果U=[u1,u2,…,uk]的线性组合能够填满整个S子空间,则称U“张成”子空间S,而S称为矩阵U的column
space。线性组合的矩阵表示:XwConvex
set:
more
“practical”
subspace
–
not
too
big,
butno
hole
in
it.Linear
independence
and
dimHow
to
define
“a
group
of
people
withcompletely
different
characteristics”mathematically?Each
one
cannot
be
defined
by
others,
ifyou
have
to
do
so,
then
the
combinationcoefficients
must
be
zero.Or
Linear
independence(线性独立).Why?去冗余,用最小的数据表达最大的信息量。basis
and
dimensionEach
vector
space,
has
a
few
fixednumber
of
key
members,
they
are
differentto
each
other
and
can
generate
others
inthis
space.You
can
change
basis,
but
cannot
changethe
dimension
for
a
vector
space.Normed
linear
spaceSimply
a
vector
space
equipped
with
aoperator
that
calculates
the
energy/size
ofa
vectorWhy
we
care
about
this?-判断一个数学对象是否可控?计算其能量!Transformation
and
continuityA
transformation
y=
T(x)
is
simply
amapping
from
X
to
Y.A
trans
from
X
to
a
space
of
realscalars,called
a
functional
ofX.Denoted
as
f,
g,
means
f(x)
over
X.E.g.,
f(x)
=
||x||,f(x)
=
<w,x>,
||f||=
max_x
|f(x)|=|<w,x>|<=||w||||x||=
||w||,
so
||f||=||w||.Transformation
and
continuitySmoothness:
if
x
changes
slightly,
then
ychanges
slightly
as
well.How
to
formulate
this?Banach
spaceIf
u
follow
a
sequence
(in
optimization),and
find
it
converges
to
a
point
outsideyour
viewpoint
A
Banach
space
is
a
spacethatguarantees
u
that
this
will
never
happen.(called
completeness)C[0,1]
is
a
Banach
space.Hilbert
space就是定义了内积的Bannach空间。可用于定义元素能量的计算方式(范数)–
x’x
is
a
norm,
i.e.,
||x||^2=
x’x正交集:H中任意两个元素的内积为0.单位正交集(orthonormal):正交且能量为1.它们构成的矩阵分别称为正交阵、单位正交阵,单位正交阵对对象进行线性变换后,不会改变其长度,只改变其方向。Gram-Schmidt
procedureHow
to
find
a
set
of
basis
for
a
group
ofpoints
v1,v2,…Vk
in
a
Hilbert
space?Input:
K
linearly
independent
pointsOutput:
K
orthonormal
points.思考:如果V是之前的矩阵、U是正交化后的矩阵,U是否改变了V张成的列空间?Project
v
onto
u投影:在集合U中找一点a,使得它与集合外一点v的距离最近。–在直线u上找一点a,使得它离v的距离最近。Rephrase:用受限模型f(u)=au去拟合数据v.关于a求导,等于0,得到V=au+(v-au)=投影+残差Gram-Schmidt
procedureResidual
learning:用迭代的方法,每次只拟合目标函数中的一小部分细节,而下次拟合“剩下”的部分(残差)。Iterate
t=1…T计算当前模型F_t
与目标之间的残差学习一个新的模型f_{t+1}来拟合残差将新模型f_{t+1}与F_t合并Good
enough?
if
yes,stop
now.Gram-Schmidt
procedureInput:
K
linearly
independent
points
Vk…Output:
K
orthonormal
points
Uk算法:1.任意选择一个方向u1=v1开始,F1={u1}Repeat
k=2….K:2.用当前模型F_{k-1}去拟合任意一点V_{k},计算残差r_{k},规整化。3.更新模型,F_{k}=F_{k-1}Ur_{k}Gram-Schmidt
procedure用当前模型F3={u1,u2,u3}来拟合当前样本v4拟合后的残差,将u4加入F集合,且与原有的{u1,u2,u3}正交。V:Nxd
:输入观测矩阵U:Nxd
:单位正交阵\Gamma:
dxd:上三角矩阵,对角线为1Least
square
RegressionIdea:把获得X的单位正交基Z,然后用Z来回归y
(Nx1).由Gram-Schidt
过程得到:X=UT,U是X的列空间:Nxd用U代替X
回归y:
Ua
=y =》
a
=
U’
y若直接用X回归y,有X
b=U
T
b=y,i.e.,T
b=U’y=a而根据T的定义方式,T(m,m)=1,so,xj的回归系数\beta(j),本质上是y在Um上的回归,Um:用除了Xj之外的所有列来回归Xj之后的残差。Problem:只要X中有一列与Xj高度相关,Um就会很小解决方案1:Forward
stepwiseregression同时进行特征选择和回归Let
X为Nxd,假定已用了q列(X1)进行
QR分解,用Q回归y得到q个回归变量,则当前模型的残差为:r=y–X1
\beta1还剩下d-q变量,问题,选哪个来继续正交化最好?解决方案2:PrincipalComponent
Regression问题:怎样寻找一个dxd正交转换矩阵P,使得它对X进行变换之后,得到目标阵T,它的每列彼此线性无关。i.e.,T=XP,but
只知X,T
、P未知So,do
PCA
on
X,
let
P
=
U,
trans
X
to
T,
and
then
perform
LS
Regression
over
T.计算主成分还有其他估计主成分的方法吗?What’s
covariance
matrix?Variance:
分布在某个方向上的散度What’s
covariance
matrix?Covariance:两个不同方向上的联合散度,一个d维空间上的样本,一共有d(d-1)/2个联合散度,构成一个对称矩阵。data
X协方差矩阵C=E(X-EX)’(X-EX)特征向量分别指向信息量最大和最小的方向。直觉协方差表示红色减去蓝色部分的净大小。与x,y轴的尺度有关易受离群点影响相关性:当大多数点沿向上排列时,红色居多,相关性大。注意区分两种情况:一种是inverse
协方差;一种直接是协方差(处理图像):What’s
covariance
matrix?对离群点和边缘敏感,对中间点不敏感。协方差矩阵的几何意义V=(1,3)’V’=C
*
vC
*v’applicationHarris
corner
detectorC.Harris,
M.Stephens.
“A
Combined
Corner
and
Edge
Detector”.
1988The
Basic
IdeaWe
should
easily
localize
the
point
bylooking
through
a
small
windowShifting
a
window
in
any
direction
shouldgive
a
large
change
in
intensityHarris
Detector:
Basic
Idea“flat”
region:no
change
as
shiftwindow
in
alldirections“edge”:no
change
as
shiftwindow
along
theedge
direction“corner”:significant
change
asshift
window
in
alldirections问题:怎样把这个检查动作表示成一个优化问题?Harris
Detector:
MathematicsWindow-averaged
change
of
intensity
inducedby
shifting
the
image
data
by
[u,v]:E(u,
v)
=
w(x,
y)
I
(x
+
u,
y
+
v)
-
I
(x,
y)
2x,
yIntensityShiftedintensityWindowfunctionorWindow
functionw(x,y)
=Gaussian1
in
window,
0
outsideTaylor
series
approx
to
shiftedimageE
(u,v)
»
w(x,
y)[I(x,
y)
+
uIx
+
vIy
-
I(x,
y)]2x,y=
w(x,
y)[uIx
+
vIy
]2x,yIx
IxIx
IyIx
Iy
uIy
Iy
v
=
w(x,
y)(u
v)x,yHarris
Detector:
MathematicsuE(u,
v)
@
[u,
v]
Mv
Expanding
I(x,y)
in
a
Taylor
series
expansion,
we
have,
for
small
shifts
[u,v],
abilinear
approximation:xx,
yx
y
yI
II
2
I
2I
I
x y
M
=
w(x,
y)
where
M
is
a
2·2
matrix
computed
from
image
derivatives:M
is
also
called
“structure
tensor”Harris
Detector:
MathematicsuE(u,
v)
@
[u,
v]
Mv
Intensity
change
in
shifting
window:
eigenvalue
analysisl1,
l2
–
eigenvalues
of
Mdirection
oftheslowestchangedirection
of
thefastestchange(lmax)-1/2(lmin)-1/2Ellipse
E(u,v)
=
constIso-intensity
contour
of
E(u,v)Selecting
Good
Featuresl1
and
l2
are
largeSelecting
Good
Featureslarge
l1,small
l2Selecting
Good
Featuressmall
l1,
small
l2Harris
Detector:
Mathematicsl1l2“Corner”l1
and
l2
are
large,l1
~
l2;E
increases
in
alldirectionsl1
and
l2
are
small;E
is
almostconstant
inalldirections“Edge”l1
>>
l2“Edge”l2
>>
l1“Flat”regionClassification
of
imagepoints
usingeigenvalues
of
M:Harris
Detector:
MathematicsMeasure
of
corner
response:R=
det
M
-
k
(trace
M
)2det
M
=
l1l2trace
M
=
l1
+
l2(k
–
empirical
constant,
k
=
0.04-0.06)This
expressiondoes
not
requirescomputing
theeigenvalues.Harris
Detector:
Mathematicsl1l2“Corner”“Flat”R
depends
only
oneigenvalues
of
MR
is
large
for
a
cornerR
is
negative
with
largemagnitude
for
an
edge|R|
is
small
for
a
flat
regionR
>
0“Edge”R
<
0“Edge”R
<
0|R|
smallHarris
DetectorThe
Algorithm:Find
points
with
large
corner
responsefunction
R
(R
>
threshold)Take
the
points
of
local
maxima
of
RHarris
Detector:
WorkflowHarris
Detector:
WorkflowCompute
corner
response
RHarris
Detector:
WorkflowFind
points
with
large
corner
response:
R>thresholdHarris
Detector:
WorkflowTake
only
the
points
of
local
maxima
of
RHarris
Detector:
WorkflowHarris
Detector:
SummaryAverage
intensity
change
in
direction
[u,v]
can
beexpressed
as
a
bilinear
form:Describe
a
point
in
terms
of
eigenvalues
of
M:measure
of
corner
responseA
good
(corner)
point
should
have
a
large
intensitychange
in
all
directions,
i.e.
R
should
be
largepositiveuE(u,
v)
@
[u,
v]
Mv
(
)21
2
1
2R
=
ll
-
k
l
+
lNonlinear
Iterative
Partial
LeastSquares
(NIPALS)Idea:如果知道样本xj在特征空间中的投影t,那么求主成分投影方向
p,就是一个回归过程。Butwe
don’tHowever,一方面,我们知道p是样本协方差矩阵的特征向量,另一方面,它满足重构约束,我们由此可得到由t估计p的函数关系如下:由此得到NIPALS算法:如此可获得第一个主成分p及其对应的t,:其余的继续拟合用投影坐标t来回归(解释)X的投影向量p,并规整化。重新估计X在p上的投影tPartial
Least
SquaresHow
to
relate
to
two
modals?寻找投影矩阵W和C,使得两个相关模态在共同的特征空间中的相关性最大迭代N次,对两个不同姿态的人脸:Partial
Least
SquaresIdea:要增强投影空间上的相关性,可以用彼此在特征空间上的投影来回归各自的投影向量。用Y的投影u来回归(解释)X的投影向量p用p来对X进行投影(特征抽取),得到特征空间坐标t用X的投影t来重构Y的投影向量q用q来对Y进行投影,得到特征空间坐标uPartial
Least
Squares当迭代完成后,X,Y的特征分别是T,U,然后用T来回归U:U=T
D,代入Y的生成模:Y=UQ’=TDQ’=XP’DQ’即:PLS目标是学习同时令两个模态X,Y相关性最高的latent
factors,揭示两个语义接近的不同观测背后的统计规律。Partial
Least
SquaresPLS其实是一种用监督方式估计特征空间的方法:回归、分类、降维、特征学习:和传统的Least
square
Regression
相比,PLS
更看重同时学习Y和X的特征学习,而传统LS只考虑了X空间的特征关系。和主成分分析PCA相比,PCA是unsupervised,而PLS利用监督信息来学习更有判别力的特征空间。But
why
called
“partial”?
–
es
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 小儿贫血的护理知识更新
- 急性盆腔炎的护理健康教育与宣传
- 2026年SRv6 FlexE跨域高速算力互联通道设计与微秒级时延保障
- 生态文明建设活动方案
- 2026年AI手机预订餐厅跨平台比价自然语言指令完成多步操作
- 2026年NewCo模式亚洲VC评估全球生物资产标准
- 2026年无FMM方案:ViP技术与光刻像素化工艺深度报告
- 2026年针灸推拿正骨等核心技术的服务流程标准化手册
- 2026年服务业组织碳核算:办公运营与商务旅行碳排放计算
- 建筑工程临水临电计算及布置案例(模版)
- 2026年江苏经贸职业技术学院单招综合素质考试题库附答案详解
- 【新教材】人教PEP版(2024)四年级下册英语 Unit 1 Class rules A Lets talk 教案
- 【MOOC】《大学物理的数学基础》(西南交通大学)章节期末慕课答案
- 《工程勘察设计收费标准》(2002年修订本)-完整版-1
- 北师大版七年级数学下册-基础计算题100题(无答案)
- 石化信息分类编码-装置名称及3
- 土方工程沟槽土方(沟槽开挖)技术交底记录
- 烟花爆竹安全与质量GB10631-2013
- 区域卫生信息化平台项目建设方案
- 中国文化史复习资料
- 钢结构人行天桥工程监理规划
评论
0/150
提交评论