稀疏编码自编码表达

From Ufldl

Jump to: navigation, search

@@ Line 314: / Line 314: @@
-=== Good initialization of <math>s</math> ===
+=== Good initialization of <math>s</math>[良好的s初始值] ===
 Another important trick in obtaining faster and better convergence is good initialization of the feature matrix <math>s</math> before using gradient descent (or other methods) to optimize for the objective function for <math>s</math> given <math>A</math>. In practice, initializing <math>s</math> randomly at each iteration can result in poor convergence unless a good optima is found for <math>s</math> before moving on to optimize for <math>A</math>. A better way to initialize <math>s</math> is the following:
@@ Line 324: / Line 324: @@
 [初译]
+在给定<math>A</math>的条件下，根据目标函数使用梯度下降（或其他方法）求解<math>s</math>之前找到良好的特征矩阵<math>s</math>的初始值是另一个快速高效收敛的重要技巧。实际上，每次迭代过程<math>s</math>的随机初始化导致收敛性较差，除非在求解<math>A</math>的最优值前已得到<math>s</math>的最优解。下面给出一个初始化s的较好方法：
+<ol>
+<li>令<math>s \leftarrow W^Tx</math> (<math>x</math> 是迷你块中patches的矩阵表示)
+<li>对s做归一化处理：<math>s</math>中的每个特征（<math>s</math>的每一列）除以其在<math>A</math>中对应的偏移量。换句话说，如果 <math>s_{r, c}</math>表示<math>c</math>样本的第<math>r</math>个特征，<math>A_c</math>表示<math>A</math>中第<math>c</math>个偏移量，则令<math>s_{r, c} \leftarrow \frac{ s_{r, c} } { \lVert A_c \rVert }.</math>
+</ol>
 [一审]
-[原文]
+在给定<math>A</math>的条件下，根据目标函数使用梯度下降（或其他方法）求解<math>s</math>之前找到良好的特征矩阵<math>s</math>的初始值是另一个快速高效收敛的重要技巧。实际上，每次迭代过程<math>s</math>的随机初始化导致收敛性较差，除非在优化<math>A</math>的最优值前已得到<math>s</math>的最优解。下面给出一个初始化s的较好方法：
+<ol>
+<li>令<math>s \leftarrow W^Tx</math> (<math>x</math> 是迷你块中patches的矩阵表示)
+<li>对s做归一化处理：<math>s</math>中的每个特征（<math>s</math>的每一列）除以其在<math>A</math>中对应的基向量。即，如果 <math>s_{r, c}</math>表示<math>c</math>样本的第<math>r</math>个特征，<math>A_c</math>表示<math>A</math>中第<math>c</math>个基向量，则令<math>s_{r, c} \leftarrow \frac{ s_{r, c} } { \lVert A_c \rVert }.</math>
+</ol>
+[原文]
 Very roughly and informally speaking, this initialization helps because the first step is an attempt to find a good <math>s</math> such that <math>Ws \approx x</math>, and the second step "normalizes" <math>s</math> in an attempt to keep the sparsity penalty small. It turns out that initializing <math>s</math> using only one but not both steps results in poor performance in practice. ([[TODO]]: a better explanation for why this initialization helps?)

稀疏编码自编码表达

From Ufldl

Revision as of 07:21, 8 March 2013

Views

Personal tools

ufldl resources

wiki

Search

Toolbox