稀疏编码
From Ufldl
Line 19: | Line 19: | ||
\mathbf{x} = \sum_{i=1}^k a_i \mathbf{\phi}_{i} | \mathbf{x} = \sum_{i=1}^k a_i \mathbf{\phi}_{i} | ||
\end{align}</math> | \end{align}</math> | ||
- | While techniques such as Principal Component Analysis (PCA) allow us to learn a complete set of basis vectors efficiently, we wish to learn an '''over-complete''' set of basis vectors to represent input vectors | + | While techniques such as Principal Component Analysis (PCA) allow us to learn a complete set of basis vectors efficiently, we wish to learn an '''over-complete''' set of basis vectors to represent input vectors <math>\mathbf{x}\in\mathbb{R}^n</math> (i.e. such that <math>k > n</math>). The advantage of having an over-complete basis is that our basis vectors are better able to capture structures and patterns inherent in the input data. However, with an over-complete basis, the coefficients <math>a_i</math> are no longer uniquely determined by the input vector <math>\mathbf{x}</math>. Therefore, in sparse coding, we introduce the additional criterion of '''sparsity''' to resolve the degeneracy introduced by over-completeness. |
+ | 【初译】因主成分分析技术允许有效地学习一组完备基向量集合,希望能够学习一组过完备的基向量集合,来表示输入向量 <math>\mathbf{x}\in\mathbb{R}^n</math> (即,如<math>k > n</math>)。过完备基的优点是该基向量能更好地捕捉输入数据内在的结构和模式。然而,对于过完备基,其系数 <math>a_i</math> 不在由输入向量 <math>\mathbf{x}</math>唯一确定。因此,在稀疏编中引入稀疏的附加标准,解决因引入过完备基所导致的退化问题。 | ||
+ | 【一审】虽然形如主成分分析技术(PCA)能使我们方便地找到一组“完备”基向量,但是这里我们想要做的是找到一组“超完备”基向量来表示输入向量 <math>\mathbf{x}\in\mathbb{R}^n</math> (也就是说,<math>k > n</math>)。超完备基的好处是它们能更有效地找出输入数据的结构与模式。然而,对于超完备基来说,系数 <math>a_i</math> 不再由输入向量 <math>\mathbf{x}</math>唯一确定。因此,在稀疏编码算法中,我们另加了一个评判标准“稀疏性”来解决因超完备而导致的退化(degeneracy)问题。 |