稀疏编码

'''Sparse Coding'''

'''稀疏编码'''

初译：@寅莹

一审：@大黄蜂的思索

Sparse coding is a class of unsupervised methods for learning sets of over-complete bases to represent data efficiently. The aim of sparse coding is to find a set of basis vectors <math>\mathbf{\phi}_i</math> such that we can represent an input vector <math>\mathbf{x}</math> as a linear combination of these basis vectors: 
:<math>\begin{align}
\mathbf{x} = \sum_{i=1}^k a_i \mathbf{\phi}_{i} 
\end{align}</math>
【初译】针对超完备基的学习集，稀疏编码是一类有效表示数据的非监督方法。稀疏编码的目的是找到一组基向量 <math>\mathbf{\phi}_i</math> 集合，以致能将输入向量 <math>\mathbf{x}</math> 表示为这组基向量的线性组合。
:<math>\begin{align}
\mathbf{x} = \sum_{i=1}^k a_i \mathbf{\phi}_{i} 
\end{align}</math>
【一审】稀疏编码算法是一种无监督学习方法，它用来寻找一组“超完备”基向量来更高效地表示样本数据。稀疏编码算法的目的就是找到一组基向量 <math>\mathbf{\phi}_i</math> ，使得我们能将输入向量 <math>\mathbf{x}</math> 表示为这些基向量的线性组合。
:<math>\begin{align}
\mathbf{x} = \sum_{i=1}^k a_i \mathbf{\phi}_{i} 
\end{align}</math>
While techniques such as Principal Component Analysis (PCA) allow us to learn a complete set of basis vectors efficiently, we wish to learn an '''over-complete''' set of basis vectors to represent input vectors <math>\mathbf{x}\in\mathbb{R}^n</math> (i.e. such that <math>k > n</math>). The advantage of having an over-complete basis is that our basis vectors are better able to capture structures and patterns inherent in the input data. However, with an over-complete basis, the coefficients <math>a_i</math> are no longer uniquely determined by the input vector <math>\mathbf{x}</math>. Therefore, in sparse coding, we introduce the additional criterion of '''sparsity''' to resolve the degeneracy introduced by over-completeness. 
【初译】因主成分分析技术允许有效地学习一组完备基向量集合，希望能够学习一组过完备的基向量集合，来表示输入向量 <math>\mathbf{x}\in\mathbb{R}^n</math> （即,如<math>k > n</math>）。过完备基的优点是该基向量能更好地捕捉输入数据内在的结构和模式。然而，对于过完备基，其系数 <math>a_i</math> 不在由输入向量  <math>\mathbf{x}</math>唯一确定。因此，在稀疏编中引入稀疏的附加标准，解决因引入过完备基所导致的退化问题。
【一审】虽然形如主成分分析技术（PCA）能使我们方便地找到一组“完备”基向量，但是这里我们想要做的是找到一组“超完备”基向量来表示输入向量 <math>\mathbf{x}\in\mathbb{R}^n</math> （也就是说，<math>k > n</math>）。超完备基的好处是它们能更有效地找出输入数据的结构与模式。然而，对于超完备基来说，系数 <math>a_i</math> 不再由输入向量 <math>\mathbf{x}</math>唯一确定。因此，在稀疏编码算法中，我们另加了一个评判标准“稀疏性”来解决因超完备而导致的退化（degeneracy）问题。