Processing math: 100%

Tuesday, 20 January 2015

Application of linear algebra: searching database

Basic idea:
\large \cos\theta=\frac{|\vec{a} \cdot \vec{b}|}{|\vec{a}||\vec{b}|}=\vec{x}^T\vec{y}
\vec{x},\vec{y}: unit vectors of \vec{a} and \vec{b} respectively
\theta: angle between \vec{x} and \vec{y}.

database matrix: \large \vec{x}=\frac{\vec{a}}{|\vec{a}|}
search vector: \large \vec{y}=\frac{\vec{b}}{|\vec{b}|}

If \cos\theta = 0, \theta=90^\circ, the document does not contain any of the search words and the corresponding column vector of the database matrix is orthogonal to the search vector.

If \cos\theta is close to 1, \theta \sim 0, the data corresponding to that vector best matches our search criteria.

More to explore:
Latent Semantic Indexing (LSI)
Singular value decomposition
Covariance
Least squares problem

No comments:

Post a Comment