Tuesday, 20 January 2015

Application of linear algebra: searching database

Basic idea:
$\large \cos\theta=\frac{|\vec{a} \cdot \vec{b}|}{|\vec{a}||\vec{b}|}=\vec{x}^T\vec{y}$
$\vec{x},\vec{y}$: unit vectors of $\vec{a}$ and $\vec{b}$ respectively
$\theta$: angle between $\vec{x}$ and $\vec{y}$.

database matrix: $\large \vec{x}=\frac{\vec{a}}{|\vec{a}|}$
search vector: $\large \vec{y}=\frac{\vec{b}}{|\vec{b}|}$

If $\cos\theta = 0$, $\theta=90^\circ$, the document does not contain any of the search words and the corresponding column vector of the database matrix is orthogonal to the search vector.

If $\cos\theta$ is close to $1$, $\theta \sim 0$, the data corresponding to that vector best matches our search criteria.

More to explore:
Latent Semantic Indexing (LSI)
Singular value decomposition
Covariance
Least squares problem

No comments:

Post a Comment