
Ahsan Ijaz







Bag of words model

Sample sentence with matrix:

Emphasizes important words
Appears rarely in corpus (rare globally)
\[ \textit{Inverse doc frequency} = \log\frac{\textit{#docs}}{1 + \textit{#docs using the word}} \]

Notion of closeness?
\[ \textit{distance}(x_i,x_q) = |x_i - x_q| \]

\[ \textit{distance}(x_i,x_q) = \sqrt{(a_1(x_i[1]-x_q[1])^2+\ldots+a_d(x_i[d]-x_q[d])^2))} \]


\[ \frac{\mathbf{x}_i^{T}\mathbf{x}_q}{\|x_i\|\|x_q\|} \]


