Support Vector

Another way to avoid catastrophic forgetting are Support Vectors (SV). Support Vectors are those patterns that are in the boundary between classes, they can be seen as the representatives of the classes. Support Vectors represent the decision boundaries of a classifier.

Let $X_t = (x_{1,t}, x_{2,t},...,x_{n,t})$ the input patterns at time $t$ and $T_t$ the related target at time $t$. A model $M$ is trainined with a set of couple $(\overline{X}, \overline{T}) = E$. Since the model depends on the training patterns $E$ and on other parameters $\lambda$ (like learning rate, momentum,...), we refer to the model as $M(E,\lambda)$. Support vectors are patterns with an undefined model responce. A good indicator of uncertainty is entropy:

\begin{displaymath}
H = - \sum_{i=1}^{N} P_i \log(P_i)
\end{displaymath} (10)

where $P_i$ in (1.10) is the posterior probability of the model $M$ and $N$ represents the number of outputs of the model. If the output vector has one $1$ and each others $0$ the entropy is minimal, $H=0$. Otherwise $H=1$ if all the outputs are equal to $1/N$, where $N$ is the number of network outputs. In this case the model $M$ has the highest uncertainty. In order to normalize the entropy compared with the length of the output vector, a normalized entropy is used:
\begin{displaymath}
H'=\frac{-\sum_{i=1}^{N}P_i \log (P_i)}{\log (N)}
\end{displaymath} (11)

Stefano Scanzio 2007-10-24