Next: Applications Up: Epitome modeling Previous: Reconstruction

Variations on the model

As an extension of the basic model, we made certain variations in the inference. First, we allow a single patch in the image being mapped from a group sub-patches located in different positions of the epitome. For instance, a $ 24 \times 24$ patch can correspond to a single $ 24 \times 24$ patch in the epitome, or four $ 12 \times
12$ sub-patches, or nine $ 8 \times 8$ ones. In this way, we introduce variation on the patch sizes, so that the epitome captures patterns from both large and small areas.

However, how the patches are mapped (beyond where they are mapped) becomes another set hidden variables $ G=\{G_k\}_{k=1}^P$. $ G_k$ denotes the grouping method for $ Z_k$ from all the sub-patches in $ E$. Its dependency graph is depicted as follows, where the mapping and grouping are independent given the epitome of the image.


Figure 2: Dependency graph of the extended epitome model

The joint distribution hence becomes


    $\displaystyle p(Z, T, G, e)$ (15)
  $\displaystyle =$ $\displaystyle p(e)p(T\vert e)p(G\vert e)p(Z\vert T,G,e)$ (16)
  $\displaystyle =$ $\displaystyle p(e) \prod_{k=1}^P p(T_k) \prod_{i\in S_k} N(Z_{i,k}; \mu_{T_k(i)}, \phi_{T_k(i)})$ (17)

To train the epitome, we modify the target function as


$\displaystyle Q(e, e^g)$ $\displaystyle =$ $\displaystyle E[ \log p(Z,T,G\vert e) ]$ (18)
  $\displaystyle =$ $\displaystyle \sum_T \sum_G {\log p(Z,T,G\vert e) f(T,G\vert Z,e^g)}$ (19)

Instead of looping over both $ T$ and $ G$, we can simplify the problem by only considering the most likely grouping method for a given mapping. In other words,


$\displaystyle Q(e, e^g)$ $\displaystyle =$ $\displaystyle E[ \log p(Z,T,G^*\vert e) ]$ (20)
  $\displaystyle =$ $\displaystyle \sum_T {\log p(Z,T,G^*\vert e) f(T,G^*\vert Z,e^g)}$ (21)

where

$\displaystyle G^* = {\hbox{$\underset{G}{\mbox{argmax}}\;$}} f(T,G\vert Z, e^g)$ (22)

In this way, we can apply the same EM formulas. But in the ``expectation'' step, we compute $ G_k^*$ first for each $ T_k$, then substitute the posterior probability by $ f(T_k, G_k^* \vert Z_k, e^g)$ in 11.

The second variation we made is that we allow linear transformations on a patch, such as flips and rotations. We expect the epitome can be more condensed, since the patches with symmetric properties in the original image can be generated from the same patch in the epitome.

Again, we make the inference hard by introducing transformations $ R=\{R_k\}_{k=1}^P$ as hidden variables. But we can tackle that by the same simplification method as in 20 and 22. We compute the best transformation for every possible mapping, given the observed patch and the epitome parameters. We use that configuration in the posterior calculations and update the model parameters accordingly.




Next: Applications Up: Epitome modeling Previous: Reconstruction