$Date: 2004/02/09 08:11:23 $
Random thoughts for the notes:
It's interesting how we swing back and forth between more constrained and less constrained models for MT. Yamada and Knight proposed a syntax-based TM with reorder operation, etc, which captures the groupings of words in language translation. Then, we realize that a simple tree is too constrained, so Gildea proposes a "loosely tree-based" model, which uses grouping and cloning to get around the fact that the Yamada and Knight model cannot get some word orderings.
I wonder if there exists some "optimal" model with just the right amount of contraints and parameters to be statistically learned. I'm starting to think that different models will perform better for different pairs of languages. If so, it implies the design of statistical models (be it IBM, tree-based, phrase-based, loosely-tree based) should still be motivated largely by linguistic knowledge.
Also, who is Alexander Calder? He invented the mobile, see here
I won't give away the punchline but it boils down to a tokenization problem.
Seems to be my week for thinking about tokenization problems.