Genetic Algorithms for Automatically Learning Factored Language Model Structure

GA-FLM Software

GA-FLM is a genetic-based algorithm for optimizing the structure of Factored Language Models (FLMs). It is used as an extension to the FLM programs in the SRI Language Modeling toolkit.

The program takes as input some training/development text files and some parameter files that specify the type of genetic algorithms and factored language model desired by the user. It then uses standard genetic algorithms search to build a population of factored language models. The search optimizes for development set perplexity. GA-FLM is most useful when one has many factors specified for language model development, but cannot find a good factored language model manually.

Terms of Use

GA-FLM can be downloaded free of charge under the GNU Public Licence. The current download is a beta version (v.0.1). For bugs, suggestions, or questions, please feel free to contact Kevin Duh (duh at ee dot The code is still under development, so I am more than happy to receive suggestions and questions.

Published research using GA-FLM may cite the following paper:


Last updated: May 05, 2005