Vatsal Sharan, Sham M. Kakade, Percy S. Liang, Gregory Valiant
We study the basic problem of learning overcomplete HMMs---those that have many hidden states but a small output alphabet. Despite having significant practical importance, such HMMs are poorly understood with no known positive or negative results for efficient learning. In this paper, we present several new results---both positive and negative---which help define the boundaries between the tractable-learning setting and the intractable setting. We show positive results for a large subclass of HMMs whose transition matrices are sparse, well-conditioned and have small probability mass on short cycles. We also show that learning is impossible given only a polynomial number of samples for HMMs with a small output alphabet and whose transition matrices are random regular graphs with large degree. We also discuss these results in the context of learning HMMs which can capture long-term dependencies.