Part of Advances in Neural Information Processing Systems 21 (NIPS 2008)
Query expansion is a long-studied approach for improving retrieval effectiveness by enhancing the userâs original query with additional related terms. Current algorithms for automatic query expansion have been shown to consistently improve retrieval accuracy on average, but are highly unstable and have bad worst-case performance for individual queries. We introduce a novel risk framework that formulates query model estimation as a constrained metric labeling problem on a graph of term relations. Themodel combines assignment costs based on a baseline feedback algorithm, edge weights based on term similarity, and simple constraints to enforce aspect balance, aspect coverage, and term centrality. Results across multiple standard test collections show consistent and dramatic reductions in the number and magnitude of expansion failures, while retaining the strong positive gains of the baseline algorithm.