Daniel J. Hsu, Sham M. Kakade, John Langford, Tong Zhang
We consider multi-label prediction problems with large output spaces under the assumption of output sparsity – that the target (label) vectors have small support. We develop a general theory for a variant of the popular error correcting output code scheme, using ideas from compressed sensing for exploiting this sparsity. The method can be regarded as a simple reduction from multi-label regression problems to binary regression problems. We show that the number of subprob- lems need only be logarithmic in the total number of possible labels, making this approach radically more efﬁcient than others. We also state and prove robustness guarantees for this method in the form of regret transform bounds (in general), and also provide a more detailed analysis for the linear prediction setting.