Ling Huang, XuanLong Nguyen, Minos Garofalakis, Michael Jordan, Anthony Joseph, Nina Taft
We consider the problem of network anomaly detection in large distributed systems. In this setting, Principal Component Analysis (PCA) has been proposed as a method for discover- ing anomalies by continuously tracking the projection of the data onto a residual subspace. This method was shown to work well empirically in highly aggregated networks, that is, those with a limited number of large nodes and at coarse time scales. This approach, how- ever, has scalability limitations. To overcome these limitations, we develop a PCA-based anomaly detector in which adaptive local data (cid:2)lters send to a coordinator just enough data to enable accurate global detection. Our method is based on a stochastic matrix perturba- tion analysis that characterizes the tradeoff between the accuracy of anomaly detection and the amount of data communicated over the network.