Les Atlas, David Cohn, Richard Ladner
"Selective sampling" is a form of directed search that can greatly increase the ability of a connectionist network to generalize accu(cid:173) rately. Based on information from previous batches of samples, a network may be trained on data selectively sampled from regions in the domain that are unknown. This is realizable in cases when the distribution is known, or when the cost of drawing points from the target distribution is negligible compared to the cost of label(cid:173) ing them with the proper classification. The approach is justified by its applicability to the problem of training a network for power system security analysis. The benefits of selective sampling are studied analytically, and the results are confirmed experimentally.
Introduction: Random Sampling vs. Directed Search
1 A great deal of attention has been applied to the problem of generalization based on random samples drawn from a distribution, frequently referred to as "learning from examples." Many natural learning learning systems however, do not simply rely on this passive learning technique, but instead make use of at least some form of directed search to actively examine the problem domain. In many problems, directed search is provably more powerful than passively learning from randomly given examples.
Training Connectionist Networks with Queries and Selective Sampling
Typically, directed search consists of membership queries, where the learner asks for the classification of specific points in the domain. Directed search via membership queries may proceed simply by examining the information already given and deter(cid:173) mining a region of uncertainty, the area in the domain where the learner believes mis-classification is still possible. The learner then asks for examples exclusively from that region.
This paper discusses one form of directed search: selective sampling. In Section 2, we describe theoretical foundations of directed search and give a formal definition of selective sampling. In Section 3 we describe a neural network implementation of this technique, and we discuss the resulting improvements in generalization on a number of tasks in Section 4.
2 Learning and Selective Sampling For some arbitrary domain learning theory defines a concept as being some subset of points in the domain. For example, if our domain is ~2, we might define a concept as being all points inside a region bounded by some particular rectangle. A concept class is simply the set of concepts in some description language.
A concept class of particular interest for this paper is that defined by neural network architectures with a single output node. Architecture refers to the number and types of units in a network and their connectivity. The configuration of a network specifies the weights on the connections and the thresholds of the units 1 .
A single-output architecture plus configuration can be seen as a specification of a concept classifier in that it classifies the set of all points producing a network output above some threshold value. Similarly, an architecture may be seen as a specification of a concept class. It consists of all concepts classified by configurations of the network that the learning rule can produce (figure 1).