Invariant recognition of feature combinations in the visual system.
Elliffe MCM., Rolls ET., Stringer SM.
The operation of a hierarchical competitive network model (VisNet) of invariance learning in the visual system is investigated to determine how this class of architecture can solve problems that require the spatial binding of features. First, we show that VisNet neurons can be trained to provide transform-invariant discriminative responses to stimuli which are composed of the same basic alphabet of features, where no single stimulus contains a unique feature not shared by any other stimulus. The investigation shows that the network can discriminate stimuli consisting of sets of features which are subsets or supersets of each other. Second, a key feature-binding issue we address is how invariant representations of low-order combinations of features in the early layers of the visual system are able to uniquely specify the correct spatial arrangement of features in the overall stimulus and ensure correct stimulus identification in the output layer. We show that output layer neurons can learn new stimuli if the lower layers are trained solely through exposure to simpler feature combinations from which the new stimuli are composed. Moreover, we show that after training on the low-order feature combinations which are common to many objects, this architecture can--after training with a whole stimulus in some locations--generalise correctly to the same stimulus when it is shown in a new location. We conclude that this type of hierarchical model can solve feature-binding problems to produce correct invariant identification of whole stimuli.