Scale-invariant object categorization under eccentricity-dependent retinal resolution
Description
Feedforward hierarchical models of primate object recognition typically ignore the nonuniform resolution of the retina, for which the smallest perceptible spatial wavelength λ is roughly proportional to eccentricity from the fixation point. Models incorporating retinal resolution have so far been used only to explore limitations of primate vision in the periphery. Here we investigate the consequences of retinal resolution for the task which is the ventral visual pathway's 'core competency': recognition of a fixated object.
This change in task forces clarification of several issues, including the scope of feedforward models -- i.e., what problem is solved by a single pass through the model? -- and certain aspects of the models themselves, especially the role of scale invariance. We present theoretical and experimental findings for scale-invariant object categorization in the context of an HMAX model modified to account for realistic retinal resolution.
Our experiments explored two aspects of this model related to the handling of scale: (1) the size and shape of the input window used to select a subset of the visual information available in a scene for processing in a single feedforward pass, defined as a region in (x,y,λ), and (2) the handling of the λ dimension within the hierarchy. We also investigated the model's effectiveness in the presence of clutter.
Our main experimental results are (1) spatial wavelengths too small for the retina to perceive across the entire object do not play a significant role in the no-clutter case, but confer robustness in the presence of clutter, and (2) preservation by the hierarchy of information about the relative scale (distance along λ) of feature activations is more important than current models reflect.