Linking retinal sampling in neural encoding models to temporal profiles of visual processing in humans

Wait 5 sec.

by Niklas Müller, Hongye Chen, Sofie Wahlberg, H. Steven Scholte, Iris I. A. GroenRetinotopic tuning of neural populations is a key organizing principle of human visual cortex. However, state-of-the-art models that predict neural recordings based on task-optimized Convolutional Neural Networks (CNNs) do not take this retinotopic organization into account. Furthermore, while retinotopic tuning in visual cortex has been studied extensively using functional magnetic resonance imaging, the temporal dynamics of processing information from distinct parts of the visual field are less well understood. Here, we reveal distinct temporal profiles for foveal and peripheral visual information processing by implementing multiple spatial sampling strategies on feature maps of CNNs into encoding models that predict human electroencephalography (EEG) responses. Using large, high-quality natural scene images, we show that processing of peripheral information precedes that of foveally sampled information. This temporal difference is best modeled when applying a differential spatial transform to CNN feature maps that is derived from empirical measurements of human retinal ganglion cells. We directly confirm this temporal difference experimentally by mutually exclusive stimulation of foveal and peripheral visual field regions. Last, we introduce a novel, data-driven method of recovering visual field information from neural data, highlighting and quantifying spatial, retinotopic information contained in temporally specific EEG recordings. Together, these results provide novel neural evidence for a temporal coarse-to-fine visual processing hierarchy in the processing of natural images that is directly linked to distinct spatial information sampling. Aligning the spatial sampling of humans and CNN encoding models not only improves predictions of neural responses but also demonstrates that EEG recordings contain a significant amount of temporally encoded retinotopic information. We make our large-scale EEG dataset including high-resolution natural scene images publicly available to enable future research into naturalistic visual processing.