Natural Stimuli

Natural images and contrast encoding in bipolar cells in the retina of the land- and aquatic-phase tiger salamander.
D. A. Burkhardt and P. K. Fahey and M. A. Sikora
Vis Neurosci  23  35-47  (2006)
Intracellular recordings were obtained from 57 cone-driven bipolar cells in the light-adapted retina of the land-phase (adult) tiger salamander (Ambystoma tigrinum). Responses to flashes of negative and positive contrast for centered spots of optimum spatial dimensions were analyzed as a function of contrast magnitude. On average, the contrast/response curves of depolarizing and hyperpolarizing bipolar cells in the land-phase animals were remarkably similar to those of aquatic-phase animals. Thus, the primary retinal mechanisms mediating contrast coding in the outer retina are conserved as the salamander evolves from the aquatic to the land phase. To evaluate contrast encoding in the context of natural environments, the distribution of contrasts in natural images was measured for 65 scenes. The results, in general agreement with other reports, show that the vast majority of contrasts in nature are very small. The efficient coding hypothesis of Laughlin was examined by comparing the average contrast/response curves of bipolar cells with the cumulative probability distribution of contrasts in natural images. Efficient coding was found at 20 cd/m2 but at lower levels of light adaptation, the contrast/response curves were much too shallow. Further experiments show that two fundamental physiological factors-light adaptation and the nonlinear transfer across the cone-bipolar synapse are essential for the emergence of efficient contrast coding. For both land- and aquatic-based animals, the extent and symmetry of the dynamic range of the contrast/response curves of both classes of bipolar cells varied greatly from cell to cell. This apparent substrate for distributed encoding is established at the bipolar cell level, since it is not found in cones. As a result, the dynamic range of the bipolar cell population brackets the distribution of contrasts found in natural images.
Naturalistic stimuli increase the rate and efficiency of information transmission by primary auditory afferents.
F. Rieke and D. A. Bodnar and W. Bialek
Proc Biol Sci  262  259-65  (1995)
Natural sounds, especially communication sounds, have highly structured amplitude and phase spectra. We have quantified how structure in the amplitude spectrum of natural sounds affects coding in primary auditory afferents. Auditory afferents encode stimuli with naturalistic amplitude spectra dramatically better than broad-band stimuli (approximating white noise); the rate at which the spike train carries information about the stimulus is 2-6 times higher for naturalistic sounds. Furthermore, the information rates can reach 90% of the fundamental limit to information transmission set by the statistics of the spike response. These results indicate that the coding strategy of the auditory nerve is matched to the structure of natural sounds; this 'tuning' allows afferent spike trains to provide higher processing centres with a more complete description of the sensory world.
Spectral-temporal receptive fields of nonlinear auditory neurons obtained using natural sounds.
F. E. Theunissen and K. Sen and A. J. Doupe
J Neurosci  20  2315-31  (2000)
The stimulus-response function of many visual and auditory neurons has been described by a spatial-temporal receptive field (STRF), a linear model that for mathematical reasons has until recently been estimated with the reverse correlation method, using simple stimulus ensembles such as white noise. Such stimuli, however, often do not effectively activate high-level sensory neurons, which may be optimized to analyze natural sounds and images. We show that it is possible to overcome the simple-stimulus limitation and then use this approach to calculate the STRFs of avian auditory forebrain neurons from an ensemble of birdsongs. We find that in many cases the STRFs derived using natural sounds are strikingly different from the STRFs that we obtained using an ensemble of random tone pips. When we compare these two models by assessing their predictions of neural response to the actual data, we find that the STRFs obtained from natural sounds are superior. Our results show that the STRF model is an incomplete description of response properties of nonlinear auditory neurons, but that linear receptive fields are still useful models for understanding higher level sensory processing, as long as the STRFs are estimated from the responses to relevant complex stimuli.
Responses of neurons in cat primary auditory cortex to bird chirps: effects of temporal and spectral context.
O. Bar-Yosef and Y. Rotman and I. Nelken
J Neurosci  22  8619-32  (2002)
The responses of neurons to natural sounds and simplified natural sounds were recorded in the primary auditory cortex (AI) of halothane-anesthetized cats. Bird chirps were used as the base natural stimuli. They were first presented within the original acoustic context (at least 250 msec of sounds before and after each chirp). The first simplification step consisted of extracting a short segment containing just the chirp from the longer segment. For the second step, the chirp was cleaned of its accompanying background noise. Finally, each chirp was replaced by an artificial version that had approximately the same frequency trajectory but with constant amplitude. Neurons had a wide range of different response patterns to these stimuli, and many neurons had late response components in addition, or instead of, their onset responses. In general, every simplification step had a substantial influence on the responses. Neither the extracted chirp nor the clean chirp evoked a similar response to the chirp presented within its acoustic context. The extracted chirp evoked different responses than its clean version. The artificial chirps evoked stronger responses with a shorter latency than the corresponding clean chirp because of envelope differences. These results illustrate the sensitivity of neurons in AI to small perturbations of their acoustic input. In particular, they pose a challenge to models based on linear summation of energy within a spectrotemporal receptive field.
Linearity of cortical receptive fields measured with natural sounds.
C. K. Machens and M. S. Wehr and A. M. Zador
J Neurosci  24  1089-100  (2004)
How do cortical neurons represent the acoustic environment? This question is often addressed by probing with simple stimuli such as clicks or tone pips. Such stimuli have the advantage of yielding easily interpreted answers, but have the disadvantage that they may fail to uncover complex or higher-order neuronal response properties. Here, we adopt an alternative approach, probing neuronal responses with complex acoustic stimuli, including animal vocalizations. We used in vivo whole-cell methods in the rat auditory cortex to record subthreshold membrane potential fluctuations elicited by these stimuli. Most neurons responded robustly and reliably to the complex stimuli in our ensemble. Using regularization techniques, we estimated the linear component, the spectrotemporal receptive field (STRF), of the transformation from the sound (as represented by its time-varying spectrogram) to the membrane potential of the neuron. We find that the STRF has a rich dynamical structure, including excitatory regions positioned in general accord with the prediction of the classical tuning curve. However, whereas the STRF successfully predicts the responses to some of the natural stimuli, it surprisingly fails completely to predict the responses to others; on average, only 11% of the response power could be predicted by the STRF. Therefore, most of the response of the neuron cannot be predicted by the linear component, although the response is deterministically related to the stimulus. Analysis of the systematic errors of the STRF model shows that this failure cannot be attributed to simple nonlinearities such as adaptation to mean intensity, rectification, or saturation. Rather, the highly nonlinear response properties of auditory cortical neurons must be attributable to nonlinear interactions between sound frequencies and time-varying properties of the neural encoder.
Analyzing neural responses to natural signals: maximally informative dimensions.
T. Sharpee and N. C. Rust and W. Bialek
Neural Comput  16  223-50  (2004)
We propose a method that allows for a rigorous statistical analysis of neural responses to natural stimuli that are nongaussian and exhibit strong correlations. We have in mind a model in which neurons are selective for a small number of stimulus dimensions out of a high-dimensional stimulus space, but within this subspace the responses can be arbitrarily nonlinear. Existing analysis methods are based on correlation functions between stimuli and responses, but these methods are guaranteed to work only in the case of gaussian stimulus ensembles. As an alternative to correlation functions, we maximize the mutual information between the neural responses and projections of the stimulus onto low-dimensional subspaces. The procedure can be done iteratively by increasing the dimensionality of this subspace. Those dimensions that allow the recovery of all of the information between spikes and the full unprojected stimuli describe the relevant subspace. If the dimensionality of the relevant subspace indeed is small, it becomes feasible to map the neuron's input-output function even under fully natural stimulus conditions. These ideas are illustrated in simulations on model visual and auditory neurons responding to natural scenes and sounds, respectively.
Modulation power and phase spectrum of natural sounds enhance neural encoding performed by single auditory neurons.
A. Hsu and S. M. N. Woolley and T. E. Fremouw and F. E. Theunissen
J Neurosci  24  9201-11  (2004)
We examined the neural encoding of synthetic and natural sounds by single neurons in the auditory system of male zebra finches by estimating the mutual information in the time-varying mean firing rate of the neuronal response. Using a novel parametric method for estimating mutual information with limited data, we tested the hypothesis that song and song-like synthetic sounds would be preferentially encoded relative to other complex, but non-song-like synthetic sounds. To test this hypothesis, we designed two synthetic stimuli: synthetic songs that matched the power of spectral-temporal modulations but lacked the modulation phase structure of zebra finch song and noise with uniform band-limited spectral-temporal modulations. By defining neural selectivity as relative mutual information, we found that the auditory system of songbirds showed selectivity for song-like sounds. This selectivity increased in a hierarchical manner along ascending processing stages in the auditory system. Midbrain neurons responded with highest information rates and efficiency to synthetic songs and thus were selective for the spectral-temporal modulations of song. Primary forebrain neurons showed increased information to zebra finch song and synthetic song equally over noise stimuli. Secondary forebrain neurons responded with the highest information to zebra finch song relative to other stimuli and thus were selective for its specific modulation phase relationships. We also assessed the relative contribution of three response properties to this selectivity: (1) spiking reliability, (2) rate distribution entropy, and (3) bandwidth. We found that rate distribution and bandwidth but not reliability were responsible for the higher average information rates found for song-like sounds.
Testing the efficiency of sensory coding with optimal stimulus ensembles.
C. K. Machens and T. Gollisch and O. Kolesnikova and A. V. M. Herz
Neuron  47  447-56  (2005)
According to Barlow's seminal {\tt{}"{}}efficient coding hypothesis,{\tt{}"{}} the coding strategy of sensory neurons should be matched to the statistics of stimuli that occur in an animal's natural habitat. Using an automatic search technique, we here test this hypothesis and identify stimulus ensembles that sensory neurons are optimized for. Focusing on grasshopper auditory receptor neurons, we find that their optimal stimulus ensembles differ from the natural environment, but largely overlap with a behaviorally important sub-ensemble of the natural sounds. This indicates that the receptors are optimized for peak rather than average performance. More generally, our results suggest that the coding strategies of sensory neurons are heavily influenced by differences in behavioral relevance among natural stimuli.
Adaptive stimulus optimization for auditory cortical neurons.
K. N. O'Connor and C. I. Petkov and M. L. Sutter
J Neurophysiol  94  4051-67  (2005)
Despite the extensive physiological work performed on auditory cortex, our understanding of the basic functional properties of auditory cortical neurons is incomplete. For example, it remains unclear what stimulus features are most important for these cells. Determining these features is challenging given the considerable size of the relevant stimulus parameter space as well as the unpredictable nature of many neurons' responses to complex stimuli due to nonlinear integration across frequency. Here we used an adaptive stimulus optimization technique to obtain the preferred spectral input for neurons in macaque primary auditory cortex (AI). This method uses a neuron's response to progressively modify the frequency composition of a stimulus to determine the preferred spectrum. This technique has the advantage of being able to incorporate nonlinear stimulus interactions into a "best estimate" of a neuron's preferred spectrum. The resulting spectra displayed a consistent, relatively simple circumscribed form that was similar across scale and frequency in which excitation and inhibition appeared about equally prominent. In most cases, this structure could be described using two simple models, the Gabor and difference of Gaussians functions. The findings indicate that AI neurons are well suited for extracting important scale-invariant features in sound spectra and suggest that they are designed to efficiently represent natural sounds.
Tuning to natural stimulus dynamics in primary auditory cortex.
J. A. Garcia-Lazaro and B. Ahmed and J. W. H. Schnupp
Curr Biol  16  264-71  (2006)
The amplitude and pitch fluctuations of natural soundscapes often exhibit "1/f spectra", which means that large, abrupt changes in pitch or loudness occur proportionally less frequently in nature than gentle, gradual fluctuations. Furthermore, human listeners reportedly prefer 1/f distributed random melodies to melodies with faster (1/f0) or slower (1/f2) dynamics. One might therefore suspect that neurons in the central auditory system may be tuned to 1/f dynamics, particularly given that recent reports provide evidence for tuning to 1/f dynamics in primary visual cortex. To test whether neurons in primary auditory cortex (A1) are tuned to 1/f dynamics, we recorded responses to random tone complexes in which the fundamental frequency and the envelope were determined by statistically independent "1/f(gamma) random walks," with gamma set to values between 0.5 and 4. Many A1 neurons showed clear evidence of tuning and responded with higher firing rates to stimuli with gamma between 1 and 1.5. Response patterns elicited by 1/f(gamma) stimuli were more reproducible for values of gamma close to 1. These findings indicate that auditory cortex is indeed tuned to the 1/f dynamics commonly found in the statistical distributions of natural soundscapes.
The acoustic features of rhesus vocalizations and their representation in the ventrolateral prefrontal cortex.
Y. Cohen and F. Theunissen and B. Russ and P. Gill
J Neurophysiol      (2006)
Communication is one of the fundamental components of both human and non-human animal behavior. Auditory communication signals (i.e., vocalizations) are especially important in the socioecology of several species of non-human primates such as rhesus monkeys. In rhesus, the ventrolateral prefrontal cortex (vPFC) is thought to be part of a circuit involved in representing vocalizations and other auditory objects. To further our understanding of the role of the vPFC in processing vocalizations, we characterized the spectrotemporal features of rhesus vocalizations, compared these features with other classes of natural stimuli, and then related the rhesus-vocalization acoustic features to neural activity. We found that the range of these spectrotemporal features was similar to those found in other ensembles of natural stimuli, including human speech, and identified the subspace of these features that would be particularly informative to discriminate between different vocalizations. In a first neural study, we found, however, that the tuning properties of vPFC neurons did not emphasize these particularly informative spectrotemporal features. In a second neural study, we found that a first-order linear model (the spectrotemporal receptive field) is not a good predictor of vPFC activity. The results of these two neural studies are consistent with the hypothesis that the vPFC is not involved in coding the first-order acoustic properties of a stimulus but is involved in processing the higher-order information needed to form representations of auditory objects.
Contrast constancy in natural scenes in shadow or direct light: A proposed role for contrast-normalisation (non-specific suppression) in visual cortex.
J. S. Lauritzen and D. J. Tolhurst
Network  16  151-73  (2005)
The range of contrasts in natural scenes is generally thought to far exceed the limited dynamic ranges of individual contrast-encoding neurons in the primary visual cortex. The visual system may employ gain-control mechanisms (Ohzawa et al. 1985) to compensate for the mismatch between the range of natural contrast energies and the limited dynamic range of visual neurons; one proposed mechanism is contrast normalisation or non-specific suppression (Heeger 1992a). This paper aims to evaluate the role of contrast normalisation in human contrast perception, using a computer model of primary visual cortex. The model uses orthogonal pairs of Gabor patches to simulate simple-cell receptive-fields to calculate local, band-limited contrast in a series of 50 digitised photographs of natural scenes. The average range of contrast energies in each image was 2.29 log units, while the "lifetime range" each model simple cell would see across all images was 2.98 log units. These ranges are greater than the dynamic range of real mammalian simple cells. Contrast normalisation (dividing contrast responses by the summed responses of all nearby neurons) reduces contrast ranges, perhaps sufficiently to match them to neurons' limited dynamic ranges. Comparison of images taken under diffuse and direct lighting conditions showed that contrast normalisation can sometimes match these conditions effectively. This may lead to perceptual contrast constancy in the face of spurious changes in contrast caused by natural environmental conditions.
Contextual modulation of orientation tuning contributes to efficient processing of natural stimuli.
G. Felsen and J. Touryan and Y. Dan
Network  16  139-49  (2005)
It has been proposed that sensory neurons are adapted to the statistical structure of the natural environment in order to encode natural stimuli efficiently. While spatiotemporal correlations in luminance signals may be decorrelated by neurons in early visual processing stages, higher-order correlations, such as those in the orientation domain, are likely to persist in the input representation until the cortical level. In this study, we first examine orientation correlations in natural stimuli across brief time intervals and across nearby regions of space, and find strong correlations in both domains. We then examine contextual modulation of orientation tuning. We find that both temporal and spatial contexts exert a common influence on orientation tuning, shifting tuning away from the orientation of either the adapting (temporal) or surrounding (spatial) grating. Finally, we incorporate this context-mediated repulsive shift in orientation tuning into a model of cortical responses. We find that a direct result of the shift is a reduction of the redundancy in the population responses evoked by the orientation configurations that are most common in natural stimuli. Thus, cortical neurons may be adapted to the statistics of orientation in natural stimuli in order to increase the efficiency of natural stimulus representation.
Fixational instability and natural image statistics: implications for early visual representations.
M. Rucci and A. Casile
Network  16  121-38  (2005)
Under natural viewing conditions, small movements of the eye, head and body prevent the maintenance of a steady direction of gaze. It is known that stimuli tend to fade when they are stabilized on the retina for several seconds. However, it is unclear whether the physiological motion of the retinal image serves a visual purpose during the brief periods of natural visual fixation. This study examines the impact of fixational instability on the statistics of the visual input to the retina and on the structure of neural activity in the early visual system. We show that fixational instability introduces a component in the retinal input signals that, in the presence of natural images, lacks spatial correlations. This component strongly influences neural activity in a model of the LGN. It decorrelates cell responses even if the contrast sensitivity functions of simulated cells are not perfectly tuned to counter-balance the power-law spectrum of natural images. A decorrelation of neural activity at the early stages of the visual system has been proposed to be beneficial for discarding statistical redundancies in the input signals. The results of this study suggest that fixational instability might contribute to the establishment of efficient representations of natural stimuli.
Sparse coding and decorrelation in primary visual cortex during natural vision.
W. E. Vinje and J. L. Gallant
Science  287  1273-6  (2000)
Theoretical studies suggest that primary visual cortex (area V1) uses a sparse code to efficiently represent natural scenes. This issue was investigated by recording from V1 neurons in awake behaving macaques during both free viewing of natural scenes and conditions simulating natural vision. Stimulation of the nonclassical receptive field increases the selectivity and sparseness of individual V1 neurons, increases the sparseness of the population response distribution, and strongly decorrelates the responses of neuron pairs. These effects are due to both excitatory and suppressive modulation of the classical receptive field by the nonclassical receptive field and do not depend critically on the spatiotemporal structure of the stimuli. During natural vision, the classical and nonclassical receptive fields function together to form a sparse representation of the visual world. This sparse code may be computationally efficient for both early vision and higher visual processing.
Estimating spatio-temporal receptive fields of auditory and visual neurons from their responses to natural stimuli.
F. E. Theunissen and S. V. David and N. C. Singh and A. Hsu and W. E. Vinje and J. L. Gallant
Network  12  289-316  (2001)
We present a generalized reverse correlation technique that can be used to estimate the spatio-temporal receptive fields (STRFs) of sensory neurons from their responses to arbitrary stimuli such as auditory vocalizations or natural visual scenes. The general solution for STRF estimation requires normalization of the stimulus-response cross-correlation by the stimulus autocorrelation matrix. When the second-order stimulus statistics are stationary, normalization involves only the diagonal elements of the Fourier-transformed auto-correlation matrix (the power spectrum). In the non-stationary case normalization requires the entire auto-correlation matrix. We present modelling studies that demonstrate the feasibility and accuracy of this method as well as neurophysiological data comparing STRFs estimated using natural versus synthetic stimulus ensembles. For both auditory and visual neurons, STRFs obtained with these different stimuli are similar, but exhibit systematic differences that may be functionally significant. This method should be useful for determining what aspects of natural signals are represented by sensory neurons and may reveal novel response properties of these neurons.
Natural stimulation of the nonclassical receptive field increases information transmission efficiency in V1.
W. E. Vinje and J. L. Gallant
J Neurosci  22  2904-15  (2002)
We have investigated how the nonclassical receptive field (nCRF) affects information transmission by V1 neurons during simulated natural vision in awake, behaving macaques. Stimuli were centered over the classical receptive field (CRF) and stimulus size was varied from one to four times the diameter of the CRF. Stimulus movies reproduced the spatial and temporal stimulus dynamics of natural vision while maintaining constant CRF stimulation across all sizes. In individual neurons, stimulation of the nCRF significantly increases the information rate, the information per spike, and the efficiency of information transmission. Furthermore, the population averages of these quantities also increase significantly with nCRF stimulation. These data demonstrate that the nCRF increases the sparseness of the stimulus representation in V1, suggesting that the nCRF tunes V1 neurons to match the highly informative components of the natural world.
Nonlinear V1 responses to natural scenes revealed by neural network analysis.
R. Prenger and M. C. Wu and S. V. David and J. L. Gallant
Neural Netw  17  663-79  (2004)
A key goal in the study of visual processing is to obtain a comprehensive description of the relationship between visual stimuli and neuronal responses. One way to guide the search for models is to use a general nonparametric regression algorithm, such as a neural network. We have developed a multilayer feed-forward network algorithm that can be used to characterize nonlinear stimulus-response mapping functions of neurons in primary visual cortex (area V1) using natural image stimuli. The network is capable of extracting several known V1 response properties such as: orientation and spatial frequency tuning, the spatial phase invariance of complex cells, and direction selectivity. We present details of a method for training networks and visualizing their properties. We also compare how well conventional explicit models and those developed using neural networks can predict novel responses to natural scenes.
Natural stimulus statistics alter the receptive field structure of v1 neurons.
S. V. David and W. E. Vinje and J. L. Gallant
J Neurosci  24  6991-7006  (2004)
Studies of the primary visual cortex (V1) have produced models that account for neuronal responses to synthetic stimuli such as sinusoidal gratings. Little is known about how these models generalize to activity during natural vision. We recorded neural responses in area V1 of awake macaques to a stimulus with natural spatiotemporal statistics and to a dynamic grating sequence stimulus. We fit nonlinear receptive field models using each of these data sets and compared how well they predicted time-varying responses to a novel natural visual stimulus. On average, the model fit using the natural stimulus predicted natural visual responses more than twice as accurately as the model fit to the synthetic stimulus. The natural vision model produced better predictions in >75% of the neurons studied. This large difference in predictive power suggests that natural spatiotemporal stimulus statistics activate nonlinear response properties in a different manner than the grating stimulus. To characterize this modulation, we compared the temporal and spatial response properties of the model fits. During natural stimulation, temporal responses often showed a stronger late inhibitory component, indicating an effect of nonlinear temporal summation during natural vision. In addition, spatial tuning underwent complex shifts, primarily in the inhibitory, rather than excitatory, elements of the response profile. These differences in late and spatially tuned inhibition accounted fully for the difference in predictive power between the two models. Both the spatial and temporal statistics of the natural stimulus contributed to the modulatory effects.
Do we know what the early visual system does?
M. Carandini and J. B. Demb and V. Mante and D. J. Tolhurst and Y. Dan and B. A. Olshausen and J. L. Gallant and N. C. Rust
J Neurosci  25  10577-97  (2005)
We can claim that we know what the visual system does once we can predict neural responses to arbitrary stimuli, including those seen in nature. In the early visual system, models based on one or more linear receptive fields hold promise to achieve this goal as long as the models include nonlinear mechanisms that control responsiveness, based on stimulus context and history, and take into account the nonlinearity of spike generation. These linear and nonlinear mechanisms might be the only essential determinants of the response, or alternatively, there may be additional fundamental determinants yet to be identified. Research is progressing with the goals of defining a single "standard model" for each stage of the visual pathway and testing the predictive power of these models on the responses to movies of natural scenes. These predictive models represent, at a given stage of the visual pathway, a compact description of visual computation. They would be an invaluable guide for understanding the underlying biophysical and anatomical mechanisms and relating neural responses to visual perception.
Predicting neuronal responses during natural vision.
S. V. David and J. L. Gallant
Network  16  239-60  (2005)
A model that fully describes the response properties of visual neurons must be able to predict their activity during natural vision. While many models have been proposed for the visual system, few have ever been tested against this criterion. To address this issue, we have developed a general framework for fitting and validating nonlinear models of visual neurons using natural visual stimuli. Our approach derives from linear spatiotemporal receptive field (STRF) analysis, which has frequently been used to study the visual system. However, prior to the linear filtering stage typical of STRFs, a linearizing transformation is applied to the stimulus to account for nonlinear response properties. We used this approach to compare two models for neurons in primary visual cortex: a nonlinear Fourier power model, which accounts for spatial phase invariant tuning, and a traditional linear model. We characterized prediction accuracy in terms of the total explainable variance, given intrinsic experimental noise. On average, Fourier power STRFs predicted 40% of explainable variance while linear STRFs were able to predict only 21% of explainable variance. The performance of the Fourier power model provides a benchmark for evaluating more sophisticated models in the future.
Efficient coding of natural scenes in the lateral geniculate nucleus: experimental test of a computational theory.
Y. Dan and J. Atick and R. Reid
J Neurosci  16  3351-62  (1996)
A recent computational theory suggests that visual processing in the retina and the lateral geniculate nucleus (LGN) serves to recode information into an efficient form (Atick and Redlich, 1990). Information theoretic analysis showed that the representation of visual information at the level of the photoreceptors is inefficient, primarily attributable to a high degree of spatial and temporal correlation in natural scenes. It was predicted, therefore, that the retina and the LGN should recode this signal into a decorrelated form or, equivalently, into a signal with a 'white' spatial and temporal power spectrum. In the present study, we tested directly the prediction that visual processing at the level of the LGN temporarily whitens the natural visual input. We recorded the responses of individual neurons in the LGN of the cat to natural, time-varying images (movies) and, as a control, to white-noise stimuli. Although there is substantial temporal correlation in natural inputs (Dong and Atick, 1995b), we found that the power spectra of LGN responses were essentially white. Between 3 and 15 Hz, the power of the responses had an average variation of only +/-10.3\%. Thus, the signals that the LGN relays to visual cortex are temporarily decorrelated. Furthermore, the responses of X-cells to natural inputs can be well predicted from their responses to white-noise inputs. We therefore conclude that whitening of natural inputs can be explained largely by the linear filtering properties (Enroth-Cugell and Robson, 1966). Our results suggest that the early visual pathway is well adapted for efficient coding of information in the natural visual environment, in agreement with the prediction of the computational theory.
Processing of natural time series of intensities by the visual system of the blowfly.
J. van Hateren
Vision Res  37  3407-16  (1997)
A major problem that a visual system faces is how to fit the large intensity variation of natural image streams into the limited dynamic range of its neurons. One of the means to accomplish this is through the use of gain control. In order to investigate this, natural time series of intensities were measured, as well as the responses of blowfly photoreceptors and Large Monopolar Cells (LMCs) to these time series. Time series representative of what each photoreceptor of a real visual system would normally receive were measured with an optical system measuring the light intensity of a spot comparable with the field of view of single human foveal cones. This system was worn on a headband by a freely walking person. Resulting time series have rms-contrasts ranging from an average of 0.45 for 1-sec segments to 1.39 for 100-sec segments (both when limited to frequencies up to 100 Hz). Power spectra behave approximately as 1/f (f: temporal frequency). Measured time series were subsequently presented to fly photoreceptors and LMCs by playing them back on an LED. The results show that fast gain controls indeed keep the response within the dynamic range of the cells and that a large part of this range is actually used for packing the information in natural time series.
The 'independent components' of natural scenes are edge filters.
A. Bell and T. Sejnowski
Vision Res  37  3327-38  (1997)
It has previously been suggested that neurons with line and edge selectivities found in primary visual cortex of cats and monkeys form a sparse, distributed representation of natural scenes, and it has been reasoned that such responses should emerge from an unsupervised learning algorithm that attempts to find a factorial code of independent visual features. We show here that a new unsupervised learning algorithm based on information maximization, a nonlinear 'infomax' network, when applied to an ensemble of natural scenes produces sets of visual filters that are localized and oriented. Some of these filters are Gabor-like and resemble those produced by the sparseness-maximization network. In addition, the outputs of these filters are as independent as possible, since this infomax network performs Independent Components Analysis or ICA, for sparse (super-gaussian) component distributions. We compare the resulting ICA filters and their associated basis functions, with other decorrelating filters produced by Principal Components Analysis (PCA) and zero-phase whitening filters (ZCA). The ICA filters have more sparsely distributed (kurtotic) outputs on natural scenes. They also resemble the receptive fields of simple cells in visual cortex, which suggests that these neurons form a natural, information-theoretic coordinate system for natural images.
Contrast adaptation and the spatial structure of natural images.
M. Webster and E. Miyahara
J Opt Soc Am A  14  2355-66  (1997)
Natural images have a characteristic spatial structure, with amplitude spectra that decrease with frequency roughly as 1/f. We have examined how contrast (pattern-selective) adaptation to this structure influences the spatial sensitivity of the visual system. Contrast thresholds and suprathreshold contrast and frequency matches were measured after adaptation to random samples from an ensemble of images of outdoor scenes or of synthetic images formed by filtering the amplitude spectra of noise over a range of spectral slopes. Adaptation selectively reduced sensitivity at low-to-medium frequencies, biasing contrast sensitivity toward higher frequencies. The pattern of aftereffects was similar for different natural image ensembles but varied with large changes in the slope of the noise spectra. Our results suggest that adaptation to the spatial structure in natural scenes may exert strong and selective influences on perception that are important in characterizing the normal operating states of the visual system.
Neural activity in areas V1, V2 and V4 during free viewing of natural scenes compared to controlled viewing [published erratum appears in {N}euroreport 1998 {J}un 1;9(8):inside back cover and corrected and republished in {N}euroreport 1998 {J}un 22;9(9):2153-8]
J. Gallant and C. Connor and D. Van Essen
Neuroreport  9  85-90  (1998)
Under natural viewing conditions primates make frequent exploratory eye movements across complex scenes. We recorded neural activity of 62 cells in visual areas V1, V2 and V4 in an awake behaving monkey that freely viewed natural images. About half of the cells studied showed a modulation in firing rate following some of the eye movements made during free viewing, though the proportions showing a discernible modulation varied across areas. These cells were also examined under controlled viewing conditions in which gratings or natural image patches were flashed in and around the classical receptive field while the animal performed a fixation task. Activity rates were generally highest with flashed gratings and lowest during free viewing. Flashed natural image patches evoked responses between these two extremes, and the responses were higher when the patches were confined to the classical receptive field than when they extended into the non-classical surround. Thus the reduction of activity during free viewing relative to that obtained with flashed gratings is partly attributable to natural images being less effective stimuli and partly to suppressive spatio-temporal neural mechanisms that are important during natural vision.
Robust temporal coding of contrast by V1 neurons for transient but not for steady-state stimuli.
F. Mechler and J. Victor and K. Purpura and R. Shapley
J Neurosci  18  6583-98  (1998)
We show that spike timing adds to the information content of spike trains for transiently presented stimuli but not for comparable steady-state stimuli, even if the latter elicit transient responses. Contrast responses of 22 single neurons in macaque V1 to periodic presentation of steady-state stimuli (drifting sinusoidal gratings) and transient stimuli (drifting edges) of optimal spatiotemporal parameters were recorded extracellularly. The responses were analyzed for contrast-dependent clustering in spaces determined by metrics sensitive to the temporal structure of spike trains. Two types of metrics, cost-based spike time metrics and metrics based on Fourier harmonics of the response, were used. With both families of metrics, temporal coding of contrast is lacking in responses to drifting sinusoidal gratings of most (simple and complex) V1 neurons. However, two-thirds of all neurons, mostly complex cells, displayed significant temporal coding of contrast for edge stimuli. The Fourier metrics indicated that different response harmonics are partially independent, and their combined use increases information about transient stimuli. Our results demonstrate the importance of stimulus transience for temporal coding. This finding is significant for natural vision because moving edges, which are present in moving object boundaries, and saccades induce transients. We think that an abrupt change in the adapted state of the local visual circuitry triggers the temporal structuring of spike trains in V1 neurons.
Natural scene statistics at the centre of gaze.
P. Reinagel and A. Zador
Network  10  341-50  (1999)
Early stages of visual processing may exploit the characteristic structure of natural visual stimuli. This structure may differ from the intrinsic structure of natural scenes, because sampling of the environment is an active process. For example, humans move their eyes several times a second when looking at a scene. The portions of a scene that fall on the fovea are sampled at high spatial resolution, and receive a disproportionate fraction of cortical processing. We recorded the eye positions of human subjects while they viewed images of natural scenes. We report that active selection affected the statistics of the stimuli encountered by the fovea, and also by the parafovea up to eccentricities of 4 degrees. We found two related effects. First, subjects looked at image regions that had high spatial contrast. Second, in these regions, the intensities of nearby image points (pixels) were less correlated with each other than in images selected at random. These effects could serve to increase the information available to the visual system for further processing. We show that both of these effects can be simply obtained by constructing an artificial ensemble comprised of the highest-contrast regions of images.
Visual adaptation as optimal information transmission.
M. Wainwright
Vision Res  39  3960-74  (1999)
We propose that visual adaptation in orientation, spatial frequency, and motion can be understood from the perspective of optimal information transmission. The essence of the proposal is that neural response properties at the system level should be adjusted to the changing statistics of the input so as to maximize information transmission. We show that this principle accounts for several well-documented psychophysical phenomena, including the tilt aftereffect, change in contrast sensitivity and post-adaptation changes in orientation discrimination. Adaptation can also be considered on a longer time scale, in the context of tailoring response properties to natural scene statistics. From the anisotropic distribution of power in natural scenes, the proposal also predicts differences in the contrast sensitivity function across spatial frequency and orientation, including the oblique effect.
The human visual system is optimised for processing the spatial information in natural visual images.
C. Parraga and T. Troscianko and D. Tolhurst
Curr Biol  10  35-8  (2000)
A fundamental tenet of visual science is that the detailed properties of visual systems are not capricious accidents, but are closely matched by evolution and neonatal experience to the environments and lifestyles in which those visual systems must work. This has been shown most convincingly for fish and insects. For mammalian vision, however, this tenet is based more upon theoretical arguments than upon direct observations. Here, we describe experiments that require human observers to discriminate between pictures of slightly different faces or objects. These are produced by a morphing technique that allows small, quantifiable changes to be made in the stimulus images. The independent variable is designed to give increasing deviation from natural visual scenes, and is a measure of the Fourier composition of the image (its second-order statistics). Performance in these tests was best when the pictures had natural second-order spatial statistics, and degraded when the images were made less natural. Furthermore, performance can be explained with a simple model of contrast coding, based upon the properties of simple cells in the mammalian visual cortex. The findings thus provide direct empirical support for the notion that human spatial vision is optimised to the second-order statistics of the optical environment.
Nonmonotonic noise tuning of BOLD fMRI signal to natural images in the visual cortex of the anesthetized monkey.
G. Rainer and M. Augath and T. Trinath and N. Logothetis
Curr Biol  11  846-54  (2001)
Background: The perceptual ability of humans and monkeys to identify objects in the presence of noise varies systematically and monotonically as a function of how much noise is introduced to the visual display. That is, it becomes more and more difficult to identify an object with increasing noise. Here we examine whether the blood oxygen level-dependent functional magnetic resonance imaging (BOLD fMRI) signal in anesthetized monkeys also shows such monotonic tuning. We employed parametric stimulus sets containing natural images and noise patterns matched for spatial frequency and intensity as well as intermediate images generated by interpolation between natural images and noise patterns. Anesthetized monkeys provide us with the unique opportunity to examine visual processing largely in the absence of top-down cognitive modulations and can thus provide an important baseline against which work with awake monkeys and humans can be compared.Results: We measured BOLD activity in occipital visual cortical areas as natural images and noise patterns, as well as intermediate interpolated patterns at three interpolation levels (25\%, 50\%, and 75\%) were presented to anesthetized monkeys in a block paradigm. We observed reliable visual activity in occipital visual areas including V1, V2, V3, V3A, and V4 as well as the fundus and anterior bank of the superior temporal sulcus (STS). Natural images consistently elicited higher BOLD levels than noise patterns. For intermediate images, however, we did not observe monotonic tuning. Instead, we observed a characteristic V-shaped noise-tuning function in primary and extrastriate visual areas. BOLD signals initially decreased as noise was added to the stimulus but then increased again as the pure noise pattern was approached. We present a simple model based on the number of activated neurons and the strength of activation per neuron that can account for these results.Conclusions: We show that, for our parametric stimulus set, BOLD activity varied nonmonotonically as a function of how much noise was added to the visual stimuli, unlike the perceptual ability of humans and monkeys to identify such stimuli. This raises important caveats for interpreting fMRI data and demonstrates the importance of assessing not only which neural populations are activated by contrasting conditions during an fMRI study, but also the strength of this activation. This becomes particularly important when using the BOLD signal to make inferences about the relationship between neural activity and behavior.
Chromatic structure of natural scenes.
T. Wachtler and T. W. Lee and T. J. Sejnowski
J Opt Soc Am A Opt Image Sci Vis  18  65-77  (2001)
We applied independent component analysis (ICA) to hyperspectral images in order to learn an efficient representation of color in natural scenes. In the spectra of single pixels, the algorithm found basis functions that had broadband spectra and basis functions that were similar to natural reflectance spectra. When applied to small image patches, the algorithm found some basis functions that were achromatic and others with overall chromatic variation along lines in color space, indicating color opponency. The directions of opponency were not strictly orthogonal. Comparison with principal-component analysis on the basis of statistical measures such as average mutual information, kurtosis, and entropy, shows that the ICA transformation results in much sparser coefficients and gives higher coding efficiency. Our findings suggest that nonorthogonal opponent encoding of photoreceptor signals leads to higher coding efficiency and that ICA may be used to reveal the underlying statistical properties of color information in natural scenes.
Noticing familiar objects in real world scenes: the role of temporal cortical neurons in natural vision.
D. Sheinberg and N. Logothetis
J Neurosci  21  1340-50  (2001)
During natural vision, the brain efficiently processes views of the external world as the eyes actively scan the environment. To better understand the neural mechanisms underlying this process, we recorded the activity of individual temporal cortical neurons while monkeys looked for and identified familiar targets embedded in natural scenes. We found a group of visual neurons that exhibited stimulus-selective neuronal bursts just before the monkey's response. Most of these cells showed similar selectivity whether effective targets were viewed in isolation or encountered in the course of exploring complex scenes. In addition, by embedding target stimuli in natural scenes, we could examine the activity of these stimulus-selective cells during visual search and at the time targets were fixated and identified. We found that, during exploration, neuronal activation sometimes began shortly before effective targets were fixated, but only if the target was the goal of the next fixation. Furthermore, we found that the magnitude of this early activation varied inversely with reaction time, indicating that perceptual information was integrated across fixations to facilitate recognition. The behavior of these visually selective cells suggests that they contribute to the process of noticing familiar objects in the real world.
Natural signal statistics and sensory gain control.
O. Schwartz and E. P. Simoncelli
Nat Neurosci  4  819-25  (2001)
We describe a form of nonlinear decomposition that is well-suited for efficient encoding of natural signals. Signals are initially decomposed using a bank of linear filters. Each filter response is then rectified and divided by a weighted sum of rectified responses of neighboring filters. We show that this decomposition, with parameters optimized for the statistics of a generic ensemble of natural images or sounds, provides a good characterization of the nonlinear response properties of typical neurons in primary visual cortex or auditory nerve, respectively. These results suggest that nonlinear response properties of sensory neurons are not an accident of biological implementation, but have an important functional role.
Reconstruction of natural scenes from ensemble responses in the lateral geniculate nucleus.
G. Stanley and F. Li and Y. Dan
J Neurosci  19  8036-42  (1999)
A major challenge in studying sensory processing is to understand the meaning of the neural messages encoded in the spiking activity of neurons. From the recorded responses in a sensory circuit, what information can we extract about the outside world? Here we used a linear decoding technique to reconstruct spatiotemporal visual inputs from ensemble responses in the lateral geniculate nucleus (LGN) of the cat. From the activity of 177 cells, we have reconstructed natural scenes with recognizable moving objects. The quality of reconstruction depends on the number of cells. For each point in space, the quality of reconstruction begins to saturate at six to eight pairs of on and off cells, approaching the estimated coverage factor in the LGN of the cat. Thus, complex visual inputs can be reconstructed with a simple decoding algorithm, and these analyses provide a basis for understanding ensemble coding in the early visual pathway.
Independent component filters of natural images compared with simple cells in primary visual cortex.
J. van Hateren and A. van der Schaaf
Proc R Soc Lond B Biol Sci  265  359-66  (1998)
Properties of the receptive fields of simple cells in macaque cortex were compared with properties of independent component filters generated by independent component analysis (ICA) on a large set of natural images. Histograms of spatial frequency bandwidth, orientation tuning bandwidth, aspect ratio and length of the receptive fields match well. This indicates that simple cells are well tuned to the expected statistics of natural stimuli. There is no match, however, in calculated and measured distributions for the peak of the spatial frequency response: the filters produced by ICA do not vary their spatial scale as much as simple cells do, but are fixed to scales close to the finest ones allowed by the sampling lattice. Possible ways to resolve this discrepancy are discussed.