Physical
principles in neural coding and computation: Lessons from the fly visual system
Last
updated 2 May 2007
Successful
organisms solve a myriad of problems in the face of profound physical
constraints, and it is a challenge to quantify this functionality in a language
that parallels our characterization of other physical systems. Strikingly, when
we do this the performance of biological systems often approaches the limits
set by basic physical principles. Here we describe our exploration of
functionality and physical limits in the fly visual system.
An
important technical aspect of the work described here has been the enormous
stability of the fly as a system for recording the activity of neurons. In
addition, rather than focusing on the response of the visual system to a
discrete set of stimuli presented in isolation, we have developed ways of
analyzing the responses to continuous, dynamic stimuli more like those
encountered in nature; in fact considerable effort has gone into getting ever
closer to truly natural stimuli. The result is that where classical experiments
on neurons involved roughly one kilobit of data, roughly the size of a single
gene, much of the work described here rests on the analysis of data sets of
several megabits, closer in size to a whole genome. This combination of
changing the scale of the data set and changing our theoretical outlook in the
design and analysis has allowed us to uncover several new phenomena.
We
have found that computation in the flyÕs visual system works with such
precision that it is limited by photon shot noise and diffraction, that
information transmission across synapses between neurons operates near the
limits imposed by the ÔquantizationÕ of signals into discrete packets of
chemical transmitter, that deep inside the brain signals are represented by
sequences of action potentials or ÔspikesÕ with nearly optimal efficiency, and
that these levels of performance are the result of dynamic adaptation processes
that allow the system to adjust its strategies to the statistics of the current
visual environment. These observations provide a glimpse of optimization
principles that might organize and determine the brainÕs choice of codes and
computations. It remains to be seen whether these principles are applicable
more generally, but we believe that our experience in the fly visual system has
sharpened many of the questions in the field. The following paragraphs are
meant as a guide through some of the papers taken in chronological (rather than
logical) order; early parts of the work were reviewed, with pedagogical
background, in Spikes: Exploring the Neural Code (MIT Press, 1997).
Numbers refer to a full
list of publications for WB.
21. RealÐtime performance of a movement sensitive neuron in the blowfly visual system: Coding and information transfer in short spike sequences. R de Ruyter van Steveninck & W Bialek, Proc. R. Soc. London Ser. B 234, 379-414 (1988).
All of the sensory signals reaching
the brain are encoded in sequences of identical, discrete pulses called action
potentials or spikes. Spike trains in other regions of the brain can represent
motor commands or, more subtly, plans and intentions, and there is no reason to
doubt that even our private thoughts are represented in this way. The way in
which spikes represent the external world has been studied since the 1920s. In
this paper we argued that one should take a new point of view on this problem.
Instead of asking how known signals in the outside world are encoded in the
average properties of spike trains, we asked how the brainÑwhich has only the
spike trains to work withÑcould make inferences about unknown sensory stimuli.
Specifically, we showed how to characterize the distribution of sensory inputs
that are consistent with the neural response, thus quantifying the
(un)certainty and information content of inferences from the spike train. These ideas were used in the design and
analysis of experiments on the fly visual system, specifically a neuron H1 that
is responsible for extracting information about (horizontal) rigid body motion
across the whole visual field. There are several specific points that have
become important in recent work: The demonstration that short sequences of
spikes are informative only about projections of the stimulus onto spaces of
low dimensionality, that similar spike trains stand for similar stimulus
waveforms, and that patterns of spikes can convey more information than
expected by summing the contributions of individual spikes. In addition to
these specific results, the point of view expressed in this paper set the
agenda for much of our subsequent work on the neural code.
25. Coding and computation with neural spike
trains. W Bialek & A Zee, J. Stat. Phys. 59, 103-115 (1990).
Inspired
in part by the results in the fly, we set out to study the problem of coding
and decoding in simple models of spiking neurons. Probably the most important
result was that there is a large regime in which signals can be decoded by
linear (perturbative) methods even though the encoding is strongly nonlinear.
The small parameter that makes this work is the mean number of spikes per
correlation time of the signal, suggesting that spike trains can be decoded
linearly if they are sparse in the time domain. In Spikes we discuss the evidence that many different neural
systems make use of such a sparse representation, but of course one really
wants a direct experimental answer: Can we decode the spike trains of real
neurons using these theoretical ideas? As an aside, it is worth noting that
identifying a regime where linear decoding can work is really much more general
than the details of the model that we chose to investigate; this is important,
since none of the models we write down are likely to be accurate in detail.
Rereading
the original paper, it is perhaps not so clear that sparseness is the key idea.
A somewhat more explicit discussion is given in later summer school lectures
[29, 84].
34. Reading a neural code. W Bialek, F
Rieke, RR de Ruyter van Steveninck, & D Warland, Science 252, 1854-1857 (1991).
Returning
to the experiments, we showed that it really is possible to decode the spike
trains from H1 and thereby reconstruct the time dependent velocity of motion
across the visual field. The coexistence of linear decoding with nonlinear
encoding was implicit in this work, but made explicit in Chapter 2 of Spikes. This work was intended as a
proof of principle, but we also found that the reconstructions were
surprisingly precise: Errors corresponded to ~ 0.06 degrees over ~ 30 ms, which
is about 20 times smaller than the lattice spacing of detectors in the compound
eye or 10 times smaller than the nominal Òdiffraction limitÓ due to blur by the
eyeÕs lenses. Resolution beyond the sampling and diffraction scales is also
known in human vision, and the collection of perceptual phenomena in which this
occurs is called hyperacuity. This led us to wonder about the physical limits
to motion estimationÑblur due to diffraction through the lenses of the compound
eye and noise due to the random arrival of photons at the receptors. In fact
the observed performance is very close to this limit, so that even four layers
of neurons away from the receptors it is still the physics of the inputs that
sets the precision of computation. The ideas of decoding and stimulus
reconstruction have since been applied to systems ranging from motor control in
crabs to visual motion perception in monkeys.
For
a review of hyperacuity see Section 4.2 of Spikes. Perceptual hyperacuity usually
is demonstrated in tasks that involve discrimination among discrete alternative
signals; the reconstruction experiments allowed the demonstration of comparable
precision in a more natural task of continuous estimation. Experiments that are
more analogous to the discrimination experiments have also been done on the H1
neuron [57], and a preliminary account of these experiments (in 1984) may have
been the first report of hyperacuity in the responses of a single neuron. For
details of the limits to motion estimation, see [29, 42].
53. Statistical mechanics and visual signal
processing. M Potters & W Bialek, J Phys I France 4, 1755-1775 (1994).
Inspired by the observation of near optimal performance
in the fly's motion estimation system, we set out to understand the algorithmic
requirements for optimal estimation. Conventional approaches involve searching
a set of possible strategies for the best within the set, but we showed how one
could map the problem of estimation in the presence of noise onto a statistical
mechanics problem in which the data act as external fields and the estimator is
the expectation value of some order parameter. Estimation theory then is reduced to the computation of
(perhaps strongly nonlinear) response functions, and standard approximations in
statistical mechanics map to different regimes of the signal processing
problem. Applying the general
framework to the problem of motion estimation in the fly, we showed that the
optimal estimation strategy has very different behaviors in different sensory
environments. In particular, the
optimal estimator interpolates between popular models for motion estimation,
which arise as limiting cases of the full theory. An inevitable prediction of the theory is that the optimal
processor must change its strategy, or adapt to changes in the statistics of
the input signals. Preliminary
experiments gave clear evidence of this Òstatistical adaptationÓ [54, 59, 61],
and recent experiments provide a direct confirmation of the combination of
nonlinear operations predicted by the theory [102].
The rate of information transfer at
gradedÐpotential synapses. RR de Ruyter van Steveninck & SB Laughlin, Nature 379, 642-645 (1996).
Information
is passed from one neuron to another largely through chemical synapses. In the
same way that electrical signaling in many cells is quantized into action
potentials or spikes, chemical signaling is quantized into vesicles or packets
of neurotransmitter molecules; this is true even at synapses like the first
synapse in the retina, where both the preÐ and postÐsynaptic cells generate
graded voltage responses rather than spikes. Here we characterized the signal
and noise properties of the photodetector cells and their synaptic target, the
large monopolar cell (LMC), in the fly retina, and then used these measurements
to infer the information capacity of the synapse. In characterizing
photodetector noise we touch one of the fundamental facts about the visual
system, namely that it is capable of counting single photons. Although evidence
for this has been accumulating since the 1940s, it has been much less clear
whether biological photodetectors can continue to operate near the photon shot
noise limit at counting rates that are more typical of animal behavior. Here we
showed how one can combine traditional measures of signal transfer and noise to
characterize the equivalent contrast noise of photodetector cells; we exploit
the extreme stability of the fly experiments to calibrate this noise against
the limits set by photon counting. The result is that fly photodetector
performance comes very close to the shot noise limit over a wide range of
frequencies and counting rates, up to the point where pupil mechanisms begin to
attenuate the light entering the eye. The excess noise beyond shot noise is
well approximated as a limit to time resolution. Applying the same analysis to
the LMCs we saw an effective sixÐfold increase in photon capture rate, as
expected since six photodetectors converge on a single LMC, but this also means
that over a considerable range of frequencies and counting rates the noise in
the synapse is negligible and integration of the six signals is essentially
perfect. Finally these noise measurements were analyzed to show that the
synapse can transmit more than 1500 bits/sec, by far the record for any single
neuron. As discussed in Spikes, this information transmission rate is (given some
uncertainties) close to the limit set by the quantization of the signal into
vesicles combined with the systemÕs time resolution. Within some range of time
scales, then, the photodetector is an nearÐideal photon counter and the LMC is
a nearÐideal vesicle counter.
For a
review and pedagogical discussion see [81].
66. Entropy and information in neural spike
trains. SP Strong, R Koberle, RR de Ruyter van Steveninck & W Bialek, Phys
Rev Lett 80, 197-200 (1998).
There have been fifty years of debate on the question of whether the detailed timing of spikes is important. With precise timing the system would have access to a much greater representational capacity: The entropy of neural responses is larger at higher time resolution, but is this capacity used efficiently? Here we showed how to measureÑwithout reference to any models or assumptionsÑthe information content of neural spike trains as they encode naturalistic, dynamic inputs. The result was that the motion sensitive neurons in the fly visual system use roughly half of the spike train entropy to carry visual information, and this efficiency is approximately constant from a time resolution of ~1 sec down to ~1 msec. The observation of high coding efficiency at msec time resolution is in agreement with earlier results from other systems using the linear decoding methods to estimate the information rates, but it is important that with the present approach we donÕt even need to know what aspects of the stimulus are represented let alone the proper algorithm for decoding these features. In contrast with methods based on decoding, the tools developed in this work have come to be called ÔdirectÕ methods for analysis of information transmission, and are being applied to a variety of systems. Direct estimates of the information content of spike trains serve to make precise our impressions about the reproducibility and variability of neural responses, and an important observation in the fly as in other systems is that the statistical structure of the spike train in response to dynamic signals seems to be very different from that observed in response to static or slowly varying signals [63, 73]. This suggests that one might be able to observe even more informative and efficient responses under more natural conditions (see below!).
The
central technical difficulty in using these methods is the problem of bias due
to limited sample sizes. In this first paper we made an effort to collect a
very large data set, rather than trying to be especially sophisticated in how
we extract our estimates on entropy and information from the available samples.
There are interesting theoretical questions concerning how much one can say
about information theoretic quantities when the relevant probability
distributions are undersampled. For our efforts in this direction see [83,
101].
The metabolic cost of neural information.
SB Laughlin, RR de Ruyter van Steveninck & JC Anderson, Nature Neurosci 1, 36-41 (1998).
A striking fact about the brain is that very small groups
of cells change their metabolic ate in relation to their activityÑthinking
harder really does cost energy, and the exquisite control of this energy
balance ultimately forms the basis for signals that are detectable in
functional imaging of the brain. The combination of our measurements on
signals, noise and information transmission in the fly retina with fairly
detailed mechanistic data on these cells allowed us to address the energetics
of information transmission in a new way.
We found that visual information is quite expensive, with a cost of ~
10,000 ATP molecules (each worth
~0.5 eV) per bit. In such noiseÐlimited signaling systems, transmission of
multiple parallel signals at relatively poor signalÐtoÐnoise ratio is vastly
more energetically efficient than transmitting a single high quality signal,
perhaps providing a physical justification for the frequent occurrence of both
multiple pathways and apparent unreliability of the individual pathways in
biological systems. Several groups are actively investigating related issues,
from the idea that neural codes are optimized for metabolic efficiency to the
possibility of exploiting these conclusions in low power silicon devices.
71. Synergy in a neural code. N Brenner,
SP Strong, R Koberle, W Bialek & RR de Ruyter van Steveninck, Neural
Comp 12, 1531-1552 (2000).
Timing of spikes could be significant (as demonstrated in [66]) because each spike points precisely to an event in the outside world, or because the system really uses temporal patterns of spikes to convey something special. Here we gave this question an information theoretic formulation: Do patterns of spikes carry more information than expected by summing the contributions of individual spikes? To answer this requires measuring the information carried by particular candidate symbols in the code, and we show how this can be done with real data, independent of an model assumptions, making connections between the information theoretic quantities and the more familiar correlation functions of the spike train. Although we focused on patterns across time in a single cell, everything generalizes to patterns across a population of cells. For the flyÕs motion sensitive neuron, we do observe synergistic coding, and this synergy is a significant component of the high coding efficiency seen in this system. The fact that we can measure objectively the information carried, for example, by a single spike, also means that we have a benchmark against which to test models of the code.
For
a discussion of synergy and redundancy in populations see [91]. There is a big
conceptual question about how one relates synergy or redundancy in populations
to the synergy or redundancy that one can observe among pairs, triplets, ... of
cells; see [90].
72. Adaptive rescaling optimizes information
transmission. N Brenner, W Bialek & R de Ruyter van Steveninck, Neuron 26, 695-702 (2000).
The direct demonstration of high coding efficiency in neural spike trains [45, 66, 77] supports strongly the old idea that the construction of an efficient representation could be the goal of neural computation. Efficient representations must be matched to the statistical structure of the input signals, and it therefore is encouraging that we observe higher coding efficiency for more naturalistic signal ensembles [58, 63, 74], but it usually was assumed that such matching could occur only over the long time scales of development or evolution. In [53, 55] we proposed that adaptation to statistics would occur in real time, to exploit the intermittent structure of natural signals (eg [52]), and in [54, 61, 62] we presented evidence that this occurs both in the fly and in the vertebrate retina. Here we analyzed an example in detail, and found that adaptation to changes in the variance of the input has a striking form, rescaling the input/output relation of the neuron so that signals are coded in relative units. Further, the precise choice of the scaling factor serves to optimize information transmission; as far as we know this is the first direct demonstration that an optimization principle is at work in the brain.
There
are two notions of realÐtime adaptation to statistics at work in our thinking
about the fly, and in the fly itself. First is the idea of adaptation of the
computation that the fly does in estimating motion, as developed in [53].
Second is the idea that coding the output of these computations must be matched
to he distribution of the signal we are trying to encode. In general,
adaptation in real time works only if signals have a special statistical
structure in which low order statistical properties (variance, correlation
time, ... ) are constant across reasonable windows of space and time, and then
these low order statistics rift. Under these conditions one can generate the
sorts of longÐtailed distributions seen for many different natural signals, and
it is also true that optimal coding and computational strategies will involve
adapting locally and tracking these drifting statistics. At least for us these
ideas have their origin in observations on the statistical structure of natural
images [52], where we first saw the Òvariance normalizationÓ which has its
direct echo in the present work on adaptation and scaling.
75. Universality and individuality in a
neural code. E Schneidman, N Brenner, N Tishby, RR de Ruyter van Steveninck
& W Bialek, in Advances in Neural Information Processing 13,TK Leen, TG Dietterich, & V
Tresp,eds., pp. 159-165 (MIT Press, Cambridge, 2001).
One of the major challenges in thinking quantitatively about biological systems is the variability among individuals. In the context of the neural code, we can ask if different animals share similar neural representations of the same sensory inputs. The problem of comparing neural representations is similar in several ways to the problem of comparing DNA sequences, and we argue that rather than using conventional metric or string matching methods one should take a model independent information theoretic approach; we believe that this is a substantial conceptual advance that should have implications back to the bioinformatics problem. We then find that the flyÕs visual system has a quantifiable mixture of universality and individuality: what is universal is the efficiency of the code, nd what is individual is the precise way in which patterns of spikes are used to achieve his efficiency. Closely related to the problem of individuality is the problem of classifying neurons within one organism: Can we, for example, make precise the impression that our retina has a small number of classes of cells which serve to divide the incoming visual information into parallel and stereotyped channels, or might each neuron in fact have a unique view of the world? Building on the methods introduced here, it is possible to give this classification problem a similar purely information theoretic formulation [87].
77. Neural coding of naturalistic motion stimuli.
GD Lewen, W
Bialek, & RR de Ruyter van Steveninck, Network 12, 317-329 (2001).
Brains were selected by evolution for their performance in processing sensory signals of considerable complexity, far from the simple stimuli of the traditional experimentalistÕs toolbox. One of the themes in our exploration of the flyÕs visual system thus has been to provide methods for analyzing the responses to more complex and ultimately natural inputs. Any experiment with ÒnaturalÓ inputs, however, must make a compromise between emulating the richness of the real world and maintaining experimental control. Rather than trying to construct an ever more naturalistic world in the laboratory, here we took a different approach and recorded the responses of the motion sensitive neuron H1 with the fly outdoors and moving along angular trajectories taken from actual acrobatic flights. Even in response to constant velocity stimuli, the differences between outdoor and laboratory conditions is large enough to extend the dynamic range of these neurons by more than an order of magnitude in angular velocity. During motion along realistic flight trajectories, spike timing can be reproducible on the scale of 100 µsec; further, the ability of the photodetector cells to act as nearÐideal photon counters at high counting rates is reflected in the fact that the information about motion continues to increase as the photon flux climbs toward its midday maximum. While much remains to be done, these initial results strongly support the conclusion that under natural conditions the nervous system can operate with a richness and precision far beyond that expected from experiments in more limited environments.
While
attractive, the idea that natural stimuli are coded more efficiently (or are
special in some other way) has been controversial. For a review that addresses
the controversy and resents several new results, see [73].
78. Efficiency and ambiguity in an adaptive
neural code. AL Fairhall, GD Lewen,
W Bialek & RR de Ruyter van Steveninck, Nature 412, 787-792 (2001).
Adaptation allows the nervous system to be better ÒmatchedÓ to the current sensory environment, but there are problems: adaptive codes are ambiguous, and matching takes time so one can fall behind. Here we take the observations on adaptation and optimization in [72] as a starting point and show that the dynamics of adaptation itself is optimal, so that the speed with which the system adjusts to a change in input distribution is close to the limit set by the need to gather statistics. Further, while the coding of short segments of the sensory stimulus in small numbers of spike trains is highly adaptive, the longer term statistics of the spike train contains enough information about the adaptation state of the cell to resolve potential ambiguities. Finally, there is no single time scale that characterizes the response to changes in input statistics; rather the system seems to have access to time scales ranging from less than 100 msec out to order one minute or more, which may make it possible to deal with the multiple time scales of variation in the real world.
103. Features and dimensions: Motion
estimation in fly vision. W Bialek & RR de Ruyter van Steveninck,
q-bio/0505003 (2005).
Here we build on the ideas of [21] and [72] to characterize the computation of motion in the fly visual system as a mapping from the high dimensional space of signals in the retinal photodetector array to the probability of generating an action potential in a motion sensitive neuron. We identify a low dimensional subspace of signals within which the neuron is most sensitive, and then sample this subspace to visualize the nonlinear structure of the mapping. The results illustrate the computational strategies predicted for a system that makes optimal motion estimates given the physical noise sources in the detector array. More generally, the hypothesis that neurons are sensitive to low dimensional subspaces of their inputs formalizes the intuitive notion of feature selectivity and suggests a strategy for characterizing the neural processing of complex, naturalistic sensory inputs. The same methods of analysis have been used to take a new look at the computations done in simple, biologically plausible model neurons [88], as well as other experimental systems from vertebrate retina to visual cortex. New, purely information theoretic methods should allow us to search for low dimensional relevant subspaces even when stimuli have all the complex correlation structure of fully natural signals [93].
115. Neural coding of a natural stimulus ensemble: Uncovering information at subÐmillisecond resolution. I Nemenman, GD Lewen, W Bialek & RR de Ruyter van Steveninck, qÐbio.NC/0612050 (2006).
Our knowledge of the sensory world is encoded by neurons in sequences of discrete, identical pulses termed action potentials or spikes. There is persistent controversy about the extent to which the precise timing of these spikes is relevant to the function of the brain. We revisit this issue, using the motion-sensitive neurons of the fly visual system as a test case. New experimental methods (from [77]) allow us to deliver more nearly natural visual stimuli, comparable to those which flies encounter in free, acrobatic flight, and new mathematical methods (from [83,99]) allow us to draw more reliable conclusions about the information content of neural responses even when the set of possible responses is very large. We find that significant amounts of visual information are represented by details of the spike train at millisecond and sub-millisecond precision, even though the sensory input has a correlation time of ~ 60 ms; different patterns of spike timing represent distinct motion trajectories, and the absolute timing of spikes points to particular features of these trajectories with high precision. Under these naturalistic conditions, the system continues to transmit more information at higher photon flux, even though individual photoreceptors are counting more than one million photons per second, and removes redundancy in the stimulus to generate a more efficient neural code.