Brian Castle
Model Neurons


When investigating the brain, models are very helpful. They allow us to explore neural behaviors without sacrificing animals or invading their nervous systems. We would like to understand how behaviors can be precisely timed after just one example. For instance if you're a child learning to play basketball, and someone shows you how to make a basket, you can probably do it all by yourself after one or two attempts.

One-shot learning is an interesting study, but let's set that aside for a moment and look at some model neurons that let us simplify our investigations. Which model neuron we use, depends on what behaviors we'd like to look at. If we're just interested in the algebra and statistics, we can use simple model neurons that are amenable to matrix multiplication with nVidia GPU's. On the other hand there are many applications for which static algebra is inadequate, and particular kinds of dynamics are required. In order to support both, the neuron must have certain key capabilities, and these are what modeling helps us understand.


Early Model Neurons

Modeling of neurons for computational purposes began with McCulloch & Pitts in 1942. They considered a binary neuron whose state could be either firing, or not firing. The neuron integrates its inputs (as a weighted sum), and if the result is over the threshold the neuron fires, and if it's under the threshold it doesn't. The original threshold function was the Heaviside step function, which is not smooth and can not be easily differentiated. The ideas embodied in the McCulloch-Pitts neuron later became the basis for the early learning machine called the Perceptron (Rosenblatt 1958). In the Perceptron, the synaptic weights are adjusted based on an error that is determined from the data. The Perceptron essentially "fits" the data to a set of linear parameters. Unlike the McCulloch-Pitts neuron, the Perceptron uses neurons with a sigmoidal threshold function, which is differentiable and therefore errors can be passed backward through the network and assigned to their sources in proportion to their contributions.




Simple neurons like these have severe limitations, and based on what we've already seen they're inadequate to mimic biological behavior. Nevertheless they still have significant computational abilities when they're wired into populations. The Perceptron is able to perform "linear separability"on a dataset, meaning it finds the slope and intercept of a straight line that partitions the dataset. More neurons simply means more lines, the end result is a combination of linear partitions. Linear regression is probably the single most common activity in all of statistics, so this capability is useful. However it doesn't work on all datasets, and the Perceptron is a bit of a one-trick pony in terms of its neural behavior. Unfortunately Frank Rosenblatt passed away before he could counter Marvin Minsky's points about Perceptron limitations, but before he died he saw his invention being used in real time national security applications. Even binary neurons can be very powerful when wired into populations and given the right kinds of plasticity.

After the successful demonstration of the Perceptron, neuroscientists pointed out that the firing of neurons often uses a "rate code", where the rate of firing is determined by stimulus intensity. The "spike rate" was thought to be a neuron's only output, therefore any information carried out of the neuron must be encoded in the spike train. And so the binary neuron became a linear neuron, where the output can take on any positive value (or zero). In this case the calculation of the threshold still applies, and if the output is above threshold the thresold level is simply subtracted from the output to obtain the final value (this is the principle used in the ReLU function in machine learning). Such a model neuron is shown in the figure, and you can see the only thing that differs is the threshold function.




In the above figure the inputs I are multiplied by the synaptic weights W and integrated over the dendritic surface to arrive at a sum S, which is added to a "bias" term b representing noise or a baseline input level. The difference between this neuron and the McCulloch-Pitts neuron is that the threshold function f is sigmoidal (S-shaped) instead of being a step. When this threshold function can be differentiated, an error occurring at the output O can be passed back through the threshold and the contributions of each input to the error can be determined. Then, the synaptic weights can be updated to reflect the new error information. The process of passing the error in the opposite direction through the threshold function is called "back propagation".

In some machine architectures, when the derivatives for successive layers can be calculated, the errors can be passed all the way back through the layers in one computational sweep. But brains don't work this way! To pass back an error, it has to be explicitly represented. And models like this are far from Hodgkin-Huxley, in that they contain no dynamics. (For example they don't account for the refractory period). In the 70's and 80's, in an effort to make model neurons more realistic, the essential behaviors of biological neurons were crystallized in several different ways.


Simplifying the Hodgkin-Huxley Model

Real neurons have more than two parts, more than just an integrative portion and a spike generating portion. The properties of a neural membrane vary according to its location, for example calcium channels are present in the dendritic tufts of cortical pyramidal cells, and in the basal dendrites, but not in the shafts of the apical dendrites. In general ion channels are carefully localized along the neural membrane, and this is true everywhere, not just in the synapses.

On the other hand, the full Hodgkin-Huxley model for multiple channel types is computationally intensive, it requires the solution of multiple sets of simultaneous differential equations, and it's very difficult to do this in real time, especially when there are millions of neurons in the network. There exist simulators for individual neurons (like NEURON and Nengo), and simulators for networks (like TensorFlow and PyTorch), but there's very little in between - and that is because, the behaviors that we require from the neurons and synapses is computationally difficult. With the current state of the art in neural modeling, if someone discovers an exciting new behavior related to calcium channels in a cortical stellate cell, it's very difficult to subsequently create a network full of such cells and test them under changing input conditions.

While the above model neurons distill the basics of dendritic summation and a non-linear threshold for action potential generation, they're too simple to handle the complexities of multiple ionic conductances. On the other hand, the Hodgkin-Huxley model is four-dimensional and a little too difficult computationally. Perhaps we can find something in the middle, something a little more realistic that's still amenable to computation.


Morris-Lecar Model

One of the variations of Hodgkin-Huxley that students should be familiar with (because it's widely cited and widely used) is the Morris-Lecar model. This is a somewhat simplified version of H-H that still shows bifurcations and spiking activity.



(figure from Rowat & Greenwood 2014)

The drawback of the Morris-Lecar model is it only uses two conductances (calcium and potassium), and it has some significant computational artifacts, both at high frequencies and at certain particular frequencies.


Fitzhugh-Nagumo Model

Another model neuron variation that is frequently found in the literature is the Fitzhugh-Nagumo model. This model is an abstraction, rather than directly modeling ion channels. It does not display any bursting behavior, nor can it model subthreshold dynamics.



(figure from Nagumo et al 1962)

Any of the above simplified model neurons can still be useful if we're just trying to replicate simple network or neuron behavior. However the simplified dynamics can create computational artifacts. We would like a simple model neuron that is computationally friendly and can be modified at least to a limited extent, so we can test the effect of various conductances and various geometries.


Integrate-and-Fire (Izhikevich) Model

Spike times can be precisely modeled by integrate-and-fire neurons, which are frequently used when modeling TTFS and STDP (please refer to the glossary for the definitions of these terms). Unfortunately this model can create spike trains with arbitrarily high firing frequencies, so one must be careful. There are modifications to this model that allow for setting maximum firing rates (Strack et al 2023).



(figure from Izhikevich 2003)

A MATLAB tutorial for Izhikevich neurons is given here.


Poisson Models

There are some models that are "in between" rate codes and spike times, for example they may use a Poisson approximation to estimate spike times from a rate code. Such models are frequently used in oscillator paradigms to conveniently visualize the system attractors. Poisson models are related to gamma distributions, which in turn are related to Bayesian statistics, and gamma functions (which are the integrals of gamma distributions) are related to the fractional calculus which describes past, present, and future events.

A Poisson process can be used to model the interval between spikes. Unlike a Gaussian distribution which is parametrized by its mean and variance, a Poisson process has only a single parameter which is its rate. (The variance of the rate is always equal to its mean). A Poisson process assumes stationarity, that is, the rate does not vary with time. This is of limited value in modeling biological neurons, since real neurons are not stationary. Nevertheless, a Poisson model can create some realistic looking spike trains.



(figure from Amin 2006)

To adjust the ratio of variance to mean in a Poisson model, the Fano factor can be used. If we're looking at a time series and digesting data as it arrives, the Fano factor can vary with time.

Generally F(t) = σ2t / μt

If the Fano factor varies with time, the counting process is no longer a renewal process and a Markov renewal process is needed instead, where transitions are described probabilistically.

To model changes in the firing rate over time, we can use an inhomogeneous Poisson process, which is a stochastic process where the rate varies over time as λ(t). Such a process still assumes independent increments, that is, the number of events in non-overlapping time intervals are independent. And again, this may not be a good assumption for biological neurons, however it has worked in some cases and its relationship to adaptation is noteworthy (Farqui et al 2013).

Poisson-"like" processes are useful in other ways too. Gamma and beta distributions can be used to synthesize Poisson-"like" distributions and the advantage of using these models is the distributions form conjugate priors for Bayesian inference. So, if a synapse or neuron can be related to such a distribution, and the behavior is adjustable based on the data, then the network can be used for things like estimation of likelihood, optimization, and creativity in the generative sense.


Stochastic Models

There is a class of models that doesn't depend on the underlying physics and only looks at behavior (these models could be classified as "statistical", or "phenomenological"). The statistical approach uses the inter-spike interval (ISI) as the foundation of its method, and thus is often associated with Poisson dynamics. The subsequent analysis is very much along the lines of data science (using principal component analysis and similar techniques). The advantage of this method is that synaptic modification can be performed on the basis of temporal correlation alone, regardless of the underlying physical mechanisms. This approach has its roots in machine learning, it goes all the way back to Kohonen, Widrow, Hebb, and beyond.

At a more precise level, neurons can be modeled as stochastic generators. This is essentially a better version of the Poisson approach, insofar as the setpoints control the resulting outcomes. The generators can be kept simple, and in a stochastic system the Wiener process is about as simple as it gets. Unfortunately the assumption of stationarity is violated in many ways in biological systems, and the real situation is closer to nonlinear non-equilibrium thermodynamics.

Stochastic modeling, is where the math gets complex. To understand neurons in their full stochastic glory, an engineering education is necessary. This is because the resulting relationships invoke a class of physical models that includes the Langevin formalism, Fokker-Planck equations, the Ornstein-Uhlenbeck process, and stability considerations such as Lyapunov exponents and Routh-Hurwitz criteria. However stochastic modeling is vitally important for research in neuroscience, because it helps us understand behavior in the phase plane, including oscillations and the transitions between states (notably bifurcations, especially Hopf bifurcations). Stochastic modeling is as close to ground truth as we presently come, in terms of biological realism.

Stochastic modeling is also directly applicable to the growth of a neural network, for instance the ability of axons to find their targets, and the pruning that occurs on the basis of incoming stimuli. Stochastic modeling is also important at the molecular level, to understand the release of synaptic vesicles, the binding of neurotransmitters to receptor sites, and the mechanical coupling associated with active transport and the management of receptor concentrations.


Compartmentation

In addition to the timing of action potentials, there are also questions relating to the integration of signals (data) along the dendrites. Among them are the issue of membrane multi-stability, the issue of dendritic spiking, and the issue of sub-threshold membrane oscillations. The latter issue can affect both integration and spike generation, and possibly serve as a control point for one or both.

One of the key features of neurons is that their processes branch, and sometimes the branching geometry can be very different from one end of the neuron to the other. Dendrites in particular are geometrically important because of their surface integrative properties, and even within the same neuron they can have different lengths and diameters, and different concentrations of ion channels that are compartmentalized and kept in place by the cytoskeleton.

The classic compartmental model for dendrites is still the Rall model. In this model, dendrites are treated as passive cable conductors. (The model was created before the discovery of spiking dendrites). Dendrites are modeled as branching trees of equivalent cylinders (an extension allows for tapering).



(figure from Wilfrid Rall - CC BY-SA 2.5)


(figure from Wilfrid Rall - CC BY-SA 2.5)

Within a dendrite, compartmentation is further enabled by the architecture related to spiny synapses. The thin stalks of dendritic spines help to restrict biochemical diffusion, essentially creating an independent compartment in each synapse. The communication out of such compartments is often complex, involving multiple stable states in the nearby dendritic membrane. In addition to ordinary graded potentials and plateau potentials, spiny synapses can generate dendritic mini-spikes that travel into the cell body, where they may convert the neuron from from one state to another.

Spiny synapses are vitally important to understand computationally, not only because they're ubiquitous, but because the integration of dendritic mini-spikes can occur quite differently from the passive forms of integration usually promoted in the classroom. In the simplest case, mini-spikes are generated by two or more successive excitatory inputs (within a very narrow window, say a few msec). The propagation of mini-spikes can fail at branch points, however when it succeeds the propagation of mini-spikes into the soma can force the neuron into a high-throughput "up" state, where subsequent inputs can affect bursting. At the moment the regulation of this process is poorly understood, and it is an area of very active research. It certainly involves at least two different kinds of glutamate receptors, calcium, proteins like CaMkII, glial cells, transcription factors, and the spine apparatus and its endoplasmic reticulum. The activation of "hot spots" in the data and their translation into "up states" in neurons is one way in which criticality can be controlled at the modular level in the network.


Sculpting Population Behavior

At the end of the day, the behavior of a neuron is determined mainly by its ion channels. These channels can have different kinetics, they can be voltage sensitive or not, and they can have different conditions for activation and deactivation. One can model such behaviors using programs like NEURON or Nengo, but they're very difficult to implement in machine learning situations with TensorFlow or PyTorch, because the latter specialize in matrix multiplication and can't really handle dynamics. So the machine learning community encodes the dynamic in various ways to approximate the desired results. We are just now at the point where the neuroscience and machine learning communities are beginning to inform each other. I started modeling neural networks in 1978, long before they were a thing - and there was no one in the field at that time, only a handful of forward looking researchers who were often perceived as eccentric bordering on crazy. I was at Princeton when John Hopfield wrote his Nobel prize winning papers, and at the time no one took notice, they were considered an interesting oddity in the physics community. (But I noticed - I was fresh off extracting Volterra kernels from the lateral line organs of sharks and skates, and I looked at Hopfield's first 1982 paper and started laughing. "Binary neurons!" I said, and put the journal down. The next day I was back in the library looking at it again, because something about it caught my attention). And then it was a few years before Hinton and Sejnowski and David Tank and others expanded Hopfield's models and invented the Boltzmann machine (and demonstrated NetTalk - the video is still on YouTube).

The building blocks for neural populations are just like the building blocks for electric circuits. There are components, and there are wires. In biology as in electronics there are hundreds of different kinds of components, but unlike in electronics, biology has different kinds of wires too. Some of the biological wires behave in highly nonlinear ways, and they enable some special-purpose behaviors which would otherwise be nearly impossible to accomplish by traditional means.

The computational power of neurons is in populations. Neurons by themselves do very little, but when they're hooked together they become very powerful. And, when engaging in the modeling of neural populations, it helps to have an interactive environment with a friendly GUI and a powerful computational engine. Such environments are available, usually for free for individual and student use. For example Nengo is a wonderful interactive modeling environment, and it interfaces with all manner of computational engines, even neuromorphic hardware. Using such an "IDE" makes turnaround times much faster. We'll see some Nengo output when we begin modeling in a few short pages, and there are many useful simulation environments, including NEURON, Brian, Nest, and others.


Next: Neurons in Populations

Back to Rhythmic Behavior

Back to the Console


(c) 2026 Brian Castle
All Rights Reserved
webmaster@briancastle.com