Brian Castle
Visual Memory


When talking about memory, we must be specific. The word "memory" has a specific meaning, it does not equate with plasticity. In many cases the two are intimately linked, but they're not the same. Technically, memory is a dependence on prior times, so instead of f(t) we have f(t,t-1,t-2...t-n). Whereas, plasticity is the algorithm that changes the parameters inside f, over time and depending on the input.

In humans, memory is a layered process with different stages involving different time courses. Working memory is almost instantaneous, and disappears almost as quickly. Short term memory is organized by the same brain areas that handle scene reconstruction, and the time course there might be a half hour to an hour. Consolidation into long term memory takes hours, even days. The process of memory consolidation is not well understood. One of the requirements is that only essential information gets stored, but as distinct from machines, humans only require one presentation to retain an image forever. Information does fade, it does decay - but the time courses of the decay are set up so they're "longer than" what is generally needed in real world biological settings. (The same is true for the integrator in the oculomotor system, which only has to deal with delays on the order of a few seconds).

Visual content is guided towards the hippocampus, by pathways from both visual streams into the entorhinal cortex and surrounding areas (parahippocampal cortex, perirhinal cortex, and etc). There, scene information is "phase encoded" relative to a theta rhythm organized in the medial septal nucleus. It is still unclear whether the septal rhythm is a dependency or merely a driver, since many cortical areas can generate their own rhythms locally. However the net effect is to synchronize activity in the population of hippocampal neurons. Much like the visual cortex, scene reconstruction seems to sample the input space.

The visual streams come together at the level of the hippocampus. The circuitry around the hippocampus performs two essential functions: scene reconstruction (which helps us navigate), and short term memory (which helps provide the context related to the scene). The memory process in the hippocampus is often referred to as "episodic" because it pertains to a series of events that are related in time.

There is also a different kind of visual memory called "working memory", that preferentially involves the visual cortex and the frontal cortex. There is some evidence that visual signals reach the frontal cortex as fast as they reach the visual cortex, or maybe even faster. The pathway for this fast signaling is currently unknown.

The hippocampus is much too complicated a topic to cover with any justice on these pages. There is a plethora of information about it online. The figure shows the major anatomical landmarks associated with the hippocampus. This is a medial view of the brain, showing the inside of the temporal lobe. Of particular interest is the entorhinal cortex, which receives signals from both visual streams.






The hippocampus and surrounding areas have a complex architecture. These figures show some of the highlights.












There is overwhelming evidence that the hippocampus and surrounding areas engage in scene reconstruction. To do this, information from the dorsal and ventral visual streams must be combined, we need to know both where an object is and what it is. Human brain anatomy supports this convergence. The ventral visual stream enters the perirhinal cortex and terminates preferentially in the lateral entorhinal cortex whereas the dorsal stream enters the parahippocampal cortex and terminates preferentially in the medial entorhinal cortex. Both pathways feed the hippocampus in multiple ways, both directly and indirectly via the subiculum.

In turn, the hippocampus communicates with the dorsolateral prefrontal cortex, from which it derives contextual information related to the scene, and to which it sends encoded episodic information.




In real life the situation is considerably more complicated than this simple picture, fMRI reveals specific connectivity for regions between which the anatomical information in humans is sparse or nonexistent. Significant progress is being made in this area using the human connectome (Rolls et al 2022).

In the entorhinal cortex and the hippocampus are the ingredients necessary for scene reconstruction. In this activity, the objects in the scene are localized relative to the organism, and mapped in an egocentric coordinate system in such a way that the organism can navigate. In such a map, the details of a particular object are only relevant insofar as they assist real-time behavior - they may not enter short term memory at all. When humans recall a scene we frequently forget the irrelevant information. Sometimes we can recall it when prompted, other times it's like it never got into storage.

There is also abundant evidence that a translation between egocentric and allocentric spatial reference frames is performed in and around the hippocampal circuitry (Szczepanski and Saalmann 2013). The allocentric reference frame is needed to navigate, and the egocentric reference frame is needed for the limb and body movements underlying navigation. The parietal spatial areas and the frontal eye fields are organized in viewer-centric coordinates, while the supplementary eye fields and a portion of the superior parietal lobule may be organized in allocentric coordinates. In the entorhinal cortex and hippocampus, the encoding is clearly allocentric because of the grid cells and place fields.

In addition to the two pathways directly related to visual objects in real time, there is a third stream that Rolls calls the "reward stream", it enters from the orbitofrontal cortex and anterior cingulate cortex, and holds information related to the "value" of an object, or an object relative to other objects ("scene configuration"). There are direct monosynaptic connections from the hippocampus to the nucleus accumbens, and this area is the primary target of dopamine fibers from the ventral tegmental area, and is reciprocally connected with it.


Memory and the Limbic System

A memory is not just in "one" place in the brain. Bits and pieces of it are all over. The details of a memory seem to be stored in the areas that are most relevant to it, for instance visual information is stored in the visual cortex, and fear and threats end up in the amygdala. The Circuit of Papez was originally stated as an emotional system, and later it became linked to short term memory, but now it seems that its primary role is neither of these. Its primary role seems to be scene reconstruction. This circuit includes the mammillary bodies which respond to head direction, and the anterior cingulate cortex which handles attention among other things. The hippocampus projects massively to the prefrontal cortex, as does the mediodorsal nucleus of the thalamus, which is richly connected to almost the entire prefrontal area including the orbital portion.

It is impossible to divorce scene reconstruction from short term memory. This is (at least in part) because scene reconstruction requires context. Information is phase encoded before it reaches short term memory. What does this mean exactly? Why is it necessary?

Phase encoding is a way of locally compressing information. It encodes temporal relationships between events, in a form that can be staged and conveniently replayed along the neural timeline. It enables a form of local temporal invariance that helps separate the objects themselves from the sequence of their relationships. Phase encoding is an efficient form of encoding information near the point at infinity on the timeline.


Next: Machine Vision

Back to the Console

Back to the Home Page


(c) 2026 Brian Castle
All Rights Reserved
webmaster@briancastle.com