Chapter 20 in
Introduction [by Anderson, co-editor of the book, to paper by
Pellionisz and Llinás 1985, reproduced in
facsimile of a cardinal paper of Tensor Network Theory. Hyperlinks show paper
in facsimile, non-searchable, while the html below is searchable]
(1985) A. Pellionisz and R. Llinás
Tensor network theory of the metaorganization of functional geometries
in the central nervous system
Neuroscience 16: 245-273
(Comments by Anderson) This is a difficult paper about geometry.
Rodolfo Llinás is a neurophysiologist, well known for over two decades of
productive work on the cerebellum. András Pellionisz has done well-known work
on theoretical neurobiology for about the same period. This paper is the
product of one of a relatively small number of close collaborations between a
theoretician and an experimentalist, where each contributed extensively to the
final result.
Neural network models nearly always represent information as
collections of values of neural activity shown by many model neurons: formally,
state vectors. A state vector is a point in a high-dimensional space. Spaces
have geometries, and this paper suggests that we should not take this geometry
for granted. Network models make use of geometry to some extent already because
they depend heavily on concepts of "nearness" in the sense of
distance or angle between points in a state space. As only one example, models
to explain the development of topographic maps in the brain (see von der
Malsburg, paper 8; Amari, paper 9; Kohonen, paper 37) develop so that units
physically near to one another in an array of units become more correlated in
their response properties. Neural networks in essence put a state vector into
the system as input and get as output another state vector and make use of the
structure of the real world through correlations represented in the input state
vectors.
It is a neural network truism that networks develop so as to pick up
the statistical structure of their environment. However, there is another
aspect of the environment that most network models have not made much use of up
to this time. Animals move about in the world, touch it, and are touched by it.
The ultimate output of any biological neural computation must be a motor act.
Motor acts have sensory consequences; there is a sensory-motor, motor-sensory
loop closed by the environment. (One of the most intellectually exciting things
about the rapidly developing field of robotics is that it is not possible to
avoid these biological problems when constructing artificial systems.)
Organisms do not exist in a world of random, high-dimensional vectors.
Their world is a three-dimensional geometrical structure that is accurately
described by Euclidean geometry. Animals like us, with internal skeletons,
interact with that three-dimensional geometry by contracting muscles to pull
bones around. A crucial point made in this paper is that the intrinsic geometry
of motor action is not necessarily simple and three-dimensional, just because
its actions operate in a three-dimensional world. There are hundreds of
muscles, most of which have different effects on the position of the parts of
the body. The complexity of the motor system suggests that interacting with
nature is not simple. A particular body motion almost always involves the
coordinated contraction of many muscles. Worse, muscles do not act
independently. Two different muscles may produce components of force in the
same direction
Chapter 20
or can oppose each other. The motor output is usually underdetermined,
so that many different patterns of motor neuron output can give rise to the
same overall force.
Consider the simple situation of a limb held in a position in space,
which is diagrammed in the paper's figure 1. Forces designed to move a limb in
a particular direction in space are the result of a high-dimensional output
vector driving the muscles. Each muscle provides a force on the limb, and the
overall force is the resultant of all the muscle forces together, which add by
simple physics.
At the same time each muscle contains an elaborate set of
proprioceptive sensors that tell the brain which muscles are contracting and
how strongly. This is the primary information the sensorimotor system is using
to close the loop.
Now the mathematizing can start. We are performing a set of coordinate
transformations, involving sensory input, motor proprioception, and motor
output. When faced with a problem, the wise theoretician starts by looking at
the techniques others have already developed for similar problems. Most of us
are already familiar with the geometry of simple coordinate transformations
from linear algebra or physics. But there is a highly developed area of
mathematics called tensor- theory that handles the truly general problems
involved in the conversion of one geometrical representation into another.
Tensor theory, however, has a legendary reputation for difficulty and
complexity, partially deserved. Part of the blame for this situation lies in
extremely abstract mathematical treatments and the arcane and idiosyncratic
notation commonly used. Books on tensor theory written by mathematicians are
generally useless for anyone else, but there are reasonable (usually older)
books that try to develop some degree of geometrical intuition for engineers
and physicists (see Bickley and Gibson 1962, Hay 1953, Kay 1988).
Suppose we have two coordinate systems. Every description of a point in
one system by a set of coordinates corresponds to another set of numbers in the
other system; there are functions relating the descriptions of points in the
two coordinate systems. There are three key concepts that are required to
understand the ideas that Pellionisz and Llinás are trying to convey:
covariance, contravariance, and the metric tensor. These concepts are firmly
established in mathematics, but it is usually easy to avoid using them because
in our familiar orthogonal coordinate system, the need for them disappears.
Suppose we have a set of coordinate axes and we want to describe a
point. (Like calculus the argument can hold in general for curved coordinate
axes if we move "very small" distances.) To describe a point in
space, we need a set of vector components to do the description. There are two
distinct types of vector components we could use:
The first way is diagrammed in Figure 1 and is described as covariant
vector components. Operationally we might describe the point by the movements
of a ship looking for an island. It sails along one axis until it detects the
island off to one side. It then moves perpendicular to its original course to
get to the island. Note that this description really depends only on the
direction of one coordinate axis. The covariant representation picks out the
component of the point along one or another coordinate axis.
The second way is diagrammed in Figure 2 and is described
mathematically as contravariant vector components. Operationally the
contravariant operation works more like a car on a highway, which cannot freely
change direction, but can move only along the directions of the coordinate
axes: Therefore the point is so many units along on one coordinate axis and
Covariant Component
Figure 1 Covariant representation
so many units along another coordinate axis. Contravariance is very
familiar to most scientists because it describes the way forces add in the
familiar parallelogram of forces.
Notice that in the case of our familiar Euclidean coordinate system,
the covariant and contravariant representations are the same, so this
distinction is not needed.
With this distinction clear, Pellionisz and Llinás start to think about
what it might mean. The contravariant description fits very well with the
intuitive notion of the addition of forces produced by muscles, if each
coordinate axis is identified with muscle motor activity. Because these
individual forces add up like physical forces to produce a resultant, they are
contravariant in nature.
The proprioceptive sensors act much more like the covariant vector
representation. The covariant description would pick up the component of a
force along the given coordinate axes. If we assume that the coordinate axes
are identified with proprioceptors in a given muscle, then these sensors will
respond only to force components along the muscle axis, that is, a muscle does
not in general know what is going on in other muscles except through their
components along the first muscle.
So the problem of sensorimotor transformation in these terms becomes
one of relating a covariant sensory representation to a contravariant motor
representation. But the whole system is connected together through the world -
they are both looking at different aspects of the same thing: muscle forces and
actions.
Pellionisz and Llinás suggest that the cerebellum might be the brain's
structure that closes the internal loop in the nervous system: That is, it
transforms the sensory representation into the motor representation,
mathematically, by transforming a covariant representation into a contravariant
representation. Is there a standard mathematical way to describe this
transformation that would give us insight into what the cerebellum might be
doing?
Contravariant Component
Figure 2 Contravariant Representation
In tensor theory the covariant and contravariant representations are
related by what is called the metric tensor. The name metric tensor is
appropriate geometrically because it is concerned with computation of distance
and angle. Clearly the distance between two points must not change, no matter
what the coordinate axes used to describe them look like, because distance is
something real. Similarly the angle between two vectors must not change with
different axes. The metric tensor can be constructed by simple rules from the
covariant and contravariant representations.
Once the metric tensor is assumed to be realized in a neural structure,
we have access to the full power of the neural network connection matrix
mechanism. For example, every matrix has eigenvectors and in this case there is
a physical feedback loop between input and output. Therefore we can predict
oscillations or resonances in the resulting dynamical systems with frequencies
and amplitudes related to the large positive eigenvalues of the connection
matrix. These eigenvectors would be particularly important in learning and one
might make corrections in the functioning of the system by manipulating the magnitudes
of the eigenvalues. Because we are working on the metric tensor, all these
changes will amount to changing the geometry of the internal representation in
response to the interaction of internal (neural) geometry with external
(physical) geometry. These two geometries are different, but they can interact
to organize each other. The term metaorganization is used to describe this
process.
The bulk of the paper is devoted to suggestions about what the neural
structures might be doing, based on the tensor interpretation of the function
of sensorimotor cerebellar pathway function and the known neurophysiology and
neuroanatomy.
For novice readers there are two notational pitfalls to watch for in
this paper. Tensor mathematics makes frequent of the Einstein convention (yes,
that Einstein), also called the
summation convention. Because summations are so frequent, the
convention holds that when the same index appears twice in an expression, there
is an implied summation over that index, so, for example, ~a;x; is written as
a;x;. This convention is used in equations 1 and 2 of the paper. There is also
lavish use of superscripts and subscripts to represent vector components.
Superscripts do not mean powers, but particular vectors or vector component. A
general rule is that covariant vectors use subscripts and contravariant vectors
use superscripts, but this is unfortunately not invariable notation.
Many neural modelers have a somewhat static and deliberately simple input-output
view of the nervous system, where input data are processed like raw materials
in a factory, with the output appearing at the shipping dock. In nature,
however, when the input-output loop is closed through the environment, some
unusual and powerful techniques become applicable, and the nature of the
computation changes.
To say that an approach is radical means that it represents a
fundamental change in orientation. In this collection there are two radical
approaches to the nervous system that are at variance with ideas that are taken
for granted by both neural scientists and network modelers. The ideas presented
in this paper form one set, and the paper by Skarda and Freeman (paper 21) is
the other. These two papers should be read carefully. They are important.
References
W.G. Bickley and R.E. Gibson (1962), Via Vector to Tensor. New York:
Wiley.
G.E. Hay (1953), Vector and Tensor Analysis. New York: Dover.
D.C. Kay (1988), Tensor Calculus. New York: Schaum's Outlines,
McGraw-Hill.