I’m just leaving the 2015 CogSci conference, and am pondering what I saw there. I’m particularly dismayed at the lack of sophistication surrounding the notion of an external representation, especially the role and nature of ‘grounding’ in a representation—so I’ll focus on that in this post.
First, though, some quick impressions: It was exciting to see so much mathematical and numerical cognition going on. There were many fascinating posters, and also several great talks, a symposium, and a keynote focused on mathematics. People are saying exciting, new things, and it was fascinating to hear. I particularly enjoyed, as did many people, Kevin Mickey’s demonstration of the power of the unit circle as a central representation for trigonometry.
The unit circle is very useful among even expert reasoners working in trigonometry. For instance, the instance below was spontaneously produced by a graduate student in mathematics solving a problem involving quite elementary trig identities, while participating in one of my studies this summer (more on that project another time),
It can be seen more purely here
For those who haven’t seen it, the unit circle is used to motivate the notion of sin, cos, and tangent, in the following way: the circle has a radius 1. For each point that lies on the circle, the sin is defined as the projection of the point to the y axis, and cos as the projection to the x axis. Zero degrees is defined as the point lying along the positive x axis, and positive angle goes counterclockwise around the circle. The image shows that this definition allows the meaningful embedding of right triangles (but not other shapes) into the image, indicating the relationship between right triangles and trig functions.
Kevin Mickey (along with his mentor, Jay McClelland) demonstrated that people who rely on the unit circle to infer formal1 rules, such as sin(-x) = -sin(x), do so very successfully, and, crucially, more successfully than those who rely on their memory for such rules. Furthermore, teaching people the unit circle interpretation helps them also become more successful. It’s a cool result, and certainly serves as a demonstration of the power of the unit circle. In time, the unit circle may join the number line, the Cartesian coordinate system, and the function in the pantheon of immensely potent mathematical visualizations.
The meaning myth
But what is a visualization, anyway? What makes one powerful? Why are the unit circle and the number line so good, and other strategies, such as remembering and applying the axioms sin(-x) = -sin(x) and cos(-x)=cos(x) so bad? Why are these representations so successful in grounding other ways of thinking?
There’s a common story here, which I heard repeatedly at cognitive science this year. It’s a bad story, and I’m afraid the ways that it is bad slow advancement in mathematics education, especially in teasing apart concepts like embodiment, grounding, and concreteness. In what follows I’m putting together a number of different conversations I had over the conference, so I cannot attribute this argument to any one individual. For concreteness, I’ll attribute it to Frankenstein. Frankenstein’s story goes like this: Frankenstein thinks number line and the unit circle are meaningful, while other representations such as axioms or ‘formal notations’, are meaningless. In general, Frankenstein thinks, we want to ground meaningless symbols in meaningful ones. Trying to work with meaningless systems is more-or-less hopeless because meaningful situations are the source of and license for moves made in meaningless systems.
Of course, Frankenstein must explain how some situations come to be meaningful, and others come to be meaningless. Sadly, Frankenstein has a ready answer in a simple neo-Fregean notion of a syntax/semantics distinction, and this is where Frankenstein goes wrong. On this account, which has come down to us through modern philosophers of language and mind such as Steve Pinker, and John Searle, and Jerry Fodor some chunks of the physical world have semantic content and others do not—that is, some refer, and some are referred to. For our purposes, we can call the referring objects ‘symbols’, even though sometimes we want to make finer distinctions. Real situations and objects have intrinsic or inherent meaning, in virtue of the relationships in which they enter. Dogs are doggy. The word ‘dog’, on the other hand, is arbitrary—it just acquires its meaning parasitically through its association with real dogs. A great and powerful thing about this story is that it can be used to explain mathematical symbols, verbal symbols, and the symbols of the mind, all using the same coherent notions.
Frankenstein says that the unit circle and the number line are good groundings because they are ‘real’, or anyway ‘closer’ to real, while symbolic forms just get their meaning through reference to real situations. Symbol manipulations just create a Chinese Room of dancing symbols, but can never reach out to real meaning.
The breakdown of the world into the symbolic and the symbolized has had a long run of popularity, but is currently contentious. Many clever folks have now realized that symbols such as ‘dog’ have strong internal structure, important properties, and engage in particular kinds of relations with other words, which gives them their own kind of intrinsic meaning—a logic laying the ground for complex statistical analyses of the distribution of words in rich languages. Furthermore, these days we question whether the mind really has something like pure referential symbols, often arguing instead for locking or coupling relationships between distinct intrinsically autonomous but richly interconnected physical systems. Weirdly, many of the people who happily espoused the neo-Fregean view of, say, the unit circle are the very people who are very skeptical of this distinction in other walks of (scientific) life. This is strange, because while personally I think that there might be symbols in the mind, and this might even be the only right way to think about language, the one place there pretty clearly aren’t referential symbols is in notational mathematics.
Despite its long heritage and good family, the notion that algebraic notations are meaningful in virtue of their reference to ‘real’ situations is misleading and unnecessary. Furthermore, it obscures the nature and value of forms like the unit circle, and hides the real value and importance of grounding. It also blinds us to the value of the formal algebraic notation.
On the contrary:
The real unit circle diagram:
- Is profoundly useful as a grounding for a trigonometric algebra, but
- Is not ‘close’ to perceptual experience, instead requiring substantial training of spatial systems to use correctly, in part because of how its use is constrained by mathematical truth.
- Bears both intra-systemic content and content derived from its connections to other systems, including formalizations, through modeling relationships.
Real algebraic formalizations:
- Do not refer. Instead, systems of them can bear modeling relationships to other systems.
- Are restrained visuo-spatial forms, with non-arbitrary structure and intrinsic meaningfulness.
- Are not ‘far’ from perceptual experience, instead being themselves spatially extended structures affording spatial transformations, but requiring substantial training of spatial systems to use correctly, in part because of how their use is constrained by mathematical truth.
- Bear both intra-systemic content and content derived from its connections to other systems, including formalizations.
As you might notice, those descriptions are very similar. That’s because there are NO qualitative differences between the unit circle and an algebra. There are quantitative difference, which we overlook at our peril, and which Frankenstein’s belief in the referentiality of symbols blinds him to. Both belong to a family of cultural artifacts, which we might call ‘mathematical systems’.
Mathematical systems are like this: they are imaginary structures, which are imagined using principally spatial reasoning systems, and perceptual routines. That is, one imagines spatially extended mental objects, and reasons about them by making mental operations like affine transformations, mental visualization, marking off, zooming, shifting attention, and so on. Importantly, however, mathematical systems are not objects given by experience, and they do not match precisely, ever, the usual processes of the spatial reasoning systems involved in them. Rather, those systems must be trained2 to allow only certain extreme processes, and to disallow others. To give some obvious examples: in the unit circle, one may not ‘zoom in’ to the lines of the axes to use their spatial extension, nor the circle itself. One may not move the triangle to new orientations such as placing the right angle at the origin (though in the geometry of congruent forms such a transformation is actively encouraged). One does not consider length of the chords cutting from the intersections of the coordinate axes and the circle (length root 2, of course). One certainly does not rotate the circle into three or four dimensions. If one does these things, one is no longer playing the trigonometry game—one has left the system. There are many other things one may do, but only carefully, and which are not part of the usual practice-considering the length between the x-axis projection of a line and the circle, for instance.
Formal notations are also mathematical systems: one imagines a ‘world’ made up of squiggles, and one allows only certain spatial transformations of those symbols. One then considers the inhabitants of that world. Certain spatial routines are allowed, others are not. Importantly, these often explicitly (and always implicitly) involve the specific written shapes. For instance, consider a Cantor’s diagonalization proof3 we explicitly consider the digits of the real number, and physically imagine a book of written of symbols. At the least, one must agree that Cantor’s diagonalization—a proof so profound and fundamental it arguably plays the unit circle role for a broad swath of infinite cardinality theory and computational undecidability proofs—explicitly trades in the visual form of supposedly ‘arbitrary’ and ‘intrinsically meaningless’ symbols. Note also that nothing in Cantor’s proof reaches out to the supposed ‘meaning’ of the reals. We just don’t care whether these are conceptualized as points on a line, or magnitudes, or what. In fact, worse for the neo-Fregean than this, we do care that they are not any of these things—the decimals are a particular manner of constructing strings—they are the symbols on the page. (A minor tweak is required to actually mesh this proof with common models such as points on the line).
Frankenstein wants to call foul at this point. Frankenstein worries that Cantor’s proof is not axiomatic, but rather reasoning about the forms of the formalisms. Maybe, but this kind of reasoning is at least ubiquitous and foundational in work with formalisms, and is frankly probably important in modern mathematics than axiomatic work. And the neo-Fregean ‘arbitrary symbol with derived meaning through referentiality’ really has no good account for this sort of thing. This is a problem, because lots of reasoning in abstract algebra, category theory, proof theory, and other pretty common branches of mathematics has just this sort of quality: one reasons—formally or informally—about the properties of jumbles of symbols under certain allowed transformations. This means that the neo-Fregean is unable to cope with most of what goes on in modern algebra.
Furthermore, when pressed, Frankenstein has no good account for how actual humans engage in symbolic transformations of the algebraic sort other than the one listed above: constrained perceptual-motor transformations over real or imagined symbolic forms. So the the ‘mathematical systems’ account nicely explains both axiomatic and non-axiomatic approaches to formalisms, while the Frankenstein account explains neither.
Don’t look to the meaning, look to the use!
Frankenstein is confused because in real life, especially in the very elementary mathematics most of our subjects trade in most of the time, usually symbol systems are embedded in particular mappings, or modeling relationships. We think of multiplication as area, or as repeated addition—we think of number strings as reflecting points on a line, or algebraic expressions as capturing relationships among collaborating painters or moving trains. And mappings really are very important in modern mathematics—indeed, one way of construing category theory is that that it is the study of such mappings. It’s just that these are not taken by the advanced mathematician for meanings—multiplication does not ‘mean’ repeated addition, any more than the real numbers ‘means’ the points on a line. Real numbers, like any other mathematical concept, have no meaning. Instead, they may have definitions within a mathematical system, and models across two systems.
In a mapping situation, one takes two mathematical systems, and finds ways to embed one into the other such that conclusions drawn in one system have a natural interpretation in the other. A simple example might suffice. I’ll take one in which, arguably, the visualizations over the unit circle are unfamiliar and idiosyncratic, but the symbolic transformations are familiar and reassuring:
Take the unit circle, with an arbitrary triangle inscribed.
Now, take that triangle, and flip it over. Then put the point that used to lie on the circle instead at the origin, and placing the point that used to lie at the origin on the unit circle.
Now make two squares, both with one point at the origin, and both with one point at the point that has the height of one triangle, and the width of the other, like so:
Now, what’s the combined area of the two blue squares? You may be able to work this out through some geometric considerations. Here’s an easy algebraic way: The two squares have areas cos(x) * sin(90-x) (because the angle of the origin point of the old triangle is 90 minus the new angle), and sin(x) * cos(90-x). Then because of the relation between sin and cos,
Isn’t that pretty? Adding squared cos and squared sin is (if you’ve done much trig lately) a very familiar and comforting proof. The things I was doing with squares and flipping triangles was weird. Be that as it may, I made a convincing correspondence between transformations in system 1 (the circle) and in system 2 (the symbols), such that one can import conclusions from one into the other.
Frankenstein is no fool: this looks a lot like a referential relationship. But it’s not: it’s a truth-preserving mapping between two autonomous (spatial/perceptual/dynamic) mathematical systems. Mappings like these go in lots of directions, and often never involve symbols. When symbols are involved, sometimes its through reasoning about axioms and doing lots of substitution, as in the above, sometimes it’s through reasoning about their constituents, as in Cantor’s proof. But this is not a syntax/semantics relationship—it’s a syntax/syntax alignment. Those tend to be useful, for a number of reasons I won’t go into here in detail, but two big ones are that error-prone transformations in system (a) tend to be robust in system (b), and vice versa, and that system (a) and system (b) are likely to carve a space differently, so that the alignment provides insight into their structure. This isn’t to say that there’s nothing but syntax going on, but rather that each system is autonomously ‘meaningful’–really, I mean to include something much like Miriam Bassok’s great ideas about semantic alignments, and certainly something like my own perceptual alignment account (ungated). However, to be rigorous the bindings between formal systems are usually required to be articulated syntactically.
Poincare, in his famous fight with David Hilbert, actually got the lack of semantics issue more-or-less right: Poincare argues that one cannot do axiomatic geometry, because geometric imaginings are about geometry, while axiomatic imaginings are about formal notations. He resisted the processes of alignment and mutual inference which became, like it or not, the core characteristic of 20th century mathematics. But he was right that the relationship of a formalism to a geometric or trigonometric structure is not one of reference.
This doesn’t mean that mathematical systems aren’t meaningful. They can be meaningful or meaningless in all kinds of ways, both through their internal structure, and in the kinds of modeling relationships they bear. It’s just that little of this meaning is particularly referential in character, and its shared by geometric and formal systems.
There’s one last promissory note I have to pay, then I’m done: Why is the unit circle so good, and formalisms so bad? In other words, we all agree that playing with symbols is often error-prone, confusing, and has a feeling of meaninglessness not shared by the unit circle and the number line. What gives, if not a nice syntax-semantics distinction?
What gives, and what takes
Once we agree that formalisms and mathematical diagrams are alike in type, we can still see that they are very different in emphasis. Here’s a characterization of what makes a mathematical structure good for grounding:
- It requires minimal regimentation—the training process required to play the appropriate math game pretty robustly is relatively lightweight. This is what Frankenstein ought to say instead of saying that the unit circle is a real-world object, or given by experience, or whatever.
- It is stable in memory. That is, not many things have to be remembered to get it right, and those things are unlikely to get mixed up.4 Relatively speaking, formalisms tend to involve lots of relatively confusable items, and therefore to be pretty bad groundings.
- It is generative: Many truths are easily extracted from it, and those truths are important for the system that is to be grounded. In the unit circle, the things that can be easily inferred or ‘read off’ from the circle are just those identities most important for trigonometry. Ditto for number lines and everyday arithmetic and magnitude understanding.
- This one is probably actually incidental, but worth mentioning: It is richly interconnected with other knowledge. Rich interconnections can reduce errors, increase stability, and help with reasoning generatively. However, if you play with the unit circle for a while, and try wandering ‘off the beaten path’, you’ll quickly realize how many things you don’t know about it. It’s not that you have rich knowledge about that shape—it’s that you know which thoughts to think, and which to avoid.
This preliminary characterization is better than the ‘grounded in experience’ one, because it provides a clear operationalization, respects that no mathematical structure is grounded in real experience, and allows that occasional formalism—y=mx+b, a^2+b^2=c^2, AB=BA, (a->b)^(b->c) -> (a->c), ~~a=a and so on, can itself carry that feeling of familiarity and concreteness we normally associate with ‘grounding’. Finally, it explains by common shapes tend to be better than formalisms for grounding: mathematical structures that resemble regular shapes are likely to be stable in memory and to require minimal regimentation.
Formalisms are not referential, and mathematical structures like the unit circle are not ‘experiential’. Mathematical structures all involve regimentation of perceptual systems to align them with rigorous culturally constrained operations, and in this way all are alike. In each, ‘meaning’ is contained in the autonomous web of permitted transformations. There is no ‘semantics’ or ‘meaning’ in mathematical reasoning with formalisms, but rather a conceptually symmetric relationship of inference-preserving syntax-syntax alignments among intrinsically meaningful systems. Nevertheless, formalisms tend to more often be used to derive truths about other systems, especially at low levels of mathematical sophistication. Potent structures for mathematical grounding require minimal regimentation, are stable in memory, and are generative. These tend to be geometric, social, or graphical, rather than formal, because formal systems require extensive regimentation and high working memory load.
Steven Phillips wrote a few comments, and kindly agreed to let me post them here. He says:
I think we share the similar misgivings about the Fregean perspective of treating the meaning of a statement as derived from the meaning of its constituents (elements) and their inter-relations. I think the basic problem with this approach is that it depends too much on having an appropriate meaning for the constituents.
Category theory may help in this regard where the "semantics" are based on the relations (equations) between arrows between objects, not specific objects, nor even specific arrows. In this way, meaning is not required to be "grounded" by reference to specific objects or elements.
As a "concrete" example, the usual notion of a group is a set that has an identity element, an inverse for each element, and a binary operation, satisfying some relations among the operations and elements. From a Fregean perspective, the meaning of a group depends on the meaning of the elements of the set. The first step in abstracting away from specific elements is to recast each part as a function: the identity (unit) element u is the (nullary) function that picks out the element u, each inverse is obtained from a unary function, and the binary operation is a binary function. The next step is to capture the axioms of a group, e.g., e x a = a = a x e for every a in set S, as equations among the functions, effectively creating a "point free" version of the axioms, i.e., one which does not make explicit reference to the elements. At this point we have a group in the category Set (of sets and functions). Here, we note that the only properties of Set that are needed for this construction are finite (Cartesian) products (for reasons I didn’t explain). Thus, we can further abstract away from objects that are sets and arrows that are functions to any (abstract) category with finite products. In fact, we just need three abstract objects, which we can be label 0, 1, and 2, and three abstract arrows, which we can label u (unit), i (inverse) and m ("multiplication") that satisfy the relevant equations (commutativity diagrams). This category is our algebraic "theory" of groups, and the functors from this category to Set are the (set-valued) models of the theory, i.e., the (set-valued) functorial semantics of the theory. Other functors, such as those into Top (the category of topological spaces and continuous functions) provide for topological groups (topological semantics), and so on.
Regarding the grounding of mathematics generally, I’m a bit sceptical that its (solely) grounded in geometry/quantity, since that seems to leave out topology, which is big chunk of mathematics. Topology is what you do when you don’t have a notion of distance, or quantity. Perhaps intuitions about geometry, etc., or their failures, motivated the invention of topology, but that is different from saying that topology is grounded in geometry.
In general, I would be concerned about overly relying on geometrical intuitions as a grounding of mathematical concepts. For example, the concept of a functor between categories can be thought of, geometrically, as a map from a circle enclosing a bunch of arrows connecting points (category) to another circle enclosing another bunch of arrows connecting points (category). The image of a functor can then be thought of as a smaller circle enclosing a subset of point and arrows within the larger circle. So, is the image of a functor a category? Geometric intuition suggests yes, since the smaller circle is just another arrow-enclosing circle. In fact, the image of a functor is not necessarily a category, which can be confirmed by trying to generate a counter example that satisfies the axioms of a functor, but not the axioms of a category. (An exercise for the reader! Here, although, geometric intuition plays a (necessary) role, geometry is not sufficient, since you need to know what are the axioms, which are given symbolically.
So, why is the unit circle such a good model of trigonometry? I guess its because it has lots of extra structure to play around with, than the symbolic model. Why is it not a good model? I guess its because some of that extra structure is not in the right correspondence relationship. Note, that by correspondence relationship, category theory is not restricted to isomorphism, which appears to be the concept that predominates in cognitive science. Rather, for category theorists, adjunctions are far more important.
In short, I suppose mathematics is grounded both geometrically and linguistically, and I guess that category theory should be well-placed to see the connection.
I replied that my intuition is that we do rely on spatial/geometric reasoning as a mechanism implementing lots of mathematical thinking, but that geometric/intuitions are not determinative of truth. In Steve’s example, we are misled precisely because geometric intuitions are our usual first-line mode of thinking. However, in category theory (as in lots of mathematics), we accept as true only things that conform to careful and precise coordinations among different intuition systems. In this case, there are two ways to get to the error. First, if you use the usual sorts of (spatial!) pattern-matching processes across axiomatic systems, you come to a conclusion that differs from the geometric intuition. This isn’t totally determinative either. You have now to align your arrows and your proof, to make sure everything ‘went right’. I’ll also note that geometrically, if you draw a circle inside the image category, you actually can draw a circle who’s contents aren’t a category (by, say, cutting the circle through some of the arrows). So geometric ‘intuitions’ are trainable and tunable, and I think a lot of this goes on too.
The point is that we have to have multiple descriptions: one that captures processes or mechanisms of reasoning, and another that lets us in-principle agree on how truth is supposed to be assigned. I guess I bet these aren’t always that related.
- Because the word ‘symbol’ is over-subscribed in this context, I’m going to refer to good old fashioned symbols like sin(x), AxB=BxA, and y=mx+b as “formalizations”. This puts the cart before the horse, because, as the saying goes, you can’t say FORMalization without saying “FORM”. I’ll argue that these formalisms, like other mathematical structures, acquires intrinsic meaning through a combination of cultural training and its, you know, physical form. but for now, if you’d like to pretend that ‘form’ means something that doesn’t include the, you know, form of the symbol, you’ll be in good company. Most classical symbolists think so too. ↩
- Tyler Marghetis likes the word ‘regimented’, and I think it’s an excellent one. I’ll steal it without further attribution. ↩
- Here’s a quick reminder, which leaves out some important details. The goal of diagonalization is to prove that the set of real numbers—say those between 0 and 1—cannot be listed (this is the same as putting them in 1-1 correspondence with the naturals). The proof is a proof by contradiction. One imagines a book which lists, in some order, infinitely many reals. Can that book contain all the reals? It cannot. Let’s say the book started like this:
0 | 0 . 1 0 1 0 1 1 1 1 0 1 0 0 1 0 1 | 0 . 0 0 0 0 0 1 1 0 1 0 1 0 0 1 2 | 0 . 1 0 0 1 1 1 1 1 1 1 1 1 0 0 ...
Where I’ve made things easy by doing it in binary. Now, we make—make with our minds’ hands, as a temporally extended act of visuo-spatial imagery, not to put to fine a point on it– a new real number which is not in our list. Call it X. What we do is this. Start with 0, followed by a decimal. Then make the first digit different from the first digit of the listed numbers:
X = 0 . 0 ?
Now, whatever ? is, X cannot be the first item in our book. Make the second digit mismatch the second item in book:
X = 0 . 0 1 ?
Now X cannot be numbers 0 or 1. Do the same for the next item
X = 0 . 0 1 1 ?
And x can’t be items 0, 1, or 2. Keep doing this spatially extended visual operation on a supposedly formal representation (subtle, aren’t I?) and you’ll see that X cannot appear anywhere in the book—it mismatches each item. ↩
- For instance, exact spatial positions are hard to keep in memory, while spatial relations are relatively easy. Familiar frequent things are easier to recall than unusual and complex ones. The unit circle just requires a circle, a cross, and a line—three of the most common and simplest shapes of mathematics. Then one must only remember where the line goes, where 0 degrees is, which is sine and which cosine, that tangent is rise over run, and that the degrees go counterclockwise. This may seem like a large number of items, but compare it to the number of things you have to remember to keep track of the dozen or so trigonemtric relations embedded easily in the figure! ↩