Categorical Modeling

This is a seminar on applied category theory, which, of course, makes sense if you’re a mathematician. However, that label has a bit of a “hammer looking for nails” flavor to it. For that reason, I am going to suggest an alternate term: categorical modeling (note the site address!), for that suggests a focus on problems out there in the world to which category theory can be applied fruitfully, and a dual focus on subtleties of the problem domain as well as any category theoretic ideas that might come in handy.

The first seminar was on Coecke et. al’s “Mathematical Foundations for a Compositional Distributional Model of Meaning.” It’s fair to say that the audience was not thrilled by the paper on several counts. For example, the choice of pre-groups seems to be arbitrary. However, I am not going to focus on the category theoretic choices made in the paper and focus on the categorical modeling of linguistic meaning.

How does the paper do on that count?

The claimed achievement of the paper – independent of the choice of categorical machinery – is the unification of compositional and distributional theories of meaning in one categorical framework. As I see it, the main technical achievement of the paper is a theory of similarity that composes similarity judgments at the lexical level to create similarity judgments at the sentence level. The category theoretic innovation is in incorporating continuous structure that’s otherwise missing in logic-like formal accounts of compositionality, which allows us to create similarity metrics as inner products at the sentence level.

Hence the fact that love is closer to like than hate means:

Mary loves John is more similar to Mary likes John than to Mary hates John, which accords well with our intuitive judgments. However, what would this recipe do with the following three sentences.

  1. Mary loves John
  2. Mary loves dogs
  3. Mary hates John.

How does the scheme balance out the competing distances between love and hate on the one hand and dogs and John on the other? Is there a theory of similarity that delivers an unambiguous resolution of similarity judgments here? This is where, in my opinion, a categorical modeling approach can be more useful than an applied category theory approach. Instead of focusing on bringing compositional and distributional theories of meaning together (hammer meets nail), a better question might be:

Can categories help model similarity in a manner that scales well from lexical items to sentences and perhaps even more generally to similarity judgments across cognitive and perceptual facilities?

There’s a large literature on similarity in the cognitive science literature with probabilistic models appearing to be the current winners. Does category theory have something new/better to add? The idea that algebraic structure can be combined naturally with metric structure suggests we might be able to do a better job than probabilistic models that live mostly on the metric end.