Compact Closed Categories and Frobenius Algebras for Computing Natural Language Meaning

Mehrnoosh Sadrzadeh

Compact closed categories have found applications in modeling quantum information protocols by Abramsky-Coecke. They also provide semantics for Lambek's pregroup algebras, applied to formalizing the grammatical structure of natural language. And they are implicit in a distributional context-based model of word meaning which relies on vector spaces. In particular, in previous work, we (Coecke-Clark-Sadrzadeh) used the product category of pregroups with vector spaces and provided a distributional model of meaning for sentences. I will recast this theory in terms of strongly monoidal functors and advance it via Frobenius algebras over vector spaces. The former are used to formalize topological quantum field theories by Atiyah and Baez-Dolan, and the latter are used to model classical data in quantum protocols by Coecke-Pavlovic-Vicary. The Frobenius algebras enable us to work in a single space in which live meanings of words, phrases, and sentences of any structure. They also allow us to model non-contex-based words such as relative pronouns. Hence we can compare meanings of different language constructs and enhance the applicability of the theory.

I will also report on our experimental results on two language tasks (joint work with Kartsaklis-Pulman): word sense disambiguation (joint work with Grefestette) and term/definition classification. These results show how our theoretical predictions are verified on real large-scale date from the British National Corpus.