What Galois connections have to do with learning
Two functions and between partially ordered sets form a Galois connection if, for all and ,
That single biconditional has unreasonably many consequences. The compositions and are closure operators on their respective sides. The fixed points of those closures form isomorphic complete lattices. And — most usefully for the present purpose — every Galois connection is determined by a binary relation, and every binary relation determines a Galois connection.
This last fact is the engine inside Formal Concept Analysis: the incidence relation of a context induces a Galois connection between subsets of objects and subsets of attributes, and the concepts are exactly the pairs of fixed points (Ganter & Wille, 1999). The classical theory of these objects is more than half a century old; its order-theoretic foundations are laid out in (Birkhoff, 1967) and (Davey & Priestley, 2002).
What is younger, and what this essay will pursue, is the observation that many learning algorithms are also computing closure operators — without naming them as such. Version space learning closes a hypothesis set under specialisation. Rule learners close a feature set under implication. Some attention mechanisms can be viewed as soft Galois maps between query and key spaces. Once one starts looking for the closure-operator skeleton, it appears in more places than the literature on FCA usually claims. The forthcoming long form will work several of these correspondences out in careful detail.