unary.tex

\chapter{Unary type theories}
\label{chap:unary}

We begin our study of type theories and their categorical counterparts with a class of very simple cases that we will call \emph{unary type theories}.
(This terminology is not standard in the literature.)
On the type-theoretic side the word ``unary'' indicates that there is only one type on each side of a sequent $A\types B$.
On the categorical side it means, roughly, that we deal with categories rather than any kind of multicategory.
In later chapters we will generalize away from this in various ways.

In some ways the unary case is fairly trivial, but for that very reason it serves as a good place to become familiar with basic notions of type theory and how they correspond to category theory.\footnote{I am indebted to Dan Licata~\cite{ls:1var-adjoint-logic} for the insight that unary type theories can be easier but still interesting.}
Some of these notions and remarks may seem very pedantic in the unary case, but will become more important later on.
I encourage the reader new to type theory to skim over any such parts of this chapter, and then return to it after some acquaintance with later chapters.


\section{Posets}
\label{sec:poset}

We start with the simplest sort of categories: those in which each hom-set has at most one element.
These are well-known to be equivalent to \emph{preordered sets}, where the existence of an arrow $A\to B$ is regarded as the assertion that $A\le B$.
I will abusively call them \emph{posets}, although traditionally posets (partially ordered sets) also satisfy the antisymmetry axiom (if $A\le B$ and $B\le A$ then $A=B$).
From a category-theoretic perspective, antisymmetry means asking a category to be skeletal, which is both unnatural and pointless.
Conveniently, posets also correspond to the simplest version of logic, namely \emph{propositional} logic, as we will see in \cref{sec:logic}.

From a category-theoretic perspective, the question we are concerned with is the following.
Suppose we have some objects in a poset, and some ordering relations between them.
For instance, we might have
\begin{mathpar}
  A\le B \and A\le C \and D\le A \and B \le E \and D\le C
\end{mathpar}
Now we ask, given two of these objects --- say, $D$ and $E$ --- is it necessarily the case that $D\le E$?
In other words, is it the case in \emph{any} poset containing objects $A,B,C,D,E$ satisfying the given relations that $D\le E$?
In this example, the answer is yes, because we have $D\le A$ and  $A\le B$ and $B\le E$, so by transitivity $D\le E$.
More generally, we would like a method to answer all possible questions of this sort.

There is an elegant categorical way to do this based on the notion of \emph{free structure} (analogously to the situation for free groups we considered in \cref{sec:syntax}).
Namely, consider the category \bPoset of posets, and also the category \bRelGr of \emph{relational graphs}, by which I mean sets equipped with an arbitrary binary relation.
There is a forgetful functor $U:\bPoset \to \bRelGr$, which has a left adjoint $\F\bPoset$.

Now, the abstract information about ``five objects $A,B,C,D,E$ satisfying five given relations'' can be regarded as an object $\cG$ of \bRelGr, and to give five such objects satisfying those relations in a poset \cP is to give a map $\cG \to U\cP$ in \bRelGr.
By the adjunction, therefore, this is equivalent to giving a map $\F\bPoset\cG \to \cP$ in \bPoset.
Therefore, a given inequality such as $A\le E$ will hold in \emph{all} posets if and only if it holds in the \emph{particular, universal} poset $\F\bPoset\cG$ freely generated by the assumed data.

Thus, to answer all such questions at once, it suffices to give a concrete presentation of the free poset $\F\bPoset\cG$ generated by a relational graph \cG.
In this simple case, it is easy to give an explicit description of $\F\bPoset$: it is the reflexive-transitive closure.
But since soon we will be trying to generalize vastly, we want instead a general method to describe free objects.
From our current perspective, this is the role of type theory.

As noted in \cref{sec:intro}, when we move into type theory we use the symbol $\types$ instead of $\to$ or $\le$.
Type theory is concerned with \emph{(hypothetical) judgments}, which (roughly speaking) are syntactic gizmos of the form ``$\Gamma\types\Delta$'', where $\Gamma$ and $\Delta$ are syntactic gadgets whose specific nature is determined by the specific type theory under consideration (and, thus, by the particular kind of categories we care about).
We call $\Gamma$ the \emph{antecedent} or \emph{context}, and $\Delta$ the \emph{consequent} or \emph{co-context}.
In our simple case of posets, the judgments are simply
\[ A \types B \]
where $A$ and $B$ are objects of our (putative) poset; such a judgment represents the relation $A\le B$.
In general, the categorical view is that a hypothetical judgment represents a sort of \emph{morphism} (or, as we will see later, a sort of \emph{object}) in some sort of categorical structure.

In addition to a class of judgments, a type theory consists of a collection of \emph{rules} by which we can operate on such judgments.
Each rule can be thought of as a partial $n$-ary operation on the set of possible judgments for some $n$ (usually a finite natural number), taking in $n$ judgments (its \emph{premises}) that satisfy some compatibility conditions and producing an output judgment (its \emph{conclusion}).
We generally write a rule in the form
\begin{mathpar}
  \inferrule{\cJ_1 \\ \cJ_2 \\ \cdots \\ \cJ_n}{\cJ}
\end{mathpar}
with the premises above the line and the conclusion below.
A rule with $n=0$ is sometimes called an \emph{axiom}.
The categorical view is that we have a given ``starting'' set of judgments representing some objects and putative morphisms in the ``underlying data'' of a categorical structure, and the closure of this set under application of the rules yields the objects and morphisms in the \emph{free} structure it generates.

We will attempt to make all of this precise in \cref{chap:dedsys}, which the reader is free to consult now.
However, it is probably more illuminating at the moment to bring it back down to earth in our very simple example.
Since the properties distinguishing a poset are reflexivity and transitivity, we have two rules:
\begin{mathpar}
  \inferrule{ }{A\types A} \and
  \inferrule{A\types B \\ B\types C}{A\types C}
\end{mathpar}
in which $A,B,C$ represent arbitrary objects.
In other words, the first says that for any object $A$ we have a $0$-ary rule whose conclusion is $A\types A$, while the second says that for any objects $A,B,C$ we have a $2$-ary rule whose premises are $A\types B$ and $B\types C$ (that is, any two judgments of which the consequent of the first is the antecedent of the second) and whose conclusion is $A\types C$.
We will refer to the pair of these two rules as the \textbf{free type theory of posets}.

Hopefully it makes sense that we can construct the reflexive-transitive closure of a relational graph by expressing its relations in this funny syntax and then closing up under these two rules, since they are exactly reflexivity and transitivity.
Categorically, of course, that means identities and composition.
In type theory the composition/transitivity rule is often called \textbf{cut}, and plays a unique role, as we will see later.

In the example we started from,
\begin{mathpar}
  A\le B \and A\le C \and D\le A \and B \le E \and D\le C
\end{mathpar}
we have the two instances of the transitivity rule
\begin{mathpar}
  \inferrule{D\types A \\ A\types B}{D\types B}\and
  \inferrule{D\types B \\ B\types E}{D\types E}
\end{mathpar}
allowing us to conclude $D\types E$.
When applying multiple rules in sequence to reach a conclusion, it is customary to write them in a ``tree'' structure like so:
\begin{mathpar}
  \inferrule*{\inferrule*{D\types A \\ A\types B}{D\types B} \\ B\types E}{D\types E}
\end{mathpar}
Such a tree is called a \emph{derivation}.
The way to typeset rules and derivations in \LaTeX\ is with the \texttt{mathpartir} package; the above diagram was produced with
\begin{verbatim}
  \inferrule*{
    \inferrule*{D\types A \\ A\types B}{D\types B} \\
    B\types E
  }{
    D\types E
  }
\end{verbatim}
Note that \texttt{mathpartir} has only recently made it into standard distributions of \LaTeX, so if you have an older system you may need to download it manually.

Formally speaking, what we have observed is the following \emph{initiality theorem}.

\begin{thm}\label{thm:poset-initial-1}
  For any relational graph \cG, the free poset $\F{\bPoset}\cG$ that it generates is has the same objects and its morphisms are the judgments that are derivable from \cG in free type theory of posets.
\end{thm}
\begin{proof}
  In the preceding discussion we assumed it as known that the free poset on a relational graph is its reflexive-transitive closure, which makes this theorem more or less obvious.
  However, it is worth also presenting an explicit proof that does not assume this, since same pattern of proof will reappear many times for more complicated type theories where we don't know the answer in advance.

  Thus, let us define $\F{\bPoset}\cG$ as stated in the theorem.
  The reflexivity and transitivity rules imply that $\F{\bPoset}\cG$ is in fact a poset.
  Now suppose $\cA$ is any other poset and $P:\cG\to\cA$ is a map of relational graphs.
  The objects of $\F{\bPoset}\cG$ are the same as those of \cG, so $P$ extends uniquely to a map on underlying sets $\F{\bPoset}\cG\to\cA$.
  Thus it suffices to show that this map is order-preserving, i.e.\ that if $A\types B$ is derivable from \cG in the free type theory of posets, then $P(A)\le P(B)$.

  For this purpose we \emph{induct on the derivation of $A\types B$}.
  There are multiple ways to phrase such an induction.
  One is to define the \emph{height} of a derivation to be the number of rules appearing in it, and then induct on the height of the derivation of $A\types B$.
  \begin{enumerate}
  \item If there are no rules at all, then $A\types B$ must come from a relation $A\le B$ in \cG; hence $P(A)\le P(B)$ since $P$ is a map of relational graphs.
  \item If there are $n>0$ rules, then consider the last rule.
    \begin{enumerate}
    \item If it is the identity rule $A\types A$, then $P(A)\le P(A)$ in \cA since \cA is a poset and hence reflexive.
    \item Finally, if it is the transitivity rule, then each of its premises $A\types B$ and $B\types C$ must have a derivation with strictly smaller height, so by the (strong) inductive hypothesis we have $P(A)\le P(B)$ and $P(B)\le P(C)$.
      Since \cA is a poset and hence transitive, we have $P(A)\le P(C)$.\qedhere
    \end{enumerate}
  \end{enumerate}
\end{proof}

A different way to phrase such an induction, which is more flexible and more type-theoretic in character, uses what is called \emph{structural induction}.
This means that rather than introduce the auxiliary notion of ``height'' of a derivation, we apply a general principle that \emph{to prove that a property $P$ holds of all derivations, it suffices to show for each rule that if $P$ holds of the premises then it holds of the conclusion}.
We can also define operations on derivations by \emph{structural recursion}, meaning that it suffices to define what happens to the conclusion of each rule assuming that we have already defined what happens to the premises.
Structural induction and recursion can be justified formally by set-theoretic arguments --- see \cref{chap:dedsys} for some general statements.
However, intuitively they are implicit in what is meant by saying that ``derivations are what we obtain by applying rules one by one,'' just as ordinary mathematical induction is implicit in saying that ``the natural numbers are what we obtain by starting with zero and constructing successors one by one'', and constructive type-theoretic foundations for mathematics often take them as axiomatic.
From now on we will use structural induction and recursion on derivations in all type theories without further comment.

However, it is proved, \cref{thm:poset-initial-1} enables us to reach conclusions about arbitrary posets by deriving judgments in type theory.
In our present trivial case this is not very useful, but as we will see it becomes more useful for more complicated structures.

Another way to express the initiality theorem is to incorporate \cG into the rules.
Given a relational graph \cG, we define the \textbf{type theory of posets under \cG} to be the free type theory of posets together with a 0-ary rule
\begin{mathpar}
  \inferrule{ }{A\types B}
\end{mathpar}
for any relation $A\le B$ in \cG.
Now a derivation can be written without any ``leaves'' at the top, such as
\begin{mathpar}
  \inferrule*{\inferrule*{\inferrule*{ }{D\types A} \\ \inferrule*{ }{A\types B}}{D\types B} \\ \inferrule*{ }{B\types E}}{D\types E}
\end{mathpar}
Clearly this produces the same judgments; thus the initiality theorem can also be expressed as follows.

\begin{thm}\label{thm:poset-initial-2}
  For any relational graph \cG, the free poset $\F{\bPoset}\cG$ that it generates has the same objects and its morphisms are the derivable judgments in the type theory of posets under \cG.\qed
\end{thm}

We can extract from this our first general statement about categorical logic: it is \emph{a syntax for generating free categorical structures using derivations from rules}.
The reader may be forgiven at this time for wondering what the point is; but bear with us and things will get less trivial.


\section{Categories}
\label{sec:categories}

Let's now generalize from posets to categories.
The relevant adjunction is now between categories \bCat and \emph{directed graphs} \bGr; the latter are sets $\cG$ of ``vertices'' equipped with a set $\cG(A,B)$ of ``edges'' for each $A,B\in \cG$.
Thus, we hope to generate the free category $\F{\bCat}\cG$ on a directed graph \cG type-theoretically.

Our judgments $A\types B$ will still represent morphisms from $A$ to $B$, but now of course there can be more than one such morphism.
Thus, to specify a particular morphism, we need more information than the simple \emph{derivability} of a judgment $A\types B$.
Na\"ively, the first thing we might try is to identify this extra information with the \emph{derivation} of such a judgment, i.e.\ with the tree of rules that were applied to reach it.
This makes the most sense if we take the approach of \cref{thm:poset-initial-2} rather than \cref{thm:poset-initial-1}, so that distinct edges $f,g\in \cG(A,B)$ can be regarded as distinct \emph{rules}
\begin{mathpar}
  \inferrule*[right=$f$]{ }{A\types B} \and
  \inferrule*[right=$g$]{ }{A\types B} \and
\end{mathpar}
Thus, for instance, if we have also $h\in \cG(B,C)$, the distinct composites $h\circ g$ and $h\circ f$ will be represented by the distinct derivations
\begin{mathpar}
  \inferrule*[right=$\circ$]{
    \inferrule*[right=$h$]{ }{B\types C} \\
    \inferrule*[right=$g$]{ }{A\types B}
  }{
    A\types C
  }\and
  \inferrule*[right=$\circ$]{
    \inferrule*[right=$h$]{ }{B\types C} \\
    \inferrule*[right=$f$]{ }{A\types B}
  }{
    A\types C
  }\and
\end{mathpar}
Note that when we have distinct rules with the same premises and conclusion, we have to label them so that we can tell which is being applied.
For consistency, we also begin labeling the composition and identity rules, with $\circ$ and $\idfunc$ respectively.

Of course, this na\"ive approach founders on the fact that composition in a category is supposed to be associative and unital, since the two composites $h\circ (g\circ f)$ and $(h\circ g)\circ f$, which ought to be equal, nevertheless correspond to distinct derivations:
\begin{equation}\label{eq:assoc}
  \begin{array}{c}
  \inferrule*[right=$\circ$]{
    \inferrule*[right=$h$]{ }{C\types D} \\
    \inferrule*[right=$\circ$]{
      \inferrule*[right=$g$]{ }{B\types C} \\
      \inferrule*[right=$f$]{ }{A\types B}
    }{
      A\types C
    }}{
    A\types D
  }\\\\
  \inferrule*[right=$\circ$]{
    \inferrule*[right=$\circ$]{
      \inferrule*[right=$h$]{ }{C\types D} \\
      \inferrule*[right=$g$]{ }{B\types C}
    }{
      B\types D
    }\\
    \inferrule*[right=$f$]{ }{A\types B}
  }{
    A\types D
  }
  \end{array}
\end{equation}
Thus, with this type theory we don't get the free category on \cG, but rather some free category-like structure that lacks associativity and unitality.
There are two ways to deal with this problem; we consider them in turn.

\subsection{Primitive cuts}
\label{sec:category-cutful}

The first solution is to simply quotient by an equivalence relation.
Our equivalence relation will have to identify the two derivations in~\eqref{eq:assoc}, and also the similar pairs for identities:
\begin{mathpar}
  \inferrule{\inferrule*[right=$\idfunc$]{ }{A\types A}\\ {A\types B}}{A\types B}\;\circ
  \qquad\equiv\qquad A\types B
  \\
  \inferrule{A\types B \\ \inferrule*[right=$\idfunc$]{ }{B\types B}}{A\types B}\;\circ
  \qquad\equiv\qquad A\types B
\end{mathpar}
Our equivalence relation must also be a ``congruence for the tree-construction of derivations'', meaning that these identifications can be made anywhere in the middle of a long derivation, such as:
\begin{mathpar}
  \inferrule{\inferrule*{}{\sD_1\\\\\vdots} \\
    \inferrule*[right=$\circ$]{\inferrule*[right=$\idfunc$]{ }{A\types A}\\ \inferrule*{\sD_2\\\\\vdots}{A\types B}}{A\types B}
  }{\vdots\\\\\sD_3}
  \qquad\equiv\qquad
  \inferrule{\inferrule*{}{\sD_1\\\\\vdots} \\
    \inferrule*{\sD_2\\\\\vdots}{A\types B}
  }{\vdots\\\\\sD_3}
\end{mathpar}
We will also have to close it up under reflexivity, symmetry, and transitivity to make an equivalence relation.

Of course, it quickly becomes tedious to draw such derivations, so it is convenient to adopt a more succinct syntax for them.
We begin by labeling each judgment with a one-dimensional syntactic representation of its derivation tree, such as:
\begin{mathpar}
  \inferrule*[right=$\circ$]{
    \inferrule*[right=$g$]{ }{g:(B\types C)} \\
    \inferrule*[right=$\circ$]{
      \inferrule*[right=$\idfunc$]{ }{\idfunc_B:(B\types B)} \\
      \inferrule*[right=$f$]{ }{f:(A\types B)}
    }{
      (\idfunc_B\comp{B} f):(A\types B)
    }}{
    (g\comp{B} (\idfunc_B\comp{B} f)) : (A\types C)
  }
\end{mathpar}
These labels are called \emph{terms}.
Of course, in this case they are none other than the usual notation for composition and identities.
Formally, this means the rules are now:
\begin{mathpar}
  \inferrule{f\in\cG(A,B)}{f:(A\types B)}\and
  \inferrule{A\in\cG}{\idfunc_A : (A\types A)}\and
  \inferrule{\phi:(A\types B) \\ \psi:(B\types C)}{\psi\comp{B} \phi:(A\types C)}
\end{mathpar}
% TODO: Mention somewhere the assumptions of "external" facts as premises
Here $\phi,\psi$ denote arbitrary terms, and if they contain $\circ$'s themselves then we put parentheses around them, as in the example above.
Now the generators of our equivalence relation look even more familiar:
\begin{align*}
  \chi \comp{C} (\psi \comp{B} \phi) &\equiv (\chi\comp{C}\psi)\comp{B}\phi\\
  \phi \comp{A} \idfunc_A &\equiv \phi\\
  \idfunc_B \comp{B} \phi &\equiv \phi
\end{align*}
Again $\phi,\psi,\chi$ denote arbitrary terms, corresponding to the fact that arbitrary derivations can appear at the top of our identified trees; and similarly these identifications can also happen anywhere inside another term, so that for instance
\[ k\comp{C} (h\comp{B} (g\comp{A} f)) \equiv k\comp{C} ((h\comp{B} g)\comp{A} f).  \]

Of course, we only impose these relations when they make sense.
We can describe the conditions under which this happens using rules for a secondary judgment $\phi\equiv \psi : (A\types B)$.
The rules for our generating equalities are
\begin{mathpar}
  \inferrule{\phi:(A\types B) \\ \psi:(B\types C) \\ \chi:(C\types D)}{(\chi \comp{C} (\psi \comp{B} \phi) \equiv (\chi\comp{C}\psi)\comp{B}\phi) : (A\types D)}\\
  \inferrule{\phi:(A\types B)}{(\phi \circ \idfunc_A \equiv \phi):(A\types B)}\and
  \inferrule{\phi:(A\types B)}{(\idfunc_B \circ \phi \equiv \phi):(A\types B)}
\end{mathpar}
and we must also have rules ensuring that we have an equivalence relation and a congruence:
\begin{mathpar}
  \inferrule{\phi:(A\types B)}{(\phi\equiv\phi):(A\types B)}\and
  \inferrule{(\phi\equiv\psi):(A\types B)}{(\psi\equiv\phi):(A\types B)}\and
  \inferrule{(\phi\equiv\psi):(A\types B)\\(\psi\equiv\chi):(A\types B)}{(\phi\equiv\chi):(A\types B)}\and
  \inferrule{(\phi_1\equiv\psi_1):(A\types B)\\(\phi_2\equiv\psi_2):(B\types C)}{(\phi_2\comp{B} \phi_1 \equiv \psi_2\comp{B}\psi_1):(A\types C)}
\end{mathpar}
The last of these is sufficient, in our simple case, to ensure we have a congruence; in general we would have to have one such equality rule for each basic rule of the theory (except for those with no premises, like $\idfunc$).

Many of our type theories will involve such an equality judgment, for which we always use the notation $\equivsym$, and the need for the equivalence relation and congruence rules is always the same.
Thus, we generally decline to mention them, stating only the ``interesting'' generating equalities for the theory.
A general framework for such equality judgments is described in \cref{sec:axioms}.

In our case, when the rules for $\circ$ and $\idfunc$ are augmented by these rules for $\equiv$, and we also add axioms for the edges of a given directed graph \cG, we call the result the \textbf{cut-ful type theory for categories under \cG}.
It may seem obvious that this produces the free category on \cG, but again we write it out carefully to help ourselves get used to the patterns.
In particular, we want to emphasize the role played by the following lemma:

\begin{lem}\label{thm:category-tad}
  If $\phi :(A\types B)$ is derivable in the cut-ful type theory for categories under \cG, then it has a unique derivation.
\end{lem}
\begin{proof}
  The point is that the terms produced by all the rules have disjoint forms.
  If $\phi$ is of the form ``$f$'' for some $f\in\cG(A,B)$, then it can only be derived by the first rule applied to $f$.
  If it is of the form ``$\idfunc_A$'', then it can only be derived by the identity rule applied to $A$.
  Finally, if it is of the form ``$\psi\comp{C}\phi$'' it can only be derived by the composition rule applied to $\phi:(A\types C)$ and $\psi:(C\types B)$, and by induction the latter judgments also have unique derivations.
\end{proof}

In other words, the terms (before we impose the relation $\equiv$ on them) really are simply one-dimensional representations of derivations, as we intended.
Not everything that ``looks like a term'' represents a derivation, but if it does, it represents a unique one.
(We have not precisely defined exactly what ``looks like a term'', but it should make intuitive sense; a formal definition is given in \cref{chap:dedsys}.)
It is easy to see that conversely every derivation is represented by a unique term, since the above rules for annotating derivations by terms are deterministic.

The above simple inductive proof of \cref{thm:category-tad} depends in particular on the presence of the subscript on the symbol $\circ$.
Similar annotations will reappear in many subsequent theories.
In the present case we could omit these annotations and still reconstruct a unique derivation, because we know the domain and codomain of all the generating morphisms in \cG.
However, this would require a more ``global'' analysis of the term; whereas a clean inductive proof such as the above has the advantage that it can be regarded as a \textit{recursively defined algorithm}.

We call this algorithm \emph{type-checking}: it starts with a putative sequent with term $\phi :(A\types B)$ and, by following the algorithm of \cref{thm:category-tad} until it terminates or encounters a contradiction, either produces a derivation of that sequent or decides that it has no such derivation.
This algorithm can be programmed into a computer, and arguably represents reasonably faithfully what human mathematicians do when reading syntax.
With that said, when writing for a human reader (and even an electronic reader whose programmer has been clever enough) it is often possible to leave off annotations of this sort without fear of ambiguity, and we will frequently do so.

Not all type theories have the property that terms uniquely determine their derivations by a direct inductive algorithm; but those that don't tend to be much more complicated to analyze and prove the initiality theorem for.
We will call this property \textbf{terms are derivations} or \textbf{type-checking is possible}, and we will always attempt to construct our type theories so that it holds.

\begin{rmk}
  Technically, there is either more or less happening here than may appear (depending on your point of view).
  A term as we write it on the page is really just a string of symbols, whereas in the proof of \cref{thm:category-tad} we have assumed that a term such as ``$f\comp{B} (g\comp{A} h)$'' can uniquely be read as $\comp{B}$ applied to ``$f$'' and ``$g\comp{A} h$''.
  This simple string of symbols could technically be regarded as $\comp{A}$ applied to ``$f\comp{B} (g$'' and ``$h)$'', but of course that would make no sense because those are not meaningful terms in their own right (in particular, they contain unbalanced parentheses).

  Thus, something \emph{more} must be happening, and that something else is called \emph{parsing} a term.
  Human mathematicians do it instinctively without thinking; electronic mathematicians have to be programmed to do it.
  In either case, the result of parsing a string of symbols is an ``internal'' representation (a mental idea for humans, a data structure for computers) that generally has the form of a tree, indicating the ``outermost'' operation as the root with its operands as branches, and so on, for instance:
  \[ f\comp{B} (g\comp{A} h) \qquad\leadsto\qquad \vcenter{\xymatrix@-1pc{ & \comp{B} \ar@{-}[dl] \ar@{-}[dr]\\
      f && \comp{A} \ar@{-}[dl]\ar@{-}[dr]\\
      & g && h }} \]

  Of course, this ``internal'' tree representation of a term is nothing but the corresponding derivation flipped upside-down.
  So in that sense \cref{thm:category-tad} is actually saying \emph{less} than one might think: the derivation tree is actually being constructed by the silent step of parsing, while the type-checking algorithm consists only of \emph{labeling} the nodes of this tree by rules in a consistent manner.
  We will not say much more about parsing, however;
  % (though we discuss it a bit further in \cref{chap:dedsys});
  we trust the human reader to do it on their own, and we trust programmers to have good algorithms for it.
\end{rmk}

Now we can prove the initiality theorem.

\begin{thm}\label{thm:category-initial-1}
  The free category on a directed graph $\cG$ has the same objects as \cG, and its morphisms $A\to B$ are the derivations of $A\types B$ (or equivalently, the terms $\phi$ such that $\phi :(A\types B)$ is derivable) in the cut-ful type theory for categories under \cG, modulo the equivalence relation $\phi\equiv \psi:(A\types B)$.
\end{thm}
\begin{proof}
  Let $\F\bCat\cG$ be defined as described in the theorem; the identity and composition rules give it the structure necessary to be a category, and the transitivity and unitality relations make it a category.

  Now suppose \cA is any category and $\pfree:\cG\to\cA$ is a map of directed graphs.
  Then $\pfree$ extends uniquely to the objects of $\F\bCat\cG$, since they are the same as those of \cG.
  But unlike the case of posets, we have to define our desired extension $\free$ on the morphisms of $\F\bCat\cG$ as well.

  If $\phi :(A\types B)$ is derivable, then by \cref{thm:category-tad} it has a unique derivation; thus we can define $\free(\phi)$ by recursion on the derivation of $\phi$.
  Of course, if the derivation of $\phi$ ends with $f\in\cG(A,B)$, then we define $\free(\phi)=\pfree(f)$; if it ends with $\idfunc_A$ we define $\free(\phi)=\idfunc_{P(A)}$; and if it ends with $\psi\comp{C}\chi$ we define $\free(\phi) = \free(\psi)\circ \pfree(\chi)$.

  We also have to show that this definition respects the equivalence relation $\equiv$.
  This is clear since $\cA$ is a category; formally it would be another induction on the derivations of $\equiv$ judgments.

  Finally, we have to show that this $\free:\F\bCat\cG\to\cA$ is a functor.
  This follows by definition of the category structure of $\F\bCat\cG$ and the action of $\free$ on its arrows.
\end{proof}

Of course, once again very little seems to be happening; we are just using a complicated funny syntax to build a free algebraic structure.
(In fact, what we are doing now is analogous to the ``tautological construction'' of free groups from \cref{sec:syntax}.)
Therefore, it is the second way to deal with the problem of associativity that is more interesting.

\subsection{Cut admissibility}
\label{sec:category-cutadm}

In this case what we do is \emph{remove the composition rule $\circ$ entirely}; instead we ``build (post)composition into the axioms''.
That is, the only rule independent of \cG is identities:
\[ \inferrule{ }{A\types A}\,\idfunc \]
while for every edge $f\in \cG(A,B)$ we take the following rule:
\[ \inferrule{X\types A}{X\types B} \,f \]
for any $X$.
Informally, one might say that we represent $f$ by its ``image under the Yoneda embedding''.

Note that we have made a choice to build in \emph{postcomposition}; we could also have chosen to build in precomposition.
In the current context, either choice would work just as well; but later on we will see that there were reasons to choose postcomposition here.
We will call this the \textbf{cut-free type theory for categories under \cG}.

In this theory, if we have $f\in\cG(A,B)$, $g\in\cG(B,C)$, and $h\in \cG(C,D)$ there is \emph{only one way} to derive $A\types D$:
\begin{mathpar}
  \inferrule*[Right=$h$]{
    \inferrule*[Right=$g$]{
      \inferrule*[Right=$f$]{
        \inferrule*[Right=$\idfunc$]{ }{A\types A}
      }{
        A\types B
      }
    }{
      A\types C
    }
  }{
    A\types D
  }
\end{mathpar}
Thus, we no longer have to worry about distinguishing between $h\circ (g\circ f)$ and $(h\circ g)\circ f$.
Of course, we have a new problem: if we are trying to build a category, then we \emph{do} need to be able to compose arrows!
So we need the following theorem:

\begin{thm}\label{thm:category-cutadm}
  If we have derivations of $A\types B$ and $B\types C$ in the cut-free type theory for categories under \cG, then we can construct a derivation of $A\types C$.
\end{thm}
\begin{proof}
  We induct on the derivation of $B\types C$.
  If it ends with $\idfunc$, then it must be that $B=C$; so our given derivation of $A\types B$ is also a derivation of $A\types C$.
  Otherwise, we must have some $f\in\cG(D,C)$ and our derivation of $B\types C$ ends like this:
  \begin{mathpar}
    \inferrule*[right=$f$]{\inferrule*{\sD\\\\\vdots}{B\types D}}{B\types C}
  \end{mathpar}
  In particular, it contains a derivation \sD of $B\types D$.
  Thus, by the inductive hypothesis we have a derivation, say $\sD'$, of $A\types D$.
  Now we can simply follow this with the rule for $f$:
  \begin{equation*}
    \inferrule*[right=$f$]{\inferrule*{\sD'\\\\\vdots}{A\types D}}{A\types C}\qedhere
  \end{equation*}
\end{proof}

In type-theoretic lingo, \cref{thm:category-cutadm} says that \textbf{the cut rule is admissible} in the cut-free type theory for categories under \cG.
In other words, although the cut/composition rule
\begin{mathpar}
  \inferrule*[right=$\circ$]{A\types B \\ B\types C}{A\types C}
\end{mathpar}
is not \emph{part of the type theory} as defined, it is nevertheless true that whenever we have derivations of the premises of this rule, we can construct a derivation of its conclusion.

\begin{rmk}\label{rmk:admissible-derivable-1}
  This is what it means in general for a rule to be \textbf{admissible}: it is not part of the theory as defined (that is, it is not one of the \textbf{primitive rules}), but nevertheless if it were added to the theory it would not change the set of derivable sequents.\footnote{This terminology comes from the posetal case, where ``derivability'' is the important concept.
    If we care about distinguishing between different derivations of the same sequent (to represent multiple parallel morphsims in a category), then an admissibility theorem is better regarded as an \emph{operation} on derivations.
    We will return to this later on.}
  In between primitive and admissible rules there are \textbf{derivable rules}: those that can be expanded out directly into a fragment of a derivation in terms of the primitive rules.
  For instance, if we have $f\in\cG(A,B)$ and $g\in \cG(B,C)$, then the left-hand rule below is derivable:
  \begin{mathpar}
    \inferrule*{X\types A}{X\types C}\and
    \inferrule*[Right=$g$]{\inferrule*[Right=$f$]{X\types A}{X\types B}}{X\types C}
  \end{mathpar}
  because we can expand it out into the right-hand derivation in terms of the primitive rules.
  Any derivable rule is admissible: if we have a derivation of $X\types A$ we can follow it with the $f$ and $g$ rules to obtain a derivation of $X\types C$.
  Note the difference with the proof of cut-admissibility: here we do not need to modify the given derivation, we only apply further primitive rules to its conclusion.
  (The reader should beware, however, that the words ``derivable'' and ``admissible'' are frequently misused.)
  We will return to this distinction in \cref{rmk:admissible-derivable-2}.
\end{rmk}

Closely related to cut-admissibility is \textbf{cut-elimination}, which in our theory takes the following form.

\begin{thm}\label{thm:category-cutelim}
  Consider the cut-free type theory for categories under \cG with the cut rule \emph{added} as primitive.
  If $A\types B$ has a derivation in this new theory, then it also has a derivation in the cut-free theory.
\end{thm}
\begin{proof}
  We induct on the derivation of $A\types B$.
  If it ends with $\idfunc$, it is already cut-free.
  If it ends like this for some $f\in\cG(C,B)$:
  \begin{mathpar}
    \inferrule*[right=$f$]{\inferrule*{\sD\\\\\vdots}{A\types C}}{A\types B}
  \end{mathpar}
  then by induction, $A\types C$ has a cut-free derivation, to which we can apply the $f$ rule to obtain a cut-free derivation of $A\types B$.
  Finally, if it ends with the cut rule:
  \begin{mathpar}
    \inferrule*[right=cut]{\inferrule*{\sD_1\\\\\vdots}{A\types C} \\ \inferrule*{\sD_2\\\\\vdots}{C\types B}}{A\types B}
  \end{mathpar}
  then by induction $A\types C$ and $C\types B$ have cut-free derivations, and thus by \cref{thm:category-cutadm} so does $A\types B$.
\end{proof}

Note that cut-elimination is a fairly straightforward consequence of cut-admissibility: the latter allows us to eliminate each cut one by one.
This will nearly always be true for our type theories, so we will usually just prove cut admissibility and rarely remark on the cut-elimination theorem that follows from it.
On the other hand, cut admissibility is a special case of cut-elimination, and sometimes people prove cut-elimination directly without explicitly using cut-elimination as a lemma.
Under this approach, the inductive step in cut-admissibility is viewed instead as a step of ``pushing cuts upwards'' through a derivation: given a derivation as on the left below in the theory with cut, we transform it into the derivation on the right in which the cut is higher up.
\begin{equation*}
  \inferrule*[right=cut]{\inferrule*{\sD_1\\\\\vdots}{A\types B} \\
    \inferrule*[Right=$f$]{\inferrule*{\sD_2\\\\\vdots}{B\types C}}{B\types D}}{A\types D}
  \quad\leadsto\quad
  \inferrule*[right=$f$]{\inferrule*[Right=cut]{\inferrule*{\sD_1\\\\\vdots}{A\types B} \\
    \inferrule*{\sD_2\\\\\vdots}{B\types C}}{A\types C}}{A\types D}
\end{equation*}
Because our derivation trees are finite (or, more generally, well-founded) this process must eventually terminate with all the cuts eliminated.

A more category-theoretic way to say what is going on is that the morphisms in the free category on a directed graph \cG have an explicit description as \emph{finite strings of composable edges} in \cG.
(This is analogous to the description of free groups using reduced words in \cref{sec:syntax}.)
We have just given an inductive definition of ``finite string of composable edges'': there is a finite string (of length 0) from $A$ to $A$; and if we have such a string from $X$ to $A$ and an edge $f\in\cG(A,B)$, we can construct a string from $X$ to $B$.

We could prove the initiality theorem by appealing to this known fact about free categories, but as before, we prefer to give a more explicit proof to illustrate the patterns of type theory.
For this purpose, it is convenient to first introduce terms, as we did in the previous section for the cut-ful theory.
We can do this with terms directly constructed so that their parse tree will mirror the derivation tree, for instance writing the rules as
\begin{mathpar}
  \inferrule{ }{\idfunc_A:(A\types A)}\,\idfunc\and
  \inferrule{\phi:(X\types A)}{\postc f\phi:(X\types B)} \,f
\end{mathpar}
Then a term derivation and corresponding parse tree would look like
\begin{equation*}
\inferrule*[right=$h$]{
    \inferrule*[Right=$g$]{
      \inferrule*[Right=$f$]{
        \inferrule*[Right=$\idfunc$]{ }{\idfunc_A:(A\types A)}
      }{
        \postc f{\idfunc_A}:(A\types B)
      }
    }{
      \postc g{\postc f{\idfunc_A}}:(A\types C)
    }
  }{
    \postc h{\postc g{\postc f{\idfunc_A}}}:(A\types D)
  }
  \qquad\leadsto\qquad
  \raisebox{2.75cm}{\xymatrix@-1pc{
      \postcsym h \ar@{-}[d] \\
      \postcsym g \ar@{-}[d] \\
      \postcsym f \ar@{-}[d] \\
      \idfunc_A
    }}
\end{equation*}
However, now there is another option available to us, which begins to show more of the characteristic behavior of type-theoretic terms.
Rather than describing the entire judgment $A\types B$ with a term, the way we did for the cut-ful theory, we assign a \emph{formal variable} such as $x$ to the domain $A$, and then an expression containing $x$ to the codomain $B$.
For the theory of plain categories that we are working with here, the only possible expressions are repeated applications of function symbols to the variable, such as $h(g(f(x)))$.
We write this as
\[ x:A \types h(g(f(x))) : B\]
The identity and generator rules can now be written as
\begin{mathpar}
  \inferrule{ }{x:A\types x:A}\,\idfunc \and
  \inferrule{x:X\types M:A \\ f\in\cG(A,B)}{x:X\types f(M):B} \,f
\end{mathpar}
Here $M$ denotes an arbitrary term, which will of course involve the variable $x$.
Thus, for instance, the composite of $h$, $g$, and $f$ would be written like so:
\begin{mathpar}
  \inferrule*[Right=$h$]{
    \inferrule*[Right=$g$]{
      \inferrule*[Right=$f$]{
        \inferrule*[Right=$\idfunc$]{ }{x:A\types x:A}
      }{
        x:A\types f(x):B
      }
    }{
      x:A\types g(f(x)): C
    }
  }{
    x:A\types h(g(f(x))):D
  }
\end{mathpar}
Of course, the term $h(g(f(x)))$ has essentially the same parse tree as the term $\postc h{\postc g{\postc f{\idfunc_A}}}$ shown above, so it can clearly represent the same derivation.
The main difference is that instead of $\idfunc_A$ we have the variable $x$ representing the identity rule.

This is our first encounter with how type theory permits a ``set-like'' syntax when reasoning about arbitrary categorical structures.
It is also one reason why we chose to build in postcomposition rather than precomposition.
If we used precomposition instead, then the analogous syntax would be backwards: we would have to represent $f:A\to B$ as $f(u):A \types u:B$ rather than $x:A \types f(x):B$.
At a formal level, there would be little difference, but it feels much more familiar to apply functions to variables than to co-apply functions to co-variables.
(We can still dualize at the level of the categorical models; we already mentioned in \cref{sec:intro} that we could apply the type theory of categories with finite products to the opposite of the category of commutative rings.)

Now we observe that terms are still derivations in this theory.

\begin{lem}
  If $x:X \types M:B$ is derivable in the cut-free type theory for categories under \cG, then it has a unique derivation.
\end{lem}
\begin{proof}
  If $M$ is the variable $x$, then the only possible derivation is $\idfunc$.
  And if $M = f(N)$, where $f\in\cG(A,B)$, then it can only be obtained from the generator rule for $f$ applied to $x:X \types N:A$.
\end{proof}

Note that the terms in this theory are simpler than those in the cut-ful theory in that we don't need the type subscripts on the composition operation $\comp{A}$.
This is because each rule composes with only one generator $f$, and each such generator ``knows'' its domain, so the premise of the rule is determined by the conclusion.

Another difference between the two theories that instead of attaching a term to the entire derivation such as $(f\circ g): (A\types C)$, we now attach a variable to the antecedent and a more complex term to the consequent.
Really it is the pair of both of these that plays the role played by the terms in \cref{sec:category-cutful}; that is, we may regard $x:A \types M:B$ as a notational variation of something like\footnote{The period used for the pairing here is a ``variable binder''; we will return to it later on.} $x.M:(A\types B)$, and regard $x.M$ as the real ``term''.
However, everyone always refers to the non-variable part $M$ as the \emph{term}, and the separation into variable (or, later, variables) and term is responsible for much of the characteristic behavior of terms in type theory.

In particular, unlike in the cut-ful theory, it is no longer true that each derivation determines a \emph{unique} term (or more precisely, variable-term pair), because we have to choose a name for the variable.
As written on the page, the judgments $x:A \types f(x):B$ and $y:A \types f(y):B$ are distinct; but they represent the same derivation (if we remove the term annotations) and the same morphism:
\begin{mathpar}
  \inferrule*[right=$f$]{\inferrule*[Right=$\idfunc$]{ }{x:A\types x:A}}{x:A\types f(x):B}
  \and
  \inferrule*[right=$f$]{\inferrule*[Right=$\idfunc$]{ }{y:A\types y:A}}{y:A\types f(y):B}
\end{mathpar}

This should not really be overly worrisome.
Recall that we regard terms as merely \emph{notation} for derivations, which we introduced in order to talk about derivations (and, in particular, to describe an equivalence relation $\equiv$ on them) in a more concise and readable way.
Thus, we are really just saying that we have more than one notation for the same thing, which is of course commonplace in mathematics.
For instance, saying ``let $f(x)=x^2$'' and ``let $f(t)=t^2$'' are two notationally different ways to define exactly the same function $\dR\to\dR$.

To be sure, there is a different viewpoint on type theory that takes \emph{terms} as primary objects rather than derivations, regarding the derivability of a judgment such as $x:X\types M:B$ as a \emph{property} of the term $M$, rather than regarding (as we do) the term $M$ as a notation for a particular derivation of $X\types B$.
One reason for this is that terms are (by design) much more concise than derivations, and so if we want to represent type theory in a computer then it is attractive to use terms as the basic objects rather than derivations.
% And if we want to actually program a computer to manipulate terms, or if we want to construct a free category using terms rather than derivations, then we do need to somehow remove the redundancy involved in the choice of variable name.

We will not follow this route.
However, even though we maintain the viewpoint that derivations are primary, there are reasons to think a bit more carefully about the issue of variable names.
Most of these reasons will not arise until \cref{chap:simple}, so we will not say very much about the issue here; but we will at least introduce in our present simple context the two basic ways of dealing with the ambiguity in variable names.

The first method is to decide, once and for all, on a single variable name (say, $x$) to use for \emph{all} our derivations.
Then we cannot write $y:A \types f(y):B$ at all, and so every derivation does determine a unique term.
We call this the \textbf{de Bruijn method}.
(In theories with multiple variables this method becomes more complicated; we will return to this in \cref{chap:simple}.)

The second method is to allow arbitrary choices of variable names (from some standard alphabet), but be aware of the operation of variable renaming.
We say that two terms are \textbf{$\alpha$-equivalent} if they differ by renaming the variable; thus we can say that a derivation determines a unique $\alpha$-equivalence class of terms.
(In theories with ``variable binding'', the definition of $\alpha$-equivalence is likewise more complicated; we will return to this in \cref{sec:catcoprod} and discuss it formally in \cref{chap:dedsys}.)

Of these two methods, the de Bruijn method is theoretically cleaner, and better for implementation in a computer, but tends to detract from readability for human mathematicians.
We will return to discuss these two methods when we have more complicated theories where there is more interesting to say about them.
For now, we continue to use arbitrary variables, remembering that the particular choice of variable name is irrelevant, that derivations are primary, and that terms are just a convenient notation for derivations.

Now that we have such a convenient notation, we can observe that \cref{thm:category-cutadm} is not just a statement about derivability.
Indeed, the proof that we gave is ``constructive'', in the strong sense that it actually determines an \emph{algorithm} for transforming a pair of derivations of $A\types B$ and $B\types C$ into a derivation of $A\types C$.
The inductive nature of the proof means that this algorithm is recursive.
And because terms uniquely represent derivations (modulo $\alpha$-equivalence), it can equivalently be considered an operation on derivable term judgments.

For instance, suppose we start with $x:A \types f(x):B$ and $y:B\types h(g(y)):C$; then the construction proceeds in the following steps.
\begin{itemize}
\item The second derivation ends with an application of $h$, so we apply the inductive hypothesis to $x:A \types f(x):B$ and $y:B\types g(y):D$.
\item Now the second derivation begins with an application of $g$, so we recurse again on $x:A \types f(x):B$ and and $y:B\types y:B$.
\item This time the second derivation is just the identity rule, so the result is the first given derivation $x:A \types f(x):B$.
\item Backing out of the induction one step, we apply $g$ to this result to get $x:A\types g(f(x)):D$.
\item Finally, backing out one more time, we apply $h$ to the previous result to get $x:A\types h(g(f(x))):C$.
\end{itemize}
Intuitively, the result $h(g(f(x)))$ has been obtained by \emph{substituting} the term $f(x)$ for the variable $y$ in the term $h(g(y))$.
Thus, we refer to the operation defined by \cref{thm:category-cutadm} as \textbf{substitution}, and sometimes state \cref{thm:category-cutadm} and its analogues as \textbf{substitution is admissible}.
In general, given $x:A\types M:B$ and $y:B\types N:C$ we denote the substitution of $M$ for $y$ in $N$ by $N[M/y]$ (although unfortunately one also finds other notations in the literature; including, quite confusingly, $[M/y]N$ and $N[y/M]$).

The operation $N[M/y]$ this is ``meta-notation'': the square brackets are not part of the syntax of terms, instead they denote an operation \emph{on} terms.
The proof of \cref{thm:category-cutadm} \emph{defines} the notion of substitution recursively in the following way:
\begin{align}
  y[M/y] &= M\label{eq:category-sub-1}\\
  f(N)[M/y] &= f(N[M/y])\label{eq:category-sub-2}
\end{align}
When terms are regarded as objects of study in their own right, rather than just as notations for derivations, it is common to define substitution as an operation on terms first, and then to state \cref{thm:category-cutadm} as ``if $x:A\types M:B$ and $y:B\types N:C$ are derivable, then so is $x:A\types N[M/y]:C$''.
We instead consider \cref{thm:category-cutadm} as fundamentally an operation on derivations, which we call ``substitution'' especially when representing it using term notation.

Note, though, that because a derivation is represented by a term together with a variable for the antecedent (that is, $x:X\types M:B$ is a notational variant of $x.M:(X\types B)$), technically this operation on derivations has to specify the variables too.
The notation $N[M/y]$ represents only the term part; so the definitions~\eqref{eq:category-sub-1} and~\eqref{eq:category-sub-2} are only complete when combined with the statement that the variable of $N[M/y]$ is the same as that of $M$.

\begin{rmk}
  Substitution is already a place where the use of distinct named variables (and hence $\alpha$-equivalence) makes the exposition substantially clearer for a human reader.
  We even teach our calculus students (or, at least, the author does) that when composing functions $f$ and $g$, it is clearer to use different variables for the two functions, writing $y=f(x)$ but $z=g(y)$ and then plugging $f(x)$ in place of $y$ in the second equation to get $z = g(f(x))$.
  It is possible to get away with using the same variable for the inputs of all functions, as we do in de Bruijn style, but it is much easier to get confused that way.
\end{rmk}

Before proving the initiality theorem, let us first observe that substitution does, in fact, define a category:

\begin{lem}\label{thm:category-subassoc}
  Substitution is associative: given $x:A\types M:B$ and $y:B\types N:C$ and $z:C\types P:D$, we have $P[N/z][M/y] = P[N[M/y]/z]$.
  (This is a literal equality of derivations, or equivalently of terms modulo $\alpha$-equivalence.)
\end{lem}
\begin{proof}
  By induction on the derivation of $P$.
  If it ends with the identity, so that $P=z$, then
  \[P[N/z][M/y] = z[N/z][M/y] = N[M/y] = z[N[M/y]/z] = P[N[M/y]/z] \]
  If it ends with an application of a morphism $f$, so that $P = f(Q)$, then
  \begin{multline*}
    f(Q)[N/z][M/y] = f(Q[N/z])[M/y] = f(Q[N/z][M/y])\\
    = f(Q[N[M/y]/z]) = f(Q)[N[M/y]/z]
  \end{multline*}
  using the inductive hypothesis for $Q$ in the third step.
\end{proof}

\begin{thm}\label{thm:category-initial-2}
  The free category on a directed graph $\cG$ has the same objects as \cG, and its morphisms are the derivations $A\types B$ in the cut-free type theory for categories under \cG (or, equivalently, the derivable term judgments $x:A \types M:B$, modulo $\alpha$-equivalence).
\end{thm}
\begin{proof}
  Let $\F\bCat\cG$ be defined as in the statement, with composition given by substitution constructed as in \cref{thm:category-cutadm}.
  By \cref{thm:category-subassoc}, composition is associative.
  For unitality, we have $y[M/y] = y$ by definition, while $N[x/x] = N$ is another easy induction on the structure of $N$.
  Thus, $\F\bCat\cG$ is a category.

  Now suppose \cA is any category and $\pfree:\cG\to\cA$ is a map of directed graphs.
  We define $\free:\F\bCat\cG \to\cA$ by recursion on the rules of the type theory: the identity $x:A\types x:A$ goes to $\idfunc_{P(A)}$, while $x:A\types f(M):B$ goes to $\pfree(f) \circ \free(M)$, with $\free(M)$ defined recursively.
  Since $x:A\types f(M):B$ is the composite of $x:A\types M:C$ and $y:C\types f(y):B$ in $\F\bCat\cG$, this is the only possible definition that could make $\free$ a functor.
  It remains to check that it actually \emph{is} a functor, i.e.\ that it preserves \emph{all} composites; that is, we must show that $\free(N[M/y]) = \free(N) \circ \free(M)$.
  This follows by yet another induction on the derivation of $N$.
\end{proof}

Note that we did not have to impose any equivalence relation on the derivations in this theory.
This suggests a second, more interesting, general statement about categorical logic: it is
{a syntax for generating free categorical structures using derivations from rules}
\emph{that yield elements in canonical form}, eliminating the need for quotients.
This statement is actually too narrow; as we will see later on, type theory is not \emph{just} about canonical forms.
However, canonical forms do play a very important role.

From the perspective of category theory, the reason for the importance of canonical forms is that we can easily decide whether two canonical forms are equal.
In the cut-free type theory for categories, two terms present the same morphism in a free category just when they are literally equal (modulo $\alpha$-equivalence); whereas to check whether two terms are equal in the cut-ful theory we have to remove the identities and reassociate them all to the left or the right.

In fact, a good algorithm for checking equality of terms in the cut-ful theory is to \emph{interpret them into the cut-free theory}!
That is, we note that every rule of the cut-ful theory is admissible in the cut-free theory, and hence eliminable; so any term (i.e.\ derivation) in the cut-ful theory yields a derivation in the cut-free theory.
For instance, to translate the cut-ful term $h\comp{C} ((\id_C\comp{C} g) \comp{B} f)$ into the cut-free theory, we first write it as a derivation
\begin{mathpar}
  \inferrule*[Right=$\circ$]{
    h:(C\types D)\\
    \inferrule*[Right=$\circ$]{
      \inferrule*[Right=$\circ$]{
        \id_C:(C\types C)\\
        g:(B\types C)}
      {(\id_C\comp{C} g):(B\types C)}\\
      f:(A\types B)
    }{((\id_C\comp{C} g) \comp{B} f):(A\types C)}
  }{(h\comp{C} ((\id_C\comp{C} g) \comp{B} f)):(A\types D)}
\end{mathpar}
and then annotate the same derivation by cut-free terms, using substitution for composition:
\begin{mathpar}
  \inferrule*[Right=$\circ$]{
    z:C\types h(z):D\\
    \inferrule*[Right=$\circ$]{
      \inferrule*[Right=$\circ$]{
        z:C\types z:C\\
        y:B\types g(y):C}
      {y:B\types g(y):C}\\
      x:A\types f(x):B
    }{x:A\types g(f(x)):C}
  }{x:A\types h(g(f(x))):D}
\end{mathpar}
Since, as we have proven, both the cut-ful and the cut-free theory present the same free structure, it follows that \emph{two terms in the cut-ful theory are equal modulo $\equiv$ exactly when their images in the cut-free theory are identical}.
Informally, we are just comparing two terms by  ``removing all the identities and the parentheses''; but in a more complicated theory much more can be going on.

In this sense, type theory can be considered to be about solving \emph{coherence problems} in category theory.
In general, the coherence problem for a categorical structure is to decide when two morphisms ``constructed from its basic data'' are equal (or isomorphic, etc.)
For instance, the classical coherence theorem of MacLane for monoidal categories says, informally, that two parallel morphisms constructed from the basic constraint isomorphisms of a monoidal category are \emph{always} equal; whereas the analogous theorem for braided monoidal categories says that they are equal if and only if they have the same underlying braid.
A type-theoretic calculus of canonical forms gives a way to answer this question, by translating a cut-ful theory into a cut-free one, and cut-elimination methods have frequently been used in the proof of coherence theorems. % [TODO: some citations].
% ~\cite{blute:ll-coh},~\cite{bcst:natded-coh-wkdistrib}
% https://golem.ph.utexas.edu/category/2008/02/logicians_needed_now.html#c015266
% https://golem.ph.utexas.edu/category/2008/02/logicians_needed_now.html#c015167
% and other references therein
We will not explore this aspect here, however.

\label{sec:identifying-initial-objects}
A related remark is that categorical logic is about \emph{showing that two different categories have the same\footnote{Of course, technically, an object of one category is not generally also an object of another one.  So what we mean is that there is an easy way to transform the initial object of one category into the initial object of another.} initial object}.
The primitive rules of a type theory can be regarded as the ``operations'' of a certain algebraic theory, and the judgments that can be derived from these rules form the initial algebra for this theory, i.e.\ the initial object in a certain category.
(See \cref{chap:dedsys} for a precise statement along these lines.)
The initiality theorems we care about, however, show that these initial objects are \emph{also} initial in some other, quite different, category that is of more intrinsic categorical interest.

\begin{rmk}\label{rmk:admissible-derivable-2}\label{rmk:free}
  This point of view sheds further light on the distinction between derivable and admissible rules mentioned in \cref{rmk:admissible-derivable-1}.
  A derivable rule automatically holds in any model of the ``algebraic theory'' version of a type theory, whereas an admissible rules holds \emph{only in the initial algebra} (or, more generally, free algebras) for this algebraic theory.
  In particular, an arbitrary model of the algebraic rules of the cut-free type theory for categories is not even a category, e.g.\ it may not satisfy the cut rule.

  It can be tempting for a category theorist, upon learning that type theory is a presentation of a certain free structure, to conclude that the emphasis on \emph{free} structures is myopic or of only historical interest, and attempt to generalize to not-necessarily-free algebras over the same theory.
  This temptation should be resisted.
  At best, it leads to neglect of some of the most important and interesting features of type theory, such as cut-elimination, which holds only in free structures.
  At worst, it leads to nonsense, for central type-theoretic notions such as ``bound variable'' (see \cref{sec:catcoprod}) \emph{only make sense} in free structures.
  We will see in \cref{sec:unary-theories} that we can still use type theory to ``present'' categorical objects that are not themselves free (at least, not in the usual sense); but the syntax of types and terms/derivations must still itself be freely generated.
\end{rmk}


\subsection*{Exercises}

\begin{ex}\label{ex:categories-over}
  Let \sM be a fixed category; then we have an induced adjunction between $\bCat/\sM$ and $\bGr/\sM$.
  Describe a cut-free type theory for presenting the free category-over-\sM on a directed-graph-over-\sM, and prove the initiality theorem (the analogue of \cref{thm:category-initial-2}).
  Note that you will have to prove that cut is admissible first.
  \textit{(Hint: index the judgments by arrows in \sM, so that for instance $A\types_\alpha B$ represents an arrow lying over a given arrow $\alpha$ in $\sM$.)}
\end{ex}

\begin{ex}\label{ex:cat-2free}
  Category theorists are accustomed to consider \bCat as a 2-category, but our free category $\F\bCat\cG$ only has a 1-categorical universal property, expressed by the 1-categorical adjunction between $\bCat$ and $\bGr$.
  It is not immediately obvious how it could be otherwise, since unlike \bCat, \bGr is only a 1-category; but there is something along these lines that we can say.
  \begin{enumerate}
  \item Suppose \cG is a directed graph and \cC a category; define a category $\bGr(\cG,\cC)$ whose objects are graph morphisms $\cG\to\cC$ and whose morphisms are an appropriate kind of ``natural transformation''.
  \item Prove that $\bGr(\cG,-)$ is a 2-functor $\bCat\to\bCat$.
  \item Using the cut-free presentation of $\F\bCat\cG$, prove that it is a representing object for this 2-functor.
  \end{enumerate}
\end{ex}

\begin{ex}\label{ex:nonfree-noadm}
  Regarding the cut-free type theory for categories as describing a multi-sorted algebraic theory, define a particular algebra for this theory that does not satisfy the cut rule.
  Then define another algebra that does admit a ``cut rule'', but in which the resulting ``composition'' is not associative.
\end{ex}


\section{Meet-semilattices}
\label{sec:mslat}

Moving gradually up the ladder of nontriviality, we now consider categories with finite products, or more precisely binary products and a terminal object.
In fact, let us revert back to the posetal world first and consider posets with binary meets and a top element, i.e.\ meet-semilattices.
We will make all this structure algebraic, so that our meet-semilattices are posets (which, recall, is not necessarily skeletal) \emph{equipped with} a chosen top element and an operation assigning to each pair of objects a meet.
We then have an adjunction relating the category \bmSLat of such meet-semilattices (and morphisms preserving all the structure strictly) with the category \bRelGr of relational graphs, and we want to describe the free meet-semilattice on a relational graph \cG.

One new feature this introduces is that the objects of $\F\bmSLat \cG$ will no longer be the same as those of \cG: we need to add a top element and freely apply the meet operation.
In order to describe this type-theoretically, we introduce a new judgment ``$\types A\type$'', meaning that $A$ will be one of the objects of the poset we are generating.
The rules for this judgment are
\begin{mathpar}
  \inferrule{ }{\types \top\type} \and
  \inferrule{\types A\type \\ \types B\type}{\types A\meet B\type}
\end{mathpar}
When talking about type theory under \cG, we additionally include ``axiom'' rules saying that each object of \cG is a type:
\begin{mathpar}
  \inferrule{A\in\cG}{\types A\type}
\end{mathpar}
Note that the premise $A\in \cG$ here is not a judgment; rather it is an ``external'' fact that serves as a precondition for application of this rule.
Thus it would be more correct to write this rule as
\begin{mathpar}
  \inferrule{ }{\types A\type}\;\text{(if $A\in\cG$)}
\end{mathpar}
but we will generally write such conditions as premises, since otherwise the notation can get rather unwieldy.

As an example of the application of these rules, if $A,B\in\cG$ we have a derivation
\begin{mathpar}
  \inferrule*{
    \inferrule*{A\in\cG}{\types A\type}\\
    \inferrule*{\inferrule*{ }{\types \top\type} \\ \inferrule*{B\in\cG}{\types B\type}}{\types \top\meet B\type}
  }{
    \types (A\meet (\top\meet B))\type
  }
\end{mathpar}
so that $A\meet (\top\meet B)$ will be one of the objects of $\F\bmSLat\cG$.

Now we need to describe the morphisms, i.e.\ the relation $\le$ in $\F\bmSLat\cG$.
The obvious thing to do is to assert the universal property of the meet and the top element:
\begin{mathpar}
  \inferrule{ }{A\types \top}\and
  \inferrule{ }{A\meet B \types A}\and
  \inferrule{ }{A\meet B \types B}\and
  \inferrule{A\types B \\ A\types C}{A\types B\meet C}
\end{mathpar}
This works, but it forces us to go back to asserting transitivity/cut.
For instance, if $A,B,C\in \cG$ we have the following derivation:
\begin{mathpar}
  \inferrule*{
    \inferrule*{ }{(A\meet B)\meet C \types A\meet B}\\
    \inferrule*{ }{A\meet B \types A}
  }{
    (A\meet B)\meet C \types A
  }
\end{mathpar}
but there is no way to deduce this without using the cut rule.
Thus, this ``cut-ful type theory for meet-semilattices under \cG'' works, but to have a better class of ``canonical forms'' for its relations we would also like a cut-free version.

What we need to do is to treat the ``projections'' $A\meet B \to A$ and $A\meet B\to B$ similarly to how we treated the edges of \cG in \cref{sec:categories}.
However, at this point we have to make a choice of whether to build in postcomposition or precomposition:
\[
\inferrule{A\types C}{A\meet B \types C} \qquad\text{or}\qquad
\inferrule{C\types A\meet B}{C\types A} \quad ?
\]
Both choices work (that is, they make cut admissible), and lead to different kinds of type theories with different properties.
The first leads to a kind of type theory called \textbf{sequent calculus}, and the second to a kind of type theory called \textbf{natural deduction}.
We consider each in turn.

\subsection{Sequent calculus for meet-semilattices}
\label{sec:seqcalc-mslat}

To be precise, for a relational graph \cG, the \textbf{unary sequent calculus for meet-semilattices under \cG} has the following rules (in addition to the rules for the judgment $\types A\type$ mentioned above).
We label each rule on the right to make them easier to refer to later on.
\begin{mathpar}
  \inferrule{A\in \cG}{A\types A}\;\idfunc\and
  \inferrule{f\in \cG(A,B) \\ X\types A}{X\types B}\;fR\and
  \inferrule{\types A\type}{A\types \top}\;\top R\and
  \inferrule{A\types C \\ \types B\type}{A\meet B \types C}\;\meetL1\and
  \inferrule{B\types C \\ \types A\type}{A\meet B \types C}\;\meetL2\and
  \inferrule{A\types B \\ A\types C}{A\types B\meet C}\;\meetR
\end{mathpar}

There are several things to note about this.
The first is that we have included in the premises some judgments of the form $\types A\type$.
This ensures that whenever we can derive a sequent $A\types B$, that $A$ and $B$ are well-formed as types.
However, we don't need to assume explicitly as premises that \emph{all} types appearing in any sequent are well-formed, only those that are introduced without belonging to any previous sequents; this is sufficient for the following inductive proof.

\begin{thm}\label{thm:seqcalc-mslat-wftype}
  In the unary sequent calculus for meet-semilattices under \cG, if $A\types B$ is derivable, then so are $\types A\type$ and $\types B\type$.
\end{thm}
\begin{proof}
  By induction on the derivation of $A\types B$.
  \begin{itemize}
  \item If it is the $\idfunc$ rule, then $A\in\cG$ and so $\types A\type$.
  \item If it ends with the rule $fR$ for some $f\in\cG(A,B)$, then $B\in \cG$ and so $\types B\type$, while $X\types A$ and so $\types X\type$ by the inductive hypothesis.
  \item If it ends with the rule $\top R$, then $\types A\type$ by assumption.
  \item If it ends with the rule $\meetL1$, then $\types B\type$ by assumption, while $\types A\type$ and $\types C\type$ by the inductive hypothesis; thus also $\types A\meet B\type$.
  \item The cases for $\meetL2$ and $\meetR$ are similar.\qedhere
  \end{itemize}
\end{proof}

We will generally formulate our type theories with just enough premises to make theorems such as \cref{thm:seqcalc-mslat-wftype} true.
Essentially this means that if some type appears in the conclusion but not in any of the premises, we have to add its ``type-ness'' judgment as an additional premise.
We will not usually state and prove theorems analogous to \cref{thm:seqcalc-mslat-wftype} explicitly, but the reader can verify that they will always be true.

The second thing to note about our current type theory is that we only assert the identity rule $A\types A$ when $A$ is a \emph{generating object} (also called a \emph{base type}), i.e.\ an object of \cG.
This is sufficient because in the sequent calculus, we can derive the identity rule for any type:

\begin{thm}\label{thm:seqcalc-mslat-idadm}
  In the unary sequent calculus for meet-semilattices under \cG, if $A$ is a type (that is, if $\types A\type$ is derivable), then $A\types A$ is derivable.
\end{thm}
\begin{proof}
  We induct on the derivation of $\types A\type$.
  There are three cases:
  \begin{enumerate}
  \item $A$ is in \cG.  In this case $A\types A$ is an axiom.
  \item $A=\top$.  In this case $\top\types\top$ is a special case of the rule that anything $\types\top$.
  \item $A=B\meet C$ and we have derivations $\sD_B$ and $\sD_C$ of $\types B \type$ and $\types C\type$ respectively.
    Therefore we have, inductively, derivations $\sD_1$ and $\sD_2$ of $B\types B$ and $C\types C$, and we can put them together like this:
    \begin{equation*}
      \inferrule*{
        \inferrule*{
          \inferrule*{\sD_1\\\\\vdots}{B\types B} \\
          \inferrule*{\sD_C\\\\\vdots}{\types C\type}
        }{
          B\meet C \types B
        }\\
        \inferrule*{
          \inferrule*{\sD_2\\\\\vdots}{C\types C}\\
          \inferrule*{\sD_B\\\\\vdots}{\types B\type}
        }{
          B\meet C\types C
        }
      }{
        B\meet C\types B\meet C
      }\qedhere
    \end{equation*}
  \end{enumerate}
\end{proof}

In other words, the general identity rule
\[ \inferrule{\types A\type}{A\types A} \]
is also \emph{admissible}.
This is a general characteristic of sequent calculi.

Next we prove that the cut rule is admissible for this sequent calculus too.

\begin{thm}\label{thm:seqcalc-mslat-cutadm}
  In the unary sequent calculus for meet-semilattices under \cG, if $A\types B$ and $B\types C$ are derivable, then so is $A\types C$.
\end{thm}
\begin{proof}
  By induction on the derivation of $B\types C$.
  \begin{enumerate}
  \item If it is $\idfunc$, then $B=C$.
    Now $A\types C$ is just $A\types B$ and we are done.
  \item If it is $f\in\cG(C',C)$, then we have a derivation of $B\types C'$.
    So by the inductive hypothesis we can derive $A\types C'$, whence also $A\types C$ by the rule for $f$.
  \item If it ends with $\top R$, then $C=\top$.
    Since $A\types B$ is derivable, by \cref{thm:seqcalc-mslat-wftype} $\types A\type$ is also derivable; thus by $\top R$ we have $A\types \top$.
  \item If it ends with $\meetR$, then $C=C_1\meet C_2$ and we have derivations of $B\types C_1$ and $B\types C_2$.
    By the inductive hypothesis we can derive both $A\types C_1$ and $A\types C_2$, to which we can apply $\meetR$ to get $A\types C_1\meet C_2$.
  \item If it ends with $\meetL1$, then $B=B_1\meet B_2$ and we can derive $B_1\types C$.
    We now do a secondary induction on the derivation of $A\types B$.
    \begin{enumerate}
    \item It cannot end with $\idfunc$ or $f$ or $\top R$, since $B=B_1\meet B_2$ is not in $\cG$ and not equal to $\top$.
    \item If it ends with $\meetL1$, then $A=A_1\meet A_2$ and we can derive $A_1\types B$.
      By the inductive hypothesis, we can derive $A_1 \types C$, and hence by $\meetL1$ also $A \types C$.
      The case of $\meetL2$ is similar.
    \item If it ends with $\meetR$, then we can derive $A\types B_1$ and $A\types B_2$.
      Recall that we are also assuming a derivation of $B_1\types C$.
      Thus, by the inductive hypothesis on $A\types B_1$ and $B_1\types C$, we can derive $A\types C$.
      \label{item:mslat-principal-cut}
    \end{enumerate}
  \item The case when it ends with $\meetL2$ is similar.\qedhere
  \end{enumerate}
\end{proof}

This simple proof already displays many of the characteristic features of a cut-admissibility argument.
The final case~\ref{item:mslat-principal-cut} is called the \textbf{principal case} for the operation $\meet$, when the type $B$ we are composing over (also called the \textbf{cut formula}) is obtained from $\meet$ and both sequents are also obtained from the $\meet$ rules.
In a direct argument for cut-elimination such as that sketched after \cref{thm:category-cutelim}, this case becomes the following transformation on derivations:
\begin{equation*}
  \let\mymeet\meet
  \def\meet{\mathord{\mymeet}}
  \inferrule*[right=cut]{
    \inferrule*[right=$\meetR$]{\labderivof{\sD_1}{A\types B_1} \\ \labderivof{\sD_2}{A\types B_2}}{A\types B_1\meet B_2}\\
    \inferrule*[Right=$\meetL1$]{\labderivof{\sD_3}{B_1\types C}}{B_1\meet B_2\types C}}{A\types C}
  \quad\leadsto\quad
  \inferrule*[right=cut]{\labderivof{\sD_1}{A\types B_1} \\ \labderivof{\sD_3}{B_1\types C}}{A\types C}
\end{equation*}

\begin{rmk}
  It may seem somewhat odd that we can prove the admissibility of all cuts (compositions), but we have to assert identities as a primitive rule for base/generating types.
  This is essentially because we chose to ``build a cut'' into the rule $fR$ that represents the generating arrows.
  If we had not, then we would have to assert ``cuts over base types'' (that is, where the cut formula is an object of \cG) as primitive rules, the way we did in the cut-ful theory of \cref{sec:category-cutful}.
  Put differently, building a cut into $fR$ is essentially the ``morphism version'' of asserting identities primitively for base types.
\end{rmk}

Finally, we have the initiality theorem:

\begin{thm}\label{thm:seqcalc-mslat-initial}
  For any relational graph $\cG$, the free meet-semilattice $\F\bmSLat \cG$ it generates is described by the unary sequent calculus for meet-semilattices under \cG: its objects are the $A$ such that $\types A\type$ is derivable, with $A\le B$ just when $A\types B$ is derivable.
\end{thm}
\begin{proof}
  \cref{thm:seqcalc-mslat-idadm,thm:seqcalc-mslat-cutadm} show that this defines a poset $\F\bmSLat\cG$.
  The rule $\top R$ implies that $\top$ is a top element, while the rules $\meetL1$, $\meetL2$, and $\meetR$ imply that $A\meet B$ is a binary meet.
  Therefore, we have a meet-semilattice.
  Moreover, the rules $\idfunc$ and $f$ yield a map of posets $\cG\to \F\bmSLat\cG$.

  Now suppose $\cM$ is any other meet-semilattice with a map $\pfree:\cG\to\cM$.
  Recall that a meet-semilattices is equipped with a chosen top element and meet function.
  We extend $\pfree$ to a map from the objects of $\F\bmSLat\cG$ by recursion on the construction of the latter, sending $\top$ to the chosen top element of \cM, and $A\meet B$ to the chosen meet in \cM of the (recursively defined) images of $A$ and $B$.
  This is clearly the only possible meet-semilattice map extending $\pfree$, and it clearly preserves the chosen meets and top element, so it suffices to check that it is a poset map.
  This follows by a straightforward induction over the rules for deriving the judgment $A\types B$.
\end{proof}

To finish, we observe that this sequent calculus has another important property.
Inspecting the rules, we see that the operations $\meet$ and $\top$ only ever appear in the \emph{conclusions} of rules.
Each operation $\meet$ and $\top$ has zero or more rules allowing us to introduce it on the right of the conclusion, and likewise zero or more rules allowing us to introduce it on the left.
(Specifically, $\meet$ has two left rules and one right rule, while $\top$ has zero left rules and one right rule.)
This is convenient if we are given a sequent $A\types B$ and want to figure out whether it is derivable: we can choose rules to apply ``in reverse'' by breaking down $A$ and $B$ according to their construction out of $\meet$ and $\top$.

It also tells us nontrivial things about derivations.
For instance, all the primitive rules have the property that every type appearing in their premises also appears as a sub-expression of some type in their conclusion.
Thus, any (cut-free) \emph{derivation} of a sequent $A\types B$ must involves only types appearing as sub-expressions of $A$ and $B$.
This is called the \textbf{subformula property}.

The phrase \emph{sequent calculus}, like \emph{type theory}, is difficult to define precisely, but sequent calculi generally exhibit the properties we have observed in this subsection: admissibility of the identity rule (based on an axiom applying only to base types), admissibility of cut, type operations appearing only in the conclusions of rules, and the subformula property.

\subsection{Natural deduction for meet-semilattices}
\label{sec:natded-mslat}

Now suppose we make the other choice about how to treat projections.
We call this the \textbf{unary natural deduction for meet-semilattices under \cG}; its rules (in addition to those for $\types A\type$) are
\begin{mathpar}
  \inferrule{\types X\type}{X\types X}\;\idfunc\and
  \inferrule{f\in \cG(A,B) \\ X\types A}{X\types B}\;fI\and
  \inferrule{\types X\type}{X\types \top}\;\top I\and
  \inferrule{X\types B\meet C}{X \types B}\;\meetE1\and
  \inferrule{X\types B\meet C}{X \types C}\;\meetE2\and
  \inferrule{X\types B \\ X\types C}{X\types B\meet C}\;\meetI
\end{mathpar}

We observe first that this theory has the same well-formedness property as the sequent calculus:

\begin{thm}\label{thm:natded-mslat-wftype}
  In the unary natural deduction for meet-semilattices under \cG, if $A\types B$ is derivable, then so are $\types A\type$ and $\types B\type$.\qed
\end{thm}

Unlike the sequent calculus, however, the general identity rule is not admissible: there is no way to derive $A\meet B \types A\meet B$ from $A\types A$ and $B\types B$ without it.
Thus, we assert the $\idfunc$ for all types, not just those coming from \cG.

Cut, however, is still admissible:

\begin{thm}\label{thm:natded-mslat-cutadm}
  In the unary natural deduction for meet-semilattices under \cG, if $A\types B$ and $B\types C$ are derivable, then so is $A\types C$.
\end{thm}
\begin{proof}
  We induct on the derivation of $B\types C$.
  \begin{enumerate}
  \item The cases when it ends with $\idfunc$, $f$, $\top I$, and $\meetI$ are just like those in \cref{thm:seqcalc-mslat-cutadm} for $\idfunc$, $f$, $\top R$, and $\meetR$.
  \item If it ends with $\meetE1$, then we have $B\types C\meet D$ for some $D$.
    Thus, $A\types C\meet D$ by the inductive hypothesis, so $A\types C$ by $\meetE1$.
    The case of $\meetE2$ is similar.\qedhere
  \end{enumerate}
\end{proof}

The proof is noticeably simpler than that of \cref{thm:seqcalc-mslat-cutadm}; we don't need the secondary inner induction.
This is essentially due to the fact that all the rules of this theory involve an \emph{arbitrary} type $X$ on the left (rather than one built using operations such as $\meet$).
Thus, instead of the rules of sequent calculus that introduce operations like $\meet$ and $\top$ on the left and right, we have rules like $\top I$ and $\meetI$ that introduce them on the right, and also rules that \emph{eliminate} them on the right like $\meetE1$ and $\meetE2$.
These properties are characteristic of \emph{natural deduction} theories.
(Later on, in \cref{sec:natded-logic}, we will be able to give a more convincing explanation of the origin of the phrase ``natural deduction''.)

\begin{rmk}
  Because the proof of cut admissibility for natural deduction theories is so much simpler than that for sequent calculus, some people say that the former is ``trivial''.
  Triviality is subjective; but what seems inarguable is that cut-admissibility for natural deduction is \emph{saying something different} than cut-admissibility for sequent calculus.
  The content of cut-admissibility for sequent calculus corresponds more closely to $\beta$-conversion in natural deduction (see \cref{sec:beta-eta}).
  Similarly, the admissibility of the identity rule for sequent calculus corresponds to the $\eta$-conversion rule in natural deduction (see \cref{sec:beta-eta} for that too).
\end{rmk}

Of course, we should also prove the initiality theorem:

\begin{thm}\label{thm:natded-mslat-initial}
  For any relational graph $\cG$, the free meet-semilattice $\F\bmSLat \cG$ it generates is described by the unary natural deduction for meet-semilattices under \cG: its objects are the $A$ such that $\types A\type$ is derivable, with $A\le B$ just when $A\types B$ is derivable.
\end{thm}
\begin{proof}
  Almost exactly like \cref{thm:seqcalc-mslat-initial}.
\end{proof}

\subsection*{Exercises}

\begin{ex}\label{ex:mslat-idem}
  Using the unary sequent calculus for meet-semilattices, prove that $A\meet A \cong A$ for any object $A$ of any meet-semilattice.
  (Recall that meet-semilattices are categories with at most one morphism in each hom-set, so for two objects to be isomorphic it suffices to have a morphism in each direction.)
  Then prove the same thing using the natural deduction.
\end{ex}

\begin{ex}\label{ex:mslat-monoid}
  Using either the sequent calculus or the natural deduction for meet-semilattices (your choice), prove that in any meet-semilattice we have
  \begin{mathpar}
    A\meet \top \cong A\and
    \top \meet A \cong A\and
    A\meet B \cong B\meet A\and
    A\meet (B\meet C) \cong (A\meet B)\meet C\and
  \end{mathpar}
\end{ex}

\begin{ex}\label{ex:mslat-invertible}
  Prove that the rules $\top R$ and $\meetR$ in the unary sequent calculus for meet-semilattices are \emph{invertible}, in the sense that whenever we have a derivation of their conclusions, we also have a derivation of all their premises.
\end{ex}

\begin{ex}\label{ex:jslat}
  Describe a sequent calculus for \emph{join-semilattices} (posets with a bottom element and binary joins), and prove the admissibility and initiality theorems for it.
  The rules for $\bot$ and $\join$ should be exactly dual to the rules for $\top$ and $\meet$.
\end{ex}

\begin{ex}\label{ex:lattices}
  By putting together the rules for meet- and join-semilattices, describe a sequent calculus for \emph{lattices} (posets with a top and bottom element and binary meets and joins), and prove the admissibility and initiality theorems for it.
\end{ex}

\begin{ex}\label{ex:lattices-invertible}
  Prove that in your sequent calculus for lattices from \cref{ex:lattices}, the rules $\top R$, $\meet R$, $\bot L$, and $\join L$ are all invertible in the sense of \cref{ex:mslat-invertible}.
\end{ex}

\begin{ex}\label{ex:seqcalc-poset-fib}
  A map of posets $P:\sA\to\sM$ is called a \emph{(cloven) fibration} if whenever $b\in\sA$ and $x\le P(b)$, there is a chosen $a\in \sA$ such that $P(a)=x$ and $a\le b$ and moreover for any $c\in\sA$, $c\le b$ and $P(c)\le x$ together imply $c\le a$.
  The object $a$ can be written as $x^*(b)$.
  \begin{enumerate}
  \item Given a fixed poset \sM, describe a sequent calculus for fibrations over \sM by adding rules governing the operations $x^*$ to the cut-free theory of \cref{ex:categories-over}.
  \item Prove the initiality theorem for this sequent calculus.
  \item Use this sequent calculus to prove that in any fibration $P:\sA\to\sM$, if we have $b\in \sA$ and $x\le y\le P(b)$, then $x^*(y^*(b))\cong x^*(b)$.
  \end{enumerate}
\end{ex}

\begin{ex}\label{ex:natded-poset-fib}
  Now describe instead a natural deduction for fibrations over \sM, prove the initiality theorem, and re-prove that $x^*(y^*(b))\cong x^*(b)$ using this theory.
\end{ex}

\begin{ex}\label{ex:mslat-fib}
  Suppose we augment your sequent calculus for fibrations over \sM from \cref{ex:seqcalc-poset-fib} with the following additional rules for ``fiberwise meets''.
  Here $\types A\type_x$ is a judgment indicating that $A$ will be an object of our fibration in the fiber over $x\in\sM$.
  \begin{mathpar}
    \inferrule{ }{\types \top_x \type_x} \and
    \inferrule{\types A\type_x \\ \types B\type_x}{\types A\meet_x B\type_x} \and
    \inferrule{\types A\type_x}{A\types_{x\le y} \top_y}\and
    \inferrule{A\types_{x\le y} C \\ \types B\type_x}{A\meet_x B \types_{x\le y} C}\and
    \inferrule{B\types_{x\le y} C \\ \types A\type_x}{A\meet_x B \types_{x\le y} C}\and
    \inferrule{A\types_{x\le y} B \\ A\types_{x\le y} C}{A\types_{x\le y} B\meet_y C}\and
  \end{mathpar}
  Consider the sequents
  \begin{align*}
    x^*(A\meet_y B) &\types_{x\le x} x^*A \meet_x x^*B\\
    x^*A \meet_x x^*B &\types_{x\le x} x^*(A\meet_y B)
  \end{align*}
  for $x\le y$, $\types A\type_y$, and $\types B\type_y$.
  \begin{enumerate}
  \item Construct derivations of these sequents in the above sequent calculus.
  \item Write down an analoguous natural deduction and derive the above sequents therein.
  \item What categorical structure do you think these type theories construct the initial one of?
    If you feel energetic, prove the initiality theorem.
  \end{enumerate}
\end{ex}

\begin{ex}\label{ex:poset-bifib}
  A map of posets $P:\sA\to\sM$ is called an \emph{opfibration} if $P\op:\sA\op\to\sM\op$ is a fibration.
  The analogous operation takes $a\in \sA$ and $P(a)\le y$ to a $b\in \sA$ with $P(b)=y$ and $a\le b$ and a universal property; we write this $b$ as $y_!(a)$.
  We say $P$ is a \emph{bifibration} if it is both a fibration and an opfibration.
  Describe a sequent calculus for bifibrations over a fixed \sM, and prove the initiality theorem.
\end{ex}

\begin{ex}\label{ex:poset-bifib-adj}
  Use your sequent calculus from \cref{ex:poset-bifib} to prove that in a bifibration of posets, if $x\le y$ in $\sM$, we have an adjunction $y_! \dashv x^*$.
\end{ex}

\begin{ex}\label{ex:poset-bifib-iso}
  Use your sequent calculus from \cref{ex:poset-bifib} to prove that in a bifibration of posets, if $x\cong y$ in $\sM$, we have an isomorphism $x_! \cong x^*$ (that is, for any $a$ in the fiber over $y$, we have $x_!(a) \cong x^*(a)$).
\end{ex}


\section{Categories with products}
\label{sec:catprod}\label{sec:beta-eta}

Now we move back from posets to categories.
For brevity, by a \textbf{category with products} we will mean a category with specified binary products and a specified terminal object.
Let \bPrCat be the category of such categories with products, and functors preserving them strictly.
Then we have an adjunction relating \bPrCat to \bGr, and we want to describe the left adjoint with a type theory.

As in \cref{sec:natded-mslat}, we could take either the sequent calculus route or the natural deduction route.
Unfortunately, even if we build in enough composition to make cut admissible, in \emph{both} cases we need to impose a further equivalence relation on the derivations, as there are single morphisms that can be derived in multiple ways.
However, the ways in which this happens in the two cases are different.

On one hand, if we have an arrow $f:A\to C$ in a directed graph \cG, then there is a morphism $A\times B \to A \to A\times C$ in the free category-with-products on \cG.
In a sequent calculus, there are two distinct derivations of this morphism:
\begin{mathpar}
  \inferrule*{
    \inferrule*{A\types A}{A\times B \types A}\\
    \inferrule*{\inferrule*[Right=$f$]{A\types A}{A\types C}}{A\times B \types C}
  }{
    A\times B \types A\times C
  }\and
  \inferrule*{
    \inferrule*{A\types A \\ \inferrule*[Right=$f$]{A\types A}{A\types C}}{A\types A\times C}
  }{
    A\times B \types A\times C
  }\and
\end{mathpar}
whereas in a natural deduction there will be only one:
\begin{mathpar}
  \inferrule*{
    \inferrule*{A\times B\types A\times B}{A\times B \types A}\\
    \inferrule*[Right=$f$]{\inferrule*{A\times B\types A\times B}{A\times B\types A}}{A\times B \types C}
  }{
    A\times B \types A\times C
  }
\end{mathpar}
This sort of thing is true quite generally.
A sequent calculus includes both left and right rules, so to derive a given sequent we must choose whether a left or a right rule is to be applied last.
By contrast, both kinds of rules in a natural deduction (introduction and elimination) act on the right, so there is less choice about what rule to apply last.

On the other hand, if we have an arrow $A\to B$ in \cG, then in a natural deduction there are (at least) two derivations of the identity $A\to A$:
\begin{equation}
  \inferrule*{\inferrule*{A\types A \\ \inferrule*{A\types A}{A\types B}}{A\types A\times B}}{A\types A}\qquad\text{and}\qquad
  \inferrule*{ }{A\types A}
  \label{eq:beta-deriv}
\end{equation}
while in a sequent calculus there is only one:
\begin{mathpar}
  \inferrule*{ }{A\types A}
\end{mathpar}
This is also true quite generally.
A natural deduction includes both introduction and elimination rules, so we will always be able to introduce a type and then eliminate it, essentially ``doing nothing''.
By contrast, in a sequent calculus we have ``only introduction rules'' (of both left and right sorts), so this cannot happen.

\begin{rmk}
  In \cref{sec:intro} we mentioned that by presenting free categorical structures without explicit reference to composition, type theory enables us to \emph{define} composition in such a way as to ensure that various desirable properties hold as exact equalities.
  The point here is just that sequent calculus and natural deduction make \emph{different} choices of which properties to ensure by their definitions of composition.
  Roughly speaking, sequent calculus chooses to make the \emph{defining equations} of universal properties hold exactly (e.g.\ the composites of a paired morphism $\pair f g : X \to A\times B$ with $\pi_1$ and $\pi_2$ are exactly $f$ and $g$ respectively); this is what the ``principal case'' of its cut-admissibility proof does.
  On the other hand, natural deduction chooses to make the \emph{naturality}\footnote{This is not actually the origin of the term ``natural deduction'' --- see \cref{sec:natded-logic} for that --- but it serves as a useful mnemonic.} of universal properties hold exactly; the equality between the two distinct sequent calculus derivations above expresses the naturality of pairing $\pair f g : X \to A\times B$ with respect to precomposition by $\pi_1$.
\end{rmk}

In fact, in the simple case of a unary type theory for categories with products, there are tricks enabling us to eliminate {both} kinds of redundancy, and thereby do without any ``$\equiv$'' for both the sequent calculus and the natural deduction.
(For sequent calculus the trick is called ``focusing'', and for natural deduction the trick is called ``canonical/atomic terms''.)
However, in even slightly more complicated theories the analogous tricks do not eliminate $\equiv$ completely (though they do reduce its complexity).
Moreover, our focus here is on type theory as a notation for category-theoretic arguments, not as a tool for isolating canonical forms and proving coherence theorems.
Thus, we will not spend any time on these tricks, but instead bite the bullet and deal with $\equiv$.

As was the case in \cref{sec:category-cutful}, it is easier to describe the rules for $\equiv$ if we first introduce terms for derivations.
Thus, we generalize the abstract-variable term syntax of \cref{sec:category-cutadm} with terms for the $\times$ and $\unit$ rules, as shown in \cref{fig:catprod-terms}.

\begin{figure}
  \centering
  \begin{mathpar}
    \inferrule{\types X\type}{x:X\types x:X}\;\idfunc\and
    \inferrule{f\in \cG(A,B) \\ x:X\types M:A}{x:X\types f(M):B}\;fI\and
    \inferrule{\types X\type}{x:X\types \ttt:\unit}\;\unit I\and
    \inferrule{x:X\types M:B\times C}{x:X \types \pr1BC(M):B}\;\timesE1\and
    \inferrule{x:X\types M:B\times C}{x:X \types \pr2BC(M):C}\;\timesE2\and
    \inferrule{x:X\types M:B \\ x:X\types N:C}{x:X\types \pair MN:B\times C}\;\timesI
  \end{mathpar}
  \caption{Terms for categories with products}
  \label{fig:catprod-terms}
\end{figure}

Since terms are just a notation for derivations, the general principle for naming derivations by terms is that each rule (being an operation on derivations) should correspond to a ``term operation'' that indicates unambiguously what rule is being applied, what the premises are (by including terms for them), and how they are combined.
This is no different from any other mathematical notation for any other operation.
For instance, in \cref{sec:category-cutful} the composition rule was represented by a (subscripted) binary operation $\circ_B$ combining a pair of terms for the premises.

However, the use of abstract variables complicates matters a bit, because we are only free to describe a notation for the term part (attached to the consequent), whereas the term part only describes a derivation when combined with a variable (attached to the antecedent).
For instance, the two premises $x:X\types M:A$ and $x:X\types N:B$ of the rule $\timesI$ are really $x.M:(X\types A)$ and $x.N\types (X\types B)$, and so the term for the induced derivation $X\types A\times B$ ought to ``pair up'' $x.M$ and $x.N$ somehow; but how would it then be associated to a variable in its own antecedent?

In our present case, we can solve this by using the fact that both premises have the same antecedent \emph{and the same variable}, so we can ``pull that variable out'' of the pair and write $x.\pair M N : (X\types A\times B)$, or $x:X\types \pair M N :A\times B$.
(In other cases, such as \cref{sec:catcoprod}, a more complicated solution is needed.)
But why \emph{do} they have the same variable?
Of course, if we use the de Bruijn method, then they must always have the same variable because all judgments have the same variable.
But otherwise, they might in principle have different variables, and so we might have to rename the variable in one of them (that is, apply an $\alpha$-equivalence) before we can use this notation for $\timesI$.
This is an important general principle.

\begin{princ}\label{princ:term-der-alpha}
  A rule applies equally to any derivations of its premises, but the abstract-variable term notation for that rule may require certain compatibilities between the variables occurring in the premises, which can always be ensured by $\alpha$-equivalence.
\end{princ}

The terms for the other rules $\timesE1,\timesE2,\unit I$ are relatively straightforward.
The superscript type annotations on the projections $\pr1BC$ and $\pr2BC$ are necessary to make type-checking possible, since otherwise it would not be clear from a term such as $x:A \types \pi_1(M):B$ what the type of $M$ should be in the premise.
However, in practice we often omit them.
It is then straightforward to prove the analogue of \cref{thm:category-tad}.

Note how well the natural-deduction choice of ``all rules acting on the right'' matches the use of abstract variables: in all cases we can think of ``applying functions to arguments'' in a familiar way.
It is possible to describe sequent calculus derivations using terms as well, but they are less pretty.
For this reason, \emph{we will henceforth use sequent calculus only for posetal theories}.
(I emphasize that this is a choice of focus and exposition for the present notes only; sequent calculus has successfully been used to answer many coherence questions about non-posetal categorical structures.)
% with a brief exception in \cref{sec:focusing}.
The need to impose the identity rule for all types (not just those coming from \cG) also makes perfect sense from the abstract variable standpoint: a variable in any type is also a term of that type.

Now that we have our terms, the desired equivalence between the two derivations~\eqref{eq:beta-deriv} of the identity $A\to A$ can be written as
\begin{equation}
  \pr1AB(\pair M N) \equiv M\label{eq:beta-prodcat-1}
\end{equation}
and of course we should also have
\begin{equation}
  \pr2AB(\pair M N) \equiv N\label{eq:beta-prodcat-2}
\end{equation}
Note that in these equalities we allow $M$ and $N$ to be arbitrary terms.
Categorically speaking, therefore, we are asserting that the maps $X\to A\times B$ induced by the universal property of the product (the $\timesI$ rule) do in fact have the desired composites with the projections.

The other half of the universal property is the uniqueness of maps into a product.
This corresponds to a dual family of simplifications: we want to identify the following derivations of $A\times B\to A\times B$.
\begin{mathpar}
  \inferrule*{
    \inferrule*{\inferrule*{ }{A\times B\types A\times B}}{A\times B \types A} \\
    \inferrule*{\inferrule*{ }{A\times B\types A\times B}}{A\times B \types B}
  }{
    A\times B \types A\times B
  }\and
  \inferrule*{ }{A\times B\types A\times B}
\end{mathpar}
In term syntax, this means that
\begin{equation}
  \pair{\pr1AB(M)}{\pr2AB(M)} \equiv M\label{eq:eta-prodcat}
\end{equation}
In type-theoretic lingo, the equalities~\eqref{eq:beta-prodcat-1} and~\eqref{eq:beta-prodcat-2} are called \textbf{$\beta$-conversion}\footnote{Presumably $\beta$-conversion is so named because it is the ``second most basic'' equivalence relation on terms, with $\alpha$-equivalence (renaming of variables) being the first.
However, there is a significant difference between the two: $\alpha$-equivalent terms represent the \emph{same} derivation, while $\beta$-conversion relates \emph{distinct derivations} (though we generally notate them with terms).} while the equality~\eqref{eq:eta-prodcat} is called an \textbf{$\eta$-conversion}.

For $\unit$ there is no $\beta$-conversion rule, while the $\eta$-conversion rule is
\begin{equation}\label{eq:eta-unit}
  \ttt \equiv M
\end{equation}
for any term $M:\unit$.
These conversions generate an equivalence relation on terms, which we also require to be a congruence for everything else.

We can describe this more formally with an additional judgment ``$x:X\types M\equiv N :A$'', with rules shown in  \cref{fig:catprod-equality}.
Note that in addition to the $\beta$- and $\eta$-conversions, we assert reflexivity, symmetry, and transitivity, and also that all the previous rules preserve equality.
As remarked in \cref{sec:category-cutful}, all our equality judgments $\equiv$ will be equivalence relations with such a congruence property for all the primitive rules.
In general we will not bother to state these ``standard'' rules for $\equiv$, but since this is our first encounter with such a relation involving ``abstract variable'' term syntax we have included them explicitly.

\begin{figure}
  \centering
  \begin{mathpar}
    \inferrule{x:X\types M:A \\ x:X\types N:B}{x:X\types \pr1AB(\pair M N) \equiv M :A}\and
    \inferrule{x:X\types M:A \\ x:X\types N:B}{x:X\types \pr2AB(\pair M N) \equiv N :B}\and
    \inferrule{x:X\types M:A\times B}{x:X\types \pair{\pr1AB(M)}{\pr2AB(M)} \equiv M:A\times B}\and
    \inferrule{x:X\types M:\unit}{x:X\types \ttt\equiv M:\unit}\and
    \inferrule{x:X\types M:A}{x:X \types M\equiv M:A}\and
    \inferrule{x:X\types M\equiv N:A}{x:X\types N\equiv M:A}\and
    \inferrule{x:X\types M\equiv N:A \\ x:X\types N\equiv P:A}{x:X\types M\equiv P:A}\and
    \inferrule{f\in \cG(A,B) \\ x:X\types M\equiv N:A}{x:X\types f(M)\equiv f(N):B}\\
    \inferrule{x:X\types M\equiv N:B\times C}{x:X \types \pr1BC(M)\equiv \pr1BC(N):B}\and
    \inferrule{x:X\types M\equiv N:B\times C}{x:X \types \pr2BC(M)\equiv \pr2BC(N):B}\and
    \inferrule{x:X\types M\equiv M':B \\ x:X\types N\equiv N':C}{x:X\types \pair MN\equiv \pair{M'}{N'}:B\times C}
  \end{mathpar}
  \caption{Equality rules for categories with products}
  \label{fig:catprod-equality}
\end{figure}

This completes the definition of the \textbf{unary type theory for categories with products under \cG}.

\begin{rmk}\label{rmk:beta-reduction}
  Note that unlike the equalities $h\circ (g\circ f) \equiv (h\circ g)\circ f$ from \cref{sec:category-cutful}, the $\beta$- and $\eta$-conversions are intutively ``directional'', with one side being ``simpler'' than the other.
  This suggests that we should be able to ``reduce'' an arbitrary term to a ``simplest possible form'' by successively applying $\beta$- and $\eta$-conversions.
  Such is indeed the case (although for technical reasons the $\eta$-conversion is usually applied in the less intuitive right-to-left direction and called an ``expansion'' rather than a ``reduction'').
  This \emph{process of reduction} (and expansion) belongs to the ``computational'' side of type theory, which (though of course important in its own right) is somewhat tangential to our category-theoretic emphasis, so we will not discuss it in detail.
  % (However, in \cref{cha:canonical} we will directly characterize the \emph{result of reduction}, and thereby implicitly the process thereof.)
\end{rmk}

\begin{rmk}\label{rmk:equality-terms}
  Note that this type theory has three kinds of judgments (or ``judgment forms''):
  \begin{mathpar}
    \types A\type \and x:A\types M:B \and x:A\types M\equiv N:B.
  \end{mathpar}
  Categorically, these will represent the objects of a category, the morphisms of a category, and the equalities between those morphisms.
  We have discussed how in the morphism judgment $x:A\types M:B$, the term $x.M$ is an annotation that isomorphically represents a particular derivation of the un-annotated judgment $A\types B$.
  In the object judgment $\types A\type$, we can similarly regard $A$ as a term annotation isomorphically representing a particular derivation of an un-annotatated judgment ``$\types\ftype$''.
  (We could thus write it as $\types A:\ftype$ or $A:(\types \ftype)$, but we generally don't, to avoid confusion with elements of the ``universe types'' that will be introduced much later.)

  By contrast, the equality judgment $x:A\types M\equiv N:B$ is the \emph{un-annotated} version; we have not introduced any terms representing its derivations.
  This is because two morphisms in a category can't be ``equal in more than one way'', so there is never any reason to care which derivation of an equality judgment we used.
  (Of course, this perspective has to be modified when one moves on to \emph{higher} category theory.)
  Note that the terms $x.M$ and $x.N$ annotating particular derivations of the morphism judgment appear in the un-annotated equality judgment, just as the terms $A$ and $B$ annotating particular derivations of the object judgment appear in the un-annotated morphism judgment $A\types B$.
\end{rmk}

\begin{thm}\label{thm:catprod-subadm}
  Substitution is admissible in the unary type theory for categories with products under \cG.
  That is, if we have derivations of $x:A\types M:B$ and $y:B \types N:C$, then we have a derivation of $x:A \types N[M/y]:C$.
\end{thm}
\begin{proof}
  The proof is essentially the same as that of \cref{thm:natded-mslat-cutadm}, but we write it out again explicitly with terms present.
  As always, we induct on the derivation of $y:B \types N:C$.
  \begin{enumerate}
  \item If it ends with $\idfunc$, then we can use the given derivation $M$.
  \item If it ends swith $fI$ for $f\in \cG(C',C)$, then we have $y:B \types N':C'$, so by induction we have $x:A\types N'[M/y]:C'$ and hence $x:A\types f(N'[M/y]):C'$.
  \item If it ends with $\unit I$, then by $\unit I$ we have $x:A \types \ttt:\unit$ as well.
  \item If it ends with $\timesE1$, then we have $y:B\types N':C\times C'$, so by induction we have $x:A \types N'[M/y]:C\times C'$, hence $x:A \types \pi_1(N'[M/y]):C$ by $\timesE1$.
    The case for $\timesE2$ is similar.
  \item Finally, if it ends with $\timesI$, we have $y:B\types N_1:C_1$ and $y:B\types N_2:C_2$, so by induction we have $x:A \types N_1[M/y]:C_1$ and $x:A \types N_2[M/y]:C_2$, hence $x:A \types \pair{N_1[M/y]}{N_2[M/y]}:C_1\times C_2$.\qedhere
  \end{enumerate}
\end{proof}

As with \cref{thm:category-cutadm}, this proof can be regarded as \emph{defining} recursively what it means to ``substitute'' $M$ for $y$ in $N$.
The defining clauses are
\begin{align*}
  N[M/y] &= M\\
  (f(N))[M/y] &= f(N[M/y])\\
  \ttt[M/y] &= \ttt\\
  (\pi_1(N))[M/y] &= \pi_1(N[M/y])\\
  (\pi_2(N))[M/y] &= \pi_2(N[M/y])\\
  \pair{N_1}{N_2}[M/y] &= \pair{N_1[M/y]}{N_2[M/y]}
\end{align*}

We leave it to the reader to similarly prove the following (\cref{ex:catprod-subcong-subassoc}):

\begin{lem}\label{thm:catprod-subcong}
  The relation $\equiv$ is a congruence for substitution in the unary type theory for categories with products under \cG.
  In other words, if we have derivations of $x:X \types M\equiv M':B$ and $y:B \types N\equiv N':C$, then we can derive $x:A \types N[M/y] \equiv N'[M'/y]:C$.\qed
\end{lem}

\begin{lem}\label{thm:catprod-subassoc}
  Substitution is associative in the unary type theory for categories with products under \cG: we have $P[N/z][M/y] = P[N[M/y]/z]$.\qed
\end{lem}

The proof of the initiality theorem is also similar, but we write out some of the details for later reference.

\begin{thm}\label{thm:catprod-initial}
  For any directed graph \cG, the free category-with-products $\F\bPrCat \cG$ it generates is described by the unary type theory for categories with products under \cG: its objects are the $A$ such that $\types A\type$ is derivable, and its morphisms $A\to B$ are the terms $M$ such that $x:A \types M:B$ is derivable, modulo the equivalence relation $\equiv$.
\end{thm}
\begin{proof}
  \cref{thm:catprod-subcong,thm:catprod-subassoc,thm:catprod-subadm} show that we obtain a category with products in this way.
  Now given any other category with products \cM and a map $P:\cG\to\cM$, we proceed inductively:
  \begin{enumerate}
  \item We define a map from types to the objects of \cM inductively on derivations of $\types A\type$, starting with $P$ and then using the chosen products in \cM.
  \item Then we define a map from terms to the morphisms of \cM inductively on derivations of $x:A \types M:B$, composing with the image of $P$ for the generator rules, and using the universal properties of the finite products in \cM for the introduction and elimination rules.
  \item Then we define a map from derivations of $x:A \types M\equiv N:B$ to equalities of morphisms in \cM by induction on the former, using the laws of the universal properties of products in \cM and the fact that equality is an equivalence relation and a congruence for everything.
  \item Then we prove that this operation is functorial, i.e.\ it takes a substitution $N[M/x]$ to a composite in \cM, by induction on the derivation of $N$ (which is also, of course, how substitution is defined).
  \item Then we prove that this functor preserves (the specified) products, essentially by definition.
  \item Finally, we show that it is the unique such functor, since its definition was forced at every stage by the fact that it be a functor preserving products.\qedhere
  \end{enumerate}
\end{proof}


\subsection*{Exercises}

\begin{ex}\label{ex:catprod-functor-uniq}
  Suppose we have
  \begin{mathpar}
    f\in\cG(A,B)\and
    g\in\cG(A,C)\and
    h\in\cG(B,D)\and
    k\in\cG(C,E)
  \end{mathpar}
  Consider the following two derivations of $A\types D\times E$.
  Note that both use the admissible cut/substitution rule.
  \begin{mathpar}
    \let\mytimes\times
    \def\times{\mathord{\mytimes}}
    \inferrule*[Right=$\timesI$]{
      \inferrule*[right=cut]{\inferrule*[Right=$f$]{\inferrule*{ }{A\types A}}{A\types B}\\
        \inferrule*[Right=$h$]{\inferrule*{ }{B\types B}}{B\types D}}{A\types D}\\
      \inferrule*[Right=cut]{\inferrule*[Right=$g$]{\inferrule*{ }{A\types A}}{A\types C}\\
        \inferrule*[Right=$k$]{\inferrule*{ }{C\types C}}{C\types E}}{A\types E}
    }{A\types D\times E}\and
    \inferrule*[Right=cut]{
      \inferrule*[right=$\timesI$]{
        \inferrule*[Right=$f$]{\inferrule*{ }{A\types A}}{A\types B}\\
        \inferrule*[Right=$g$]{\inferrule*{ }{A\types A}}{A\types C}
      }{A\types B\times C}\\
      \inferrule*[right=$\timesI$]{
        \inferrule*[Right=$h$]{\inferrule*[Right=$\timesE1$]{
            \inferrule*{ }{B\times C\types B\times C}
          }{B\times C\types B
          }}{B\times C\types D}\\
        \inferrule*[Right=$k$]{\inferrule*[Right=$\timesE2$]{
            \inferrule*{ }{B\times C\types B\times C}
          }{B\times C\types C
          }}{B\times C\types E}
      }{B\times C\types D\times E}
    }{A\types D\times E}\and
  \end{mathpar}
  Write down the terms corresponding to these two derivations and show directly that they are related by $\equiv$.
\end{ex}

\begin{ex}\label{ex:catprod-monoidal}
  Use the type theory for categories with products to prove that in any category with products we have
  \begin{mathpar}
    A\times B \cong B\times A\and
    A\times (B\times C) \cong (A\times B)\times C\and
    A\times \unit \cong A\and
    \unit \times A \cong A.
  \end{mathpar}
  Note that since we are in categories now rather than posets, to show that two types $A$ and $B$ are isomorphic we must derive $x:A\types M:B$ and $y:B\types N:A$ and also show that their substitutions in both orders are equal (modulo $\equiv$) to identities.
\end{ex}

\begin{ex}\label{ex:catprod-subcong-subassoc}
  Prove \cref{thm:catprod-subcong,thm:catprod-subassoc} (substitution is associative and respects $\equiv$ in the unary type theory for categories with products).
\end{ex}

\begin{ex}\label{ex:catfib}
  A functor $P:\sA\to \sM$ is called a \textbf{fibration} if for any $b\in \sA$ and $f:x\to P(b)$, there exists a morphism $\phi:a\to b$ in \sA such that $P(\phi)=f$ and $\phi$ is \emph{cartesian}, meaning that for any $\psi:c\to b$ and $g:P(c)\to x$ such that $P(\psi)=fg$, there exists a unique $\chi:c\to a$ such that $P(\chi)=g$ and $\phi\chi=\psi$.
  The object $c$ is denoted $f^*(b)$.
  \begin{enumerate}
  \item Generalize your natural deduction for fibrations of posets from \cref{ex:natded-poset-fib} to a type theory for fibrations of categories over a fixed base category \sM, with $\beta$- and $\eta$-conversion $\equiv$ rules.
  \item Prove the initiality theorem for this type theory.
  \item Use this type theory to prove that in any fibration $P:\sA\to\sM$:
    \begin{enumerate}
    \item For any $f:x\to y$ in \sM, $f^*$ is a functor from the fiber over $y$ to the fiber over $x$.
    \item For any $B\in \sA$ and $x\xto{f} y \xto{g} P(B)$ in \sM, we have $f^*(g^*(B)) \cong (gf)^*(M)$.
    \end{enumerate}
  \end{enumerate}
\end{ex}

\begin{ex}\label{ex:catprod-fib}
  Generalize \cref{ex:mslat-fib} from posets to categories, combining your type theory from \cref{ex:catfib} with the one for categories with products from \cref{sec:beta-eta}.
\end{ex}

\begin{ex}\label{ex:catprod-2free}
  The category $\bPrCat$ is a 2-category whose 2-cells are arbitrary natural transformations (that is, there is no nonvacuous notion of a ``product-preserving natural transformation'').
  Let \cG be a directed graph; as in \cref{ex:cat-2free}, define a 2-functor $\bGr(\cG,-):\bPrCat\to\bCat$, and show that $\F\bPrCat\cG$ is a representing object for it.
  \textit{(Use induction over the derivations of the judgments in its type-theoretic description.)}
\end{ex}

\begin{ex}\label{ex:catprod-flexible}
  \cref{ex:cat-2free,ex:catprod-2free} address one worry that a category theorist might have about the strictness of our constructions.
  Another such worry is that the morphisms in \bPrCat preserve specified products \emph{strictly}, while it is usually more natural in category theory to preserve products only up to isomorphism.
  This is not a problem if our main purpose is to have a syntax to describe objects and morphisms in particular categories with products; indeed, it is exactly what we would want.
  However, for abstract reasons it may be nice to also be able to say something about less strict functors.

  With this in mind, prove that for any \cG, the category with products $\F\bPrCat\cG$ is \emph{semi-flexible} in the sense of~\cite{bkp:2dmonads}: that is, if \cM has chosen products, then every functor $\F\bPrCat\cG \to \cM$ that preserves products in the usual up-to-isomorphism sense is naturally isomorphic to a functor that preserves the chosen products strictly.
  \textit{(Again, use induction over derivations in the type-theoretic description.)}
  Deduce that $\F\bPrCat\cG$ satisfies a universal property relative to the 2-category of categories with products and functors that preserve them up to isomorphism.
\end{ex}

\begin{ex}\label{ex:catprod-induction}
  Here is another way to prove the result of \cref{ex:catprod-flexible}.
  \begin{enumerate}
  \item Use the initiality of $\F\bPrCat\cG$ to show that if \cM has finite products and $Q:\cM\to \F\bPrCat\cG$ preserves finite products strictly, then any map of directed graphs $\cG \to \cM$ that lifts the inclusion $\cG \to \F\bPrCat\cG$ extends to a section $\F\bPrCat\cG \to \cM$ of $Q$ in \bPrCat.\label{item:catprod-induction}
  \item The results of~\cite{bkp:2dmonads} imply that the 2-category of categories with products and functors that preserve products in the usual up-to-isomorphism sense has 2-categorical limits called products, inserters, and equifiers, and the projections of these limits preserve products strictly.
    Use this, and~\ref{item:catprod-induction}, to prove that $\F\bPrCat\cG$ satisfies a universal property relative to this 2-category.
  \end{enumerate}
\end{ex}


\section{Categories with coproducts}
\label{sec:catcoprod}

% [TODO: I'm unsure whether this should be here at all.
% It seems a natural thing to expect, but on the other hand it involves variable binding in terms, which might be easier to introduce first with simply-typed $\lambda$-calculus in \cref{chap:simple}.
% Note that we also need variable binding for monoidal categories, so if we want to do those before cartesian monoidal categories (as also seems natural, since multicategories are simpler than cartesian multicategories) we need to discuss variable binding sooner than $\lambda$-calculus.]

In \cref{ex:jslat}, you obtained a {sequent calculus} for join-semilattices by dualizing the sequent calculus for meet-semilattices.
However, {natural deductions} don't dualize as straightforwardly, due to the insistence that all rules act only on the right.
(Of course, we could dualize them to ``co-natural deductions'' in which all rules act only on the \emph{left}, but that would destroy the familiar behavior of terms on variables, as well as make it tricky to combine left and right universal properties, such as for lattices.)
To describe joins in a natural deduction, we need to ``build an extra cut'' into their universal property:
\begin{mathpar}
  \inferrule{X\types A}{X\types A\join B}\;\joinI1\and
  \inferrule{X\types B}{X\types A\join B}\;\joinI2\and
  \inferrule{X\types A\join B \\ A\types C \\ B\types C}{X\types C}\;\joinE
\end{mathpar}
Note that $\joinE$ is precisely the result of cutting $\joinL$ with an arbitrary sequent:
\begin{mathpar}
  \inferrule*[Right=cut]{X\types A\join B \\ \inferrule*[Right=$\joinL$]{A\types C \\ B\types C}{A\join B\types C}}{X\types C}
\end{mathpar}
We treat the bottom element similarly:
\begin{mathpar}
  \inferrule{X\types \bot \\\types C\type}{X\types C}
\end{mathpar}

Rather than take the time to study join-semilattices, we skip directly to a \textbf{unary type theory for categories with coproducts} in which we care about distinct derivations.
As usual, for this purpose we annotate the judgments with terms, as shown in \cref{fig:catcoprod}.

\begin{figure}
  \centering
  \begin{mathpar}
    \inferrule{\types X\type}{x:X\types x:X}\;\idfunc\and
    \inferrule{f\in \cG(A,B) \\ x:X\types M:A}{x:X\types f(M):B}\;fI\and
    \inferrule{x:X \types M:\zero \\ \types C\type}{x:X \types \abort(M):C}\;\zeroE\\
    \inferrule{x:X\types M:A}{x:X\types \inl(M):A + B}\;\plusI1\and
    \inferrule{x:X\types N:B}{X\types \inr(N):A + B}\;\plusI2\and
    \inferrule{x:X\types M:A + B\\u:A\types P:C \\ v:B\types Q:C}{x:X\types \acase AB(M,u.P,v.Q):C}\;\plusE
  \end{mathpar}
  \caption{Unary type theory for categories with coproducts}
  \label{fig:catcoprod}
\end{figure}

Most of the term operations are easy to guess.
The two injections $A\to A+B$ and $B\to A+B$ are named $\inl$ and $\inr$ (``in-left'' and ``in-right''), while the unique morphism $\zero \to C$ is called $\abort$.
However, the term notation for $\plusE$ merits some discussion.

Recall from \cref{sec:catprod} that a term notation should indicate derivations of the premises of the rule by including terms for them, and that in general this can be tricky when using abstract variables because a derivation is only determined by a term together with its variable.
The premises of $\plusE$ are, when paired explicitly with their variables, $x.M:(X\types A+B)$ and $u.P:(A\types C)$ and $v.Q:(B\types C)$, so the term notation should operate on all three of these.
But unlike the case of $\timesI$, now all three premises have different antecendents (and to emphasize this, we have used three different variables as well), so we cannot pull the same variable out of all of them.

However, since the antecedent of the conclusion is $X$, which is also the antecedent of the first premise, we can pull that variable (namely $x$) out to be the variable of the conclusion, leaving the other variables $u$ and $v$ paired with their terms.
This leads us to $x.\acase AB(M,u.P,v.Q)$.
Note that the periods in $u.P$ and $v.Q$ bind more tightly than the commas, so this should be parsed as $x.\acase AB(M,(u.P),(v.Q))$.
The annotation by $A+B$ is to make type-checking possible (but often we will simplify it by writing just $\case$).

The idea behind the name ``$\match$'' is that to ``evaluate'' $\case(M,u.P,v.Q)$ the term $M:A+B$ should be compared to the ``patterns'' $\inl(u)$ and $\inr(v)$, and according to which one it ``matches'' (looks like), we branch into either $P$ or $Q$.
The notation $\abort(M)$ is the nullary case of this: the term $M:\zero$ is matched against all possible ways that a term of type $\zero$ could be constructed --- of which there are none, and so there are no cases to consider.

Categorically, $\plusE$ expresses the universal property of the coproduct as follows.
Recall that morphisms from $A$ to $B$ are (being derivations) represented by variable-term pairs.
Thus the morphisms $A\to C$ and $B\to C$ are represented by the pairs $u.P$ and $v.Q$, while their copairing is a morphism $A+B\to C$ that should be represented by a term involving a variable $y:A+B$.
This is exactly the data included in $y:A+B\types \case(y,u.P,v.Q):C$; the general version with $x:X\types M:A+B$ just comes from building in a cut.

% If we use the de Bruijn method for variables, with unique variable $x$, then we would have $x:A\types P:C$ and $x:B\types Q:C$, and so we \emph{could}, without technical ambiguity, just write $[P,Q]$ leaving the variables as-is.
% However, this would be rather confusing and contrary to mathematical intuition.
% For instance, if $P=f(x)$ and $Q=g(k(x))$ while $M=h(x)$, then $\plusE$ would give us $x:X\types [f(x),g(k(x))](h(x)):C$; but the expressions appearing in the copair $[-,-]$ do not actually ``depend on $x$'' in the same sense that the argument $h(x)$ does.
% If we apply $\alpha$-equivalence instead, we could use different variables for these, such as $x:X\types [f(u),g(k(v))](h(x)):C$, but this is still confusing because it appears to depend on the values of $u$ and $v$.
% (Moreover, when we generalize away from unary type theories in \cref{chap:simple} such a notation would become actually ambiguous.)

% \begin{rmk}
%   In unary type theory it is not strictly necessary to add the prefixes ``$u.$'' and ``$v.$'' --- as long as variables are a syntactic group disjoint from all other symbols, the variable $u$ is determined as the only variable occurring in $P$, and similarly for $v$ in $Q$ (and if no variables occur in $P$, it doesn't matter what variable $u$ we mean).
%   However, starting in \cref{chap:simple} it will be important to keep track of which variable is being bound, so we establish good habits now.
% \end{rmk}

Note that the variables $u$ and $v$, though they ``appear in the term'', are not the variable associated to the antecedent;
we say they are \textbf{bound} variables.
By contrast, the antecedent variable $x$ is \textbf{free}.
Textually each bound variable is associated to a subterm called its \textit{scope}, delimiting the places where it can be referred to;
in $\plusE$ the scope of $u$ is $P$ and the scope of $v$ is $Q$.
In terms of rules, this just means that the variable (or more precisely, its type) may appear as the antecedent of some, but not all, of the premises.

Bound variables are familiar in ordinary mathematics as well.
For instance, the integration variable $x$ in a definite integral $\int_0^2 x^2\,\mathrm{d}x$ is bound, because the value of the expression ``doesn't depend on $x$''; its scope is the expression being integrated (but not the bounds of integration).
Bound variables also occur in function definitions: given an expression such as $x^2$ depending on an unspecified variable $x$, we may write something like $(x\mapsto x^2)$ for ``the function that squares its argument'',\footnote{A more common notation for $(x\mapsto x^2)$ in type theory is $\lambda x.x^2$; see \cref{sec:stlc}.} which is a fully defined object containing $x$ as a bound variable.

It is a common convention in type theory that a prefixed variable followed by a period is bound.
Our writing $x:X\types M:B$ as $x.M:(X\types B)$ follows this convention; the variable $x$ is free in $M$, but bound in $x.M$, since when the term $M$ is paired with its variable $x$ it represents a derivation of $X\types B$ that does not depend on $x$.

If we were using de Bruijn variables, then all the variables $x,u,v$ here would actually be the same, giving for instance $\acase AB(M,x.f(x),x.g(x))$.
Although technically fine, this can be confusing since the same unique variable $x$ also occurs in $M$ itself, but unrelatedly (and with a different type) than its occurrences in the case branches.
In general, a bound variable ``shadows'' any free variable (or ``outer'' bound variable) of the same name, so that in $\acase AB(M,x.f(x),x.g(x))$ the $x$ in $f(x)$ refers to the prefix variable (called $x$) in $x.f(x)$, not to the outer free variable occurring in $M$ (also called $x$).
This makes a precise and general definition of $\alpha$-equivalence in the presence of bound variables rather technically involved; one such definition is given in \cref{sec:alpha}.

However, it is arguably bad mathematical style to use the same name for distinct variables, even if there is technically no ambiguity.
For instance, we encourage calculus students to write $F(x) = \int_0^x f(t)\,\mathrm{d}t$ rather than $F(x) = \int_0^x f(x)\,\mathrm{d}x$, even though technically they mean the same thing.
If we adhere to this informal convention, then we rarely need to worry about technical definitions of $\alpha$-equivalence (unless we are trying to implement mathematics in a computer).

With all of this out of the way, we can now consider the appropriate $\beta$- and $\eta$-conversion rules.
However, we pause first to prove admissibility of cut, i.e.\ to construct substitution, as in this case we will need substitution to state the $\beta$ and $\eta$ rules.

\begin{lem}\label{thm:catcoprod-subadm}
  Substitution is admissible in the unary type theory for categories with coproducts under \cG.
  That is, if we have derivations of $x:A\types M:B$ and $y:B\types N:C$, we can construct a derivation of $x:A\types N[M/y]:C$.
\end{lem}
\begin{proof}
  By induction over derivations.
  As usual for natural deduction theories, there is not much happening: for each rule that might appear last in the derivation of $y:B\types N:C$, we apply the inductive hypothesis to its premises and then re-apply the final rule.
  The defining equations of the substitution operation are:
  \begin{align*}
    y[M/y] &= M\\
    f(N)[M/y] &= f(N[M/y])\\
    \abort(N)[M/y] &= \abort(N[M/y])\\
    \inl(N)[M/y] &= \inl(N[M/y])\\
    \inr(N)[M/y] &= \inr(N[M/y])\\
    \case(N,u.P,v.Q)[M/y] &= \case(N[M/y],u.P,v.Q)\qedhere
  \end{align*}
\end{proof}

Now we proceed to $\beta$- and $\eta$-conversion.
Dualizing the $\beta$-conversion rules for products, the $\beta$-conversion rules for coproducts should say that the map $A+B\to C$ induced by $f:A\to C$ and $g:B\to C$ yields $f$ and $g$ when composed with the coproduct injections.
Recalling that composition is given by substitution, this leads us to write down
\begin{align*}
  \case(\inl(y),u.P,v.Q) &\equiv P[y/u]\\
  \case(\inr(z),u.P,v.Q) &\equiv Q[z/v]
\end{align*}
%The substitutions on the right are just $\alpha$-equivalences, replacing the variables $u,v$ by $y,z$; in a de Bruijn style nothing at all would be happening.
Similarly, the $\eta$-conversion rule\footnote{In fact, for types such as coproducts with a left universal property, there is no consensus on exactly what equality ``$\eta$'' refers to.\label{fn:weak-eta}
  From a categorical point of view this equality is the most natural, since like the $\eta$-conversion rule for products it expresses the uniqueness aspect of the universal property.
  But sometimes $\eta$ is used to refer only to the special case $\case(y,u.\inl(u),v.\inr(v))\equiv y$, which is also analogous to the $\eta$ rule for products in that it says that elements \emph{of the type in question} have a canonical form (a pair in a product or a case-split in a coproduct).
  The stronger property is equivalent to the weaker property combined with ``commuting conversions'' such as $\inl(\case(M,u.P,v.Q)) \equiv \case(M,u.\inl(P),v.\inl(Q))$.%
  % The relationship between these two, which involves ``commuting conversions'', is discussed further in \cref{cha:canonical}.
} should say that morphisms out of a coproduct are determined uniquely by their composites with the projections:
\begin{equation}
  \case(y,u.P[\inl(u)/y],v.P[\inr(v)/y]) \equiv P\label{eq:catcoprod-eta}
\end{equation}
and similarly that morphisms out of the initial object are unique:
\[ \abort(y) \equiv P. \]
In fact, to ensure that $\equiv$ is a congruence, we should ``build in a cut'' to all of these rules, so that the antecedent of the conclusion is an arbitrary type.
Thus the actual generating $\equiv$ rules are those shown in \cref{fig:catcoprod-equiv}.
\begin{figure}
  \centering
  \begin{mathpar}
    \inferrule{u:A\types P:C \\ v:B\types Q:C \\ x:X\types M:A}{x:X \types \case(\inl(M),u.P,v.Q) \equiv P[M/u] : C}
    \and
    \inferrule{u:A\types P:C \\ v:B\types Q:C \\ x:X\types N:B}{x:X \types \case(\inr(N),u.P,v.Q) \equiv Q[N/v] : C}
    \and
    \inferrule{x:X \types M:A+B \\ y:A+B \types P:C}{x:X \types \case(M,u.P[\inl(u)/y],v.P[\inr(v)/y]) \equiv P[M/y] : C}
    \and
    \inferrule{x:X\types M:\zero \\ y:\zero \types P:C}{x:X \types \abort(M) \equiv P[M/y] :C}
  \end{mathpar}
  \caption{$\beta$- and $\eta$-conversions for categories with coproducts}
  \label{fig:catcoprod-equiv}
\end{figure}
As usual, we also require $\equiv$ to be an equivalence relation and a congruence for substitution.
This completes the definition of our \textbf{unary type theory for categories with coproducts under \cG}.


As in the dual case, by a \textbf{category with coproducts} we mean a category with specified binary coproducts and a specified initial object, and in the category of such the functors preserve the specified structure strictly.

\begin{thm}\label{thm:catcoprod-initial}
  For any directed graph \cG, the free category with coproducts generated by \cG can be described by the unary type theory for categories with coproducts under \cG: its objects are the derivations of $\types A\type$, and its morphisms $A\to B$ are the derivations of $A\types B$ (or equivalently the derivable term judgments $x:A \types M:B$ modulo $\alpha$-equivalence), modulo the equivalence relation $\equiv$.
\end{thm}
\begin{proof}
  This is basically just like the proof of \cref{thm:catprod-initial}, with one slight twist.
  \cref{thm:catcoprod-subadm} defines substitution, and the same sort of induction proves it associative and unital; thus we have a category $\F\bCoprCat\cG$.
  The rules are defined just so as to give this category the structure of coproducts.
  Then, given any category with coproducts \cM and map of graphs $\pfree:\cG\to\cM$, we extend $\pfree$ to a unique coproduct-preserving functor $\free:\F\bCoprCat\cG\to\cM$ by successive inductions over all the derivations of the type theory.

  The twist is that we have to be careful about the order in which we do this.
  Because the rules for $\equiv$ involve the substitution \emph{operation} on terms, to interpret these rules using the universal properties in \cM we need to know already that substitution maps to composition in \cM.
  Thus we need to (1) extend $\pfree$ to types, (2) extend $\pfree$ to terms, (3) \emph{then} prove that this extension $\free$ takes substitution to composition, (4) \emph{then} inductively show that derivations of $x:A\types M\equiv N:B$ map to equal morphisms, and (5) then complete the proof that we have a functor preserving coproducts.
\end{proof}

% At this point the reader is free (after doing some exercises) to move on directly to \cref{chap:simple}.
% However, if you want to understand $\equiv$ a little more deeply, you may also continue with the somewhat digressive \cref{cha:canonical}.


\subsection*{Exercises}

\begin{ex}\label{ex:catcoprod-functor-uniq}
  This is the dual of \cref{ex:catprod-functor-uniq}, though of course its proof is not dual.
  Suppose we have
  \begin{mathpar}
    f\in\cG(A,C)\and
    g\in\cG(B,D)\and
    h\in\cG(C,E)\and
    k\in\cG(D,E)
  \end{mathpar}
  Here is one (cut-free) derivation of $A+B\types E$.
  \begin{mathpar}
    \inferrule*[Right=$\plusE$]{
      \inferrule*{ }{A+B\types A+B}\\
      \inferrule*[Right=$h$]{\inferrule*[Right=$f$]{\inferrule*{ }{A\types A}}{A\types C}}{A\types E}\\
      \inferrule*[Right=$k$]{\inferrule*[Right=$g$]{\inferrule*{ }{B\types B}}{B\types D}}{B\types E}
    }{A+B\types E}\and
  \end{mathpar}
  Write down another derivation of $A+B\types E$ that ends with the following cut:
  \begin{mathpar}
    \inferrule*[Right=cut]{
      \inferrule*{\vdots}{A+B\types C+D}\\
      \inferrule*{\vdots}{C+D\types E}
    }{A+B\types E}\and
  \end{mathpar}
  Then write down the terms corresponding to the two derivations and show directly that they are related by $\equiv$.
\end{ex}

\begin{ex}\label{ex:cat-opfib}
  A functor $P:\sA\to\sB$ is called an \textbf{opfibration} if $P\op:\sA\op\to\sM\op$ is a fibration (as in \cref{ex:catfib}).
  The dual of $f^*(b)$ is written $f_!(a)$.
  \begin{enumerate}
  \item Write down a type theory for opfibrations and prove the initiality theorem.
    (Remember that we always use natural deduction style when dealing with categories rather than posets, so you can't just dualize \cref{ex:catfib} or categorify \cref{ex:poset-bifib}.
    You will probably want a term syntax such as ``$\match_!$''.)
  \item Use this type theory to prove that $f_!$ is always a functor.
  \end{enumerate}
\end{ex}

% Is this worth it?  I think there's not much to say beyond \label{ex:catprod-2free}.
% \begin{ex}\label{ex:2free-catcoprod}
%   As in \cref{ex:cat-2free,ex:catprod-2free} show that $\F\bCoprCat\cG$ is a representing object for a suitable 2-functor $\bGr(\cG,-):\bCoprCat\to\bCat$.
% \end{ex}

\section{Universal properties and modularity}
\label{sec:modularity}

This seems an appropriate place to introduce some more type-theoretic lingo.
Roughly speaking, types corresponding to objects with ``mapping out'' universal properties, such as $A+B$, are called \textbf{positive}, while types corresponding to objects with ``mapping in'' universal properties, such as $A\times B$, are called \textbf{negative}.
The precise meanings of these terms relate to ``focusing'' and are more directly applicable to sequent calculus than to natural deduction, but they are often used informally in this broad sense, and we will sometimes do the same.
% We have already seen some of the different type-theoretic behavior exhibited by both ``polarities'':
% \begin{itemize}
% \item Negative types have more direct introduction and elimination rules, while positive types are eliminated by a $\match$ that binds variables (possibly zero of them).
% \item In sequent calculus, positive types have invertible left rules.
% \end{itemize}

Another important thing to note is a similarity in structure between the product types of \cref{sec:catprod} and the coproduct types of this section.
Both of them augment the basic cut-free type theory for categories from \cref{sec:category-cutadm} with four kinds of rules:
\begin{enumerate}
\item \textbf{Formation} rules for the judgment $\types A\type$, which tell us how to ``form'' the new types described by this operation (product types or coproduct types):
  \begin{mathpar}
    \inferrule{\types A\type \\ \types B\type}{\types A\times B\type}\and
    \inferrule{\types A\type \\ \types B\type}{\types A+ B\type}\and
  \end{mathpar}
\item \textbf{Introduction} rules for the term judgment, which tell us how to ``introduce'' terms belonging to such a new type:
  \begin{mathpar}
    \inferrule{x:X\types M:B \\ x:X\types N:C}{x:X\types \pair MN:B\times C}\;\timesI\\
    \inferrule{x:X\types M:A}{x:X\types \inl(M):A + B}\;\plusI1\and
    \inferrule{x:X\types N:B}{X\types \inr(N):A + B}\;\plusI2\and
  \end{mathpar}
\item \textbf{Elimination} rules for the term judgment, which tell us how to ``eliminate'' a term belonging to a new type to obtain terms belonging to other types:
  \begin{mathpar}
    \inferrule{x:X\types M:B\times C}{x:X \types \pr1BC(M):B}\;\timesE1\and
    \inferrule{x:X\types M:B\times C}{x:X \types \pr2BC(M):C}\;\timesE2\and
    \inferrule{x:X\types M:A + B\\u:A\types P:C \\ v:B\types Q:C}{x:X\types \acase AB(M,u.P,v.Q):C}\;\plusE
  \end{mathpar}
\item \textbf{Conversion} (or \emph{computation}) rules for the equality judgment, which tell us how to ``convert'' or ``reduce'' terms built by combining the introduction and elimination rules:
  \begin{mathpar}
    \pr1AB(\pair M N) \equiv M\and
    \pr2AB(\pair M N) \equiv N\and
    \pair{\pr1AB(M)}{\pr2AB(M)} \equiv M\and
    \case(\inl(y),u.P,v.Q) \equiv P[y/u]\and
    \case(\inr(z),u.P,v.Q) \equiv Q[z/v]\and
    \case(y,u.P[\inl(u)/y],v.P[\inr(v)/y]) \equiv P
  \end{mathpar}
\end{enumerate}

This division into four groups of rules for each type operation corresponds to the following four aspects of a category-theoretic universal property:
\begin{enumerate}
\item The existence of one or more objects given certain input data (formation).
\item Some data relating those objects to others that we already had, like the projections of a product or the injections of a coproduct (elimination for negative types, introduction for positive ones).
\item A ``factorization'' operation: given data of some sort, there exist specified morphism(s) into or out of the new objects (introduction for negative types, elimination for positive ones).
\item These specified morphism(s) satisfy some compatibility conditions with the given data that uniquely determine them (conversion).
\end{enumerate}
This provides a template for representing other category-theoretic universal properties with type theory.
(What if we want to represent something that doesn't have a universal property, like a monoidal structure?
As we will see in \cref{chap:simple}, the answer is to move into a different world where it \emph{does} have a universal property.)

Finally, I want to emphasize the \emph{modularity} of these rules (in the computer scientist's sense).
When we added the rules for products to the cut-free type theory for categories, the proofs of the basic theorems like cut admissibility, associativity of substitution, and the initiality theorem did not change in their overall structure.
They were all still proved by induction on derivations; we simply had to add new clauses to each induction step corresponding to the new rules.

The advantage of this sort of modularity means that we are free to ``mix and match'' rules of our type theory, corresponding to the categorical universal properties we want to have.
There will be more scope for variety with this later, but at the moment there is one new combination: we can obtain a \textbf{type theory for categories with products and coproducts} by simply combining the rules from \cref{sec:catprod,sec:catcoprod}.
We can then prove all the same theorems, including the fact that this type theory presents the free category with products and coproducts on a directed graph, by simply combining the relevant clauses from all the proofs in \cref{sec:catprod,sec:catcoprod}.


\subsection*{Exercises}

\begin{ex}\label{ex:cat-prod-coprod-uniq}
  Suppose we have
  \begin{mathpar}
    f\in\cG(A,C)\and
    g\in\cG(A,D)\and
    h\in\cG(B,C)\and
    k\in\cG(B,D)
  \end{mathpar}
  Write down two different derivations of $A+B\types C\times D$ in the type theory for categories with products and coproducts under \cG, one that ends with $\timesI$ and one that ends with $\plusE$.
  Then write down the corresponding terms and show directly that they are identified by $\equiv$.
\end{ex}

\begin{ex}\label{ex:cat-bifib}
  A functor $P:\sA\to\sB$ is called a \textbf{bifibration} if it is both a fibration and an opfibration.
  \begin{enumerate}
  \item Combine the theories of \cref{ex:catfib,ex:cat-opfib} to obtain a type theory for bifibrations.
  \item If you aren't tired of proving initiality theorems yet, do it for this type theory.
  \item Use this type theory to prove that in any bifibration, $f_!$ is left adjoint to $f^*$.
  \end{enumerate}
\end{ex}


\section{Presentations and theories}
\label{sec:unary-theories}

Now that we have a type theory for categories with products, we might hope that we could express in it the proof from \cref{sec:intro} of uniqueness of inverses for monoid objects.
However, the theory as presented in \cref{sec:catprod} is inadequate even to \emph{talk} about monoid objects!

Let us recall briefly how we use initiality theorems to deduce categorical facts.
Suppose we want to prove a theorem of the form ``for any objects $A,B,C,\dots$ and morphisms $f,g,h,\dots$ in a category with products, we have \dots''.
Then we let \cG be a directed graph with one vertex for each object appearing in the theorem and one edge for each morphism, with appropriate source and target, and we reason in the type theory for categories with products under \cG.
Then whenever we have objects and morphisms of the sort described in the theorem in any other category \cM with products, we get a map $\cG\to\cM$, which therefore extends to the free category $\F\bPrCat\cG$ presented by the type theory, and this extension carries our type-theoretic constructions and proofs to categorical ones.

However, although theorems about monoid objects are intuitively of the form ``for any object $A$ and morphisms $m,e$, \dots'', they do not fit into this picture, because the data cannot be described as a directed graph.
The problem is that the source and target of $m$ and $e$ are not \emph{single} objects mentioned in the theorem, but products of them (specifically, $m:A\times A\to A$ and $e:\unit\to A$).
Furthermore, a monoid object contains not just objects and morphisms, but \emph{axioms} about composites of those morphisms, which the type theory of \cref{sec:catprod} is also unable to deal with.

We now present an extension of this theory which does suffice to discuss monoid objects.
It is not the final word on the matter --- as we will see, the type theories of \cref{chap:simple} can do somewhat better --- but it is a first step.

The two problems mentioned above are in fact quite similar.
To talk about a morphism $m:A\times A\to A$, say, using type theory, we need to replace directed graphs with some more general structure including ``arrows'' whose source and target can be products of ``generating objects'' rather than just single ones.
Similarly, to talk about an axiom such as $m(x,m(y,z)) = m(m(x,y),z)$, we need to also include ``generating equalities'' that relate ``pairs of parallel morphisms'', and these morphisms could also be products and composites of generating ones.
(This is an instance of the higher-categorical philosophy that equalities are just a kind of higher morphism.)

This sort of refinement can be performed for essentially any type theory; it is a sort of syntactic version of the categorical notion of ``computads'' and related structures~\cite{batanin:cptd-fin,garner:hom-hcat,nlab:computad}.
The idea as just sketched is fairly straightforward, but the execution is a bit complicated, because we have to define the sort of ``generating structure'' (a generalization of a directed graph) step-by-step in parallel with the type theory that it generates.
Thus, we begin with a couple of simpler cases.


\subsection{Group presentations}
\label{sec:group-presentations}

Categorically, what we are about to do is talking about \emph{presented} categories rather than \emph{free} ones.
However, the way we're doing it is perhaps not the most common way to talk about ``presentations'' in category theory.
Consider for instance the most well-known sort of presentation, namely a presentation of a group; this consists of a set $X$ of generating elements together with a set $R$ of ``relations'', each of which is a pair\footnote{In the case of groups, an equality $g = h$ is equivalent to an equality $g h^{-1} = e$, so it is common to take the relations to be single elements of the free group on $X$ rather than pairs thereof.
  However, when we generalize to other algebraic structures this is no longer true, so we stick with the more general version even for groups.} of elements of the free group on $X$, i.e.\ $R\subseteq \F{}X\times \F{}X$.
For instance, the dihedral group $D_n$ is presented by $X = \setof{x,y}$ and $R = \setof{(x^n,e), (y^2,e), (xyxy,e)}$.

The question is, exactly what group does a group presentation present, and what is its universal property?
The more common way to answer this question category-theoretically is to consider the two projections $R \toto \F{}X$, note that by the universal property of free groups they yield two group homomorphisms $\F{}R \toto \F{}X$, and define the presented group $\langle X\mid R\rangle$ to be the coequalizer of these two morphisms in the category of groups.
While this works, it requires a bit of unraveling to extract the universal property of $\langle X\mid R\rangle$.

When we come to dependent type theories, we will be forced to use a procedure like this, but until then we can package the universal property of a presented object more explicitly using an adjunction.
In the case of groups, what we would do is define the \emph{category of group presentations}, whose objects are presentations $(X,R)$ as above, and whose morphisms $(X,R) \to (Y,S)$ are functions $g:X\to Y$ such that $R \subseteq (\F{}g \times \F{}g)^{-1}(S)$, i.e.\ each relation in $R$ is mapped by $g$ to a relation in $S$.
Then we define, for each group $G$, its \emph{underlying} or \emph{canonical} presentation $(G,K_G)$, in which the set of generators is the underlying set of $G$ itself, and $K_G$ is the ``kernel pair'' of the counit $\F{}G\to G$ of the free-group adjunction (i.e.\ its pullback against itself).
More explicitly, $K_G$ is the set of \emph{all} pairs $(w_1,w_2)$ of elements of $\F{}G$ that when ``multiplied out'' become equal in $G$.

Now the group presented by a group presentation is simply its image under the left adjoint to this forgetful functor from groups to presentations.
This gives an explicit description of its universal property: a map of presentations $(X,R) \to (G,K_G)$ is, by definition, a function $X\to G$ such that the image of each relation in $R$ is ``true'' in $G$, i.e.\ the composites $R \toto \F{}X \to \F{}G \to G$ are equal.

Before we move on to categories, we note a few things about this construction.
Firstly, this adjunction subsumes the adjunction between groups and sets, because any set $X$ gives rise to a group presentation $(X,\emptyset)$ with no relations, and the corresponding presented group is just the free group on $X$.
However, we needed to \emph{already have} the simpler adjunction between groups and sets in order to construct this more refined adjunction between groups and presentations; we used free groups in the definition of a group presentation, and we used the universal property of free groups in defining the underlying presentation of a group.

Secondly, the underlying presentation of a group actually \emph{presents that group}, i.e.\ the counit of the adjunction between presentations and groups is an isomorphism.
Equivalently, by general categorical facts, the ``underlying presentation'' functor from groups to presentations is fully faithful.
Thus, if we wanted to, we could regard groups as ``being'' certain group presentations whose relations are ``saturated'' in a suitable sense, with the adjunction exhibiting such ``groups'' as a reflective subcategory of group presentations.

On the other hand, this also means that the left adjoint from presentations to groups is essentially surjective.
Therefore, if we define a new category whose objects are group presentations and whose morphisms are the group homomorphisms between the groups they present, we obtain a category equivalent to the category of groups.
A slightly less tautological-sounding definition of the morphisms in this category uses the adjunction: a group homomorphism from $\langle X\mid R\rangle$ to $\langle Y\mid S\rangle$ is the same as a morphism of presentations from $(X,R)$ to the underlying presentation of $\langle Y\mid S\rangle$.
Such a thing can be described a bit more explicitly: it is a function $X\to \F{}Y$ which maps each relation in $R$ to a relation that holds in $\langle Y\mid S\rangle$.

More categorically, the observation is that any adjunction $F:\cC \toot \cD : G$ whose counit is an isomorphism is ``opmonadic'', i.e.\ exhibits \cD as equivalent to the Kleisli category of the induced monad $G F$ on $\cC$.
(The fact that \cD is equivalent to a reflective subcategory of \cC implies that the adjunction is also monadic, with the monad $G F$ being idempotent.
Moreover, every algebra for an idempotent monad is free, so the Kleisli and Eilenberg--Moore categories are equivalent.)

Most of these observations have analogues in the type-theoretic situation, to which we now turn.


\subsection{Category presentations}
\label{sec:category-presentations}

We begin with presentations of categories, which are formally quite similar to presentations of groups, except for the presence of ``many objects'' and the absence of inverses.
However, rather than taking the ``free category on a directed graph'' as a black box, the way we did for the free group on a set in \cref{sec:group-presentations}, we will use its explicit presentation using the cut-free type theory of \cref{sec:category-cutadm}.
This enables an analogous type-theoretic presentation of the left adjoint constructing a presented category from a presentation.

Recall from \cref{thm:category-initial-2} that the free category $\F\bCat\cG$ generated by a directed graph \cG has the same objects as \cG and its morphisms are the derivable term judgments $x:A\types M:B$ (i.e.\ derivations of $A\types B$) in the cut-free type theory for categories under \cG.

\begin{defn}
  A \textbf{category presentation} \cP consists of a directed graph $\cP_1$ together with a set $\cP_2$ of \emph{generating equalities}, each of which is a pair of two derivations of $A\types B$ for some $A,B\in\cG$.
\end{defn}

Equivalently, each generating equality is two terms $M,N$ such that $x:A\types M:B$ and $x:A\types N:B$ are both derivable, or (more category-theoretically) two parallel morphisms in the free category on \cG.

\begin{defn}
  For a category \cC, its \textbf{underlying presentation} is the underlying directed graph of \cC with, as generating equalities, the set of \emph{all} parallel pairs of morphisms in $\F\bCat\cC$ that become equal after applying the counit functor $\F\bCat\cC \to \cC$.
\end{defn}

For a category presentation \cP, we write $\cP_2(x:A,B)(M,N)$ to mean that $(M,N)\in\cP_2$ where both $x:A\types M:B$ and $x:A\types N:B$.
Now we define the \textbf{type theory for categories under \cP} to consist of the cut-free type theory for categories under the directed graph $\cP_1$ together with the rules for an equality judgment shown in \cref{fig:catpres-equality}.
Recall that the cut-free type theory for categories didn't need an equality judgment on its own, so the only rules other than the ``generator'' one (the last one) are reflexivity, symmetry, transitivity, and congruence.
(As mentioned in \cref{sec:catprod}, we often omit these rules, but we include them here because we are doing something a bit new with an equality judgment.)

The generator rule basically says that each generating equality yields an actual equality.
Note that we have built in a substitution on the left.
This ensures that $\equiv$ remains a congruence for substitution on both sides as in \cref{thm:catprod-subcong} (the other side is ensured by the primitive congruence rules for $\equiv$).


\begin{figure}
  \centering
  \begin{mathpar}
    \inferrule{x:X\types M:A}{x:X \types M\equiv M:A}\and
    \inferrule{x:X\types M\equiv N:A}{x:X\types N\equiv M:A}\and
    \inferrule{x:X\types M\equiv N:A \\ x:X\types N\equiv P:A}{x:X\types M\equiv P:A}\and
    \inferrule{f\in \cP_1(A,B) \\ x:X\types M\equiv N:A}{x:X\types f(M)\equiv f(N):B}\and
    \inferrule{x:X \types M:A \\ \cP_2(y:A,B)(N,P)}{x:X \types N[M/y] \equiv P[M/y] : A}
  \end{mathpar}
  \caption{Equality rules for category presentations}
  \label{fig:catpres-equality}
\end{figure}

\begin{thm}\label{thm:catpres-initial}
  For any category presentation \cP, the free category generated by \cP (i.e.\ the image of \cP under the left adjoint to the underlying-presentation functor) is described by the type theory for categories under \cP: its objects are those of \cP, and its morphisms are the derivations of $A\types B$ modulo the equivalence relation $\equiv$.
\end{thm}
\begin{proof}
  As usual, the rules for equality ensure that it is an equivalence relation and a congruence, so that the quotient is a category.
  Now suppose given a category \cC and a morphism from \cP to the underlying presentation of \cC.
  Then \cref{thm:category-initial-2} gives a unique functor from the free category on $\cP_1$ to \cC, so it suffices to check that this functor descends to the quotient by $\equiv$.
  This is a straightforward induction over derivations of $\equiv$.
  All the basic rules of equality are always true in \cC, while the generator one is true since $\cP\to\cC$ is a morphism of presentations.
\end{proof}

As in \cref{sec:group-presentations}, the counit of this adjunction is easily shown to be an isomorphism.
Thus we can regard categories as a reflective subcategory of category presentations, or we can show that the category of categories is equivalent to the Kleisli category of the induced monad on category presentations.


\subsection{$\times$-presentations}
\label{sec:catprod-presentations}

Now we move on to the case of categories with products.
Here there are three stages rather than two, since we have to add the operations on objects as well.
Thus, instead of two adjunctions (with base categories of directed graphs and category presentations, respectively) we will have three.
For the moment, we will call the objects of the base categories \textbf{$k$-skeletal (unary) $\times$-presentations} for $k=0,1,2$; in \cref{sec:theories} we will introduce some more standard terminology.

\begin{defn}
  A \textbf{0-skeletal $\times$-presentation} is a set, $\cP_0$.
\end{defn}

The type theory of a 0-skeletal $\times$-presentation consists of the rules for type judgments from \cref{sec:catprod}:
\begin{mathpar}
  \inferrule{A\in\cP_0}{\types A\type}\and
  \inferrule{ }{\types \unit\type} \and
  \inferrule{\types A\type \\ \types B\type}{\types A\times B\type}
\end{mathpar}
We write $\F{0}\cP_0$ for the set of derivable types in this theory (later on we will see that it is indeed some kind of free thing generated by $\cP_0$).

\begin{defn}
  A \textbf{1-skeletal $\times$-presentation} consists of a 0-skeletal $\times$-presentation $\cP_0$ (its \emph{0-skeleton}) together with a set of \emph{arrows} $\cP_1$, each of which is assigned a source and a target that are types in the type theory of $\cP_0$.
\end{defn}

Thus, for instance, a 1-skeletal $\times$-presentation could contain objects $A,B$ and arrows $f:A\times B\to B\times B$ and $g:\unit \to A$.

The type theory of a 1-skeletal $\times$-presentation consists of the rules for term judgments $x:A\types M:B$ from \cref{sec:catprod}.
The generator rule 
\[ \inferrule{x:X\types M:A \\ f\in\cP_1(A,B)}{x:X\types f(M):B} \,f \]
looks the same as before, but now it no longer implies as a side condition that $A$ and $B$ must be base types.
We prove exactly as before that terms have unique derivations and that substitution is admissible and associative.
We write $\F{1}\cP_1(A,B)$ for the set of derivations/terms $x:A\types M:B$ in this theory.

\begin{defn}
  A \textbf{2-skeletal $\times$-presentation} consists of a 1-skeletal $\times$-presentation $\cP_{\le 1}$ (its \emph{1-skeleton}) together with a set of \emph{generating equalities} $\cP_2$, each of which is a pair of derivations of the same judgment $A\types B$ in the type theory of $\cP_{\le 1}$.
\end{defn}

Since this is the last step, we sometimes omit the adjective ``2-skeletal'' and just speak of \textbf{(unary) $\times$-presentations}.
We write $\cP_2(x:A,B)(M,N)$ to mean that $(M,N)$ is a generating equality, where $x:A\types M:B$ and $x:A\types N:B$.

The type theory of a $\times$-presentation consists of the rules for the equality judgment $\equiv$ from \cref{sec:catprod}, together with a generator rule for equalities:
\begin{equation}
  \inferrule{x:X\types M:A \\ \cP_2(y:A,B)(P,Q)}{x:X \types P[M/y] \equiv Q[M/y] : B}\label{eq:equality-intro}
\end{equation}
As before, we have built in a substitution on the left.

We once again prove all the expected properties, like type-checking and cut admissibility, just as we did for the ordinary type theory for categories with products in \cref{sec:catprod}.
%(There is something to be aware of here that we will mention at the end of this section, but it does work fine.)
However, stating and proving the initiality theorem is somewhat more complicated.

Of course, generalizing from the situation of \cref{sec:category-presentations} we expect to have \emph{three} adjunctions, whose base categories consist of $k$-skeletal $\times$-presentations for $k=0,1,2$.
However, unlike in \cref{sec:category-presentations} the other category cannot be \bPrCat for all three adjunctions, since the type theories of 0-skeletal and 1-skeletal $\times$-presentations are not yet categories with products.
A 0-skeletal $\times$-presentation does not even give rise directly to a category; we could make it one by adding identity morphisms, but it still wouldn't have products.
And a 1-skeletal $\times$-presentation does give a category, but without the $\beta$- and $\eta$-conversion laws (which only appear at the final stage with the equality judgment) the structure it has is weaker than having products.

To start with, in order to have adjunctions, we need categories of $\times$-presentations.

\begin{defn}\ 
  \begin{enumerate}
  \item A morphism of 0-skeletal $\times$-presentations is just a function $K_0:\cP_0 \to \cQ_0$.
    This induces a function $\Kbar_0:\F{0}\cP_0 \to \F{0}\cQ_0$ by a simple induction.
  \item A morphism of 1-skeletal $\times$-presentations is a morphism $K_0:\cP_0 \to \cQ_0$ on 0-skeleta, together with a function $K_1:\cP_1 \to \cQ_1$ respecting sources and targets (relative to $\Kbar_0$).
    This induces maps $\Kbar_1 : \F{1}\cP_1(A,B) \to \F{1}\cQ_1(\Kbar_0(A),\Kbar_0(B))$ by another simple induction.
  \item Finally, a morphism of 2-skeletal $\times$-presentations is a morphism $K_{\le 1}:\cP_{\le 1} \to \cQ_{\le 1}$ on 1-skeleta, together with a function $K_2:\cP_2 \to \cQ_2$ respecting sources and targets (relative to $\Kbar_1$).
  \end{enumerate}
  We write $\timespres k$ for the category of $k$-skeletal $\times$-presentations.
\end{defn}

Now, since $\times$-presentations are defined inductively on their skeleta, it is most natural to prove the initiality theorem in stages as well.
For this purpose we need skeletal versions of ``categories with products'' as well.
The idea is that ``1-skeletal categories with products'', for instance, should have exactly the categorical structure induced by the type and term judgments \emph{without} any equality judgments.
In particular, they have pairings and projections, but these don't satisfy the $\beta$- and $\eta$-conversion laws.
The pairings do, however, have to be \emph{natural} in the domain, since that comes from substitution which is defined and associative already on terms before we have any $\equiv$.

\begin{defn}\ 
  \begin{enumerate}
  \item A \textbf{0-skeleton for a category with products} is a set $\cM_0$ with a specified element $\unit\in\cM_0$ and a binary operation $(-\times-) :\cM_0\times \cM_0 \to \cM_0$.
  \item A \textbf{1-skeleton for a category with products} is
    \begin{enumerate}
    \item A 0-skeleton for a category with products, $\cM_0$;
    \item A category $\cM$ with $\cM_0$ as its set of objects;
    \item For every $A,B\in\cM_0$, morphisms $A\times B \to A$ and $A\times B\to B$;
    \item For every $A,B\in\cM_0$, a natural transformation $\cM(-,A)\times \cM(-,B) \to \cM(-,A\times B)$; and
    \item A natural transformation $1\to \cM(-,\unit)$, where $1$ denotes the terminal presheaf.
    \end{enumerate}
  \item A \textbf{2-skeleton for a category with products} is just a category with products.
  \end{enumerate}
  We write $\prcat k$ for the category of $k$-skeleta for a category with products (in which we trust the reader to define the morphisms).
\end{defn}

\begin{rmk}\label{rmk:1skeleton-yoneda}
  The morphisms $A\times B \to A$ and $A\times B\to B$ in a 1-skeleton for a category with products could also be expressed, by the Yoneda embedding, as natural transformations $\cM(-,A\times B) \to \cM(-,A)$ and $\cM(-,A\times B) \to \cM(-,B)$.
  This would make them look more like the other two operations in such a 1-skeleton, and moreover would correspond more directly to the type-theoretic rules for projections:
  \begin{mathpar}
    \inferrule{x:X\types M:A\times B}{x:X \types \pi_1(M):A}\and
    \inferrule{x:X\types M:A\times B}{x:X \types \pi_2(M):B}\and
  \end{mathpar}
\end{rmk}

Now, we will construct the following ladder of adjunctions, in which the both the left and right adjoints commute with the obvious downward-pointing forgetful functors:
\[ \xymatrix{
  \timespres 2 \ar@<2mm>[r]^{\F{2}} \ar@{}[r]|-{\bot} \ar[d] & \prcat 2 \ar@<2mm>[l]^{\fU_2} \ar[d] \\
  \timespres 1 \ar@<2mm>[r]^{\F{1}} \ar@{}[r]|-{\bot} \ar[d] & \prcat 1 \ar@<2mm>[l]^{\fU_1} \ar[d] \\
  \timespres 0 \ar@<2mm>[r]^{\F{0}} \ar@{}[r]|-{\bot}        & \prcat 0 \ar@<2mm>[l]^{\fU_0}
}\]
In fact, we will define all the horizontal functors one by one from bottom to top: first $\fU_0$, then $\F1$, then $\fU_1$, etc, and show one step at a time that we get adjunctions.

\begin{defn}\label{thm:catprod-thy-underlying-0}
  The forgetful functor $\fU_0 : \prcat0 \to \timespres0$ takes a 0-skeleton for a category with products to its underlying set.
\end{defn}

\begin{thm}\label{thm:catprod-thy-initial-0}
  The left adjoint $\F0$ of $\fU_0$ takes a 0-skeletal $\times$-presentation $\cP_0$ to the set of types $\types A\type$ derivable in its type theory.\qed
\end{thm}

\begin{defn}\label{thm:catprod-thy-underlying-1}
  The forgetful functor $\fU_1 : \prcat1 \to \timespres1$ acts as $\fU_0$ on underlying 0-skeleta, and for types $A,B$ in the type theory of $\fU_0\cM$ we define $\fU_1(A,B)$ to be the set of morphisms in \cM between the images of $A$ and $B$ under the counit map $\F0 \fU_0 \cM_0 \to \cM_0$.
\end{defn}

\begin{thm}\label{thm:catprod-thy-initial-1}
  The left adjoint $\F1$ of $\fU_1$ acts as $\F0$ on 0-skeleta, and the morphisms in $\F1 \cP$ are the derivations $x:A\types M:B$ in the type theory of \cP.
\end{thm}
\begin{proof}
  We sketch the proof, with the goal of explaining the naturality requirement in the definition of 1-skeleta for categories with products.
  First we have to show that $\F1\cP$ is indeed such a 1-skeleton.
  Note that the proofs in \cref{sec:catprod} that substitution/cut is associative and unital as an operation on terms/derivations \emph{before} we impose any equality judgment; thus the present $\F1\cP$ is a category.
  Moreover, the term operations $\pi_1$, $\pi_2$, $\pair{-}{-}$, and $\ttt$ give $\F1\cP$ the structure of a 1-skeleton for a category with products.
  The \emph{naturality} of these operations is simply the fact that they commute with substitution, e.g.\ $\pair{P}{Q}[M/x] = \pair{P[M/x]}{Q[M/x]}$.
  Since this is true essentially \emph{by definition of substitution}, it is a literal equality of terms/derivations.

  Now let \cC be any 1-skeleton for a category with products, and let $\pfree:\cP\to\fU_1\cC$ be a morphism of 1-skeletal $\times$-presentations.
  We extend it to a map $\free_0:\F0\cP_0 \to \cC_0$ of 0-skeleta for categories with products using \cref{thm:catprod-thy-initial-0}.
  Then we extend this to $\free_1 : \F1\cP \to \cC$ by induction on the term-forming operations, using the assumed structure of \cC.
  For instance, to define $\free_1$ on a term $x:X \types \pair M N : A\times B$, we inductively define it on the terms $x:X\types M:A$ and $x:X \types N:B$, then apply the pairing operation
  \[ \cC(X,A)\times \cC(X,B) \to \cC(X,A\times B) \]
  of \cC.
  The other cases are similar.

  We use the naturality of these operations in \cC when proving prove that this operation is a functor.
  For instance, the composite of $y:Y\types Q:X$ and $x:X \types \pair M N : A\times B$ is $\pair M N [Q/y]$, which is by definition $\pair{M[Q/y]}{N[Q/y]}$
  By induction, we may assume that $\free(M[Q/y])$ and $\free(N[Q/y])$ are the composites $\free(Y)\xto{\free(Q)} \free(X) \xto{\free(M)} \free(A)$ and $\free(Y)\xto{\free(Q)} \free(X) \xto{\free(N)} \free(B)$.
  And by naturality, the pairing of these two composite morphisms in \cC is equal to the result of first pairing $\free(M)$ and $\free(N)$ and then composing with $\free{Q}$, which is to say
  \[ \free(\pair{M[Q/y]}{N[Q/y]}) = \free(\pair M N) \circ \free(Q). \]
  The other cases of functoriality are analogous.
  Finally, $\free$ is obviously a morphism of 1-skeleta for categories with products, and it is unique for the usual reasons.
\end{proof}

\begin{defn}\label{thm:catprod-thy-underlying-2}
  The forgetful functor $\fU_2 : \prcat2 \to \timespres2$ acts as $\fU_1$ on underlying 1-skeleta, and two parallel terms $x:A\types M:B$ and $x:A\types N:B$ in the type theory of $\fU_1\cM$ are related by a generating equality if and only if their images under the counit map $\F1 \fU_1 \cM \to \cM$ are equal.
\end{defn}

\begin{thm}\label{thm:catprod-thy-initial-2}\label{thm:catprod-thy-initial}
  The left adjoint $\F2$ of $\fU_2$ is defined by applying $\F1$ to underlying 1-skeleta, then quotienting by the equivalence relation of derivable judgments $x:A\types M\equiv N:B$.
\end{thm}
\begin{proof}
  The main point is that a category with products (in our strict sense with chosen binary and nullary products) is exactly a 1-skeleton for a category with products that satisfies categorical versions of the $\beta$- and $\eta$-conversion rules.
  Thus, $\F2\cP$ is indeed a category with products, and we can extend $\free$ to it by induction on derivations of equality.
\end{proof}

This is a bit abstract, so let's consider our motivating example.
The $\times$-presentation for monoid objects should have one object $A\in\cP_0$ and two arrows, $m\in \cP_1(A\times A,A)$ and $e\in\cP_1(\unit,A)$.
It should have three equalities, for associativity and the two unit laws; but what terms do they relate?
Consider associativity: we need two terms $x:(A\times A)\times A \types M:A$ expressing the two ways to multiply the three components of $x$.
(Note that we also had to chose, arbitrarily, a particular way to associate the triple cartesian product.)
First, of course we have to extract those components using $\pi_1$ and $\pi_2$.
Then we have to multiply them in pairs, noting that since the source of $m$ is $A\times A$ we have to pair things up before we can apply $m$ to them.
This leads us to the terms
\begin{align*}
  x:(A\times A)\times A &\types m(\pair{m(\pair{\pi_1(\pi_1(x))}{\pi_2(\pi_1(x))})}{\pi_2(x)}) : A\\
  x:(A\times A)\times A &\types m(\pair{\pi_1(\pi_1(x))}{m(\pair{\pi_2(\pi_1(x))}{\pi_2(x)})}) : A
\end{align*}
so one of our generating equalities will relate these two terms.
The unit laws are simpler; one relates
\[ x:A \types m(\pair{x}{e(\ttt)}):A \qquad\text{and}\qquad x:A \types x:A \]
and the other relates
\[ x:A \types m(\pair{e(\ttt)}{x}):A \qquad\text{and}\qquad x:A \types x:A \]
This completes the definition of the \textbf{$\times$-presentation of monoids}.
For brevity we may write its generating equalities as
\begin{align*}
  x:(A\times A)\times A &\types
                          \begin{multlined}[t]
                            m(\pair{m(\pair{\pi_1(\pi_1(x))}{\pi_2(\pi_1(x))})}{\pi_2(x)}) \\[\jot]
                            \equiv
                            m(\pair{\pi_1(\pi_1(x))}{m(\pair{\pi_2(\pi_1(x))}{\pi_2(x)})}) : A
                          \end{multlined}\\
  x:A &\types m(\pair{x}{e(\ttt)}) \equiv x:A\\
  x:A &\types m(\pair{e(\ttt)}{x}) \equiv x:A
\end{align*}
as long as we don't forget that the actual rule~\eqref{eq:equality-intro} builds in a substitution.

Let us write this $\times$-presentation as \cT.
\cref{thm:catprod-thy-underlying-2} tells us that a morphism of $\times$-presentations $\pfree:\cT \to \fU_2\cM$, for any category with products \cM, is exactly a monoid object in \cM.

More explicitly, first we choose to interpret the base type $A$ as some object $\pfree(A)\in \cM$.
This means choosing a morphism on 0-skeleta $\cT_0 \to \fU_0\cM_0$.

Second, we extend this interpretation inductively to other types using the chosen products in \cM, so that for instance $A\times A$ and $\unit$ are sent to the chosen product $\pfree(A) \times \pfree(A)$ and terminal object $1$ in \cM.
This means considering the adjunct morphism $\F0\cT_0 \to \cM_0$, which factors as the composite $\F0\cT_0 \to \F0 \fU_0 \cM_0 \to \cM_0$.

Third, we choose to interpret the generating arrows $m$ and $e$ as morphisms $\pfree(A) \times \pfree(A) \to \pfree(A)$ and $1\to \pfree(A)$ in \cM.
Technically, we are interpreting them as arrows in $\fU_1 \cM_1$ compatibly with the map $\F0\cT_0 \to \F0 \fU_0 \cM_0$ on their domains and codomains, but arrows in $\fU_1 \cM_1$ are induced by the counit $\F0 \fU_0 \cM_0 \to \cM_0$, so this really means arrows in \cM as shown.
This gives a morphism $\cT_1 \to \fU_1\cM_1$.

Fourth, we extend this interpretation to other terms, such as those appearing in the above, sending them to other morphisms in \cM.
This means considering the adjunct morphism $\F1 \cT_1 \to \cM_1$, which again factors through the counit $\F1 \fU_1 \cM_1 \to \cM_1$.

Finally, we assert that the generating equalities in $\cT$ are sent to equalities in \cM, or technically in $\fU_2 \cM$.
Unwinding the definitions shows that the two associativity terms really are sent to the two morphisms $(\pfree(A)\times \pfree(A))\times \pfree(A) \to \pfree(A)$ that the associativity of a monoid object should equate, and similarly for the unit terms.
Thus, we have precisely specified a monoid object in \cM.

It now follows from \cref{thm:catprod-thy-initial} that $\F{2}\cT$ is the free category with products generated by a monoid.
Thus, a monoid object in \cM also induces a functor $\free:\F{2}\cT\to\cM$.
Since derivations of equalities yield equal morphisms in $\F2\cT$, it follows that any equation we can derive in this theory will be true of any monoid object in any category with products.

If we want to reproduce the uniqueness-of-inverses proof from \cref{sec:intro}, we need to further augment our $\times$-presentation with two inverse operations $i,j \in \cT_1(A,A)$ and equalities such as
\[ x:X \types m(\pair{x}{i(x)}) \equiv e(\ttt) : A \]
The proof of uniqueness now looks like:
\begin{align*}
  i(x)
  &\equiv m(\pair{i(x)}{e(\ttt)})\\
  &\equiv m(\pair{i(x)}{m(\pair{x}{j(x)})})\\
  &\equiv m(\pair{\pi_1(\pi_1(\pair{\pair{i(x)}{x}}{j(x)}))}{m(\pair{\pi_2(\pi_1(\pair{\pair{i(x)}{x}}{j(x)}))}{\pi_2(\pair{\pair{i(x)}{x}}{j(x)})})})\\
  &\equiv m(\pair{m(\pair{\pi_1(\pi_1(\pair{\pair{i(x)}{x}}{j(x)}))}{\pi_2(\pi_1(\pair{\pair{i(x)}{x}}{j(x)}))})}{\pi_2(\pair{\pair{i(x)}{x}}{j(x)})})\\
  &\equiv m(\pair{m(\pair{i(x)}{x})}{j(x)})\\
  &\equiv m(\pair{e(\ttt)}{j(x)})\\
  &\equiv j(x).
\end{align*}
If it weren't for those two horrific-looking terms in the middle, the rest of the calculation looks pretty much like the argument as we gave it in \cref{sec:intro}.
The horrific-looking terms are there because we can only apply the associativity of $m$ to a single term belonging to $(A\times A)\times A$, so we need to tuple up the terms $i(x)$, $x$, and $j(x)$ and replace them by the projections from that tuple (using $\beta$-conversion).

In \cref{chap:simple} we will remedy this problem by introducing a type theory that allows us to state associativity (and $m$ itself) without tupling --- and, incidentally, remove much of the complication of this section by eliminating the need for products in the domains and codomains of generating arrows entirely.

Now, in \cref{sec:group-presentations,sec:category-presentations} we observed that the counit of the adjunction for presentations was an isomorphism.
In the case of $\times$-presentations, this adjunction is now only an \emph{equivalence}.

\begin{thm}\label{thm:catprod-thy-esssurj}
  For any category with products \cM, the functor $\F2\fU_2\cM \to \cM$ is an equivalence of categories.
\end{thm}
\begin{proof}
  The definitions of $\fU_0$, $\fU_1$, and $\fU_2$ are cooked up precisely so as to make this functor surjective on objects, full, and faithful respectively.
  That is, $\fU_0\cM$ contains all the objects of \cM, so that the counit $\ep_0:\F{0}\fU_0\cM\to \fU_0\cM$ is surjective --- though not injective, since $\F{0}\fU_0\cM$ contains new types such as ``$A\times B$'' for $A,B\in \cM$, which is distinct from the specified product of the objects $A$ and $B$ in \cM.
  Then for any types $A,B$, $\fU_1\cM(A,B)$ contains all the morphisms of \cM between their images in \cM, so the counit $\ep_1:\F{1}\fU_1\cM\to \fU_1\cM$ is full --- though not faithful, since $\F{1}\fU_1\cM$ also contains new terms such as $x:A\types g(f(x)):C$ for $f:A\to B$ and $g:B\to C$ in \cM, which is distinct from $x:A\types (g\circ f)(x):C$ for the composite $g\circ f$ of morphisms in \cM.
  Finally, $\fU_2\cM$ equates all terms whose images in \cM are equal, so that when we quotient by $\equiv$ the counit $\ep_2:\F{2}\fU_2\cM\to \fU_2\cM$ becomes faithful as well.
\end{proof}

In other words, the functor $\F2:\timespres2 \to \bPrCat$ is \emph{bicategorically} essentially surjective.
Unfortunately, while \bPrCat can naturally be enhanced to a 2-category, $\timespres2$ cannot except in the trivial way with only identity 2-cells; and while we can consider $\F2$ to be a functor of bicategories with trivial domain, there is no way to extend $\fU_2$ to a functor of bicategories.
Thus, we cannot regard categories with products as a reflective subcategory (or even reflective sub-bicategory) of $\times$-presentations.
However, we can still construct a bicategory that is (bicategorically) equivalent to $\bPrCat$ and whose objects are $\times$-presentations, by taking the hom-category from $\cP$ to $\cQ$ to be the hom-category $\bPrCat(\F2(\cP),\F2(\cQ))$.
In fact, this bicategory is a strict 2-category, and its underlying 1-category is the Kleisli category of the induced monad on $\timespres2$; we are just observing that the latter category can be enhanced to a 2-category in order to make the adjunction ``bicategorically opmonadic''.

\begin{rmk}
It is worth noting that $\times$-presentations as defined here have a minor problem from a type-theoretic standpoint:
if we try to formulate a sequent calculus version of their type theories with the same generator rules as before, we find that cut is no longer admissible.
For instance, if $f\in\cP_1(A,B\times C)$, there is no way to simplify a cut like the following:
\begin{mathpar}
  \inferrule*[right=cut]{\inferrule*[right=$f$]{X\types A}{X\types B\times C} \\
    \inferrule*[Right=$\timesL$]{B\types Y}{B\times C\types Y}}
  {X\types Y}
\end{mathpar}
That is, if we have generating morphisms whose codomains are products, then their composites with projections cannot be ``simplified''.
In a natural deduction theory, this problem manifests as a lack of ``canonical forms'' relative to $\beta$- and $\eta$-conversion.

A common way to deal with this is to simply formulate the cut-elimination theorem in a form like ``all cuts except those over types appearing as the domains or codomains of generators can be eliminated''.
Or roughly equivalently, we can modify the generator rule to build in a cut on \emph{both} sides:
\begin{mathpar}
  \inferrule{X\types A \\ B\types Y}{X\types Y}\; f\in\cT_1(A,B)
\end{mathpar}
With this primitive rule, the above ``inadmissible'' cut could be replaced by
\begin{mathpar}
  \inferrule*[right=$f$]{X\types A\\
    \inferrule*[Right=$\timesL$]{B\types Y}{B\times C\types Y}}
  {X\types Y}
\end{mathpar}

In any case, this is another advantage of the multiple-antecedent type theories to be presented in \cref{chap:simple}: they will enable us to describe important theories (such as monoids) while restricting the generating morphisms to have only base types in their domain and codomain.
(Of course we still do require generating equalities relating complex terms rather than just generators.)
\end{rmk}

The construction of $\times$-presentations, given the ordinary unary type theory for categories with products under directed graphs, while somewhat involved, is quite general and can be applied to pretty much any type theory with an initiality theorem.
We will not attempt to make this generality precise, but the approach is similar to the general theory of computads in higher category theory; see~\cite{batanin:cptd-fin,garner:hom-hcat,nlab:computad}.

For instance, by starting from the type theory for categories with coproducts instead, we obtain a notion of \textbf{unary $+$-presentation} and a corresponding tower of adjunctions.
Most of the definitions and proofs are direct translations of those for $\times$-presentations, replacing $\times$ and its rules by $+$ and its rules everywhere.
The only thing here that requires a little thought is the definition of a \textbf{1-skeleton for a category with coproducts}.
But looking at the rules for coproduct terms from \cref{fig:catcoprod},
% \begin{mathpar}
%   \inferrule{x:X \types M:\zero \\ \types C\type}{x:X \types \abort(M):C}\;\zeroE\\
%   \inferrule{x:X\types M:A}{x:X\types \inl(M):A + B}\;\plusI1\and
%   \inferrule{x:X\types N:B}{X\types \inr(N):A + B}\;\plusI2\and
%   \inferrule{x:X\types M:A + B\\u:A\types P:C \\ v:B\types Q:C}{x:X\types \acase AB(M,u.P,v.Q):C}\;\plusE
% \end{mathpar}
and the corresponding defining clauses of substitution from \cref{thm:catcoprod-subadm},
% \begin{align*}
%   \abort(N)[M/y] &= \abort(N[M/y])\\
%   \inl(N)[M/y] &= \inl(N[M/y])\\
%   \inr(N)[M/y] &= \inr(N[M/y])\\
%   \case(N,u.P,v.Q)[M/y] &= \case(N[M/y],u.P,v.Q)
% \end{align*}
leads us to define it to consist of the following.
\begin{enumerate}
\item A 0-skeleton for a category with coproducts, i.e.\ a set $\cM_0$ with an operation $\mathord+:\cM_0\times \cM_0 \to \cM_0$ and an element $\zero\in\cM_0$.
\item A category with $\cM_0$ as its set of objects.
\item For every $A,B\in\cM_0$, morphisms $A\to A+B$ and $B\to A+B$.
\item For any two morphisms $A\to C$ and $B\to C$, a morphism $A+B\to C$.
\item For any $A\in\cM_0$, a morphism $\zero\to A$.
\end{enumerate}
As noted in \cref{rmk:1skeleton-yoneda}, the type-theoretic rules are modeled more directly by natural transformations between representable functors.
In general, the necessary naturality involves the premise(s) that share antecedents with the conclusion.
But in the unary rules for coproducts, unlike those for products, there is always \emph{exactly} one such premise, and so the natural transformations involved will always be in the image of the Yoneda embedding.

The rest of the theory of $+$-presentations is straightforward: we get a tower of adjunctions, with the top left adjoint sending a $+$-presentation to a category with coproducts generated by its type theory.
We can then go on to combine these two kinds of presentation, starting from the theory of \cref{sec:modularity} for categories with both products and coproducts and obtain a notion of \emph{unary $(\times,+)$-presentation}.

One thing worth noting is that the tower of adjunctions can further motivate the twist in the proof of \cref{thm:catcoprod-initial}.
Recall that there we had to prove that the extension $\free$ to terms takes substitution to composition before we could extend it to ``act on equalities''.
From the present perspective, this is explained by the fact that we have to construct the entire 1-skeletal adjunction $\F1 \adj \fU_1$, including the functoriality of induced maps $\free:\F1\cP \to \cM$, before we can start constructing the 2-skeletal adjunction.

The category presentations studied in \cref{sec:category-presentations} fit into this general picture too, but a number of things are degenerate and thus not easily visible.
Since there are no type operations, a ``0-skeleton for a category'' is (like a 0-skeletal category presentation) just a set, so the level-0 adjunction is the identity.
And since there are no equality rules (other than those arising from the axioms of a presentation), 1-skeletons and 2-skeletons for categories are both just another name for categories.
Thus, the level-1 adjunction is the one from \cref{thm:category-initial-2} between directed graphs and categories, while the level-2 adjunction is the one from \cref{thm:catpres-initial} between category presentations and categories.

In the future, whenever we introduce a new class of type theories, we will first describe the basic version that constructs free categorical structures from ``graph-like'' data, and then later discuss (in more or less detail, depending on the case) the corresponding kind of ``presentation''.
As above, usually almost everything in the latter case is a straightforward generalization; the main conceptual point is a definition of the 1-skeletal version of the categorical structure with the right amount of naturality.
(In some cases, such as \cref{sec:multicat-pshf,sec:symm-tensor-pres}, this will require developing a bit more categorical theory.)


\subsection{Theories}
\label{sec:theories}

So far in this section we have stuck to categorical terminology, to avoid confusion.
However, the sort of objects we have introduced are more traditionally given names arising from mathematical logic.
\begin{enumerate}
\item Structures such as $\times$-presentations are usually called \textbf{theories} of an appropriate sort; our $\times$-presentations might be called something like \emph{unary finite-product theories}.
  Thus, for instance, the above $\times$-presentation for monoids would be called \emph{the unary finite-product theory of monoids}.
\item Their 1-skeleta, consisting of generating objects and morphisms but no generating equalities, are usually called \textbf{signatures}.
\item The generating objects are usually called \textbf{sorts} or \textbf{types}, the generating morphisms are usually called \textbf{function symbols} or \textbf{operations}, and the generating equalities are usually called \textbf{axioms}.
\item The underlying $\times$-presentation of a category with products would usually be called its \textbf{internal logic} or \textbf{internal type theory}.
\item A morphism from a $\times$-presentation \cP (that is, a unary finite-product theory) into the internal logic of a category with products \cM is usually called a \textbf{model} or an \textbf{interpretation} of \cP in \cM.
\item The category with products generated by a $\times$-presentation is sometimes called its \textbf{syntactic category}.
\item When we make a (bi)category equivalent to \bPrCat, say, by taking the objects to be the $\times$-presentations, the morphisms are usually known as \textbf{translations}.
  Thus, a translation is a model of its domain in the syntactic category of its codomain.
\end{enumerate}

Thus, the equivalence obtained by ``Kleisli construction'' from the category of presentations would usually be stated as something like
\begin{quote}
  Taking the internal logic of a category with products yields an equivalence between the bicategory of categories with products and the bicategory of unary finite-product theories with translations between them.
\end{quote}
Such an equivalence is sometimes regarded as the ideal situation for categorical semantics of type theories.
The author's opinion is that it is the \emph{adjunction} between presentations and categorical structures that is more fundamental (the equivalence being obtained by a trick of redefining the morphisms in one category); but the equivalence is certainly also important.

Other descriptions of categorical logic also tend to emphasize the ``internal logic'' more than we have here.
Note that the internal logic of \cM comes with a canonical interpretation into $\cM$ itself, given by the counit $\F2\fU_2\cM \to \cM$; thus anything derivable in the internal logic of \cM is actually true in \cM.
Moreover, all the objects, morphisms, and equalities in \cM are ``available'' in its internal logic as generators.
Thus, if we only care about semantics in one category \cM, we can work in this ``universal type theory'' generated by \cM, where we have ``everything from \cM'' to work with.
However, since in practice any actual argument only involves finitely many generators, it is generally also sufficient to work with small explicit theories, thereby making the conclusions more general.

Finally, I should point out that the word \emph{theory} is quite overloaded, in a confusing way that mixes many levels.
As stated above, one generally refers to $\times$-presentations as theories.
However, we also speak of the \emph{type theory} generated by such a presentation (e.g.\ what I have called the ``type theory for categories with products \emph{under} a given $\times$-presentation''), which constructs the category presented by that presentation.
In some sense the original presentation/theory is present ``inside'' this type theory, but the latter really consists of all the rules, not just the generators.

Note also that there is a \emph{different} ``type theory'' in this sense associated to each presentation/theory.
On the other hand, one often speaks loosely of ``the type theory for categories with products'', meaning to encompass all of these type theories associated to all presentation/theories.
(The word ``doctrine'' has also been used informally by Jon Beck at a similar level of generality; thus one would say that our above theory of monoids is a ``theory in the doctrine of finite products''.)
And of course, at a yet higher level one speaks of ``type theory'' as a mathematical \emph{subject}, like ``group theory'' or ``category theory'', encompassing \emph{all} such ``type theories'' (individual collections of judgments and rules).

Finally, there is a thread in categorical logic dating back to Lawvere~\cite{lawvere:functsem} that uses the word ``theory'' to refer to the \emph{free category generated by} a presentation.
Thus, in this usage the ``theory of monoids'' would be the category with products constructed from our type theory for monoids.
This category does retain all the information about the \emph{models} of a theory in categories with products, but it has lost all the information about the generating operations and axioms.
For instance, the category with products corresponding to the theory of monoids treats binary multiplication $(x,y) \mapsto x\cdot y$ on an equal footing with ternary multiplication $(x,y,z) \mapsto x\cdot (y\cdot z)$ (or equivalently $(x\cdot y)\cdot z$) and $n$-ary multiplication for all $n$, as well as other weirder operations like $(x,y,z) \mapsto y\cdot (x\cdot y)$.
In particular, quite different-looking presentation/theories (such as Boolean algebras and Boolean rings) can present equivalent categories, and hence have the same models everywhere; this is sometimes known as \emph{Morita equivalence}.

The passage from presentation/theories to categories is thus undoubtedly of great importance.
However, my own feeling is that using the word ``theory'' for the category rather than its presentation loses too much information that is traditionally included in things referred to as ``theories''.
In mathematical practice, theories (such as ``the theory of monoids'') are usually specified by a small number of generators and relations; thus if nothing else it is important to understand \emph{the process by which} these generate a category.
Type theory, with its technology of cut-admissibility, gives us a concrete way to construct and understand such categories, rather than (for example) simply deducing their existence by an adjoint functor theorem.

An extended dialogue about the meaning of the word ``theory'' can be found at~\cite{shulman:what-is-a-theory}.


\subsection*{Exercises}

\begin{ex}\label{ex:catcoprod-comon}
  Write down a $+$-presentation for \emph{comonoids} in categories with coproducts: objects $A$ equipped with morphisms $\Delta:A\to A+A$ and $e:A\to \zero$ satisfying coassociativity and counitality axioms.
\end{ex}

\begin{ex}\label{ex:catprodcoprod-field}
  Write down a $\times$-presentation for ring objects.
  Then extend it to a $(\times,+)$-presentation for field objects.
\end{ex}

\begin{ex}\label{ex:catprod-eckmann-hilton}
  Write down a unary finite-product theory (that is, a $\times$-presentation) for objects having two monoid structures with the same unit and satisfying an internalized version of the ``interchange law'' $m_1(m_2(x,y),m_2(z,w)) = m_2(m_1(x,z),m_1(y,w))$.
  Prove in the resulting type theory that $m_1=m_2$ and both are commutative.
  Compare this proof to \cref{ex:eckmann-hilton}.
  \textit{(In \cref{ex:catprod-ehnr-again} you will re-do this proof using a better internal logic for comparison.)}
\end{ex}

\begin{ex}\label{ex:catprod-nearring}
  Recall the notion of ``distributive near-ring'' from \cref{ex:near-ring}.
  Write down a unary finite-product theory for internal distributive near-ring objects in a category with products.
  Then use the resulting type theory to prove that every distributive near-ring object is in fact a ring object; compare this proof to \cref{ex:near-ring}.
  \textit{(In \cref{ex:catprod-ehnr-again} you will re-do this proof using a better internal logic for comparison.)}
\end{ex}

\begin{ex}\label{ex:catfib-thy}
  Construct a tower of skeletal presentations and adjunctions, analogous to those constructed in this section, corresponding to the type theory for fibrations from \cref{ex:catfib}.
\end{ex}

\ChapterExercises


% Local Variables:
% TeX-master: "catlog"
% End: