Reverse mode Automatic Differentiation (AD), a fundamental technique harnessing the chain rule to compute derivatives of functions implemented by programs, plays an integral role in addressing the challenges of efficiently computing derivatives of functions with highdimensional domains while maintaining numerical stability. Its widespread application in machine learning and scientific computing underscores its paramount significance.
Within the programming languages (PL) community, the pursuit of a simple, easily implementable, and purely functional reverse AD that can be performed at compile time through sourcecode transformation has been a central objective. The realization of this goal hinges on the development of an AD sourcecode transformation method underpinned by welldefined denotational semantics.
Reverse mode Combinatory Homomorphic Automatic Differentiation (CHAD) implements reverse AD following principled denotational semantics; namely, we define it as a uniquely defined structurepreserving functor. Following this principle, CHAD provides us with a reverse mode AD whose categorical semantics is the dual of the corresponding forward mode, with a straightforward correctness proof by a logical relations (LR) argument.
CHAD has been shown to apply to expressive total languages, involving sum, function, and (co)inductive types, by making use of a language with linear dependent types (see [1]). On the implementation side, CHAD can be made efficient (complexitypreserving), and it is parallelismpreserving (see [2]).
A remaining semantic challenge is to apply CHAD to languages with partial features. More precisely, we are particularly interested in understanding CHAD for languages with iteration (whileloops), as it is a common feature exploited in Machine Learning and Scientific Computing. For example, such loops are commonly used to implement Taylor approximations to special functions and iterative solvers. Further, understanding the interaction of CHAD with iteration is crucial for realizing the vision of general differential programming languages where differentiation can be applied to all programs.
In this talk, we will discuss how CHAD's framework has an appealing application to languages featuring iteration. We focus on the (concrete) denotational semantics, showing that our categorical semantics revolves around universal functors that preserve the least fixed points. Time permitting, we can discuss further aspects of this setting involving exponentials (function types) and the correctness result, involving a novel logical relations argument.
This is joint work with Gordon Plotkin and Matthijs Vákár.
[1] Fernando Lucatelli Nunes and Matthijs Vákár. CHAD for Expressive Total Languages. Mathematical Structures in Computer Science. 2023.
[2] Tom Smeding and Matthijs Vákár. Efficient CHAD arXiv:2307.05738. 2023.
