NOTE about Second-Order Generalised Algebraic Theories: Signatures and First-Order Semantics

這篇文章主要在探討程式語法的表示方式，並展示如何結合代數方法跟 HOAS。作者說他們跟隨 Uemura 把語言定義成 second-order generalised algebraic theories (SOGATs)，通過一系列案例揭示 non-substructural languages 可以自然的定義成 SOGATs

SOGAT 的形式定義 (using the syntax of a particular SOGAT)
定義兩種 SOGAT signatures 到 GAT signatures (signatures for quotient inductive-inductive types) 的轉換，分別基於同時替換（parallel substitution）與單一替換（single substitution）

按代數抽象程度區分 [local-0]

從具體到抽象可以把語法表示看成

語法表示	能力
abstract syntax trees (AST)	把程式原文 parse 後直接儲存用的一系列資料結構
well-scoped syntax tree	`λx.x` 跟 `λy.y` 在這類表示法裡面無法區分，也就是說綁定使用的具體名稱不再重要
intrinsic well-typed terms	這種表示把語法跟 typing relation 整合，所以 non well-typed 的程式甚至無法表達
well-typed, quotiented by the conversion relation (GAT)	加上更多 well-formness relation 的推廣代數理論，下面介紹
SOGAT	GAT 推到 second-order，下面介紹

在 well-typed terms 上再加上 conversion relation 就變成廣義代數理論（generalised algebraic theory，簡稱 GAT），GAT 用來處理 dependently typed languages 時特別方便，因為 typing 依賴了 conversion relation。在 GAT 上只能定義保留了 conversion relation 的函數，因此連印出函數都無法定義，但 normalisation（正規化）、typechecking（型別檢查）跟 parametricity 這些函數保留 conversion 因此可定義。

對任何 GAT 來說，syntax 可以定義成 initial model，因此沒有區分語法跟語意的必要，某種意義上來說，一個程式語言就是一個 GAT。

HOAS 方法：按對 bindings/variables 的處理方式區分 [local-1]

HOAS 觀點關心 bindings 跟 variables 的處理方式，比如 De Bruijn indices 使名稱選擇與語意無關，但這也表示替換必須是語法的一部分，舉例來說 form of a category with families。

Logical frameworks 跟 higher-order abstract syntax (HOAS) 提供了另一種實現 bindings 跟 variables 的方式：使用 metatheory 的函數空間。舉例來說，pure lambda calculus 的 lambda operation 的 type 是二階函數空間 (Tm → Tm) → Tm。這在理論上的解釋是 type-theoretic internal language of presheaves over the category of contexts and syntactic substitutions，在這個 topos 的 internal language 裡，lambda 的類型就長那樣。

Internal language 的觀點可以用來定義程式語言是什麼：有綁定的語言不是一個 GAT，而是一個二階的廣義代數理論（second-order generalised algebraic theory，簡稱 SOGAT），這種理論可以有二階操作（但不是任意高階）。

Untyped 或是 Simply typed 都有被定義成二階理論過，但 Uemura 是第一個用 SOGATs 定義有綁定的語言的人。這個理論真的很厲害，一個語言的 SOGAT 定義比 well-typed quotiented 定義還要抽象：SOGAT 連 contexts 跟 substitutions 都不用提到，這些會自動生成。但這不是一個 well-behaved 的代數理論，比如 second-order models 之間沒有有意義的 homomorphism。

為了描述 SOGAT 的一階模型、homomorphisms 或是 syntax 的 notion，作者把它轉成一個 GAT。這個過程中引入新的 sorts 給 contexts 跟 substitutions，然後把每個操作都用其 context index，second-order 的函數空間也由 context indexing 轉換成 first-order。因此得到一個有 some “correctness by construction” properties 的 GAT，比如每個操作都自動保留替換。這對複雜的理論來說，如果不是從 SOGAT 出發而是直接用其 GAT 表示，那這種屬性並不 trivial。

Cubical type theory 跟有 internal parametricity 的 type theory 都可以定義成 SOGATs，這些方法已經用來證明型別論的屬性。

Substructural (像是 linear or modal) type theories 無法用這篇論文說的方法用 SOGATs 定義，但有時候 presheaves over a substructural theory 提供的 substructural internal language 可以用來描述理論，像是 multi-modal type theory。

簡單的代數理論可以用 signatures 和方程式來表示，也可以 presentation independently 為 Lawvere theories。

GATs 的 syntactic signatures 可以用 preterms 跟 well-formedness relations 定義，也可以 presentation-independent 為 contextual categories 或 categories with families (CwFs) 或 clans。

GATs 的 theory of signatures (ToS) 方法落在 syntactic 跟 presentation-independent 方法中間：signatures 用某個 GAT 語法定義，這是一種設計來專用於定義簽名的類型理論。簽名跟我們在 Agda 裡寫下的 inductive datatype 定義一模一樣：A list (telescope) of the curried types of sorts and constructors。ToS 裡一個 signature 是一個理論的具體表示，但在 well-typed quotiented syntax 這層抽象上給出。這讓我們得到優雅的語意構造，又還是能用 signatures 工作。

SOGATs 同樣可以定義成 syntactically 或 presentation-independently（用 representable map categories 或是 CwFs with locally representable types）

這篇論文貢獻了 ToS 風格的 SOGATs 定義。SOGAT signatures 的理論本身也是 SOGAT，所以這個理論可以描述自己。

避免了循環論證，因為我們首先將 SOGAT 簽名定義為 GAT，從而引導 SOGAT 簽名理論，而 GAT 簽名理論（即 GAT 的語法）本身可以使用 Church 編碼進行引導。

Contributions [local-2]

The main takeaway of this paper is that structural languages are SOGATs.

We justify this claim through several examples. Our technical contributions are the following:

The theory of SOGAT signatures (ToS+), a domain-specific type theory in which every closed type is a SOGAT signature. As it is a structural type theory, it can be defined as a SOGAT itself. Signatures can be formalised in ToS+ without encoding overhead.
A translation from SOGAT signatures to GAT signatures based on a parallel substitution calculus. Thus, for every SOGAT, we obtain all of the semantics of GATs: a category of models with an initial object, (co)free models, notions of displayed models and sections, the fact that induction is equivalent to initiality, and so on. The GAT descriptions that we obtain are readable, do not contain occurrences of Yoneda as in usual presheaf function spaces. Correctness of the translation is showed by proving that internally to presheaves over a model of the GAT, a second-order model of the SOGAT is available.
We define an alternative translation producing a single substitution calculus.

作者開始展示如何把各種 logic 或程式語言定義成代數理論。這邊我寫下每個代數理論的 agda 版本

Schönfinkel's combinator calculus (Algebraic Theories) [ag-0003]

{-# OPTIONS --safe --without-K #-}
module ag-0003 where

open import MLTT.Spartan

Combinator calculus 可以看成一個代數理論，這時候它有

一個 sort of terms
一個 binary
兩個 nullary operations
兩個等式

record combinator-calculus : 𝓤₁ ̇ where
  field
    Tm : 𝓤₀ ̇

    _·_ : Tm → Tm → Tm

    K : Tm
    S : Tm

    Kβ : {u f : Tm} → K · u · f ＝ u
    Sβ : {f g u : Tm} → S · f · g · u ＝ f · u · (g · u)

  infixl 30 _·_

從這個符號可以明顯看出代數/模型的概念
Combinator calculus 的 quotiented syntax 是初始模型，它總是存在。
Notions of homomorphism, displayed/dependent model, induction, products and coproducts of models, free models, and so on, are derivable from the signature, as described in any book on universal algebra
Algebraic Theory 的 initial algebra 叫做 quotient inductive type

其他 single-sorted algebraic theories 著名案例：

邏輯：經典（或直覺主義）命題邏輯，定義為布林代數（或海廷代數）理論
代數：monoids、群、環、lattices 等等

Generalised algebraic theories (GATs) [local-3]

Generalised algebraic theories (GATs) 的 sort 可以 indexed by 其他 sort。案例有 typed combinator calculus、propositional logic with Hilbert-style proof theory、theories of graphs、preorders、categories 等等

Typed combinator calculus (Generalised algebraic theories) [ag-0004]

{-# OPTIONS --safe --without-K #-}
module ag-0004 where

open import MLTT.Spartan

record typed-combinator-calculus : 𝓤₁ ̇ where
  field

給 types 的 sort

    Ty : 𝓤₀ ̇

每個 type 對應（index）一個 term 用的 sort

    Tm : Ty → 𝓤₀ ̇

    ι : Ty
    _⇒_ : Ty → Ty → Ty

    _·_ : {A B : Ty} → Tm (A ⇒ B) → Tm A → Tm B

    K : {A B : Ty} → Tm (A ⇒ B ⇒ A)
    S : {A B C : Ty} → Tm ((A ⇒ B ⇒ C) ⇒ (A ⇒ B) ⇒ A ⇒ C)

    Kβ : {A B : Ty} {u : Tm A} {f : Tm B} → K · u · f ＝ u
    Sβ : {A B C : Ty}
      {f : Tm (A ⇒ B ⇒ C)}
      {g : Tm (A ⇒ B)}
      {u : Tm A}
      → S · f · g · u ＝ f · u · (g · u)

  infixl 40 _·_
  infixr 30 _⇒_

上述 Algebraic Theory 的通用代數特徵可以推廣到 GAT。具體來說，每個 GAT 都具有由 quotient inductive-inductive type 給出的語法，我們有 free 模型和 cofree 模型。

Second-order algebraic theories (SOATs) [local-4]

如果一個語言有變數或是綁定，就被定義成一個二階理論。

Lambda calculus (Second-order algebraic theories) [ag-0005]

{-# OPTIONS --safe --without-K #-}
module ag-0005 where

open import MLTT.Spartan

record lambda-calculus : 𝓤₁ ̇ where
  field
    Tm : 𝓤₀ ̇

lam 的（metatheory）型別不是一階的（not strictly positive）

    lam : (Tm → Tm) → Tm
    _·_ : Tm → Tm → Tm

    β : {f : Tm → Tm} {u : Tm} → lam f · u ＝ f u

  infixl 30 _·_

By the syntax of lambda calculus, we mean the syntax for the GAT of Definition 4. However, we still prefer to define lambda calculus as a SOGAT: it is a shorter definition, does not include boilerplate, and ensures that once translated to its first-order version, all operations respect substitution by construction.

此外，我們可以像邏輯框架一樣，使用二階表示來進行程式設計。這意味著，使用二階表示，我們可以定義 derivable operation 並證明 derivable 等式，而不是像證明 admissible 那樣需要歸納法。

舉例來說 Y combinator 是 derivable operation。可以證明它是 fixpoint combinator：

  Y : Tm
  Y = lam (λ f → (lam (λ x → f · (x · x))) · (lam (λ x → f · (x · x))))

  Y-is-fixed-point : {f : Tm} → Y · f ＝ f · (Y · f)
  Y-is-fixed-point {f} =
    Y · f                                                                 ＝⟨by-definition⟩
    lam (λ f → (lam (λ x → f · (x · x))) · (lam (λ x → f · (x · x)))) · f ＝⟨ β ⟩
    (λ f → (lam (λ x → f · (x · x))) · (lam (λ x → f · (x · x)))) f       ＝⟨ refl ⟩
    (lam (λ x → f · (x · x))) · (lam (λ x → f · (x · x)))                 ＝⟨ β ⟩
    f · ((lam (λ x → f · (x · x))) · (lam (λ x → f · (x · x))))           ＝⟨ refl ⟩
    f · ((λ f → (lam (λ x → f · (x · x))) · (lam (λ x → f · (x · x)))) f) ＝⟨ ap (f ·_) (β ⁻¹) ⟩
    f · (Y · f) ∎

這種推理對任何 second-order model 都有效，而且任何 first-order model 都可以在 internal language of presheaves over first-order model 裡被升級成 second-order model。

但注意沒有可用的 SOAT models $M$ 和 $N$ 之間的同態概念。為了討論同態或語法，我們將 SOAT 轉換為一階 GAT：加上上下文、替換、索引 Tm 以及所有基於上下文的操作，然後 lam 就變成了一個以擴展上下文中的項作為輸入的一階函數。轉換出來的 GAT 就是

Lambda calculus (GAT) [ag-0006]

{-# OPTIONS --safe --without-K #-}
module ag-0006 where

open import MLTT.Spartan hiding (_∘_; id)

作者這邊開始解釋從二階理論得出一階理論（GAT）的標準過程

record first-order-lambda-calculus : 𝓤₁ ̇  where
  field
    Con : 𝓤₀ ̇
    Sub : Con → Con → 𝓤₀ ̇

有 terminal object 的 category

    _∘_ : {Δ Γ Θ : Con} → Sub Δ Γ → Sub Θ Δ → Sub Θ Γ
    assoc : {A B C D : Con} {γ : Sub C D} {δ : Sub B C} {θ : Sub A B}
      → (γ ∘ δ) ∘ θ ＝ γ ∘ (δ ∘ θ)
    id : {Γ : Con} → Sub Γ Γ
    id-left : {A B : Con} {γ : Sub A B} → id ∘ γ ＝ γ
    id-right : {A B : Con} {γ : Sub A B} → γ ∘ id ＝ γ
    -- empty context: zero
    ◇ : Con
    ε : {Γ : Con} → Sub Γ ◇
    -- terminal
    ◇η : {Γ : Con} → (σ : Sub Γ ◇) → σ ＝ ε

sort Tm 現在 indexed by Con 且有一個 instantiation operation，這個 operation 是 functorial ([◦], [id]).

    Tm : Con → 𝓤₀ ̇
    _[_] : {Γ Δ : Con} → Tm Γ → Sub Δ Γ → Tm Δ
    [id] : {Γ : Con} {t : Tm Γ} → t [ id ] ＝ t
    [∘] : {Θ Γ Δ : Con} {t : Tm Γ} {γ : Sub Δ Γ} {δ : Sub Θ Δ} → t [ γ ∘ δ ] ＝ t [ γ ] [ δ ]

context extension 讓 contexts 是 natural number algebra

    _▹ : Con → Con

substitutions 是一串 terms，由組成元件表達

    _,,_ : {Δ Γ : Con} → Sub Δ Γ → Tm Δ → Sub Δ (Γ ▹)

有了 contexts 跟 substitutions，變數就可以定義成 De Bruijn indices:

0 = q
1 = q[p]
2 = q[p] [p]，以此類推

    p : {Γ : Con} → Sub (Γ ▹) Γ
    q : {Γ : Con} → Tm (Γ ▹)

應該滿足的等式規則

    ▹β₁ : {Δ Γ : Con} {γ : Sub Δ Γ} {t : Tm Δ}
      → p ∘ (γ ,, t) ＝ γ
    ▹β₂ : {Δ Γ : Con} {γ : Sub Δ Γ} {t : Tm Δ}
      → q [ γ ,, t ] ＝ t
    ▹η : {Δ Γ : Con} {σ : Sub Δ (Γ ▹)} → σ ＝ (p ∘ σ ,, q [ σ ])

    lam : {Γ : Con} → Tm (Γ ▹) → Tm Γ
    lam[] : {Δ Γ : Con} {γ : Sub Δ Γ} {t : Tm (Γ ▹)} → (lam t)[ γ ] ＝ lam (t [ γ ∘ p ,, q ])

    _·_ : {Γ : Con} → Tm Γ → Tm Γ → Tm Γ
    ·[] : {Δ Γ : Con} {γ : Sub Δ Γ} {t u : Tm Γ} → (t · u)[ γ ] ＝ t [ γ ] · (u [ γ ])

    β : {Δ Γ : Con} {γ : Sub Δ Γ} {t : Tm (Γ ▹)} {u : Tm Γ}
      → lam t · u ＝ t [ id ,, u ]

  infixl 40 _∘_
  infixl 30 _,,_
  infixl 40 _·_
  infixl 50 _[_]

Second-order generalised algebraic theories (SOGATs) [local-5]

SOGATs combine the two previous classes: sorts can be indexed over previous sorts and second-order operations are allowed.

Simply typed lambda calculus [ag-0007]

{-# OPTIONS --safe --without-K #-}
module ag-0007 where

open import MLTT.Spartan
open import UF.Equiv

record simply-typed-lambda-calculus : 𝓤₁ ̇  where
  field
    Ty : 𝓤₀ ̇
    _⇒_ : Ty → Ty → Ty

generalised 的部分是因為 index Tm by Ty

    Tm : Ty → 𝓤₀ ̇

綁定的部分

    lam : {A B : Ty} → (Tm A → Tm B) → Tm (A ⇒ B)
    _·_ : {A B : Ty} → Tm (A ⇒ B) → (Tm A → Tm B)

    stlc-cong : {A B : Ty} → Tm (A ⇒ B) ≃ (Tm A → Tm B)

一樣參照 first-order lambda calculus 的過程，加上 contexts、加上 substitutions，相應的 first-order 等式跟改寫，就會得到 simply-typed-lambda-calculus 的 GAT。

Minimal intuitionistic first-order logic (SOGAT) [ag-0008]

{-# OPTIONS --safe --without-K #-}
module ag-0008 where

open import MLTT.Spartan
open import UF.Subsingletons

record minimal-intuitionistic-first-order-logic : 𝓤₁ ̇  where
  field
    For : 𝓤₀ ̇
    Tm : 𝓤₀ ̇
    _⊃_ : For → For → For
    All : (Tm → For) → For
    Eq : Tm → Tm → For

    Pf : For → 𝓤₀ ̇
    Pf-is-prop : (A : For) → is-prop (Pf A)

    intro⊃ : {A B : For} → (Pf A → Pf B) → Pf (A ⊃ B)
    elim⊃ : {A B : For} → Pf (A ⊃ B) → (Pf A → Pf B)

    intro∀ : {A : Tm → For} → ((𝑡 : Tm) → Pf (A 𝑡)) → Pf (All A)
    elim∀ : {A : Tm → For} → Pf (All A) → ((𝑡 : Tm) → Pf (A 𝑡))

    introEq : {t : Tm} → Pf (Eq t t)
    elimEq : {t t' : Tm} → (A : Tm → For) → Pf (Eq t t') → Pf (A t) → Pf (A t')

一樣參照 first-order lambda calculus 的過程，加上 contexts、加上 substitutions，相應的 first-order 等式跟改寫。

Polymorphic lambda calculus (SOGAT) [ag-0009]

{-# OPTIONS --safe --without-K #-}
module ag-0009 where

open import MLTT.Spartan

record polymorphic-lambda-calculus : 𝓤₁ ̇  where
  field
    Ty : 𝓤₀ ̇
    Tm : Ty → 𝓤₀ ̇
    _⇒_ : Ty → Ty → Ty
    lam : {𝐴 𝐵 : Ty} → (Tm 𝐴 → Tm 𝐵) → Tm (𝐴 ⇒ 𝐵)
    _·_ : {𝐴 𝐵 : Ty} → Tm (𝐴 ⇒ 𝐵) → (Tm 𝐴 → Tm 𝐵)
    All : (Ty → Ty) → Ty
    Lam : {𝐴 : Ty → Ty} → ((𝑋 : Ty) → Tm (𝐴 𝑋)) → Tm (All 𝐴)
    _•_ : {𝐴 : Ty → Ty} → Tm (All 𝐴) → ((𝑋 : Ty) → Tm (𝐴 𝑋))

一樣參照 first-order lambda calculus 的過程，加上 contexts、加上 substitutions，相應的 first-order 等式跟改寫。

System Fω (SOGAT) [ag-000A]

{-# OPTIONS --safe --without-K #-}
module ag-000A where

open import MLTT.Spartan

record system-F-ω : 𝓤₁ ̇  where
  field
    □ : 𝓤₀ ̇
    Ty : □ → 𝓤₀ ̇

    _⇛_ : □ → □ → □
    LAM : {K L : □} → (Ty K → Ty L) → Ty (K ⇛ L)
    _●_ : {K L : □} → Ty (K ⇛ L) → (Ty K → Ty L)

    ∗ : □
    Tm : Ty ∗ → 𝓤₀ ̇

    All : {K : □} → (Ty K → Ty ∗) → Ty ∗
    Lam : {K : □}{A : Ty K → Ty ∗} → ((X : Ty K) → Tm (A X)) → Tm (All A)
    _•_ : {K : □}{A : Ty K → Ty ∗} → Tm (All A) → ((X : Ty K) → Tm (A X))

    _⇒_ : Ty ∗ → Ty ∗ → Ty ∗
    lam : {A B : Ty ∗} → (Tm A → Tm B) → Tm (A ⇒ B)
    _·_ : {A B : Ty ∗} → Tm (A ⇒ B) → (Tm A → Tm B)

System Fω 轉換後有

3 個 operations 綁定 Ty-variables
1 個 operation 綁定 term-variable

Minimal Martin-Löf type theory (SOGAT) [ag-000B]

{-# OPTIONS --safe --without-K #-}
module ag-000B where

open import MLTT.Spartan
open import MLTT.NaturalNumbers

variable
  𝑖 : ℕ

record minimal-martin-lof-type-theory : 𝓤₁ ̇  where
  field
    Ty : ℕ → 𝓤₀ ̇
    U : (𝑖 : ℕ) → Ty (succ 𝑖)
    Tm : Ty 𝑖 → 𝓤₀ ̇

    c : Ty 𝑖 → Tm (U 𝑖)
    El : Tm (U 𝑖) → Ty 𝑖

    PI : (A : Ty 𝑖) → (Tm A → Ty 𝑖) → Ty 𝑖
    Lift : Ty 𝑖 → Ty (succ 𝑖)

    lam : {𝐴 : Ty 𝑖}{𝐵 : Tm 𝐴 → Ty 𝑖} → ((𝑎 : Tm 𝐴) → Tm (𝐵 𝑎)) → Tm (PI 𝐴 𝐵)
    _·_ : {𝐴 : Ty 𝑖}{𝐵 : Tm 𝐴 → Ty 𝑖} → Tm (PI 𝐴 𝐵) → ((𝑎 : Tm 𝐴) → Tm (𝐵 𝑎))

    mk : {𝐴 : Ty 𝑖} → Tm 𝐴 → Tm (Lift 𝐴)
    un : {𝐴 : Ty 𝑖} → Tm (Lift 𝐴) → Tm 𝐴

這個理論的對應 GAT 得到一個具有族的範疇（CwF），更準確地說，是一個具有 N 個 families 的範疇，這些族配備了 familywise Π-types、宇宙以及族之間的一步向上提升。這些類型是 Ty : Con → N → Set 和 Tm : (Γ : Con) → Ty Γ 𝑖 → Set，其中 𝑖 論證隱含在後者中。

後面幾節

第三節 Theories of signatures as SOGATs，開始把 signatures 理論也寫成 SOGAT 來討論
第四節 Naive semantics of SOGAT signatures，討論 for any SOGAT signature 的 a notion of first-order model.
第五節 Direct semantics of SOGAT signatures，用一個更小心版本的 presheaf model 定義 first-order models of SOGATs
第六節 GAT signature semantics of SOGAT signatures，把 SOGAT signatures 轉成 GAT signatures 的方法