# Functional Hardware Description with Dependent Types Craig Ramsay November 2023 University of Strathclyde # The motivation (the problem & novelty) | | Paradigm | Typing<br>Discipline | Abstraction<br>Level | Hosting<br>Style | |-----------------|--------------------------|-------------------------------------------------|----------------------|--------------------------| | | | | | Traditional HDLs | | VHDL | Mixed /<br>Synchronous | Strong Typing | RTL | Stand-Alone | | Verilog | Mixed /<br>Synchronous | Weak Typing | RTL | Stand-Alone | | SystemVerilog | Mixed /<br>Synchronous | Strong Typing | RTL | Stand-Alone | | | | | High-Level Syn | thesis Languages | | Vivado HLS | Imperative | Strong Typing | Behavioural | Stand-Alone | | | | | | Functional HDLs | | Lava | Functional | Stronger Typing + | Gate | Embedded | | | | Hindley–Milner | | (Haskell) | | СλаЅН | Functional | Hindley–Milner Stronger Typing + Hindley–Milner | RTL | (Haskell)<br>Stand-Alone | | С\aSH<br>П-ware | Functional<br>Functional | Stronger Typing + | RTL<br>Gate | | ## Chapter 4 contribution A CλaSH case study of an application well-suited for EDSLs: - Motivates first-class staging - Motivates dependent types for ergonomics and verification - Open source low-cost, high-speed, parallel filters for direct RF sampling # The what (technical discussion) ## We explore an HDL that can: Represent circuits as plain functions (needs a stand-alone *compiler*) Ascribe meaning to synthesisable data types (needs a language with dependent types) ``` -- Unsigned binary addition in simulation addU : (w.x.v.c : Nat) Unsigned w x \rightarrow Unsigned w v \rightarrow Bit c \rightarrow Unsigned (S w) (plus c (plus x v)) pat c. cin \Rightarrow addU 0 0 0 c UNil UNil cin = UCons 0 c UNil cin pat w, c, xsn, xn, xbs, xb, ysn, yn, ybs, yb, cin ⇒ addU (S w) _ _ c (UCons w xsn xn xbs xb) (UCons w vsn vn vbs vb) cin = case (addBit _ _ _ cin xb yb) of pat a, b, prf, cin', lsb ⇒ (MkBitPair a b _ prf cin' lsb) ⇒ let rec = addU _ _ _ xbs ybs cin' ans = UCons _ _ rec lsb in eq1nd2 _ _ _ ans ``` Similar to Brady's proposal in "Constructing correct circuits", 2007. But what about synthesis? ``` -- Unsigned binary addition in simulation addU : (w.x.v.c : Nat) Unsigned w x \rightarrow Unsigned w v \rightarrow Bit c \rightarrow Unsigned (S w) (plus c (plus x v)) pat c, cin \Rightarrow addU 0 0 0 c UNil UNil cin = UCons 0 c UNil cin pat w, c, xsn, xn, xbs, xb, ysn, yn, ybs, yb, cin ⇒ addU (S w) _ _ c (UCons w xsn xn xbs xb) (UCons w vsn vn vbs vb) cin = case (addBit _ _ _ cin xb yb) of pat a, b, prf, cin', lsb ⇒ (MkBitPair a b _ prf cin' lsb) ⇒ let rec = addU _ _ _ xbs vbs cin' ans = UCons _ _ rec lsb in eq1nd2 _ _ _ ans ``` Goal is to safely reason about: circuit runtime vs elaboration-time vs typechecking only? ``` -- Unsigned binary addition in simulation addU : (w.x.v.c : Nat) Unsigned w x \rightarrow Unsigned w v \rightarrow Bit c \rightarrow Unsigned (S w) (plus c (plus x v)) pat c. cin ⇒ addU 0 0 0 c UNil UNil cin = UCons 0 c UNil cin \underline{pat} w, c, xsn, xn, xbs, xb, ysn, yn, ybs, yb, cin \Rightarrow addU (S w) _ _ c (UCons w xsn xn xbs xb) (UCons w vsn vn vbs vb) cin = case (addBit _ _ _ cin xb yb) of pat a, b, prf, cin', lsb ⇒ (MkBitPair a b _ prf cin' lsb) ⇒ let rec = addU _ _ _ xbs ybs cin' ans = UCons _ _ rec lsb in eq1nd2 _ _ _ ans ``` ### Chapter 5 contribution Further investigation of use cases for dependently typed HDLs, representing circuits as functions: - Minimal type complexity enjoy a single language for entire circuit lifetime - Moderate type complexity enjoy tracking and informing non-functional circuit properties at compile-time - Full functional verification Scales well for combinatorial DSP implementations, and shows promise for synchronous circuits. Take a small dependently typed software language, TinyIdris, then layer our experimental features on top. ## Our features: #### Erasure Distinguish typechecking time vs rest Also applies to *data* # Staging Distinguish elaboration vs circuit run-times # Synthesis Derive bit representations for user types Perform elaboration ## Chapter 6 contribution - toatie: an open source implementation for combinatorial circuits - Phases of circuit lifetime are the challenge - Two features used as software optimisation become necessary for an HDL - Synthesis can lean on two existing parts of dependently typed compilers ## Frasure Discard terms only needed during typechecking We use irrelevance (with ICC\*) to direct erasure absolutely Typechecker prevents path from *Irrelevant* back to *Relevant* $$\frac{\Gamma \vdash (\Pi\{x:S\} \to \mathcal{T}) : \mathsf{Type} \qquad \Gamma; \lambda\{x:S\} \vdash e:\mathcal{T} \qquad x \not\in \mathsf{FV}(\mathcal{E}[\![e]\!])}{\Gamma \vdash (\lambda\{x:S\}, e) : \Pi\{x:S\} \to \mathcal{T}}$$ ## Erasure After typechecking, ICC\*'s extraction performs erasure $$\begin{array}{lll} \mathcal{E} \llbracket x \rrbracket & = x & \text{(variables)} \\ \mathcal{E} \llbracket \Pi (x : S) \to T \rrbracket & = \Pi (x : \mathcal{E} \llbracket S \rrbracket) \to \mathcal{E} \llbracket T \rrbracket & \text{(Explicit } \Pi) \\ \mathcal{E} \llbracket \Pi \{x : S\} \to T \rrbracket & = \forall (x : \mathcal{E} \llbracket S \rrbracket) \to \mathcal{E} \llbracket T \rrbracket & \text{(Implicit } \Pi) \\ \mathcal{E} \llbracket \lambda (x : S). \ e \rrbracket & = \lambda (x : \mathcal{E} \llbracket S \rrbracket). \ \mathcal{E} \llbracket e \rrbracket & \text{(Explicit } \lambda) \\ \mathcal{E} \llbracket \lambda \{x : S\}. \ e \rrbracket & = \mathcal{E} \llbracket e \rrbracket & \text{(Implicit } \lambda) \\ \mathcal{E} \llbracket e \ u \rrbracket & = \mathcal{E} \llbracket e \rrbracket & \text{(Explicit application)} \\ \mathcal{E} \llbracket e \ u \rrbracket & = \mathcal{E} \llbracket e \rrbracket & \text{(Implicit application)} \\ \mathcal{E} \llbracket e \ u \rrbracket & = \mathcal{E} \llbracket e \rrbracket & \text{(Implicit application)} \\ \end{array}$$ # Staging Staging distinguishes elaboration and circuit run-time Ensures elaboration can complete *without* inspecting any circuit run-time values Uses the [...], $\sim$ , !, and $\langle ... \rangle$ syntax # Staging Typechecker extensions ensure consistient use Prevents values known only at circuit run-time from being used during elaboration time $$\frac{(\lambda x :_{n} S) \in \Gamma \qquad n \leq m}{\Gamma_{m} \vdash x : S}$$ (var <sub>$\lambda$</sub> ) # Automatic bit representations Since we represent circuits as plain functions, we need a way to synthesise user data types into bit representations Let's reuse the already required unification engine for help # Automatic bit representations ## Normalisation to a netlist We reuse the normalisation (by evaluation) system also required in typechecking to normalise down to a tiny language which is circuit-friendly: The *future* (what's left to do?) ### Further work - Support for synchronous logic - ...and its place in our correct-by-construction verification - · A fully-typed synthesis scheme - · A formalisation of synthesisability requirements - · A rebase on Idris 2 - Netlist optimisations for vendor tools # The impact (outputs and more contributions) ## Outputs Exploring Zynq MPSoC: With PYNQ and Machine Learning Applications L. Crockett, D. Northcote, C. Ramsay, F. Robinson, and R. Stewart Strathclyde Academic Media, Book — 2019 Control and Visualisation of a Software Defined Radio System on the Xilinx RFSoC Platform Using the PYNQ Framework J. Goldsmith, C. Ramsay, D. Northcote, K. W. Barlee, L. Crockett, and R. Stewart IEEE Open Access, Journal paper — 2020 · On Applications of Dependent Types to Parameterised Digital Signal Processing Circuits C. Ramsay, L. Crockett, and R. Stewart 2021 IEEE MWSCAS, Conference paper - 2021 · Low-cost, High-speed Parallel FIR Filters for RFSoC Front-Ends Enabled by CλaSH C. Ramsay, L. Crockett, and R. Stewart IEEE Asilomar, Conference paper — 2021 Data for toatie— A Hardware Description Language With Dependent Types C. Ramsay, L. Crockett, and R. Stewart Self-published, Digital artefact — 2022 ## HAFLANG an EPSRC project for the Hardware Acceleration of Functional Languagues haflang.github.io Appendix ``` -- Unsigned binary addition in simulation addU : (w.x.v.c : Nat) Unsigned w x \rightarrow Unsigned w v \rightarrow Bit c \rightarrow Unsigned (S w) (plus c (plus x v)) pat c, cin \Rightarrow addU 0 0 0 c UNil UNil cin = UCons 0 c UNil cin pat w, c, xsn, xn, xbs, xb, ysn, yn, ybs, yb, cin ⇒ addU (S w) _ _ c (UCons w xsn xn xbs xb) (UCons w vsn vn vbs vb) cin = case (addBit _ _ _ cin xb yb) of pat a, b, prf, cin', lsb ⇒ (MkBitPair a b _ prf cin' lsb) ⇒ let rec = addU _ _ _ xbs ybs cin' ans = UCons _ _ rec lsb in eq1nd2 _ _ _ ans ``` Let's subdue the "noise" ``` -- Unsigned binary addition in simulation addU : (w.x.v.c : Nat) Unsigned w x \rightarrow Unsigned w v \rightarrow Bit c \rightarrow Unsigned (S w) (plus c (plus x v)) pat c. cin \Rightarrow addU 0 0 0 c UNil UNil cin = UCons 0 c UNil cin pat w, c, xsn, xn, xbs, xb, ysn, yn, ybs, yb, cin ⇒ addU (S w) _ _ c (UCons w xsn xn xbs xb) (UCons w vsn vn vbs vb) cin = case (addBit _ _ _ cin xb yb) of pat a, b, prf, cin', lsb ⇒ (MkBitPair a b _ prf cin' lsb) ⇒ let rec = addU _ _ _ xbs ybs cin' ans = UCons _ _ rec lsb in eq1nd2 _ _ _ ans ``` What do the types guarantee? What do they not guarantee? ``` -- Unsigned binary addition in simulation addU : (w.x.v.c : Nat) Unsigned w x \rightarrow Unsigned w v \rightarrow Bit c \rightarrow Unsigned (S w) (plus c (plus x v)) pat c. cin \Rightarrow addU 0 0 0 c UNil UNil cin = UCons 0 c UNil cin pat w, c, xsn, xn, xbs, xb, ysn, yn, ybs, yb, cin ⇒ addU (S w) _ _ c (UCons w xsn xn xbs xb) (UCons w vsn vn vbs vb) cin = case (addBit _ _ _ cin xb yb) of pat a, b, prf, cin', lsb ⇒ (MkBitPair a b _ prf cin' lsb) ⇒ let rec = addU _ _ _ xbs ybs cin' ans = UCons _ _ rec lsb in eq1nd2 _ _ _ ans ``` Data constructors for synthesisable types have non-synthesisable arguments! ``` -- Unsigned binary addition in simulation addU : (w.x.v.c : Nat) Unsigned w x \rightarrow Unsigned w v \rightarrow Bit c \rightarrow Unsigned (S w) (plus c (plus x v)) pat c. cin \Rightarrow addU 0 0 0 c UNil UNil cin = UCons 0 c UNil cin pat w, c, xsn, xn, xbs, xb, ysn, yn, ybs, yb, cin ⇒ addU (S w) _ _ c (UCons w xsn xn xbs xb) (UCons w vsn vn vbs vb) cin = case (addBit _ _ _ cin xb yb) of pat a, b, prf, cin', lsb ⇒ (MkBitPair a b _ prf cin' lsb) ⇒ let rec = addU _ _ _ xbs ybs cin' ans = UCons _ _ rec lsb in eq1nd2 _ _ _ ans ``` How do we know which function arguments must be applied before the function becomes synthesisable? ``` -- Unsigned binary addition in simulation addU : (w.x.v.c : Nat) Unsigned w x \rightarrow Unsigned w v \rightarrow Bit c \rightarrow Unsigned (S w) (plus c (plus x v)) pat c, cin \Rightarrow addU 0 0 0 c UNil UNIl cin = UCons 0 c UNil cin pat w, c, xsn, xn, xbs, xb, ysn, yn, ybs, yb, cin ⇒ addU (S w) _ _ c (UCons w xsn xn xbs xb) (UCons w vsn vn vbs vb) cin = case (addBit _ _ _ cin xb yb) of pat a, b, prf, cin', lsb ⇒ (MkBitPair a b _ prf cin' lsb) ⇒ let rec = addU _ _ _ xbs ybs cin' ans = UCons _ _ rec lsb in eq1nd2 _ _ _ ans ``` We pattern match on circuit runtime values... Is this is irrefutably OK? Can elaboration complete? ``` -- Unsigned binary addition in simulation addU : (w.x.v.c : Nat) Unsigned w x \rightarrow Unsigned w v \rightarrow Bit c \rightarrow Unsigned (S w) (plus c (plus x v)) pat c, cin \Rightarrow addU 0 0 0 c UNil UNil cin = UCons 0 c UNil cin pat w, c, xsn, xn, xbs, xb, ysn, yn, ybs, yb, cin ⇒ addU (S w) _ _ c (UCons w xsn xn xbs xb) (UCons w vsn vn vbs vb) cin = case (addBit _ _ _ cin xb yb) of pat a, b, prf, cin', lsb ⇒ (MkBitPair a b _ prf cin' lsb) ⇒ let rec = addU _ _ _ xbs vbs cin' ans = UCons _ _ rec lsb in eq1nd2 _ _ _ ans ``` Goal is to safely reason about: circuit runtime vs elaboration-time vs typechecking only? ## Dangers in unsigned binary addition — reprise ``` -- Unsigned binary addition (in toatie) addU : (w : Nat) \rightarrow {x,y,c : Nat} \rightarrow \langle Unsigned w x \rangle \rightarrow \langle Unsigned w y \rangle \rightarrow \langle Bit c \rangle \rightarrow ( Unsigned (S w) (plus c (plus x v)) ) pat c, cin \Rightarrow addU 0 {0} {0} {c} [UNil] [UNil] cin = ¶ UCons { } {0} {c} UNil ~cin ■ pat w, c, xsn, xn, xbs, xb, ysn, yn, ybs, yb, cin ⇒ UCons {w} {ysn} {yn} ybs yb ] case (addBit { } { } ~ cin xb vb) of pat a, b, prf, cin', lsb \Rightarrow (MkBitPair {a} {b} {_} {prf} cin' lsb) \Longrightarrow let rec = ~(addU _ {_} {_} {__} [xbs] [ybs] [cin']) ans = UCons \{ \} \{ \} \{ \} rec Lsb in eqInd2 {_} {_} {_}} {prfAddU c xn yn a b xsn ysn prf} \{\lambda h \Rightarrow Unsigned (S (S w)) h\} ans ``` Data constructors for synthesisable types have non-synthesisable arguments! How do we know which function arguments must be applied before the function becomes synthesisable? We pattern match on circuit runtime values... ## Dangers in unsigned binary addition — reprise ``` -- Unsigned binary addition (in toatie) addU : (w : Nat) \rightarrow \{x,y,c : Nat\} \rightarrow \langle Unsigned w x \rangle \rightarrow \langle Unsigned w y \rangle \rightarrow \langle Bit c \rangle \rightarrow ( Unsigned (S w) (plus c (plus x v)) ) pat c, cin \Rightarrow addU 0 {0} {0} {c} [UNil] [UNil] cin = ¶ UCons { } {0} {c} UNil ~cin ■ \underline{pat} w, c, xsn, xn, xbs, xb, ysn, yn, ybs, yb, cin \Rightarrow UCons {w} {ysn} {yn} ybs yb ] case (addBit { } { } ~cin xb vb) of pat a, b, prf, cin', lsb \Rightarrow (MkBitPair {a} {b} {_} {prf} cin' lsb) \Longrightarrow let rec = ~(addU _ {_} {_} {__} [xbs] [ybs] [cin']) ans = UCons \{ \} \{ \} \{ \} rec lsb in ealnd2 { } { } { } \{\lambda h \Rightarrow Unsigned (S (S w)) h\} ans ``` Data constructors for synthesisable types have non-synthesisable arguments! How do we know which function arguments must be applied before the function becomes synthesisable? We pattern match on circuit runtime values... ## Normalisation to a netlist We reuse the normalisation (by evaluation) system also required in typechecking Our irrelevance and staging annotations mean we can hopefully normalise down to a tiny language which is circuit-friendly: | $ au$ , $\sigma ::= D_S \overline{\tau}$ | Types<br>Fully applied simple type | |----------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------| | $a := x \\ C_S \overline{a} $ | Argument expressions<br>Local variable<br>Fully applied simple data constructor | | $\begin{array}{l} e ::= \\ \times \\ \mid C_S \ \overline{a} \\ \mid case \times of \ \overline{alt} \ [default \ a] \\ \mid \pi_i^C \times \end{array}$ | Subexpressions<br>Local variable<br>Fully applied simple data constructor<br>Case with optional default<br>Projection | | $\mathit{alt} ::= \mathit{C}_{S} \; \overline{\mathit{x}} \rightarrow \mathit{a}$ | Alternatives | | $g::=\lambda\overline{x: au}$ . let $\overline{y:\sigma\mapsto\overline{e}}$ in $z$ | Top-level circuit |