Lambda Buffers introduction

LambdaBuffers banner

LambdaBuffers is a schema language (similar to ProtoBuffers, ADL, ASN.1, JSON Schema, etc.) and associated code generation toolkit. The goal of this project is to provide developers tools to define algebraic data types in a language-agnostic format such that shared data types can be declared in one place while maintaining compatibility across a plethora of supported languages.

Users may refer to the comparison matrix for an in-depth comparison of LambdaBuffers' features against the feature-set of other popular schema-languages.

At a glance, you may wish to choose LambdaBuffers instead of one of its competitors if your project requires:

  1. Parameterized Data Types (aka. type functions): Unlike ProtoBuffers or JSON Schema, LambdaBuffers allows users to define algebraic data types which take type variable arguments. If your project's domain is most accurately represented by parameterized data types, LambdaBuffers may be a good choice for your needs.

  2. Opaque Types: Almost every competing schema language provides users a fixed set of builtin or primitive types, which are handled in a special manner by the code generation and cannot be extended. LambdaBuffers, by contrast, allows users to add their own builtin types and extend the existing code generation framework to handle those builtins in a manner intended by the users. There are no special primitive types in LambdaBuffers; a user-defined primitive type is defined in exactly the same way (i.e. as an opaque type) as a LambdaBuffers "builtin".

  3. Typeclass Support: While nearly every schema language supports generating type definitions in supported target languages, to our knowledge no schema language supports generating commonly used functions that operate on those types. Unlike other schema languages, LambdaBuffers supports code generation for typeclass instances (or the equivalent in languages that lack support for typeclasses) to reduce the amount of boilerplate required to productively make use of the generated types. While LambdaBuffers is still a work-in-progress, we expect that, upon completion, an extensive test suite will provide a high degree of assurance that the instances/methods generated by the LambdaBuffers code generator behave identically.

Using LambdaBuffers

LambdaBuffers heavily relies on Nix to orchestrate code generation and dependency management. If you don't have Nix installed with flakes enabled, please refer to the Getting Started guide.

To set up a project with LambdaBuffers and multi-language stack, refer to our demo project at lambda-for-cardano.

To see, how LambdaBuffers code is transpiled into other languages, see:

LambdaBuffers to Haskell

Let's take a look at how LambdaBuffers modules map into Haskell modules and how LambdaBuffers type definitions map into Haskell type definitions.

We'll use the lbf-prelude-to-haskell CLI tool which is just a convenient wrapper over the raw lbf CLI. We can get this tool by either loading the LambdaBuffers Nix environment that comes packaged with all the CLI tools:

$ nix develop github:mlabs-haskell/lambda-buffers#lb
$ lbf<tab>
lbf                        lbf-plutus-to-purescript   lbf-prelude-to-purescript
lbf-plutus-to-haskell      lbf-prelude-to-haskell

Or we can simply just refer directly to the lbf-prelude-to-haskell CLI by nix run github:mlabs-haskell/lambda-buffers#lbf-prelude-to-haskell.

In this chapter, we're going to use the latter option.

Let's now use lbf-prelude-to-haskell to process the Document.lbf schema.

module Document

-- Importing types
import Prelude (Text, List, Set, Bytes)

-- Author
sum Author = Ivan | Jovan | Savo

-- Reviewer
sum Reviewer = Bob | Alice

-- Document
record Document a = {
  author : Author,
  reviewers : Set Reviewer,
  content : Chapter a
 }

-- Chapter
record Chapter a = {
  content : a,
  subChapters : List (Chapter a)
 }

-- Some actual content
sum RichContent = Image Bytes | Gif Bytes | Text Text

-- Rich document
prod RichDocument = (Document RichContent)
$ nix run github:mlabs-haskell/lambda-buffers#lbf-prelude-to-haskell -- Document.lbf
$ find autogen/
autogen/
autogen/LambdaBuffers
autogen/LambdaBuffers/Document.hs
autogen/build.json

As we can see the autogen directory has been created that contains the generated Haskell modules. Note the autogen/build.json file as it contains all the necessary Hackage dependencies the generated module needs in order to be properly compiled by GHC.

The outputted Haskell module in autogen/LambdaBuffers/Document.hs:

module LambdaBuffers.Document (Author(..)
                              , Chapter(..)
                              , Document(..)
                              , Reviewer(..)
                              , RichContent(..)
                              , RichDocument(..)) where

import qualified LambdaBuffers.Prelude
import qualified Prelude


data Author = Author'Ivan  | Author'Jovan  | Author'Savo  deriving Prelude.Show

data Chapter a = Chapter { chapter'content :: a
                         , chapter'subChapters :: LambdaBuffers.Prelude.List (Chapter a)} deriving Prelude.Show

data Document a = Document { document'author :: Author
                           , document'reviewers :: LambdaBuffers.Prelude.Set Reviewer
                           , document'content :: Chapter a} deriving Prelude.Show

data Reviewer = Reviewer'Bob  | Reviewer'Alice  deriving Prelude.Show

data RichContent = RichContent'Image LambdaBuffers.Prelude.Bytes
                    | RichContent'Gif LambdaBuffers.Prelude.Bytes
                    | RichContent'Text LambdaBuffers.Prelude.Text deriving Prelude.Show

newtype RichDocument = RichDocument (Document RichContent) deriving Prelude.Show

We can compile the code with the following commands. Note the dev shell dev-prelude-haskell as it includes the LambdaBuffers.Prelude dependency.

$ nix develop github:mlabs-haskell/lambda-buffers#dev-prelude-haskell
$ ghc autogen/LambdaBuffers/Document.hs
[1 of 1] Compiling LambdaBuffers.Document ( autogen/LambdaBuffers/Document.hs, autogen/LambdaBuffers/Document.o )

Sum types

The types Author, Reviewer, and RichContent have been declared as sum types in the LambdaBuffers schema using the sum keyword.

As we can see, nothing too surprising here, all the sum types become data in Haskell.

The only thing to notice is that the type name was prepended with ' (single quote) to the defined constructor names as to make sure they are unique.

Product types

The type RichDocument have been declared as a product type in the LambdaBuffers schema using the prod keyword.

They become Haskell newtype if they have a single type in their body, otherwise they are data.

Note that the constructor has the same name as the type.

Record types

The types Document and Chapter have been declared as record types in the LambdaBuffers schema using the record keyword.

Like with product types, they become Haskell newtype if they have a single type in their body, otherwise they are data.

Also like with product types, the constructor has the same name as the type.

The field names, similar to sum constructor names, are prepended with the lowercased name of the type with a single quote (') to maintain uniqueness.

LambdaBuffers to PlutusTx

Let's take a look at how LambdaBuffers modules map into PlutusTx modules, and how LambdaBuffers type definitions map into PlutusTx type definitions.

We'll use the lbf-plutus-to-plutustx CLI tool which is just a convenient wrapper over the raw lbf CLI. One can get this tool by either loading the LambdaBuffers Nix environment that comes packaged with all the CLI tools:

$ nix develop github:mlabs-haskell/lambda-buffers#lb
$ lbf<tab>
lbf                          lbf-plutus-to-haskell        lbf-plutus-to-purescript     lbf-plutus-to-typescript     lbf-prelude-to-purescript    lbf-prelude-to-typescript
lbf-list-modules-typescript  lbf-plutus-to-plutustx       lbf-plutus-to-rust           lbf-prelude-to-haskell       lbf-prelude-to-rust

Alternatively, one can simply refer directly to the lbf-plutus-to-plutustx CLI by executing nix run github:mlabs-haskell/lambda-buffers#lbf-plutus-to-plutustx; or for a more interactive experience, one can enter the development shell nix develop github:mlabs-haskell/lambda-buffers#dev-plutus-haskell which includes the lbf-plutus-to-plutustx executable along with a version of GHC with the required package set to compile the autogenerated PlutusTx projects.

Code generation

Owing to the fact that PlutusTx is a GHC plug-in, the mapping from LambdaBuffers modules to PlutusTx modules closely mirrors the LambdaBuffers to Haskell mapping described in the Haskell chapter. The key points are as follows.

  • LambdaBuffers' data types (sums, products, and records) are mapped similarly to the Haskell backend.

  • The LambdaBuffers' Prelude.Eq instances are mapped to PlutusTx.Eq.Eq instances.

  • The LambdaBuffers' Plutus.V1.PlutusData instances are mapped to a PlutusTx.ToData instance and a PlutusTx.FromData instance.

  • The opaque types in LambdaBuffers' Prelude are often mapped to the corresponding PlutusTx Prelude type e.g. LambdaBuffers' Prelude's opaque Prelude.Map type is mapped to PlutusTx.AssocMap. The full mapping can be found here

  • The opaque types in LambdaBuffers' Plutus.V1 and Plutus.V2 modules are mapped to the obvious corresponding Plutus Ledger API types. The full mapping can be found here.

  • A LambdaBuffers module, say <lb-module>, is translated to the PlutusTx module LambdaBuffers.<lb-module>.PlutusTx

Example code generation

In the following example, we'll enter a development shell with the lbf-plutus-to-plutustx executable, create an example LambdaBuffers schema file, use the PlutusTx backend to compile it, inspect the generated code and files, and compile the autogenerated code with GHCi.

$ nix develop github:mlabs-haskell/lambda-buffers#dev-plutus-haskell
$ cat > Example.lbf
module Example

import Prelude
import Plutus.V1 (PlutusData, AssetClass)

record Example a = {
  foo : AssetClass,
  bar : a
  }

derive Eq (Example a)
derive PlutusData (Example a)

$ lbf-plutus-to-plutustx Example.lbf
Success from: /nix/store/dcwir06wiq7pd3g8dfljhfxscm8yss94-lambda-buffers-compiler-exe-lbc-0.1.0.0/bin/lbc compile --input-file .work/compiler-input.pb --output-file .work/compiler-output.pb [INFO]
Compilation OK [INFO]
Success from: /nix/store/sqqsg2jx871saxfxj2mly5g4pd6qsb64-lbg-plutustx/bin/lbg-plutustx --input .work/codegen-input.pb --output .work/codegen-output.pb --gen-dir autogen --gen-class Prelude.Eq --gen-class Plutus.V1.PlutusData '--config=/nix/store/x5f8a22dwx008wlv24xf8a6mdm65flil-codegen-configs/plutustx-prelude.json' '--config=/nix/store/x5f8a22dwx008wlv24xf8a6mdm65flil-codegen-configs/plutustx-plutus.json' Example [INFO]
Codegen OK [INFO]

$ find autogen
autogen/
autogen/LambdaBuffers
autogen/LambdaBuffers/Example
autogen/LambdaBuffers/Example/PlutusTx.hs
autogen/build.json

$ cat autogen/LambdaBuffers/Example/PlutusTx.hs

{-# LANGUAGE NoImplicitPrelude,NoPolyKinds #-}
{-# OPTIONS_GHC -fno-ignore-interface-pragmas -fno-omit-interface-pragmas -fno-specialise -fno-strictness -fobject-code #-}
module LambdaBuffers.Example.PlutusTx (Example(..)) where

import qualified LambdaBuffers.Plutus.V1.PlutusTx
import qualified LambdaBuffers.Prelude.PlutusTx
import qualified LambdaBuffers.Runtime.PlutusTx.LamVal
import qualified PlutusTx
import qualified PlutusTx.Bool
import qualified PlutusTx.Builtins
import qualified PlutusTx.Eq
import qualified PlutusTx.Maybe
import qualified PlutusTx.Prelude
import qualified Prelude


data Example a = Example { example'foo :: LambdaBuffers.Plutus.V1.PlutusTx.AssetClass
                         , example'bar :: a} deriving Prelude.Show


instance (PlutusTx.Eq.Eq a) => PlutusTx.Eq.Eq (Example a) where
  {-# INLINABLE (==) #-}
  (==) = (\x0 -> (\x1 -> (PlutusTx.Bool.&&) ((PlutusTx.Eq.==) (example'foo x0) (example'foo x1)) ((PlutusTx.Eq.==) (example'bar x0) (example'bar x1)) ) )

instance (PlutusTx.ToData a) => PlutusTx.ToData (Example a) where
  {-# INLINABLE toBuiltinData #-}
  toBuiltinData = (\x0 -> PlutusTx.Builtins.mkList ([PlutusTx.toBuiltinData (example'foo x0)
                                                     , PlutusTx.toBuiltinData (example'bar x0)]) )

instance (PlutusTx.FromData a) => PlutusTx.FromData (Example a) where
  {-# INLINABLE fromBuiltinData #-}
  fromBuiltinData = (\x0 -> LambdaBuffers.Runtime.PlutusTx.LamVal.casePlutusData ((\x1 -> (\x2 -> PlutusTx.Maybe.Nothing ) )) ((\x3 -> case x3 of
                                                                                                                                         [x4
                                                                                                                                          , x5] -> (PlutusTx.Prelude.>>=) (PlutusTx.fromBuiltinData (x4)) ((\x6 -> (PlutusTx.Prelude.>>=) (PlutusTx.fromBuiltinData (x5)) ((\x7 -> PlutusTx.Maybe.Just (Example { example'foo = x6
                                                                                                                                                                                                                                                                                                                , example'bar = x7 }) )) ))
                                                                                                                                         x8 -> PlutusTx.Maybe.Nothing )) ((\x9 -> PlutusTx.Maybe.Nothing )) ((\x10 -> PlutusTx.Maybe.Nothing )) (x0) )[10:19 PM] [jared@pletbjerg ~/Documents/Work/lambda-buffers]$

$ ghci autogen/LambdaBuffers/Example/PlutusTx.hs
...

Example project

A complete example project using the generated PlutusTx modules can be found here.

LambdaBuffers to Purescript

Let's take a look at how LambdaBuffers modules map into Purescript modules and how LambdaBuffers type definitions map into Purescript type definitions.

We'll use the lbf-prelude-to-purescript CLI tool which is just a convenient wrapper over the raw lbf CLI. We can get this tool by either loading the LambdaBuffers Nix environment that comes packaged with all the CLI tools:

$ nix develop github:mlabs-haskell/lambda-buffers#lb
$ lbf<tab>
lbf                        lbf-plutus-to-purescript   lbf-prelude-to-purescript
lbf-plutus-to-haskell      lbf-prelude-to-haskell

Or we can simply just refer directly to the lbf-prelude-to-purescript CLI by nix run github:mlabs-haskell/lambda-buffers#lbf-prelude-to-purescript.

In this chapter, we're going to use the latter option.

Let's now use lbf-prelude-to-purescript to process the Document.lbf schema

module Document

-- Importing types
import Prelude (Text, List, Set, Bytes)

-- Author
sum Author = Ivan | Jovan | Savo

-- Reviewer
sum Reviewer = Bob | Alice

-- Document
record Document a = {
  author : Author,
  reviewers : Set Reviewer,
  content : Chapter a
 }

-- Chapter
record Chapter a = {
  content : a,
  subChapters : List (Chapter a)
 }

-- Some actual content
sum RichContent = Image Bytes | Gif Bytes | Text Text

-- Rich document
prod RichDocument = (Document RichContent)
$ nix run github:mlabs-haskell/lambda-buffers#lbf-prelude-to-purescript -- Document.lbf
$ find autogen/
autogen/
autogen/build.json
autogen/LambdaBuffers
autogen/LambdaBuffers/Document.purs

As we can see the autogen directory has been created that contains the generated Purescript modules. Note the autogen/build.json file as it contains all the necessary dependencies the generated module needs in order to be properly compiled by purs compiler.

The outputted Purescript module in autogen/LambdaBuffers/Document.hs:

module LambdaBuffers.Document (Author(..)
                              , Chapter(..)
                              , Document(..)
                              , Reviewer(..)
                              , RichContent(..)
                              , RichDocument(..)) where

import LambdaBuffers.Prelude as LambdaBuffers.Prelude
import Data.Generic.Rep as Data.Generic.Rep
import Data.Newtype as Data.Newtype
import Data.Show as Data.Show
import Data.Show.Generic as Data.Show.Generic


data Author = Author'Ivan  | Author'Jovan  | Author'Savo 
derive instance Data.Generic.Rep.Generic Author _
instance Data.Show.Show Author where
  show = Data.Show.Generic.genericShow

newtype Chapter a = Chapter { content :: a
                            , subChapters :: LambdaBuffers.Prelude.List (Chapter a)}
derive instance Data.Newtype.Newtype (Chapter a) _
derive instance Data.Generic.Rep.Generic (Chapter a) _
instance (Data.Show.Show a) => Data.Show.Show (Chapter a) where
  show = Data.Show.Generic.genericShow

newtype Document a = Document { author :: Author
                              , reviewers :: LambdaBuffers.Prelude.Set Reviewer
                              , content :: Chapter a}
derive instance Data.Newtype.Newtype (Document a) _
derive instance Data.Generic.Rep.Generic (Document a) _
instance (Data.Show.Show a) => Data.Show.Show (Document a) where
  show = Data.Show.Generic.genericShow

data Reviewer = Reviewer'Bob  | Reviewer'Alice 
derive instance Data.Generic.Rep.Generic Reviewer _
instance Data.Show.Show Reviewer where
  show = Data.Show.Generic.genericShow

data RichContent = RichContent'Image LambdaBuffers.Prelude.Bytes
                    | RichContent'Gif LambdaBuffers.Prelude.Bytes
                    | RichContent'Text LambdaBuffers.Prelude.Text
derive instance Data.Generic.Rep.Generic RichContent _
instance Data.Show.Show RichContent where
  show = Data.Show.Generic.genericShow

newtype RichDocument = RichDocument (Document RichContent)
derive instance Data.Newtype.Newtype RichDocument _
derive instance Data.Generic.Rep.Generic RichDocument _
instance Data.Show.Show RichDocument where
  show = Data.Show.Generic.genericShow

Sum types

The types Author, Reviewer, and RichContent have been declared as sum types in the LambdaBuffers schema using the sum keyword.

As we can see, nothing too surprising here, all the sum types become data in Purescript.

The only thing to notice is that the type name was prepended with ' (single quote) to the defined constructor names as to make sure they are unique.

Product types

The type RichDocument have been declared as a product type in the LambdaBuffers schema using the prod keyword.

They become Purescript newtype if they have a single type in their body, otherwise they are data.

Note that the constructor has the same name as the type.

Record types

The types Document and Chapter have been declared as record types in the LambdaBuffers schema using the record keyword.

They always become Purescript newtype, and wrapped within is a Purescript record type with the fields named exactly like they are named in the LambdaBuffers source module.

Also like with product types, the constructor has the same name as the type.

LambdaBuffers for Plutarch

Plutarch is a typed eDSL in Haskell for writing efficient Plutus Core validators.

LambdaBuffers creates Plutarch type definitions and associated Plutarch type class implementations for PlutusType, PIsData and PShow classes.

Additionally, when instructed by a LambdaBuffers derive statement type class implementations for PEq and PTryFrom are also printed.

A small example:

❯ nix develop github:mlabs-haskell/lambda-buffers#dev-plutarch
❯ cat > Example.lbf
module Example

import Prelude
import Plutus.V1 (PlutusData, AssetClass)

record Example a = {
  foo : AssetClass,
  bar : a
  }

derive Eq (Example a)
derive Json (Example a)
derive PlutusData (Example a)

❯ lbf-plutus-to-plutarch Example.lbf
[lbf][INFO] Compilation OK
[lbf][INFO] Codegen OK

❯ find autogen/
autogen/
autogen/build.json
autogen/LambdaBuffers
autogen/LambdaBuffers/Example
autogen/LambdaBuffers/Example/Plutarch.hs

For a full example see Example.

LambdaBuffers modules

Writing .lbf schemas with API types intended for Plutarch backend will typically use the following LambdaBuffers schema modules:

  1. Prelude,
  2. Plutus.V1,
  3. Plutus.V2.

Take a look at Example.lbf schema as an example.

Haskell libraries

The necessary LambdaBuffers runtime libraries a typical Plutarch project needs when working with LambdaBuffers:

  1. lbr-plutarch a Haskell runtime library necessary for working with lbf-xyz libraries.
  2. lbf-prelude-plutarch that contains the LambdaBuffers Prelude schema library generated by LambdaBuffers.
  3. lbf-plutus-plutarch that contains the LambdaBuffers Plutus schema library generated by LambdaBuffers.

Of course, additional imports for Plutarch libraries are also necessary plutarch and optionally plutarch-extra.

For a full example see Example.

Inspecting the generated output

You can inspect the generated libraries using Nix:

❯ nix build .#lbf-prelude-plutarch
❯ find result/autogen/
result/autogen/
result/autogen/LambdaBuffers
result/autogen/LambdaBuffers/Prelude
result/autogen/LambdaBuffers/Prelude/Plutarch.hs

❯ nix build .#lbf-plutus-plutarch
❯ find result/autogen/
result/autogen/
result/autogen/LambdaBuffers
result/autogen/LambdaBuffers/Plutus
result/autogen/LambdaBuffers/Plutus/V2
result/autogen/LambdaBuffers/Plutus/V2/Plutarch.hs
result/autogen/LambdaBuffers/Plutus/V1
result/autogen/LambdaBuffers/Plutus/V1/Plutarch.hs

Haskell modules

The set of imports a Plutarch program using LambdaBuffers would typically need is the following:

import LambdaBuffers.Plutus.V1.Plutarch ()
import LambdaBuffers.Plutus.V2.Plutarch ()
import LambdaBuffers.Prelude.Plutarch ()
import LambdaBuffers.Runtime.Plutarch ()
import Plutarch ()
import Plutarch.Prelude ()
import Plutarch.Api.V1 ()
import Plutarch.Api.V2 ()
  1. LambdaBuffers.Plutus.V1.Plutarch is a module generated from Plutus.V1 LambdaBuffers schema and provided by the lbf-plutus-plutarch runtime library.
  2. LambdaBuffers.Plutus.V2.Plutarch is a module generated from Plutus.V2 LambdaBuffers schema and provided by the lbf-plutus-plutarch runtime library.
  3. LambdaBuffers.Prelude.Plutarch is a module generated from Prelude LambdaBuffers schema and provided by the lbf-prelude-plutarch runtime library.
  4. LambdaBuffers.Runtime.Plutarch is a module provided by the lbr-plutarch runtime library.

Generated Plutarch module for a LambdaBuffers schema Foo/Bar.lbf (ie. Foo.Bar) is stored at Foo/Bar/Plutarch.hs

Restrictions

Plutarch backend doesn't support recursive type definitions unfortunately (see #131).

The following will not work:

module ModuleWithRecursiveType

import Prelude (Eq)
import Plutus.V1 (PlutusData)

sum List a = Cons a (List a) | Nil
derive Eq (List a)
derive PlutusData (List a)

Additionally, LambdaBuffers record types are mapped to Plutarch product types:

module ModuleWithARecordType

import Prelude (Eq, Integer, Bool)
import Plutus.V1 (PlutusData)

record Foo = {
  bar: Integer,
  baz: Bool
  }
derive Eq Foo
derive PlutusData Foo

Essentially, the record definitions are 'degraded' into product types such that the order of product fields is the order of record fields as they are defined at source.

For example the Foo record defined above would have no difference in Plutarch if it was defined as product Foo below:

prod Foo = Integer Bool

The Plutarch backend doesn't support the use of Char, Text, Bytes (there's a Plutus.V1.Bytes), Set and Map (there's a Plutus.V1.Map) from LambdaBuffers Prelude module.

Plutarch

Type definition mapping

Plutarch backend supports all types from the LambdaBuffers Plutus schema library, as to enable full featured Plutus script development.

Additionally, it also supports some types from the LambdaBuffers Prelude schema library, namely Bool, Integer, Maybe, Either and List.

module Foo

sum Sum = Some a | Nothing

record Record a = {
  foo : Bytes,
  bar: a
}

prod Product a = Bytes a

translates into Plutarch equivalent:

module LambdaBuffers.Foo.Plutarch (Sum(..), Record(..), Product(..)) where

import qualified LambdaBuffers.Plutus.V1.Plutarch
import qualified LambdaBuffers.Prelude.Plutarch
import qualified LambdaBuffers.Runtime.Plutarch
import qualified Plutarch
import qualified Plutarch.Bool
import qualified Plutarch.Builtin
import qualified Plutarch.Internal.PlutusType
import qualified Plutarch.Prelude
import qualified Plutarch.Show
import qualified Plutarch.TryFrom
import qualified Plutarch.Unsafe

data Sum (a :: PType) (s :: Plutarch.S) = Sum'Some (Plutarch.Term s (Plutarch.Builtin.PAsData LambdaBuffers.Plutus.V1.Plutarch.Bytes)) (Plutarch.Term s (Plutarch.Builtin.PAsData PAsData))
                                  | Sum'Nothing
  deriving stock GHC.Generics.Generic
  deriving anyclass Plutarch.Show.PShow

data Record (a :: PType) (s :: Plutarch.S) = Record (Plutarch.Term s (Plutarch.Builtin.PAsData LambdaBuffers.Plutus.V1.Plutarch.Bytes)) (Plutarch.Term s (Plutarch.Builtin.PAsData PAsData))
  deriving stock GHC.Generics.Generic
  deriving anyclass Plutarch.Show.PShow

data Product (a :: PType) (s :: Plutarch.S) = Product (Plutarch.Term s (Plutarch.Builtin.PAsData LambdaBuffers.Plutus.V1.Plutarch.Bytes)) (Plutarch.Term s (Plutarch.Builtin.PAsData PAsData))
  deriving stock GHC.Generics.Generic
  deriving anyclass Plutarch.Show.PShow

Type class implementations

Plutarch has a couple of fundamental type classes essential to its operations namely, PlutusType, PIsData, PTryFrom and PEq.

PlutusType

Printing an implementation for this class for a particular type is governed by derive Plutus.V1.PlutusData <type> statements in .lbf schemas.

PlutusType serves to (de)construct Plutarch eDSL terms from Haskell 'native' terms.

class PlutusType (a :: PType) where
  type PInner a :: PType
  pcon' :: forall s. a s -> Term s (PInner a)
  pmatch' :: forall s b. Term s (PInner a) -> (a s -> Term s b) -> Term s b

Additionally, Plutarch enables specifying terms to have different 'value' representation, like Scott encoded terms or PlutusData encoded terms. This is what the PInner type family is used to specify. LambdaBuffers only cares about PlutusData encoded terms since we're using it to specify Plutus datum structures.

The task is to generate a pcon' implementation such that we can construct Plutarch Terms that have some PInner representation of type PData, from Haskell 'native' values. The pcon' implementation must match the LB Plutus PlutusData encoding class standard, and so we'll use the same 'to Plutus data' specification to generate pcon' implementations.

Constructing is always only one part of the story, there's also deconstruction that is captured by the pmatch' method. This method serves to 'pattern match' on a value that was already constructed using pcon' and dispatch said value to a provided continuation function. It's important to note that there's a subtle but important distinction to be made between the ptryFrom and pmatch' methods. pmatch' assumes that the value it receives is indeed correct, as it was constructed using the pcon' method. This means that pmatch' should never error, and if it does that means the implementation is wrong. ptryFrom is different, as it takes some PData and tries to parse it into a PType, but can fail.

However, in LambdaBuffers, both of these methods follow the exact same logical pattern, and they correspond and can be generated using the from Plutus data specification.

PTryFrom

Printing an implementation for this class for a particular type is governed by derive Plutus.V1.PlutusData <type> statements in .lbf schemas.

PTryFrom serves specify how PData is 'parsed' into a Plutarch type. N It's generally used to convert between Plutarch types, but that's a fairly general use case, and we generally use this class in a very narrow form to specify how PData is 'parsed' into a Plutarch type.

class PSubtype a b => PTryFrom (a :: PType) (b :: PType) where
  type PTryFromExcess a b :: PType
  type PTryFromExcess a b = PTryFromExcess a (PInner b)
  ptryFrom' :: forall s r. Term s a -> ((Term s b, Reduce (PTryFromExcess a b s)) -> Term s r) -> Term s r
  default ptryFrom' :: forall s r. (PTryFrom a (PInner b), PTryFromExcess a b ~ PTryFromExcess a (PInner b)) => Term s a -> ((Term s b, Reduce (PTryFromExcess a b s)) -> Term s r) -> Term s r
  ptryFrom' opq f = ptryFrom @(PInner b) @a opq \(inn, exc) -> f (punsafeCoerce inn, exc)

There's some additionally features exhibited by this type class, most noteworthy is the PTryFromExcess type family that enables us specify the part of the structure that wasn't parsed and is left unexamined. It's a form of optimization that becomes very important if you have a very complex data type such as ScriptContext from the plutus-ledger-api.

Apparently, a good intuition pump for this 'excess' business is that of a zipper. We focus on a certain part of a data structure, only ever providing links to other parts that are left un-examined.

LambdaBuffers doesn't use this feature and sets the PTryFromExcess to a unit type, signaling that nothing is left unexamined.

PIsData

Printing an implementation for this class for a particular type is governed by derive Plutus.V1.PlutusData <type> statements in .lbf schemas.

PIsData serves to track 'is it Plutus data encoded?' with types.

newtype PAsData (a :: PType) (s :: S) = PAsData (Term s a)

class PIsData a where
  pfromDataImpl :: Term s (PAsData a) -> Term s a
  default pfromDataImpl :: PIsData (PInner a) => Term s (PAsData a) -> Term s a
  pfromDataImpl x = punsafeDowncast $ pfromDataImpl (punsafeCoerce x :: Term _ (PAsData (PInner a)))

  pdataImpl :: Term s a -> Term s PData
  default pdataImpl :: PIsData (PInner a) => Term s a -> Term s PData
  pdataImpl x = pdataImpl $ pto x
instance PIsData FooTrivial where
  pdataImpl = punsafeCoerce
  pfromDataImpl = punsafeCoerce

instance PEq FooTrivial where
  (#==) = \l r -> pdata l #== pdata r

Due to generated types having a PAsData attached to them, be ready to use pdata and pfromData to switch between forms.

PEq

Printing an implementation for this class for a particular type is governed by derive Prelude.Eq <type> statements in .lbf schemas.

PEq serves to track provide equality checks to Plutarch types.

class PEq t where
  (#==) :: Term s t -> Term s t -> Term s PBool
  default (#==) ::
    (PGeneric t, PlutusType t, All2 PEq (PCode t)) =>
    Term s t ->
    Term s t ->
    Term s PBool
  a #== b = gpeq # a # b

infix 4 #==

We don't generate an implementation from the LambdaBuffers 'equality spec', rather we delegate the equality check to the underlying 'PData' representations that all generated types have for performance.

PShow

All generated types have a PShow instance derived using the internal Plutarch deriving mechanism.

PShow serves to stringify Plutarch types which is very useful during debugging.

Example

Let work through the Plutarch example available in the repo.

First, please check the Getting started guide on how to prepare to work with the repo and setup Nix.

Let's see what we have here:

lambda-buffers/docs/plutarch ❯ find
.
./build.nix
./cabal.project
./hie.yaml
./plutarch-example.cabal
./app
./app/Example.hs
./api
./api/Example.lbf
./.envrc

The salient bits we should focus on are:

  1. The LambdaBuffers .lbf schema in ./api/Example.lbf that describes the API types used by our little program,
  2. The Haskell Plutarch program in ./app/Example.hs that works with the API types.

To inspect the generated library:

lambda-buffers/docs/plutarch ❯ nix build .#lbf-plutarch-example-api
lambda-buffers/docs/plutarch ❯ find autogen/
autogen/
autogen/build.json
autogen/LambdaBuffers
autogen/LambdaBuffers/Example
autogen/LambdaBuffers/Example/Plutarch.hs

The name of the generated library lbf-plutarch-example-api is set in the ./plutarch/build.nix Nix build file.

However, it's not expected for users to need to do this. If you have any issue please reach out.

Inspecting the Cabal file shows the standard runtime libraries we need:

lambda-buffers/docs/plutarch ❯ cabal info .
* plutarch-example-0.1.0.0 (program)
    Synopsis:      LambdaBuffers Plutarch example
    Versions available: [ Not available from server ]
    Versions installed: [ Unknown ]
    Homepage:      [ Not specified ]
    Bug reports:   [ Not specified ]
    License:       NONE
    Author:        Drazen Popovic
    Maintainer:    bladyjoker@gmail.com
    Source repo:   [ Not specified ]
    Executables:   plutarch-example
    Flags:         dev
    Dependencies:  base >=4.16, lbf-plutarch-example-api, lbf-plutus-plutarch,
                   lbf-prelude-plutarch, lbr-plutarch, plutarch, plutarch-extra,
                   text >=1.2
    Cached:        Yes

Run the program:

lambda-buffers/docs/plutarch ❯ cabal run
"Friends, peace and love!!!"

Take a look at the Example.hs to see how generated types are used, namely how they are constructed with pcon and deconstructed with pmatch (or pmatchC).

LambdaBuffers to Rust

Let's take a look at how LambdaBuffers modules map into Rust modules and how LambdaBuffers type definitions map into Rust type definitions.

We'll use the lbf-prelude-to-rust CLI tool which is just a convenient wrapper over the raw lbf CLI. We can get this tool by either loading the LambdaBuffers Nix environment that comes packaged with all the CLI tools:

$ nix develop github:mlabs-haskell/lambda-buffers#lb
$ lbf<tab>
lbf                        lbf-plutus-to-purescript   lbf-prelude-to-haskell     lbf-prelude-to-rust
lbf-plutus-to-haskell      lbf-plutus-to-rust         lbf-prelude-to-purescript  

Or we can simply just refer directly to the lbf-prelude-to-rust CLI by nix run github:mlabs-haskell/lambda-buffers#lbf-prelude-to-rust.

In this chapter, we're going to use the latter option.

Let's now use lbf-prelude-to-rust to process the Document.lbf schema.

module Document

-- Importing types
import Prelude (Text, List, Set, Bytes)

-- Author
sum Author = Ivan | Jovan | Savo

-- Reviewer
sum Reviewer = Bob | Alice

-- Document
record Document a = {
  author : Author,
  reviewers : Set Reviewer,
  content : Chapter a
 }

-- Chapter
record Chapter a = {
  content : a,
  subChapters : List (Chapter a)
 }

-- Some actual content
sum RichContent = Image Bytes | Gif Bytes | Text Text

-- Rich document
prod RichDocument = (Document RichContent)
$ nix run github:mlabs-haskell/lambda-buffers#lbf-prelude-to-rust -- Document.lbf
$ find autogen/
autogen/
autogen/LambdaBuffers
autogen/LambdaBuffers/Document.hs
autogen/build.json

As we can see the autogen directory has been created that contains the generated Rust modules. Note the autogen/build.json file as it contains all the necessary Cargo dependencies the generated module needs in order to be properly compiled by Rust.

The outputted Rust module in autogen/document.rs:

#![allow(unused)]
#![no_implicit_prelude]
#![allow(warnings)]
fn main() {
extern crate lbf_prelude;
extern crate std;


#[derive(std::fmt::Debug, std::clone::Clone)]
pub enum Author{Ivan, Jovan, Savo}

#[derive(std::fmt::Debug, std::clone::Clone)]
pub struct Chapter<A>{pub content: A,
                     pub sub_chapters: std::boxed::Box<lbf_prelude::prelude::List<Chapter<A>>>}

#[derive(std::fmt::Debug, std::clone::Clone)]
pub struct Document<A>{pub author: Author,
                      pub reviewers: lbf_prelude::prelude::Set<Reviewer>,
                      pub content: Chapter<A>}

#[derive(std::fmt::Debug, std::clone::Clone)]
pub enum Reviewer{Bob, Alice}

#[derive(std::fmt::Debug, std::clone::Clone)]
pub enum RichContent{Image(lbf_prelude::prelude::Bytes),
                    Gif(lbf_prelude::prelude::Bytes),
                    Text(lbf_prelude::prelude::Text)}

#[derive(std::fmt::Debug, std::clone::Clone)]
pub struct RichDocument(pub Document<RichContent>);

}

Sum types

The types Author, Reviewer, and RichContent have been declared as sum types in the LambdaBuffers schema using the sum keyword.

As we can see, nothing too surprising here, all the sum types become enum in Rust.

Product types

The type RichDocument have been declared as a product type in the LambdaBuffers schema using the prod keyword.

They become Rust tuple struct (or named tuple)

Record types

The types Document and Chapter have been declared as record types in the LambdaBuffers schema using the record keyword.

Like with product types, they become Rust struct with named fields.

All types and their fields are public, allowing to manipulate them without accessors.

LambdaBuffers to Typescript

This chapter will walk through a translation from a LambdaBuffers' module into a Typescript module.

To demonstrate this, we will use the lbf-prelude-to-typescript CLI tool which is just a convenient wrapper over the raw lbf CLI. To this end, we may enter a development shell which provides this tool along with many other Lambda Buffers CLI tools with the following command.

$ nix develop github:mlabs-haskell/lambda-buffers#lb
$ lbf<tab>
lbf                          lbf-plutus-to-haskell        lbf-plutus-to-rust           lbf-prelude-to-haskell       lbf-prelude-to-rust
lbf-list-modules-typescript  lbf-plutus-to-purescript     lbf-plutus-to-typescript     lbf-prelude-to-purescript    lbf-prelude-to-typescript

Or, we may directly refer to the lbf-prelude-to-typescript CLI with the following command.

nix run github:mlabs-haskell/lambda-buffers#lbf-prelude-to-typescript

In this chapter, we will use the former option.

Consider the Document.lbf schema which we may recall is as follows.

module Document

-- Importing types
import Prelude (Text, List, Set, Bytes)

-- Author
sum Author = Ivan | Jovan | Savo

-- Reviewer
sum Reviewer = Bob | Alice

-- Document
record Document a = {
  author : Author,
  reviewers : Set Reviewer,
  content : Chapter a
 }

-- Chapter
record Chapter a = {
  content : a,
  subChapters : List (Chapter a)
 }

-- Some actual content
sum RichContent = Image Bytes | Gif Bytes | Text Text

-- Rich document
prod RichDocument = (Document RichContent)

We generate the corresponding Typescript code with the following commands.

$ nix develop github:mlabs-haskell/lambda-buffers#lb
$ lbf-list-modules-typescript lbf-document=. > lb-pkgs.json 
$ lbf-prelude-to-typescript --gen-opt="--packages lb-pkgs.json" Document.lbf
$ find autogen/
autogen/
autogen/LambdaBuffers
autogen/LambdaBuffers/Document.mts
autogen/build.json

The generated autogen directory created contains the generated Typescript modules.

Note that lbf-list-modules-typescript is needed to create a JSON object which maps package names (for NPM) to Lambda Buffers' modules. Thus, in this example, one should have a package.json file which associates the key "name" with the string value "lbf-document".

The autogen/build.json file can be ignored.

The file autogen/LambdaBuffers/Document.mts contains the outputted Typescript module:

// @ts-nocheck
import * as LambdaBuffers$Document from './Document.mjs'
import * as LambdaBuffers$Prelude from './Prelude.mjs'

export type Author = | { name : 'Ivan' }
                     | { name : 'Jovan' }
                     | { name : 'Savo' }
export const Author : unique symbol = Symbol('Author')
export type Chapter<$a> = { content : $a
                          , subChapters : LambdaBuffers$Prelude.List<Chapter<$a>>
                          }
export const Chapter : unique symbol = Symbol('Chapter')
export type Document<$a> = { author : Author
                           , reviewers : LambdaBuffers$Prelude.Set<Reviewer>
                           , content : Chapter<$a>
                           }
export const Document : unique symbol = Symbol('Document')
export type Reviewer = | { name : 'Bob' } | { name : 'Alice' }
export const Reviewer : unique symbol = Symbol('Reviewer')
export type RichContent = | { name : 'Image'
                            , fields : LambdaBuffers$Prelude.Bytes
                            }
                          | { name : 'Gif'
                            , fields : LambdaBuffers$Prelude.Bytes
                            }
                          | { name : 'Text'
                            , fields : LambdaBuffers$Prelude.Text
                            }
export const RichContent : unique symbol = Symbol('RichContent')
export type RichDocument = Document<RichContent>
export const RichDocument : unique symbol = Symbol('RichDocument')

Product types

The type RichDocument have been declared as a product type in the LambdaBuffers schema using the prod keyword.

In general, product types are mapped to tuple types in Typescript most of the time. The exception is if there is only one element in the tuple in which case the type is translated to a type alias.

More precisely, given a LambdaBuffers' product type as follows

prod MyProduct = SomeType1 ... SomeTypeN

where the ... denotes iterated SomeTypei for some i, then

  • If N = 0 so prod MyProduct =, then we map this to the Typescript type

    export type MyProduct = []
    
  • If N = 1 so prod MyProduct = SomeType1, then we map this to the Typescript type

    export type MyProduct = SomeType1
    

    i.e., MyProduct simply aliases SomeType1

  • If N >= 2 so prod MyProduct = SomeType1 ... SomeTypeN, then we map this to the Typescript type

    export type MyProduct = [SomeType1, ..., SomeTypeN]
    

    i.e., MyProduct is a tuple with a fixed number of elements with known types.

Sum types

The types Author, Reviewer, and RichContent have been declared as sum types in the LambdaBuffers schema using the sum keyword.

In general, sum types are mapped to a union type in Typescript and with the additional following rules. Given a LambdaBuffers' sum type as follows

sum MySum
    = Branch1 Branch1Type1 ... Branch1TypeM1
    | ...
    | BranchN BranchNType1 ... BranchNTypeMN

where the ... denotes either an iterated Branchi for some i, or an iterated BranchiTypej for some i and j, then each branch, say Branchi is translated as follows.

  • If Branchi has no fields i.e., | Branchi, then the corresponding Typescript type's union member is

    | { name: 'Branchi' }
    
  • If Branchi has one or more fields i.e., | Branchi BranchiType1 ... BranchiTypeMi, then the corresponding Typescript type's union member is

    | { name: 'Branchi' 
      , fields: <Product translation of BranchiType1 ... BranchiTypeMi>
      }
    

    where <Product translation of BranchiType1 ... BranchiTypeMi> denotes the right hand side of the product translation of prod FieldsProduct = BranchiType1 ... BranchiTypeMi.

    So, for example, given | Branchi BranchiType1, the corresponding Typescript type is as follows

    | { name: 'Branchi'
      , fields: BranchiType1
      }
    

    And given | Branchi BranchiType1 BranchiType2, the corresponding Typescript type is as follows.

    | { name: 'Branchi'
      , fields: [BranchiType1, BranchiType2]
      }
    

Record types

The types Document and Chapter have been declared as record types in the LambdaBuffers schema using the record keyword.

Record types are mapped to object types in Typescript. Given a LambdaBuffers' record type as follows

record MyRecord = { field1: SomeType1, ..., fieldN: SomeTypeN }

where ... denotes iterated fieldi: SomeTypei for some i, the corresponding Typescript type is

type MyRecord = { field1: SomeType1, ..., fieldN, SomeTypeN }

Type classes quickstart

Typescript has no builtin implementation of type classes. As such, LambdaBuffers rolled its own type classes. A complete usage example can be found in the Typescript Prelude sample project, but assuming the packaging is setup correctly, the interface to use a typeclass is as follows

import * as LbrPrelude from "lbr-prelude";

// In Haskell, this is `10 == 11`
LbrPrelude.Eq[LbrPrelude.Integer].eq(10n, 11n) // false

// In Haskell, this is `Just 3 == Nothing`
LbrPrelude.Eq[LbrPrelude.Maybe](LbrPrelude.Eq[LbrPrelude.Integer])
    .eq( { name: 'Just', fields: 3 }
       , { name: 'Nothing' }) // false

In particular, we access a global variable LbrPrelude.Eq which contains the type class instances, and pick out a particular instance with the type's name like LbrPrelude.Integer. Note that the LbrPrelude.Maybe instance requires knowledge of the Eq instance of the LbrPrelude.Integer, so we must pass that in as a function argument.

Type classes in detail

A type class in Typescript is an object type which defines a set of methods. For example, the Eq type class in Haskell defines the set of methods == (equality) and /= (inequality) as follows.

class Eq a where
    (==) :: a -> a -> Bool
    (/=) :: a -> a -> Bool

The corresponding Eq class in Typescript is:

export interface Eq<A> {
  readonly eq: (l: Readonly<A>, r: Readonly<A>) => boolean;
  readonly neq: (l: Readonly<A>, r: Readonly<A>) => boolean;
}

Each type class in Typescript must have an associated global variable which maps unique representations of its instance types to the corresponding object of the type class implementation. For example, the Eq type class has the global variable defined in the lbr-prelude library defined as follows

export const Eq: EqInstances = { } as EqInstances

where EqInstances is an interface type that is initially empty but will be extended with instances of types later.

export interface EqInstances { }

Finally, the following invariant is maintained in the code generator:

  • Every type T has an associated unique symbol also called T.

So, the type Integer has

export type Integer = bigint
export const Integer: unique symbol = Symbol('Integer')

and implementing its Eq instance amounts to the following code.

export interface EqInstances {
    [Integer]: Eq<Integer>
}
Eq[Integer] = { eq: (l,r) => l === r
              , neq: (l,r) => l !== r
              }

For types defined in the LambdaBuffers schema, this chunk of code will be automatically generated provided there is an appropriate derive construct.

Type instances with constraints

Recall in Haskell that the Eq instance for a tuple may be defined as follows

instance (Eq a, Eq b) => Eq (MyPair a b) where
    MyPair a1 a2 == MyPair b1 b2 = a1 == b1 && a2 == b2
    MyPair a1 a2 != MyPair b1 b2 = a1 != b1 || a2 != b2

The corresponding Typescript type definition and instance would be defined as follows

export type MyPair<a, b> = [a, b]
export const MyPair: unique symbol = Symbol('MyPair')

export interface EqInstances {
    [MyPair]: <A,B>(a : Eq<A>, b : Eq<B>) => Eq<MyPair<A,B>>
}
Eq[MyPair] = (dictA, dictB) => { return { eq: (a,b) => dictA.eq(a[0], b[0]) && dictB.eq(a[1], b[1])
                                        , neq: (a,b) => dictA.neq(a[0], b[0]) || dictB.neq(a[1], b[1])
                                        } 
                               }

Note that the constraints (Eq a, Eq b) => become arguments dictA and dictB that are used to construct the Eq instance for MyPair.

This loosely follows the original translation given in the paper How to make ad-hoc polymorphism less ad hoc with some minor modifications.

Limitations

  • Only Haskell 2010 typeclasses are supported for the Typescript code generator. So, the following schemas will probably generate incorrect code.

    derive Eq (MyPair a a)
    derive Eq (MyMaybe (MyMaybe Integer))
    derive Eq (MyMaybe Integer)
    

Getting started

Installing Nix

This repository relies heavily on the Nix Package Manager for both development and package distribution.

To install run the following command:

sh <(curl -L https://nixos.org/nix/install) --daemon

and follow the instructions.

$ nix --version
nix (Nix) 2.8.0

NOTE: The repository should work with Nix version greater or equal to 2.8.0.

Make sure to enable Nix Flakes and IFD by editing either ~/.config/nix/nix.conf or /etc/nix/nix.conf on your machine and add the following configuration entries:

experimental-features = nix-command flakes
allow-import-from-derivation = true

Optionally, to improve build speed, it is possible to set up a binary caches maintained by IOHK and Plutonomicon by setting additional configuration entries:

substituters = https://cache.nixos.org https://iohk.cachix.org https://cache.iog.io https://public-plutonomicon.cachix.org
trusted-public-keys = cache.nixos.org-1:6NCHdD59X431o0gWypbMrAURkbJ16ZPMQFGspcDShjY= hydra.iohk.io:f/Ea+s+dFdN+3Y/G+FDgSq+a5NEWhJGzdjvKNGv0/EQ= iohk.cachix.org-1:DpRUyj7h7V830dp/i6Nti+NEO2/nhblbov/8MW7Rqoo= public-plutonomicon.cachix.org-1:3AKJMhCLn32gri1drGuaZmFrmnue+KkKrhhubQk/CWc=

Building and development

To facilitate seamlessly moving between directories and associated Nix development shells we use direnv and nix-direnv:

To install both using nixpkgs:

nix profile install nixpkgs#direnv
nix profile install nixpkgs#nix-direnv

Your shell and editors should pick up on the .envrc files in different directories and prepare the environment accordingly. Use direnv allow to enable the direnv environment and direnv reload to reload it when necessary.

Additionally, throughout the repository one can use:

$ pre-commit run --all
cabal-fmt............................................(no files to check)Skipped
fourmolu.................................................................Passed
hlint....................................................................Passed
markdownlint.............................................................Passed
nix-linter...............................................................Passed
nixpkgs-fmt..............................................................Passed
shellcheck...........................................(no files to check)Skipped
typos....................................................................Passed

To run all the code quality tooling specified in the pre-commit-check config file

LambdaBuffers Design

The goal of the LambdaBuffers project is to enable software developers to specify their application types in a common format that can be conveniently shared and their values effectively communicated across language barriers.

Problem Statement

Software projects that span multiple language environments often interact in a sub-optimal manner. Significant effort is spent in making application messages declared in one language environment available to other language environments.

This burden is particularly onerous in manually written and managed serialization/encoding code which is used to communicate application values in the context of networking, configuration and databases.

Ensuring compatibility, consistency and correctness of application messages is a difficult, tedious and error prone process that often results in unwarranted costs to the business and unsustainable technical debt.

Requirements

  1. Expressive types,
  2. Expressive semantics annotation,
  3. Extensible to new types,
  4. Extensible to new semantics,
  5. Universally consistent semantics,
  6. Modular API architecture.

Expressive Types

Application types that users can define should be expressive enough to facilitate type driven domain modeling and to express application programming interfaces.

Taking inspiration from existing type systems, LambdaBuffers supports algebraic data types that facilitate elegant type composition. Such types are well studied and widely used in functional programming languages such as Haskell.

LambdaBuffers supports first class sum, product and record types. Types can be parameterized, effectively enabling generic types. LambdaBuffers also supports recursive type definitions, which allow users to succinctly define elegant and expressive data structures.

Expressive Semantics Annotation

Enabling users to manage the semantics associated with their types is essential for adapting LambdaBuffers to a variety of different domains and use cases.

While most existing schema systems only facilitate type declarations in a variety of languages, LambdaBuffers takes a further step and provides users with the capability to manage the semantics of the types defined in schemata by indicating which basic operations and functions a type ought to support.

For example, suppose a user would like some types to support certain encoding (e.g. JSON or CBOR). In order to support a particular encoding, functions that serialize and deserialize the types are needed in the target language(s). Another example: Users may wish to declare that certain types are numeric - i.e. that values of these types can be added, subtracted or multiplied in a given target language. Most types could be declared to support an equality relation, which requires a function that can check values for equality in the target language(s).

In order to provide users the capability to manage the semantics of the types they define, LambdaBuffers supports type classes, also known as type constraints. Type classes are a well-established mechanism for supporting ad hoc polymorphism, backed by a large amount of academic research and widely used in functional programming languages such as Haskell, PureScript, and (albeit under a different name) Rust.

One essential difference between LambdaBuffers type classes and type classes as implemented in Haskell/PureScript/Rust is that LambdaBuffers does not allow users to declare the implementation of type class instances. Instead, users declare instance clauses for their types which signify the semantics (i.e. functions, methods) they wish to be generated in the target language. All implementation are generated uniformly as elaborated in the specification document for a given type class.

For each new type class declared, code generation tooling must be updated to handle the new type class.

Extensible to New Types

Enabling users to introduce new built-in types allows LambdaBuffers to be adapted in many different domains and use cases.

These types have special treatment under the hood and are generally mapped onto existing types and their value representations in the targeted language environments. For example, a primitive Int type in the LambdaBuffers schema language may be mapped to Int in Haskell and i32 in Rust. Primitive parameterized types are also possible: A primitive Maybe type might be mapped to Maybe in Haskell and to Option<_> in Rust.

LambdaBuffers supports opaque types to provide users with the power to define their own primitive or builtin types.

Example opaque types include various integer types, sequence types, text types, sets, maps and other semantically richer data types. Generally, such types are already well-defined and widely used in various language environments and come equipped with rich libraries that work with them. Redefining them ab ovo would be counterproductive as users would have to re-implement and reinvent the rich support for such types.

Extensible to New Semantics

Enabling users to introduce new type semantics facilitates LambdaBuffers to be adapted in many different domains and use cases.

In LambdaBuffers, introducing new type semantics works by first declaring a type class, which is simply the name of the class bundled with any super-classes (should they exist). Next, specification document that elaborates how instances are to be generated for members of the class must be created, taking care to ensure that instances are always generated uniformly. Finally, a code generation module for the class must be written that implements compliant code generation for different target languages.

Concretely, serialization has special treatment in most technologies in this space, however in LambdaBuffers, this is just a new type class.

For each new type class, deliberate effort must be invested to support that 'semantics' in different target language environments in a compliant manner. (Because type class instances are generated uniformly relative to the structure of the original type, in accordance with a set of deriving rules provided in the code generation implementation, the amount of boilerplate required to implement a new class is substantially reduced if not entirely eliminated.)

The LambdaBuffers team will officially support a certain set of type classes and provide code generation modules for a set of officially supported language environments. However, modular design will hopefully facilitate community contributions in a sustainable fashion.

Universally Consistent Semantics

The types specified by users must be handled in a compatible manner across language environments. This is a critical requirement that underpins this project.

If two values of a LambdaBuffers type are considered as 'equal' in one language environment, they should be considered equal in all others. In a similar fashion, if LambdaBuffers type has a declared JSON serialization semantics, values encoded as JSON in one language environment have to have the same encoding performed in all other language environments.

LambdaBuffers does not provide a way to formally verify consistency across languages, however, a comprehensive test suite will be developed to ensure that new code generation modules are indeed implemented correctly.

Modular API Architecture

LambdaBuffers establishes three separate components of the architecture, namely Frontend, Compiler and Codegen*.

Frontend is a user facing component that features a way to input, specify or otherwise construct application types. Frontend also orchestrates the overall work that includes the Compiler and Codegen, invoking each of these components as required by different workflows the Frontend supports. LambdaBuffers officially supports a Frontend component implementation that features a text based language for specifying LambdaBuffers types. However, because of the modular API architecture approach, a Frontend can be implemented in any language and in any fashion as long as they are able to interface with the Compiler and Codegen components.

The Compiler is a key component that is made available via the Compiler Input specified via Google Protocol Buffers. It performs all the necessary checks to ensure that the naming, type definitions and their declared semantics are indeed valid.

The Codegen component consumes the Compiler Output that contains all the information necessary to perform valid and robust code generation. Frontend communicates with the Codegen to understand its Capabilities, enabling each Codegen module to gradually build up support for desired opaque types and declared semantics.

Protocol Documentation

Table of Contents

Top

lang.proto

ClassConstraint

Class constraints

A special constraint type denoting the constraints that occur on the rhs of class definitions. Only used to specify super class constraints in a ClassDef.

Not to be confused with Constraint which denote type class rules.

FieldTypeLabelDescription
class_refTyClassRefType class reference.
argsTyVarrepeatedType variables quantified over ClassDef arguments.

ClassDef

Type class definition

LambdaBuffers use type classes to talk about the various 'meanings' or 'semantics' we want to associate with the types in LambdaBuffers schemata.

For instance, most types can have Eq semantics, meaning they can be compared for equality. Other can have Ord semantics, meaning they can be ordered.

Using type classes and instance declarations, much like in Haskell, users can specify the 'meaning' of each type they declare. For example, serialization in LambdaBuffers is just another type class, it's treated the same as any other type class. Concretely, if we wish to provide JSON serialization for LambdaBuffers types, we declare such a type class and provide desired semantic rules:

module Foo

class JSON a

sum Foo a b = Bar a | Baz b

derive JSON (Foo a b)

Note that for each type class introduced, the Codegen machinery must be updated to support said type class. In other words, it doesn't come for free and for each new type class, a Codegen support must be implemented for any InstanceClause declared by the user. Once all the InstanceClauses have an implementation provided, all the Derived implementation come for free.

FieldTypeLabelDescription
class_nameClassNameType class name.
class_argsTyArgrepeatedType class arguments. Class with no arguments is a trivial class. Compiler MAY report an error. TODO(bladyjoker): MultipleClassArgError.
supersClassConstraintrepeatedSuperclass constraints.
documentationstringDocumentation elaborating on the type class.
source_infoSourceInfoSource information.

ClassName

Type class name

FieldTypeLabelDescription
namestringName ::= [A-Z]+[A-Za-z0-9_]*
source_infoSourceInfoSource information.

ConstrName

Sum type constructor name

FieldTypeLabelDescription
namestringName ::= [A-Z]+[A-Za-z0-9_]*
source_infoSourceInfoSource information.

Constraint

Constraint term

FieldTypeLabelDescription
class_refTyClassRefName of the type class.
argsTyrepeatedConstraint arguments. Constraint with no arguments is a trivial constraint. Compiler MAY report an error.
source_infoSourceInfoSource information.

Derive

Derive statement

Derive statements enable user to specify 'semantic' rules for their types much like InstanceClauses do. However, the Codegen will be able to derive an implementation for any such constraint.

module Prelude

class Eq a

sum Maybe a = Just a | Nothing

derive Eq (Maybe a)

The rule installed for the derive statement is:

eq(maybe(A)) :- eq(just(A) | Nothing).

The rule relates the desired Ty term to its (lambda calculus) 'evaluated' form.

> Currently, there's only support for deriving type class rules and implementations for Ty terms of Kind.KIND_REF_TYPE. That means, type classes like Ord and Eq...

FieldTypeLabelDescription
constraintConstraintConstraint to derive.

FieldName

Record type field name

FieldTypeLabelDescription
namestringName ::= [a-z]+[A-Za-z0-9_]*
source_infoSourceInfoSource information.

InstanceClause

Instance clause

Instance clauses enable users to specify ad-hoc 'semantic' rules for their types. Each such instance must be supported explicitly in the Codegen by providing runtime implementations.

This rule form is used when declaring 'opaque' implementations on Opaque types.

module Prelude

class Eq a

opaque Maybe a

instance Eq a =&gt; Eq (Maybe a)

The rule installed for the clause is:

eq(maybe(A)) :- eq(A).

The instance clause is verbatim added to the rule set.

FieldTypeLabelDescription
headConstraintHead of the clause that holds only when the body holds. Type variables introduced in the head of the rule become available in the scope of the body of the rule.
constraintsConstraintrepeatedInstance (rule) body, conjunction of constraints.
source_infoSourceInfoSource information.

Kind

Kinds

A type of a type is called a 'kind'. In Lambda Buffers, all type terms, namely TyArg, TyVar, TyRef, TyApp and TyAbs, are either of kind Type or Type -&gt; Type and Type -&gt; Type -&gt; Type etc.

FieldTypeLabelDescription
kind_refKind.KindRef
kind_arrowKind.KindArrow

Kind.KindArrow

A kind arrow.

FieldTypeLabelDescription
leftKind
rightKind

Module

Module

A module encapsulates type, class and instance definitions.

FieldTypeLabelDescription
module_nameModuleNameModule name.
type_defsTyDefrepeatedType definitions. Duplicate type definitions MUST be reported with ProtoParseError.MultipleTyDefError.
class_defsClassDefrepeatedType class definitions. Duplicate class definitions MUST be reported with ProtoParseError.MultipleClassDefError.
instancesInstanceClauserepeatedType class instance clauses.
derivesDeriverepeatedType class derive statements.
importsModuleNamerepeatedImported modules the Compiler consults when searching for type class rules. TODO(bladyjoker): Rename to ruleImports. Duplicate imports MUST be reported with ProtoParseError.MultipleImportError.
source_infoSourceInfoSource information.

ModuleName

Module name

FieldTypeLabelDescription
partsModuleNamePartrepeatedParts of the module name denoting a hierarchical namespace.
source_infoSourceInfoSource information.

ModuleNamePart

Module name part

FieldTypeLabelDescription
namestringName ::= [A-Z]+[A-Za-z0-9_]*
source_infoSourceInfoSource information.

Opaque

Opaque type.

A type that has an Opaque body represents a 'built-in' or a 'primitive' type that's handled by the semantics 'under the hood'. It's called 'opaque' to denote the fact that the Compiler has no knowledge of its structure, and relies that the necessary knowledge is implemented elsewhere. The Codegen modules for any target language have to be able to handle such types specifically and map to existing value level representations and corresponding types.

Codegen modules would have to implement support for such defined types, for example:

  • In Python Set a would map to set() from the standard library,
  • In Haskell Set a would map to containers.Data.Set.Set type.

Every Opaque type has to be considered deliberately for each language environment targeted by Codegen modules.

TODO(bladyjoker): Consider attaching explicit Kind terms to Opaques.

FieldTypeLabelDescription
source_infoSourceInfoSource information.

Product

A product type term.

FieldTypeLabelDescription
fieldsTyrepeatedFields in a products are types.
source_infoSourceInfoSource information.

Record

A record type term.

FieldTypeLabelDescription
fieldsRecord.FieldrepeatedRecord fields.
source_infoSourceInfoSource information.

Record.Field

Field in a record type.

FieldTypeLabelDescription
field_nameFieldNameRecord field name.
field_tyTyField type.

SourceInfo

Frontend Source information

Frontends are advised to include Source information to denote how their Source* content maps to the Compiler Input. It's essential when reporting Compiler* errors back to the Frontend.

FieldTypeLabelDescription
filestringA filename denoting the Source file.
pos_fromSourcePositionStarting position in Source.
pos_toSourcePositionEnd position in Source.

SourcePosition

Position in Source

FieldTypeLabelDescription
columnint32Column index in the Source.
rowint32Row index in the Source.

Sum

A sum type term.

A type defined as a Sum type is just like a Haskell algebraic data type and represents a sum of products.

FieldTypeLabelDescription
constructorsSum.ConstructorrepeatedSum type constructors. Empty constructors means void and means that the type can't be constructed. Compiler MAY report an error. Duplicate constructors MUST be reported with ProtoParseError.MultipleConstructorError.
source_infoSourceInfoSource information.

Sum.Constructor

Constructor of a Sum type is a Product type term.

FieldTypeLabelDescription
constr_nameConstrNameConstructor name.
productProductProduct type term.

Ty

Type term

A type term that occurs in bodies of type definitions (message TyDef):

sum Maybe a = Just a | Nothing

sum Either a b = Left a | Right b

sum SomeType a = Foo a (Maybe a) | Bar (Either (Maybe a) (SomeType a))

or in instance declarations:

instance Eq (Maybe a)

instance Eq (SomeType Int)

instance (Eq (Maybe a), Eq (SomeType a)) Eq (Either (Maybe a) (SomeType a))

Check out examples.

FieldTypeLabelDescription
ty_varTyVarA type variable.
ty_appTyAppA type application.
ty_refTyRefA type reference.

TyAbs

Type abstraction

A type term that introduces type abstractions (ie. type functions). This type term can only be introduced in the context of a [type definition](@ref TyDef).

FieldTypeLabelDescription
ty_argsTyArgrepeatedList of type variables. No type arguments means delay or const ty_body, meaning TyAbs [] ty_body = ty_body. Duplicate type arguments MUST be reported with ProtoParseError.MultipleTyArgError.
ty_bodyTyBodyType body.
source_infoSourceInfoSource information.

TyApp

Type application

A type term that applies a type abstraction to a list of arguments.

FieldTypeLabelDescription
ty_funcTyType function. TODO(bladyjoker): Rename to ty_abs?
ty_argsTyrepeatedArguments to apply. No arguments to apply means force, meaning `TyApp ty_func [] = ty_func``
source_infoSourceInfoSource information.

TyArg

Type arguments

Arguments in type abstractions.

Type arguments and therefore type variables have kinds, the Compiler only accepts Type kinded type arguments ans therefore type variables.

However, to allow for future evolution if ever necessary, we attach the Kind term to type arguments, even though the Compiler will reject any TyArg that's not of kind Type.

Note, this effectively means that lambda Buffers doesn't support higher-kinded types (ie. HKT).

FieldTypeLabelDescription
arg_nameVarNameArgument name corresponds to variable names.
arg_kindKindArgument kind.
source_infoSourceInfoSource information.

TyBody

Type body.

Lambda Buffers type bodies type terms that can only be specified in the TyAbs context. It's a built-in type term that can only occur enclosed within a TyAbs term which introduces TyVars in the scope of the term.

FieldTypeLabelDescription
opaqueOpaque
sumSum
productProduct
recordRecord

TyClassRef

Type class references

It is necessary to know whether a type class is defined locally or in a foreign module when referring to it in a constraint, this allows users (and requires the frontend) to explicitly communicate that information.

FieldTypeLabelDescription
local_class_refTyClassRef.Local
foreign_class_refTyClassRef.Foreign

TyClassRef.Foreign

Foreign class reference.

FieldTypeLabelDescription
class_nameClassNameForeign module class name.
module_nameModuleNameForeign module name.
source_infoSourceInfoSource information.

TyClassRef.Local

Local type reference.

FieldTypeLabelDescription
class_nameClassNameLocal module class name.
source_infoSourceInfoSource information.

TyDef

Type definition

A type definition consists of a type name and its associated type term.

One way to look at it is that a type definition introduces a named 'type abstraction' in the module scope. Concretely, Either can be considered a type lambda of kind Type -&gt; Type -&gt; Type.

In fact, type definitions are the only way to introduce such types.

Once introduced in the module scope, type definitions are referred to using [TyRef](@ref TyRef) term.

FieldTypeLabelDescription
ty_nameTyNameType name.
ty_absTyAbsType term.
source_infoSourceInfoSource information.

TyName

Type name

FieldTypeLabelDescription
namestringName ::= [A-Z]+[A-Za-z0-9_]*
source_infoSourceInfoSource information.

TyRef

Type reference

A type term that denotes a reference to a type available that's declared locally or in foreign modules.

FieldTypeLabelDescription
local_ty_refTyRef.Local
foreign_ty_refTyRef.Foreign

TyRef.Foreign

Foreign type reference.

FieldTypeLabelDescription
ty_nameTyNameForeign module type name.
module_nameModuleNameForeign module name.
source_infoSourceInfoSource information.

TyRef.Local

Local type reference.

FieldTypeLabelDescription
ty_nameTyNameLocal module type name.
source_infoSourceInfoSource information.

TyVar

Type variable

FieldTypeLabelDescription
var_nameVarNameVariable name.

Tys

A list of type terms useful for debugging

FieldTypeLabelDescription
tiesTyrepeated

VarName

Type variable name

FieldTypeLabelDescription
namestringName ::= [a-z]+
source_infoSourceInfoSource information.

Kind.KindRef

A built-in kind.

NameNumberDescription
KIND_REF_UNSPECIFIED0Unspecified kind SHOULD be inferred by the Compiler.
KIND_REF_TYPE1A Type kind (also know as * in Haskell) built-in.

Top

compiler.proto

Error

Compiler Error

FieldTypeLabelDescription
proto_parse_errorsProtoParseErrorrepeatedErrors occurred during proto parsing.
naming_errorsNamingErrorrepeatedErrors occurred during naming checking.
kind_check_errorsKindCheckErrorrepeatedErrors occurred during kind checking.
ty_class_check_errorsTyClassCheckErrorrepeatedErrors occurred during type class checking.
internal_errorsInternalErrorrepeatedErrors internal to the compiler implementation.

Input

Compiler Input

Compiler Input is a fully self contained list of modules, the entire compilation closure needed by the Compiler to perform its task.

FieldTypeLabelDescription
moduleslambdabuffers.ModulerepeatedModules to compile. Duplicate modules MUST be reported with ProtoParseError.MultipleModuleError.

InternalError

Errors internal to the implementation.

FieldTypeLabelDescription
msgstringError message.
source_infolambdabuffers.SourceInfoSource information if meaningful.

KindCheckError

Kind checking errors.

FieldTypeLabelDescription
unbound_ty_ref_errorKindCheckError.UnboundTyRefError
unbound_ty_var_errorKindCheckError.UnboundTyVarError
unification_errorKindCheckError.UnificationError
cyclic_kind_errorKindCheckError.CyclicKindError
inconsistent_type_errorKindCheckError.InconsistentTypeError

KindCheckError.CyclicKindError

A cyclic kind was encountered. Infinite kinds like this are not acceptable, and we do not support them. We could not construct infinite kind in ty_def.

As the implementation currently stands such an error is (most likely) not representable - therefore not reachable. Such an error would usually occur for a term like: λa. a a - in which case the inference would try to unify two kinds of the form: m and m -> n - because m appears in both terms - the cyclic unification error would be thrown.

In the case of LambdaBuffers - such an error is not (currently) achievable as the kind of the variable is given by the context - (i.e. λa : m . a a, where m is a kind) therefore the unification would fail with Unification Error. Nevertheless - future features might require it.

FieldTypeLabelDescription
ty_deflambdabuffers.TyDef
module_namelambdabuffers.ModuleName

KindCheckError.InconsistentTypeError

The actual_kind differs from the expected_kind.

FieldTypeLabelDescription
module_namelambdabuffers.ModuleName
ty_deflambdabuffers.TyDef
actual_kindlambdabuffers.Kind
expected_kindlambdabuffers.Kind

KindCheckError.UnboundTyRefError

Unbound type reference ty_ref detected in term ty_def.

FieldTypeLabelDescription
module_namelambdabuffers.ModuleName
ty_deflambdabuffers.TyDef
ty_reflambdabuffers.TyRef

KindCheckError.UnboundTyVarError

Unbound variable ty_var detected in term ty_def.

FieldTypeLabelDescription
module_namelambdabuffers.ModuleName
ty_deflambdabuffers.TyDef
ty_varlambdabuffers.TyVar

KindCheckError.UnificationError

In ty_def an error has occurred when trying to unify kind ty_kind_lhs with ty_kind_rhs.

FIXME(cstml): Add source of constraint to the error such that user can see where the constraint was generated - therefore where the error precisely is.

FieldTypeLabelDescription
module_namelambdabuffers.ModuleName
ty_deflambdabuffers.TyDef
ty_kind_lhslambdabuffers.Kind
ty_kind_rhslambdabuffers.Kind

NamingError

Naming error message

FieldTypeLabelDescription
module_name_errlambdabuffers.ModuleNamePart
ty_name_errlambdabuffers.TyName
var_name_errlambdabuffers.VarName
constr_name_errlambdabuffers.ConstrName
field_name_errlambdabuffers.FieldName
class_name_errlambdabuffers.ClassName

Output

Output of the Compiler.

FieldTypeLabelDescription
errorError
resultResult

ProtoParseError

All errors that occur because of Google Protocol Buffer's inability to enforce certain invariants. Some of invariance:

  • using Proto map restricts users to string keys which impacts API documentation, which is why repeated fields are used throughout,
  • using Proto 'oneof' means users have to check if such a field is set or report an error otherwise.
FieldTypeLabelDescription
multiple_module_errorProtoParseError.MultipleModuleError
multiple_tydef_errorProtoParseError.MultipleTyDefError
multiple_classdef_errorProtoParseError.MultipleClassDefError
multiple_tyarg_errorProtoParseError.MultipleTyArgError
multiple_constructor_errorProtoParseError.MultipleConstructorError
multiple_field_errorProtoParseError.MultipleFieldError
multiple_import_errorProtoParseError.MultipleImportError
one_of_not_set_errorProtoParseError.OneOfNotSetError
unknown_enum_errorProtoParseError.UnknownEnumError

ProtoParseError.MultipleClassDefError

Multiple ClassDefs with the same ClassName were found in ModuleName.

FieldTypeLabelDescription
module_namelambdabuffers.ModuleNameModule in which the error was found.
class_defslambdabuffers.ClassDefrepeatedConflicting class definitions.

ProtoParseError.MultipleConstructorError

Multiple Sum Constructors with the same ConstrName were found in ModuleName.TyDef.

FieldTypeLabelDescription
module_namelambdabuffers.ModuleNameModule in which the error was found.
ty_deflambdabuffers.TyDefType definition in which the error was found.
constructorslambdabuffers.Sum.ConstructorrepeatedConflicting constructors.

ProtoParseError.MultipleFieldError

Multiple Record Fields with the same FieldName were found in ModuleName.TyDef.

FieldTypeLabelDescription
module_namelambdabuffers.ModuleNameModule in which the error was found.
ty_deflambdabuffers.TyDefType definition in which the error was found.
fieldslambdabuffers.Record.FieldrepeatedConflicting record fields.

ProtoParseError.MultipleImportError

FieldTypeLabelDescription
module_namelambdabuffers.ModuleNameModule in which the error was found.
importslambdabuffers.ModuleNamerepeatedConflicting module imports.

ProtoParseError.MultipleModuleError

Multiple Modules with the same ModuleName were found.

FieldTypeLabelDescription
moduleslambdabuffers.ModulerepeatedConflicting type definitions.

ProtoParseError.MultipleTyArgError

Multiple TyArgs with the same ArgName were found in ModuleName.TyDef.

FieldTypeLabelDescription
module_namelambdabuffers.ModuleNameModule in which the error was found.
ty_deflambdabuffers.TyDefType definition in which the error was found.
ty_argslambdabuffers.TyArgrepeatedConflicting type abstraction arguments.

ProtoParseError.MultipleTyDefError

Multiple TyDefs with the same TyName were found in ModuleName.

FieldTypeLabelDescription
module_namelambdabuffers.ModuleNameModule in which the error was found.
ty_defslambdabuffers.TyDefrepeatedConflicting type definitions.

ProtoParseError.OneOfNotSetError

Proto oneof field is not set.

FieldTypeLabelDescription
message_namestringProto message name in which the oneof field is not set.
field_namestringThe oneof field that is not set.

ProtoParseError.UnknownEnumError

Proto enum field is unknown.

FieldTypeLabelDescription
enum_namestringProto enum name.
got_tagstringThe unknown tag for the enum.

Result

Compiler Result ~ a successful Compilation Output.

TyClassCheckError

Type class checking errors.

TyClassCheckError.DeriveOpaqueError

In module_name it wasn't possible to solve constraint because a sub_constraint has been derived on an Opaque type. Opaque type can only have an InstanceClause declared for them.

FieldTypeLabelDescription
module_namelambdabuffers.ModuleName
constraintlambdabuffers.Constraint
sub_constraintlambdabuffers.Constraint

TyClassCheckError.ImportNotFoundError

Import missing wasn't found in module_name

FieldTypeLabelDescription
module_namelambdabuffers.ModuleName
missinglambdabuffers.ModuleName

TyClassCheckError.MissingRuleError

In module_name while trying to solve constraint it wasn't possible to find a rule (Derive or InstanceClause) for sub_constraint.

FieldTypeLabelDescription
module_namelambdabuffers.ModuleName
constraintlambdabuffers.Constraint
sub_constraintlambdabuffers.Constraint

TyClassCheckError.OverlappingRulesError

In module_name while trying to solve constraint overlaps (Derive or InstanceClause) were found that could be used to solve the sub_constraint.

FieldTypeLabelDescription
module_namelambdabuffers.ModuleName
constraintlambdabuffers.Constraint
sub_constraintlambdabuffers.Constraint
overlapsTyClassCheckError.OverlappingRulesError.QHeadrepeated

TyClassCheckError.OverlappingRulesError.QHead

NOTE(bladyjoker): This should rather be oneof Derive and InstanceClause.

FieldTypeLabelDescription
module_namelambdabuffers.ModuleName
headlambdabuffers.Constraint

TyClassCheckError.SuperclassCycleError

Superclass cycle cycled_class_refs was detected when checking a class definition for class_name in module module_name.

FieldTypeLabelDescription
module_namelambdabuffers.ModuleName
class_namelambdabuffers.ClassName
cycled_class_refslambdabuffers.TyClassRefrepeated

TyClassCheckError.UnboundClassRefError

Unbound class_ref detected in module_name.

FieldTypeLabelDescription
module_namelambdabuffers.ModuleName
class_reflambdabuffers.TyClassRef

Top

codegen.proto

Error

Codegen Error

FieldTypeLabelDescription
internal_errorsInternalErrorrepeated
unsupported_opaque_errorsUnsupportedOpaqueErrorrepeated
unsupported_class_errorsUnsupportedClassErrorrepeated

Input

Codegen Input

Codegen Input is a fully self contained list of modules, that have been checked by the Compiler.

FieldTypeLabelDescription
moduleslambdabuffers.ModulerepeatedModules to codegen.

InternalError

Errors internal to the implementation.

FieldTypeLabelDescription
msgstringError message.
source_infolambdabuffers.SourceInfoSource information if meaningful.

Output

Codegen output.

FieldTypeLabelDescription
errorError
resultResult

Result

Codegen Result ~ a successful Codegen Output.

UnsupportedClassError

Unsupported Class error for a class class_name defined in module module_name.

FieldTypeLabelDescription
module_namelambdabuffers.ModuleName
class_namelambdabuffers.ClassName

UnsupportedOpaqueError

Unsupported Opaque error for a type ty_name defined in module module_name.

FieldTypeLabelDescription
module_namelambdabuffers.ModuleName
ty_namelambdabuffers.TyName

Scalar Value Types

.proto TypeNotesC++JavaPythonGoC#PHPRuby
doubledoubledoublefloatfloat64doublefloatFloat
floatfloatfloatfloatfloat32floatfloatFloat
int32Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead.int32intintint32intintegerBignum or Fixnum (as required)
int64Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead.int64longint/longint64longinteger/stringBignum
uint32Uses variable-length encoding.uint32intint/longuint32uintintegerBignum or Fixnum (as required)
uint64Uses variable-length encoding.uint64longint/longuint64ulonginteger/stringBignum or Fixnum (as required)
sint32Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s.int32intintint32intintegerBignum or Fixnum (as required)
sint64Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s.int64longint/longint64longinteger/stringBignum
fixed32Always four bytes. More efficient than uint32 if values are often greater than 2^28.uint32intintuint32uintintegerBignum or Fixnum (as required)
fixed64Always eight bytes. More efficient than uint64 if values are often greater than 2^56.uint64longint/longuint64ulonginteger/stringBignum
sfixed32Always four bytes.int32intintint32intintegerBignum or Fixnum (as required)
sfixed64Always eight bytes.int64longint/longint64longinteger/stringBignum
boolboolbooleanbooleanboolboolbooleanTrueClass/FalseClass
stringA string must always contain UTF-8 encoded or 7-bit ASCII text.stringStringstr/unicodestringstringstringString (UTF-8)
bytesMay contain any arbitrary sequence of bytes.stringByteStringstr[]byteByteStringstringString (ASCII-8BIT)

LambdaBuffers Frontend (.lbf) syntax

The input to the LambdaBuffers Frontend is a text file which contains a module that defines a specification of the types and type class instances you want to generate. This chapter gives the exact syntax of a LambdaBuffers Frontend file, and informally describes meaning of the syntactic constructs.

The name of a LambdaBuffers Frontend file must end with .lbf, and hence may also be referred to as a .lbf file or a .lbf schema.

Notation

In the following description of a LambdaBuffers Frontend file's syntax, we use a similar BNF syntax from Section 10.1 of the Haskell Report. So, the following notational conventions are used for presenting syntax.

SyntaxDescription
[pattern]optional
{pattern}zero or more repetitions
(pattern)grouping
pat1⎮pat2choice
pat1\pat2difference -- elements generated by pat1 except those generated by pat2
'terminal'terminal syntax surrounded by single quotes

Note that the terminal syntax permits C-style escape sequences e.g. '\n' denotes line feed (newline), and '\r' denotes carriage return.

Productions will be of the form:

nonterm -> alt1 | ... | altn

Input file representation

The input file is Unicode text where the encoding is subject to the system locale. We will often use the unqualified term character to refer to a Unicode code point in the input file.

Characters

The following terms are used to denote specific Unicode character categories:

  • upper denotes a Unicode code point categorized as an uppercase letter or titlecase letter (i.e., with General Category value Lt or Lu).

  • lower denotes a Unicode code point categorized as a lower-case letter (i.e., with General Category value Ll).

  • alphanum denotes either upper or lower; or a Unicode code point categorized as a modifier letter, other letter, decimal digit number, letter number, or other number (i.e., with General Category value Lt, Lu, Ll, Lm, Lo, Nd, Nl or No).

  • space denotes a Unicode code point categorized as a separator space (i.e., with General Category value Zs), or any of the control characters '\t', '\n', '\r', '\f', or '\v'.

Interested readers may find details of Unicode character categories in Section 4.5 of The Unicode Standard 15.1.0, and the Unicode Character Database.

Lexical syntax

Tokens form the vocabulary of LambdaBuffers Frontend files. The classes of tokens are defined as follows.

keyword         -> 'module' | 'sum' | 'prod' | 'record'
                 | 'opaque' | 'class' | 'instance' | 'import' 
                 | 'qualified' | 'as'
modulename      -> uppercamelcase
longmodulename  -> modulealias modulename
typename        -> uppercamelcase
fieldname       -> lowercamelcase\keyword
longtypename    -> modulealias typename
varname         -> lowers\keyword
punctuation     -> '<=' | ',' | '(' | ')' | '{' | '}' 
                 | ':' | ':-' | '=' | '|'
classname       -> uppercamelcase
longclassname   -> modulealias uppercamelcase

where

uppercamelcase -> upper { alphanum }
lowercamelcase -> lower { alphanum }
modulealias    -> { uppercamelcase '.' }
lowers         -> lower { lower }

Input files are broken into tokens which use the maximal munch rule i.e., at each point, the next token is the longest sequence of characters that form a valid token. spaces or line comments are ignored except as it separates tokens that would otherwise combine into a single token.

Line comments

A line comment starts with the terminal '--' followed by zero or more printable Unicode characters stopping at the first end of line ('\n' or '\r\n').

Syntax of LambdaBuffers Frontend files

A LambdaBuffers Frontend file defines a module that is a collection of data types, classes, instance clauses, and derive clauses.

The overall layout of a LambdaBuffers Frontend file is:

module -> 'module' longmodulename { import } { statement }

The file must specify the module's longmodulename where its modulename must match the LambdaBuffers Frontend file's file name not including the .lbf extension. After, the file may contain a sequence of imports followed by a sequence of statements.

Import

Imports bring entities (types and classes) of other modules into scope.

import     -> 'import' [ 'qualified' ] longmodulename [ 'as' longmodulename ] [ importspec ]
importspec -> '(' [ { typename ',' } typename [','] ] ')'

If importspec is omitted, then all entities specified in the module are imported; otherwise only the specified entities are imported.

Statement

Statements define types, classes, instance clauses, and derive clauses.

statement -> typedef
           | classdef
           | instanceclause
           | deriveclause

Type definitions

Types may be either sum types, product types, record types, or opaque types.

typedef -> prodtypedef | sumtypedef |  recordtypedef | opaquetypedef
Product type definition

A product type definition defines a new product type.

prodtypedef -> 'prod' typename { varname } '=' prod
prod        -> { typeexp }
typeexp     -> varname
             | longtypename
             | '(' prod ')'

Product type definitions instruct the code generator to generate a product type for the target language.

Sum type definition

A sum type definition defines a new sum type.

sumtypedef     -> 'sum' typename { varname } '=' sum
sum            -> sumconstructor { '|' sumconstructor }
sumconstructor -> typename prod

Sum type definitions instruct the code generator to generate a sum type for the target language.

Record type definition

A record type definition defines a new record type.

recordtypedef -> 'record' typename { varname } '=' record
record        -> '{' [ field { ',' field  } ] '}'
field         -> fieldname ':' prod

Record type definitions instruct the code generator to generate a record type for the target language.

Opaque type definition

An opaque type definition defines a new opaque type.

opaquetypedef -> 'opaque' typename { varname }

Opaque type definitions must map to existing types in the target language and it's up to the Codegen module to determine how that's exactly done.

Class definition

A class definition introduces a new class.

classdef       -> 'class' [ constraintexps '<=' ] classname { varname }
constraintexp  -> longclassname { varname }
                | '(' constraintexps ')'
constraintexps -> [ constraintexp { ',' constraintexp } ]

Class definitions communicate with the code generator the implementations that already exist (via instance clauses) or that one would like to generate (via derive clauses).

Instance clause

An instance clause specifies a type is an instance of a class.

instanceclause -> 'instance'  constraint [ ':-' constraintexps ]
constraint     -> longclassname { typeexp }

Instance clauses do not instruct the code generator to generate code, but instead instructs the compiler (semantic checking) that the target language environment provides type class implementations for the given type (provided that the given constraintexps also have implementations).

Derive clause

Derive clauses instruct the code generator to generate code for a type so that it is an instance of a class.

deriveclause -> 'derive' constraint

Note the code generation of a type for a class is implemented via builtin derivation rules (which developers may extend).

Syntax reference

The summarized productions of a LambdaBuffers Frontend file is as follows.

module -> 'module' longmodulename { import } { statement }

import     -> 'import' [ 'qualified' ] longmodulename [ 'as' longmodulename ] [ importspec ]
importspec -> '(' [ { typename ',' } typename [','] ] ')'

statement -> typedef
           | classdef
           | instanceclause
           | deriveclause

typedef -> prodtypedef | sumtypedef |  recordtypedef | opaquetypedef

prodtypedef -> 'prod' typename { varname } '=' prod
prod        -> { typeexp }
typeexp     -> varname
             | longtypename
             | '(' prod ')'

sumtypedef     -> 'sum' typename { varname } '=' sum
sum            -> sumconstructor { '|' sumconstructor }
sumconstructor -> typename prod

recordtypedef -> 'record' typename { varname } '=' record
record        -> '{' [ field { ',' field  } ] '}'
field         -> fieldname ':' prod

opaquetypedef -> 'opaque' typename { varname }

classdef       -> 'class' [ constraintexps '<=' ] classname { varname }
constraintexp  -> longclassname { varname }
                | '(' constraintexps ')'
constraintexps -> [ constraintexp { ',' constraintexp } ]

instanceclause -> 'instance'  constraint [ ':-' constraintexps ]
constraint     -> longclassname { typeexp }

deriveclause -> 'derive' constraint

LambdaBuffers Compiler

The Compiler component sits between the Frontend component and the code generation component named Codegen. The purpose of the Compiler is to perform various checks on the Compiler input provided by the Frontend ensuring that supplied type, class and instance clause definitions are valid or otherwise communicate any error conditions determined during processing.

The end goal of the Compiler is to ensure that the Codegen component is capable of processing the Compiler Output by providing correct and complete information.

Compiler Interface

The Compiler operates on the Compiler Input message specified in the compiler.proto Google Protocol Buffers schema - enabling, any Frontend to interface with the Compiler in a language agnostic manner.

Similarly, the Compiler Output message is also specified in the compiler.proto Google Protocol Buffers schema that is then consumed by Codegen modules, able to be written in any programming environment capable of communicating via Google Protocol Buffers.

Refer to the Compiler Proto documentation for more information.

Checking Type Definitions

The first step the Compiler performs is kind checking and inference on type definitions provided by the Frontend and otherwise raises kind checking errors.

When successful, the Compiler outputs a Compiler Output that annotates each type term with kind information.

In standard type checking terminology LambdaBuffers 'terms' are abstract data type declarations and their 'types' are kinds.

Currently, the Compiler accepts:

  1. type terms of kind Type (such as Int or Bool),

  2. type function terms of kind Type → Type (such as Maybe or Either - though note that type functions are not "first class" in the sense that they cannot be passed as arguments to other type functions).

There are future plans to expand this to Higher Kinded Types (such as MaybeT, StateT etc - i.e. types parameterized on type function terms) - subject to research into Codegen of such types in the target languages.

The Compiler supports recursive types.

All LambdaBuffers type variables terms must be monomorphically kinded, with polymorphic kinds defaulting to monomorphic ones. For example Phantom a = Phantom would resolve to the monomorphic kind Type → Type rather than the polymorphic kind ∀a. a → Type.

Kind Checker

The kind checker is designed around a simply typed lambda calculus type system with inference via unification (John Alan Robinson's unification algorithm is being currently used). We've also introduced a let binding rule that allows later additions to the typing context. The typing rules are the following:

 x : σ ∈ Γ
----------- [Variable]
 Γ ⊢ x : σ


      Γ,x:σ ⊢ e:τ
------------------------ [Abstraction]
 Γ ⊢ (λ x:σ. e):(σ → τ)


Γ ⊢ x:(σ → τ)    Γ ⊢ y:σ
------------------------- [Application]
       Γ ⊢ x y : τ


Γ ⊢ e₁:σ    Γ, e₁ : σ ⊢ e₂:τ
----------------------------- [Let]
   Γ ⊢ let x = e₁ in e₂ : τ

The type checking strategy is as follows:

  1. A context is built from the type definitions. Type variables which are not annotated with any further kind information are defaulted to be of kind Type.

  2. The RHS of the type terms are converted into their Sums of Products canonical representation. The variables from the left hand side of the term are introduced to the right hand side as abstractions.

  3. The derivation for each term is built.

  4. Then the unification tries to find a solution.

Terms for which unification cannot find a consistent derivation are deemed incorrect and a kind checking error is thrown. Note that currently all of the inferred kinds have a restriction to be monomorphic - therefore no free or universally quantified variables can appear in the final kind signature.

Checking Type Cardinality

In addition to kind checking, the Compiler could perform a special check for types to determine their cardinality. This is especially useful to catch and report on non inhabited types that users might define.

For example, data F a = F (F a) declares a non-inhabited recursive type that can't be constructed. LambdaBuffers Compiler SHOULD reject such types as they can't possibly be constructed and generated typeclass instances would in turn be ill-defined.

This problem is equivalent to a problem of calculating graph components.

Normalizing Type Definitions

Finally, the compiler should be able to normalize expressions. For example, it may be possible to define a data type in the schema language in a form similar to: data G a = G ((Either) ((Maybe) a) Int), where the bracketing indicates the order of application within the term. The example term would normalize to data G a = G (Either (Maybe a) Int) - resulting in a cleaner (and more performant) code generation.

Checking Typeclass Definitions and Instance Clauses

The Compiler should, if possible, ensure that all instance declarations for schemata are derivable using hard-coded derivation axioms.

Other schema languages support generating type definitions in many languages from a single definition in the schema language. One key feature that sets LambdaBuffers apart from these alternatives is support for typeclasses, which enable the generation of ad-hoc polymorphic functions that operate on types generated from LambdaBuffers schemata.

The LambdaBuffers schema language doesn't allow users to specify typeclass instance implementations themselves. Users, instead, will write instance clauses as part of the schema definition, and the LambdaBuffers code generator will derive these declared instances when generating code.

Two important consequences of this design decision are:

  1. All instances must be derived structurally. As an example, consider the arbitrary product type data P a b = P a b. The semantics of the generated instance (i.e. the behavior of the generated code) must be determinable from the structure of the type - that it is a product - and the instances for its arguments a and b, and by those features alone. (Since a and b are type variables here, writing a direct instance for any interesting class is likely impossible, so LambdaBuffers supports constrained instances such as instance (C a, C b) => C (P a b))

  2. All instances must be uniform across supported languages. Because the LambdaBuffers Codegen component (and not the user) is responsible for generating instances, we must ensure that the codegen component is suitably equipped to generate instances in each language that exhibit behavior which is, to the greatest extent possible, equivalent to the behavior of generated instances in any other language. We must have an extensive test quite to verify uniform behavior of generated instances.

In languages with a typeclass system (Haskell, PureScript) or equivalent (Rust's Traits), we will utilize the existing system and should (to the extent that doing so is feasible) utilize existing typeclasses and instances from commonly used or standard libraries. In languages lacking a type system that is sufficiently rich to express typeclass relations, we will generate instances using idiomatic language features. (The details will depend on the particular language.)

LambdaBuffers Codegen

NOTE: The implementation of the code generation framework is still in early stages and will likely undergo substantial changes as development continues. This document serves to outline general principles that any implementation of the LambdaBuffers code generation framework ought to adhere to.

Requirements

  1. Modular & reusable components
  2. Ergonomic interface
  3. Extensible to new opaque types
  4. Extensible to new type classes

Modular & Reusable Components

Because the code generation modules for each target language will almost certainly constitute the bulk of the final LambdaBuffers codebase, it is essential that components of the code generation framework be as modular and reusable as is practicable.

Although each target language has its own distinct syntax and semantics, many syntactic forms exist in multiple target languages. For example, Haskell and PureScript both use a comma-separated set enclosed in square brackets (e.g. [1,2,3]) to represent a List _/[_] value, while Rust uses a similar form (in conjunction with a macro, e.g. vec![1,2,3]) to represent a Vector<_> value. To reduce redundant and superfluous code, common syntactic patterns such at this should be abstracted out into a set of functions that can be reused across languages.

Ergonomic Interface

While the LambdaBuffers team will support a finite set of specific target languages, adding support for an additional language should be as painless as possible (ceteris paribus) to encourage and facilitate open source contributions by third parties. A basic set of tools which can be used to write code generation modules for any target language should be developed, and all code generation modules written by the LambdaBuffers team should employ those tools (in order to provide a robust set of examples for future contributors, among other benefits).

Extensible to New Opaque Types

Users and contributors should be able to easily extend the default set of supported opaque types to support additional opaque types. In the context of code generation, this means: Users should have the ability to specify the target type for a given opaque type in the target language (including the package or module that contains the target type if the target type is not part of the language's standard library).

Because type class instances must be derived structurally, and because an opaque type is by definition a type with no visible internal structure, users should be provided with an ergonomic interface for noting the presence of a type class instance for an opaque type's target type in a particular language (if the instance exists in the standard library), or for referring to the module where such an instance is located (if the instance is defined in a library or by the user in one of their own modules).

Extensible to New Type Classes

Users and contributors should be able to easily extend the default set of supported type classes to support additional type classes and facilitate the derivation of instances for newly defined classes.

In practice, the type class code generation machinery must consist of two distinct parts: First, a set of deriving rules which correspond to instances that already exist (in some module or in a language's standard library/prelude). Second, code generation functions that operate on user-defined types which satisfy those rules.

Each of these parts should be ergonomic, well-documented, and the implementation of the default set of supported type classes ought to be thoroughly commented in order that users have a diverse set of real examples to work from.

LambdaBuffers Command Line Interface

LambdaBuffers consists of three runtime command line interface components:

The Frontend CLI orchestrates work between the user and the Compiler and Codegen components.

It's desirable to have both the Compiler CLI and the Codegen CLI subject to a strict API with a specified set of flags to enable CLI implementation from various sources. This would be especially helpful with Codegen modules that bring about support for new targets, opaque types and typeclasses.

Comparison Matrix

Legend:

  • 🟢 Available (grading added in some cases spanning: Bad, Average, Good, Excellent)
  • 🟡 In development
  • 🔵 Potential future feature
  • 🔴 Not currently available
  • ❔ Not clear
FeatureProto BuffersADLJSON SchemaLambda BuffersCDDLASN.1
Sum types🟢🟢🔴🟢🟢🟢
Record types🟢🟢🟢🟢🟢🟢
Product types🔴🔴🔴🟢🔴
Recursive types🟢🟢🔴🟢🟢
Parameterized types (generic types)🔴🟢🔴🟢🟢🔴
Type annotations/constraints🟢🟢🟢🔵🟢🟢
Add new builtin types🔴🟢🔴🟢🔴🔴
Add new type semantics (e.g. different encodings)🟢🟢🔴🟢🔴🟢
Manage type semantics (at language level)🔴🔴🔴🟢🔴🔴
Codegen support🟢 (Excellent)🟢 (Average)🟢 (Excellent)🟡🟢 (Bad)🟢 (Average)
DevOps tooling - build system integration🟢🔴🟡🔴🔴
Documentation tooling🟢🔴🟢🔵🔴
Formatting, linting, and development environment tools🟢🔴🟢🟢🔴🔴
Language checker API🟢🔴🟢🟢🔴🔴
Codegen API🟢🟢🔴🟢🔴🔴
Language specification🟢🟢🟢🟢🟢🟢
Backwards compatibility strategy🟢🔴🔴🔴🔴🔴

Descriptions

Sum Types

Types of the form Time = Present | Past | Future, which allow a type do be constructed by one of many variants. Think Rust's enums.

Product Types

Types of the form Person = MkPerson Age Name, where MkPerson is of Kind Type->Type->Type. Product types combine multiple elements into one data type without tagging the elements.

Record Types

Types of the form Person = MkPerson { age :: Age, name :: Name }. Record types are similar to structs in most programming languages.

Recursive Types

Recursive types are defined by the presence of the LHS type in its RHS definition. A classic example is:

List a = Nil | Cons a (List a)
^^^^^^                 ^^^^^^

Parameterized Types (Generics)

Type functions allow for the introduction of type variables in the LHS definition of the term - creating a parametrised type definition. The classic example is Maybe a which is the equivalento of Option <A> in rust:

Maybe a = Nothing | Just a

Using the above type definition we can now define another type that uses Maybe and instantiates it to use Integer

Time_Saved_via_LambdaBuffers = Maybe Integer

Type Annotations / Constraints

There exists a system of constraining or further annotating types - enriching the type's specification.

Add New Built-in Types

Refer to design document.

Add New Type Semantics

Refer to the design document.

Manage Type Semantics (at Language Level)

Refer to the design document..

Codegen Support

Codegen support relates to the language being able to generate types for other programming languages.

DevOps Tooling - Build System Integration

The framework/language provides a seamless way of integrating with normal build tools and systems (eg. Bazel, Nix, etc.).

Documentation Tooling

The language can generate human readable documentation in an easy to share and view format. For example HTML, or Markdown.

Formatting, Linting, and Development Environment Tools

Tools that allow formatting, linting, and automating standardisation of the language specific files.

Language Checker API

The language checker component exposes an API to interface with itself in a language agnostic manner.

Codegen API

The language codegen component exposes an API to interface with itself in a language agnostic manner.

Language Specification

There exists a well defined language specification document.

Backwards Compatibility Strategy

The language makes certain backwards compatibility guarantees between versions of the same type definition.

References

  • https://json-schema.org/implementations.html
  • https://www.rfc-editor.org/rfc/rfc8610
  • https://github.com/timbod7/adl
  • https://www.itu.int/en/ITU-T/asn1/Pages/introduction.aspx
  • https://protobuf.dev/
  • https://github.com/dcSpark/cddl-codegen

Lambda Buffers: Aiken Research Document

The Lambda Buffers team has deeply researched the Aiken programming language with the intention to find a technical path to integrate it with Lambda Buffers along the already integrated languages Plutus, Haskell, Rust and JavaScript. The conclusion of this research phase is that, while it would be indeed possible for Lambda Buffers to provide limited support for Aiken, it would result in a poor user experience that would be in conflict with the major language features of Aiken or in conflict with the key functionalities provided by Lambda Buffers. This document presents in detail the challenges found and its impact on the feasibility or convenience to undertake the Aiken integration.

Aiken limitations

This section describes limitations with Aiken.

All testing / observations with Aiken were done with the following version.

$ aiken --version
aiken v1.0.28-alpha+c9a1519

Aiken has no type class support

A key feature of Lambda Buffers is to provide both types and type class instances. Aiken has no support for type classes, so one must generate the type class system themselves. In other words, one must provide:

  • A calling convention for functions with type classes.

  • Functions to create instance dictionaries for instances of the type class.

The former is straightforward to implement. One can explicit pass an instance dictionary for each function that requires a type class as per the usual compilation of type class code to type class free code.

The latter requirement poses troubles for Aiken. Aiken does not allow one to create instance dictionaries (product types of the instance's methods) since composite types in Aiken cannot contain a function type.

For example, the following Aiken type

type EqDict<a> {
    eq: fn(a,a) -> Bool, 
    neq: fn(a,a) -> Bool, 
}

would produce an error as follows.

$ aiken build
    Compiling me/package 0.0.0 (/aikentest/package)
    Compiling aiken-lang/stdlib 1.8.0 (/aikentest/package/build/packages/aiken-lang-stdlib)
        Error aiken::check::illegal::function_in_type

  × While trying to make sense of your code...
  ╰─▶ I found a type definition that has a function type in it. This is not allowed.

   ╭─[/aikentest/package/validators/myvalidator.ak:3:1]
 3 │ type EqDict<a> {
 4 │     eq: fn(a,a) -> Bool,
   ·     ───────────────────
 5 │     neq: fn(a,a) -> Bool,
   ╰────
  help: Data-types can't hold functions. If you want to define method-like functions, group the type definition and the methods under a common namespace in a standalone
        module.

      Summary 1 error, 0 warnings

This makes it impossible to pass instance dictionaries via Aiken's builtin types for type classes.

Alternatively, one could try to sidestep Aiken's builtin types by creating a type which is a Church encoded tuple (i.e., implementing a tuple type via function types), but doing so requires higher ranked types which again Aiken does not support. Moreover, it appears that Aiken does not provide any "back doors" to the type system (e.g. TypeScript's any type) to trick the type system that using a Church encoded tuple and its projections are well typed.

It's clear now that having an explicit type for an instance dictionary is not feasible in Aiken, so owing to the fact that an instance dictionary is a product type of functions, one can achieve type classes via dictionary passing by replacing all instance dictionaries as multiple arguments of each method in the type class, and replace the function to create an instance dictionary with multiple functions to create each method in the type class. This is indeed possible in Aiken, and to demonstrate this technique, consider the following Haskell code (which loosely models code generated from Lambda Buffers)

class Eq a where
    eq :: a -> a -> Bool

class PlutusData a where
    toData :: a -> Data
    fromData :: Data -> a

data MyOption a = MyJust a | MyNothing

instance Eq a => Eq (MyOption a) where
    eq (MyJust s) (MyJust t) = s == t
    eq MyNothing  MyNothing  = True
    eq _          _          = False

instance PlutusData a => PlutusData (MyOption a) where
    toData (MyJust s) = Constr 0 [toData s]
    toData MyNothing  = Constr 1 []

    fromData (Constr 0 [s]) = MyJust (fromData s)
    fromData (Constr 1 [])  = MyNothing
    fromData _              = error "bad parse"

A translation to type class free code in Aiken is as follows.

#![allow(unused)]
fn main() {
use aiken/builtin as builtin
use mypackage/lb_prelude/types_/int as lb_prelude_int  // this would have to be implemented in an lb-prelude runtime

pub type MyOption<t> {
    MyJust(t)
    MyNothing
}

pub fn eq(eqt : fn(t,t) -> Bool) -> fn(MyOption<t>,MyOption<t>) -> Bool {
    fn(a,b) { 
        when a is {
            MyJust(aJust) -> when b is {
                MyJust(bJust) -> eqt(aJust,bJust)
                MyNothing -> False
            }
            MyNothing -> when b is {
                MyJust(_) -> False
                MyNothing -> True
            }
        }
    }
}

pub fn fromData(fromDataT : fn(Data) -> t) -> fn(Data) -> MyOption<t> {
    fn(theData) {
        let answer = 
            builtin.choose_data (theData, 
                  (fn(theData) { 
                        let constr = builtin.un_constr_data(theData)
                        let tag = constr.1st
                        let fields = constr.2nd
                          when tag is {
                                0 -> when fields is {
                                    [ justValue ] -> MyJust(fromDataT(justValue))
                                    _ -> error @"Bad parse"
                                }
                                1 -> when fields is {
                                    [] -> MyNothing
                                    _ -> error @"Bad parse"
                                }
                                _ -> error @"Bad parse"
                            }
                    })(theData), (fn(_theData) { error @"Bad parse" })(theData), (fn(_theData) { error @"Bad parse"})(theData), (fn(_theData){error @"Bad parse"})(theData), (fn(_theData) { error @"Bad parse"})(theData))
        answer
    }
}

pub fn toData(toDataT : fn(t) -> Data) -> fn(MyOption<t>) -> Data {
    fn(theOption) {
        when theOption is {
            MyJust(justValue) -> builtin.constr_data(0, [toDataT(justValue)])
            MyNothing -> builtin.constr_data(1, [])
        }
    }
}

// Example usages:

test from_nothing_test() {
    fromData(lb_prelude_int.fromData)(builtin.constr_data(1, [])) == MyNothing
}

test from_just_test() {
    fromData(lb_prelude_int.fromData)(builtin.constr_data(0, [builtin.i_data(69)])) == MyJust(69)
}

test from_to_nothing_test() {
    toData(lb_prelude_int.toData)(fromData(lb_prelude_int.fromData)(builtin.constr_data(1, []))) == builtin.constr_data(1, [])
}

test from_to_just_test() {
    toData(lb_prelude_int.toData)(fromData(lb_prelude_int.fromData)(builtin.constr_data(0, [builtin.i_data(69)]))) == builtin.constr_data(0, [builtin.i_data(69)])
}
}

This translation of type classes has some limitations such as:

  • All type class instances must be defined in the same module that the type is defined in i.e., orphan instances are forbidden.

  • Only Haskell2010 type classes would be supported.

While the above has demonstrated how one can translate type class instances in Lambda Buffers to type class free code in Aiken, this unfortunately leads to a bad user experience for the "builtin" PlutusData type class in Aiken. Aiken by default "generates" its own PlutusData instances for all composite types. As such, Aiken provides some nice syntactic features to make writing smart contracts particularly readable.

A common pattern to write a validator in Aiken is as follows.

#![allow(unused)]
fn main() {
pub type MyRecord<t> {a : t, b : Int }

validator {
    pub fn hello_world(_redeemer: MyRecord<Int>, _scriptContext: Data) {
                              //  ^~~~ this will automatically use Aiken's builtin PlutusData instances
        ...
    }
}
}

Unfortunately, with the type class system described in this section, an Aiken developer will no longer be able to write this (since Lambda Buffers would generate its own PlutusData instances that are not used by Aiken) and instead must write the validator more verbosely as follows.

#![allow(unused)]
fn main() {
pub type MyRecord<t> {a : t, b : Int }

validator {
    pub fn hello_world(redeemer: Data, _scriptContext: Data) {
       let actualRedeemer = myRecordFromData(intFromData)(redeemer)
        // ^~~~ Aiken users need to write more code in order to use Lambda
        // Buffers so that it will use Lambda Buffers' encoding for the
        // validator. Note that this sample assumes that `myRecordFromData :: (Data -> t) -> Data -> MyRecord<t>` 
        // exists as would be generated by Lambda Buffers.
        ...
    }
}
}

Clearly, this increase in code bloat to express the same simple idea contradicts the promises of making smart contracts easy to write on Aiken.

Aiken's encoding of its data is different from Lambda Buffers encoding

All onchain scripts must be compiled to UPLC which must in some method represent the higher level language constructs like data types in the original language. Often, data types in a higher level language are translated to UPLC's builtin Data type which supports types like lists, constructors, integers, bytestrings, and maps. Note that data which will exist as a datum or redeemer must admit a representation with this Data type.

Lambda Buffers chooses a particularly efficient encoding of its data types to Data mapping to its target languages that map to UPLC. For example, a record like

record MyRecord = { a : Integer, b : Integer }

would be translated to

[a, b]

i.e., records are lists of all record components1.

1

There are some special cases for the encoding in Lambda Buffers. For example, singleton records are encoded as just the single element.

If Lambda Buffers compiled MyRecord to a record in Aiken as follows.

#![allow(unused)]
fn main() {
type MyRecord {
    a : Int,
    b : Int
}
}

Then, one can observe that Aiken will internally represent this as the following Data type

Constr 0 [a, b]

where we note that Aiken includes a useless Constr 0 tag meaning Aiken's encoding is less efficient than Lambda Buffers' encoding.

In general, Aiken's documentation for the encoding from Aiken's types to UPLC Data is unclear, but one can inspect the generated UPLC to verify that Aiken would encode the data as mentioned above.

For example, given the following Aiken module

#![allow(unused)]
fn main() {
pub type MyRecord { a: Int, b: Int }

validator {
    pub fn hello_world(_redeemer: Data, _scriptContext: Data) {
        let theRecord = MyRecord(69, -69)

        theRecord.a == 420 && theRecord.b == -420
    }
}
}

One can compile and inspect the UPLC as follows

$ aiken build --uplc
...
$ cat artifacts/myvalidator.hello_world.uplc
artifacts/myvalidator.hello_world.uplc
(program
  1.0.0
  [
    (lam
      i_0
      [
        (lam
          i_1
          [
            (lam
              i_2
              [
                (lam
                  i_3
                  (lam
                    i_4
                    (lam
                      i_5
                      (force
                        [
                          [
                            [
                              i_3
                              (force
                                [
                                  [
                                    [
                                      i_3
                                      [
                                        [
                                          (builtin equalsInteger)
                                          [
                                            (builtin unIData)
                                            [
                                              i_1
                                              [
                                                i_2
                                                [
                                                  (builtin unConstrData)
                                                  (con
                                                    data (Constr 0 [I 69, I -69])
                                                  )
                                                ]
                                              ]
                                            ]
                                          ]
                                        ]
                                        (con integer 420)
                                      ]
                                    ]
                                    (delay
                                      [
                                        [
                                          (builtin equalsInteger)
                                          [
                                            (builtin unIData)
                                            [
                                              i_1
                                              [
                                                i_0
                                                [
                                                  i_2
                                                  [
                                                    (builtin unConstrData)
                                                    (con
                                                      data (Constr 0 [I 69, I -69])
                                                    )
                                                  ]
                                                ]
                                              ]
                                            ]
                                          ]
                                        ]
                                        (con integer -420)
                                      ]
                                    )
                                  ]
                                  (delay (con bool False))
                                ]
                              )
                            ]
                            (delay (con unit ()))
                          ]
                          (delay [ (error ) (force (error )) ])
                        ]
                      )
                    )
                  )
                )
                (force (builtin ifThenElse))
              ]
            )
            (force (force (builtin sndPair)))
          ]
        )
        (force (builtin headList))
      ]
    )
    (force (builtin tailList))
  ]
)

In particular, the following lines are evidence to support that the record is encoded inefficiently as Constr 0 [<integer>, <integer>].

      [
        (builtin unConstrData)
        (con
          data (Constr 0 [I 69, I -69])
        )
      ]

This is awkward for Lambda Buffers since when Aiken works with the MyRecord type, it is represented with the Constr tag meaning that Lambda Buffers' efficient encoding would be translated to Aiken's inefficient encoding. Ideally, one would want to change how Aiken encodes its data types internally so that Aiken can use Lambda Buffers' efficient encoding everywhere. Thus, we lose all benefits of Lambda Buffers' efficient encoding when working with Aiken's mechanisms to define types because Lambda Buffers is forced to take an extra step to translate to Aiken's inefficient encoding. As such, Aiken's opinionated way of encoding its data is at odds with Lambda Buffers.

To resolve the mismatch in the encoding of data between the two, one could alternatively sidestep all of Aiken's methods for defining types and instead use Aiken's opaque types to alias Data and provide ones own constructors / record accesses as follows.

#![allow(unused)]
fn main() {
use aiken/builtin as builtin

pub opaque type MyRecord { data: Data }

// Constructor for `MyRecord`
pub fn myRecord(a: Int, b: Int) -> MyRecord {
    MyRecord{ data : builtin.list_data([builtin.i_data(a), builtin.i_data(b)]) }
}

// Projection for the field `a` of `MyRecord`
pub fn myRecord_a(value : MyRecord) -> Int {
    builtin.un_i_data(builtin.head_list(builtin.un_list_data(value)))
}

// Projection for the field `b` of `MyRecord`
pub fn myRecord_b(value : MyRecord) -> Int {
    builtin.un_i_data(builtin.head_list(builtin.tail_list(builtin.un_list_data(value))))
}

validator {
    pub fn hello_world(_redeemer: Data, _scriptContext: Data) {
        let theRecord = myRecord(69, -69)

        myRecord_a(theRecord) == 420 && myRecord_b(theRecord) == -420
    }
}
}

Interested readers may inspect the compiled UPLC to verify that the data encoding of MyRecord agrees with Lambda Buffers' encoding.

(program
  1.0.0
  [
    (lam
      i_0
      [
        (lam
          i_1
          [
            (lam
              i_2
              [
                (lam
                  i_3
                  (lam
                    i_4
                    (lam
                      i_5
                      (force
                        [
                          [
                            [
                              i_3
                              [
                                (lam
                                  i_6
                                  (force
                                    [
                                      [
                                        [
                                          i_3
                                          [
                                            [
                                              (builtin equalsInteger)
                                              [
                                                (builtin unIData)
                                                [
                                                  i_1
                                                  [ (builtin unListData) i_6 ]
                                                ]
                                              ]
                                            ]
                                            (con integer 420)
                                          ]
                                        ]
                                        (delay
                                          [
                                            [
                                              (builtin equalsInteger)
                                              [
                                                (builtin unIData)
                                                [
                                                  i_1
                                                  [
                                                    i_0
                                                    [ (builtin unListData) i_6 ]
                                                  ]
                                                ]
                                              ]
                                            ]
                                            (con integer -420)
                                          ]
                                        )
                                      ]
                                      (delay (con bool False))
                                    ]
                                  )
                                )
                                [
                                  (builtin listData)
                                  [
                                    [ i_2 (con data (I 69)) ]
                                    [
                                      [ i_2 (con data (I -69)) ]
                                      (con (list data) [])
                                    ]
                                  ]
                                ]
                              ]
                            ]
                            (delay (con unit ()))
                          ]
                          (delay [ (error ) (force (error )) ])
                        ]
                      )
                    )
                  )
                )
                (force (builtin ifThenElse))
              ]
            )
            (force (builtin mkCons))
          ]
        )
        (force (builtin headList))
      ]
    )
    (force (builtin tailList))
  ]
)

Note that this would most likely offer a poor user experience as this would essentially replace a large part of Aiken's language construct with our own generated functions for constructing, deconstructing, serialization / deserialization to Data, etc.

In either case, to mediate the Data serialization / deserialization mismatch of Aiken and Lambda Buffers, it puts a bulkier mental overhead on the Aiken developer. For example, as in the previous section, an Aiken developer would expect to write a validator as follows.

#![allow(unused)]
fn main() {
pub type MyRecord {a : Int, b : Int }

validator {
    pub fn hello_world(_redeemer: MyRecord, _scriptContext: Data) {
                              //  ^~~~ this will automatically use Aiken's builtin Data serialization and deserialization
        ...
    }
}
}

But, any of the solutions to mediate the Data encoding mismatch of Aiken and Lambda Buffers would force an Aiken developer to instead write a more verbose validator as follows.

#![allow(unused)]
fn main() {
pub type MyRecord {a : Int, b : Int }

validator {
    pub fn hello_world(redeemer: Data, _scriptContext: Data) {
        let actualRedeemer = myRecordFromData(redeemer)
                          // ^~~~ Assume that Lambda Buffers would generate `myRecordFromData :: Data -> MyRecord`
        ...
    }
}
}

Again, it's clear this contradicts Aiken's goal of making writing smart contracts easy as Lambda Buffers integration would increase the mental overhead of working with all generated data types.

Aiken's packages only support fetching dependencies remotely

Lambda Buffers is more than just a code generator. In addition to generating code for sharing types and semantics, its Nix tools augment a set of packages for a project with a package of the generated Lambda Buffers code.

Aiken does support having packages, but it appears that it only officially supports fetching packages from either Github, GitLab, or BitBucket i.e., it's unclear how to create a local package set augmented with Lambda Buffers' packages.

For example, the following aiken.toml file

name = "me/package"
version = "0.0.0"
plutus = "v2"
license = "Apache-2.0"
description = "Aiken contracts for project 'package'"

[repository]
user = "me"
project = "package"
platform = "github"

[[dependencies]]
name = "aiken-lang/stdlib"
version = "1.8.0"

source = "github"
[[dependencies]]
name = "me/otherpackage"
version = "0.0.0"
source = "what do I put here if I have my own local package?"

would produce an error like

$ aiken build
        Error aiken::loading::toml

  × TOML parse error at line 20, column 10
  │    |
  │ 20 | source = "what do I put here if I have my own local package?"
  │    |          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  │ unknown variant `what do I put here if I have my own local package?`, expected one of `github`, `gitlab`, `bitbucket`
  │
    ╭─[/home/jared/Documents/Work/aikentest/mypackage/aiken.toml:19:1]
 19 │ version = "0.0.0"
 20 │ source = "what do I put here if I have my own local package?"
    ·          ────────────────────────────────────────────────────
 21 │
    ╰────

where the error message makes it clear that it only expects the source of dependencies to be from either GitHub, GitLab, or BitBucket.

As such, it's unclear how to augment the local set of packages with a Lambda Buffers package. Indeed, it's possible to trick Aiken into thinking that a Lambda Buffers package is already installed by preparing Aiken's build directory with the dependencies copied in ahead of time, but this delves into implementation specific details of Aiken that may break between releases. An example of this technique is here.

Summary of Aiken limitations

This section summarizes the Aiken limitations and incompatibilities with Lambda Buffers.

  1. Aiken has no type classes, but Lambda Buffers requires type classes. As such, Lambda Buffers support for Aiken would require its own implementation of type classes. Unfortunately, implementing type classes is awkward in Aiken because composite data types in Aiken cannot store functions. While there are work arounds to implement type classes in Aiken, this fundamentally will create a poor user experience for an Aiken developer as using Lambda Buffers' generated type classes such as PlutusData would be at odds with the builtin syntactic goodies of Aiken's default PlutusData type class instances.

  2. Aiken's PlutusData representation of its data types is different from Lambda Buffers' representation of PlutusData. This means that we have a choice of either:

    • Translating Lambda Buffers types to Aiken's builtin composite types which would lead to inefficient code in the already constrained onchain code environment since this would be "massaging" PlutusData representations when we would really want Aiken to use Lambda Buffers PlutusData encoding directly.

    • Translating Lambda Buffers types to a opaque type alias in Aiken which would then require us to generate supporting functions for constructors and destructors which would make Aiken's major language features obsolete, and so have a poor user experience.

    To put this more explicitly, we either have inefficient code with a somewhat nice user experience for an Aiken developer, or efficient code with an awful user experience for an Aiken developer.

  3. Creating local package sets in Aiken is unclear, but creating such local package sets is a principal feature of Lambda Buffers. Indeed, there are tricks one can do to work around this, but this depends on internal implementation details of Aiken that may break between releases.

All in all, at the moment it's clear that while it may be possible to integrate Aiken with Lambda Buffers, such integration would have

  • limited support for Lambda Buffers' key features; and

  • a poor user experience for Aiken developers that use Lambda Buffers.

So, the extra effort needed to mitigate these challenges appear to be counter productive with Aiken's and Lambda Buffers' project goals. Moreover, Aiken is still in an alpha release and is rapidly changing, so the effort to mitigate these challenges would be squandered away as Aiken evolves. Thus, given these challenges, it's clear that it would be unwise to undertake the Aiken implementation currently, and it would be wiser to revisit this later and focus on matters of pressing importance today to better foster adoption of Lambda Buffers.

Lambda Buffers has fortunately seen industry use in other projects such as DeNS, TripHut DAO, etc., and there's been feedback to improve the existing facilities in Lambda Buffers which would aid in fostering the adoption of Lambda Buffers in the greater Cardano community. Some of these issues include the following.

Catalyst reports

Our project was funded by Catalyst. As a proof of completion, we collected our reports of finished Catalyst projects here:

Catalyst 9 reports

Catalyst milestone 1: Research

Outputs

  • A report summarizing user interviews and containing a qualitative analysis of the discovered use cases.
    • STATUS: Done (#17)
    • An interview with 3 MLabs engineers was performed (1.5h) who's work span multiple Cardano dApp projects. Their feedback is made available in the repo.
    • Additionally, a survey was sent out to MLabs engineers and their feedback is made available in the repo.
  • An architecture design document.
    • STATUS: Done (#17)
    • Extensive documentation on design, compiler and codegen is made available in the repo.
  • A language specification document elaborating on the data type model features.
    • STATUS: Done (#17)
    • Extensive documentation on design, compiler and codegen is made available in the repo.
  • A related work document comparing the proposed technology via a feature matrix with others in the same space.
    • STATUS: Done (#17, #18)
    • Document comparing different schema techologies to LambdaBuffers is made available in the repo
  • An initial compiler implementation that performs some basic checks in accordance with the language specification.

Acceptance Criteria

  • At least 3 users/projects have been interviewed about their desired use case for this technology.
    • An interview with 3 MLabs engineers was performed (1.5h) who's work span multiple Cardano dApp projects. Their feedback is made available in the repo.
    • Additionally, a survey was sent out to MLabs engineers and their feedback is made available in the repo.
  • The architecture design document is completed and available in the project repository.
  • The initial compiler implementation is completed, capturing SOME of the intended language semantics as described in the Language Specification

Evidence of Milestone Completion

  • Completed and reviewed design document is available in the project repository.
  • Completed and reviewed initial version of the compiler command line tool made available in the project repository.
    • The Frontend CLI called lambda-buffers-frontend-cli is made available in the repo and is currently able to parse, validate and format .lbf documents that contain the LambdaBuffers type modules:
lambda-buffers/lambda-buffers-frontend$ cabal run 
Usage: lambda-buffers-frontend-cli COMMAND

  LambdaBuffers Frontend command-line interface tool

Available options:
  -h,--help                Show this help text

Available commands:
  compile                  Compile a LambdaBuffers Module (.lbf)
  format                   Format a LambdaBuffers Module (.lbf)

There's ongoing work to integrate the Compiler CLI in the Frontend CLI.

  • Test case: Compiler is able to validate a schema that uses a subset of types and capabilities from the spec.

References

Catalyst milestone 2: End to end proof of concept

Outputs

  • A configuration DSL for specifying domain data types.
    • LambdaBuffers Frontend supports specifying modules with type definitions with opaque, product/record and sum types. Additionally, type class definitions are supported as well as type class rule definitions using the 'instance clause' and 'derive' syntax.
    • Refer to the standard LambdaBuffers library lbf-base to get a sense of what the language looks like.
  • A compiler tool that outputs the interpreted configuration.
  • A Codegen module that takes in the interpreted configuration and outputs a Haskell+PlutusTx (was Plutarch) Cabal project containing all the types and necessary type class wiring.
  • A Codegen module that takes in the interpreted configuration and outputs a Purescript+CTL Spago project containing all the types and necessary wiring.

Acceptance Criteria

  • The generated Haskell+Plutarch Cabal project can successfully be built.
  • The generated Purescript+CTL Spago project can successfully be built.
  • All the above codegen modules are reviewed and made available in the project repository.

Evidence of Milestone Completion

Demo recordings

Demo files:

References

Catalyst milestone 3: Testing and documentation

Outputs

  • A test suite checking for correct mapping from schema data types to PlutusData encodings against a known-good corpus of such mappings (golden tests).

    • A dedicated lbt-plutus test suite was implemented for both Haskell and Purescript backends. They leverage both golden unit testing approach and randomized property based testing to assert the essential properties of the LambdaBuffers Plutus package:
      • Plutus.V1.PlutusData derivation tests
        • Golden unit tests: forall (golden : Days.Day.*.pd.json): (toJson . toPlutusData . fromPlutusData . fromJson) golden == golden
        • Property tests: forall (x : Foo.*): (fromPlutusData . toPlutusData) x == x
      • Plutus.V1.PlutusData instance tests
        • Golden unit tests: forall (golden : *.pd.json): (toJson . toPlutusData . fromPlutusData . fromJson) golden == golden
  • A test suite checking for roundtrip compatibility between codegened target environments.

    • A dedicated lbt-plutus test suite was implemented for both Haskell and Purescript backends.
    • A dedicated lbt-prelude test suite was implemented for both Haskell and Purescript backends.
    • Both include golden unit tests that provide assurances that these backends implement the LambdaBuffers packages in a mutually compatible manner.
  • A modular and contract-based test suite architecture streamlining codegen testing compliance for any of the supported typeclasses.

  • A document mapping the schema data types and typeclasses to their corresponding code-generated variants in the target environments.

Acceptance Criteria

  • The test suites are passing for the Haskell+PlutusTx codegen module.
    • CI targets:
      • checks.x86_64-linux."check-lbt-prelude-haskell:test:tests"
      • checks.x86_64-linux."check-lbt-plutus-haskell:test:tests"
  • The test suites are passing for the Purescript+CTL codegen module.
    • CI targets:
      • checks.x86_64-linux."purescript:lbt-plutus:check"
      • checks.x86_64-linux."purescript:lbt-prelude:check"
  • The “Mappings” document is made available in the project repository.

Evidence of Milestone Completion

References

Documentation strategy

Each typeclass receives a Specification document that closely explains the required semantics for LB types (sum/products/records).

Each backend receives a User documentation that elaborates on how the generated code is used.

Testing strategy

LambdaBuffers works with typeclasses, such as Eq, PlutusData and JSON. When a LB type has been declared with support for any of such typeclass, values of said types need to be handled in exactly the same manner as elaborated in the Specification for a given Typeclass in ALL support target language environments.

Concretely, if a type Foo has been declared with support for JSON, and toJson and fromJson methods have been generated for target languages, they have to be in correspondence.

module Foo

import Prelude (Json)

sum Foo = Bar Integer | Baz Text

deriving Json Foo

In Haskell and Purescript values of Foo would be Bar 1 and Baz "baz", and their respective JSON mappings would be {"constructor" : "Bar", "product" : [1]} and {"constructor" : "Baz", "product" : ["baz"]}.

Testing encoding typeclasses: from . to goldens

For each typeclass, we maintain a set of golden' files of known good/bad that lift into the target language withfromand write it back withto` into a separate file. Then we provide an assertion that these files are semantically 'equal' (for example JSON spaces don't matter and such).

Example test:

  • test/foo/Foo.lbf
  • test/foo/json/bar.1.json
  • test/foo/json/bar.2.json
  • test/foo/json/baz.1.json
  • test/foo/json/baz.2.json
  • test/foo/json/haskell/bar.1.json
  • test/foo/json/haskell/bar.2.json
  • test/foo/json/haskell/baz.1.json
  • test/foo/json/haskell/baz.2.json
  • test/foo/json/purescript/bar.1.json
  • test/foo/json/purescript/bar.2.json
  • test/foo/json/purescript/baz.1.json
  • test/foo/json/purescript/baz.2.json
  • test/foo/plutusdata/bar.1.json
  • test/foo/plutusdata/bar.2.json
  • test/foo/plutusdata/baz.1.json
  • test/foo/plutusdata/baz.2.json
  • test/foo/plutusdata/haskell/bar.1.json
  • test/foo/plutusdata/haskell/bar.2.json
  • test/foo/plutusdata/haskell/baz.1.json
  • test/foo/plutusdata/haskell/baz.2.json
  • test/foo/plutusdata/purescript/bar.1.json
  • test/foo/plutusdata/purescript/bar.2.json
  • test/foo/plutusdata/purescript/baz.1.json
  • test/foo/plutusdata/purescript/baz.2.json

Testing equality typeclasses: symmetry and transitivity

  • sym :: (a :~: b) -> b :~: a
  • trans :: (a :~: b) -> (b :~: c) -> a :~: c

This could be done with goldens and randomized testing. However, goldens approach assumes a correct marshaling procedure while the randomized assumes generators. Perhaps having both makes sense.

TODO (provide time estimates)

  1. Define the version 1 of Prelude and Plutus LBF libraries which MUST be supported
  2. Support the Json typeclass
    1. Codegen
    2. Runtime support
  3. Support the Plutarch backend
  4. Document typeclasses
    1. Eq
    2. PlutusData
    3. Json
  5. Document backends
    1. Sum/Product/Record type construction and deconstruction (Haskell, Purescript, Plutarch)
    2. Eq/Json/PlutusData typeclass use (Haskell, Purescript, Plutarch minus Json)
  6. Devise goldens
    1. Eq
    2. PlutusData
    3. Json
  7. Implement the test suite
    1. Provide assertions
    2. Hook to the CI

Catalyst milestone 3: Testing and documentation

Outputs

  • A test suite checking for correct mapping from schema data types to PlutusData encodings against a known-good corpus of such mappings (golden tests).

    • A dedicated lbt-plutus test suite was implemented for both Haskell and Purescript backends. They leverage both golden unit testing approach and randomized property based testing to assert the essential properties of the LambdaBuffers Plutus package:
      • Plutus.V1.PlutusData derivation tests
        • Golden unit tests: forall (golden : Days.Day.*.pd.json): (toJson . toPlutusData . fromPlutusData . fromJson) golden == golden
        • Property tests: forall (x : Foo.*): (fromPlutusData . toPlutusData) x == x
      • Plutus.V1.PlutusData instance tests
        • Golden unit tests: forall (golden : *.pd.json): (toJson . toPlutusData . fromPlutusData . fromJson) golden == golden
  • A test suite checking for roundtrip compatibility between codegened target environments.

    • A dedicated lbt-plutus test suite was implemented for both Haskell and Purescript backends.
    • A dedicated lbt-prelude test suite was implemented for both Haskell and Purescript backends.
    • Both include golden unit tests that provide assurances that these backends implement the LambdaBuffers packages in a mutually compatible manner.
  • A modular and contract-based test suite architecture streamlining codegen testing compliance for any of the supported typeclasses.

  • A document mapping the schema data types and typeclasses to their corresponding code-generated variants in the target environments.

Acceptance Criteria

  • The test suites are passing for the Haskell+PlutusTx codegen module.
    • CI targets:
      • checks.x86_64-linux."check-lbt-prelude-haskell:test:tests"
      • checks.x86_64-linux."check-lbt-plutus-haskell:test:tests"
  • The test suites are passing for the Purescript+CTL codegen module.
    • CI targets:
      • checks.x86_64-linux."purescript:lbt-plutus:check"
      • checks.x86_64-linux."purescript:lbt-prelude:check"
  • The “Mappings” document is made available in the project repository.

Evidence of Milestone Completion

References

Documentation strategy

Each typeclass receives a Specification document that closely explains the required semantics for LB types (sum/products/records).

Each backend receives a User documentation that elaborates on how the generated code is used.

Testing strategy

LambdaBuffers works with typeclasses, such as Eq, PlutusData and JSON. When a LB type has been declared with support for any of such typeclass, values of said types need to be handled in exactly the same manner as elaborated in the Specification for a given Typeclass in ALL support target language environments.

Concretely, if a type Foo has been declared with support for JSON, and toJson and fromJson methods have been generated for target languages, they have to be in correspondence.

module Foo

import Prelude (Json)

sum Foo = Bar Integer | Baz Text

deriving Json Foo

In Haskell and Purescript values of Foo would be Bar 1 and Baz "baz", and their respective JSON mappings would be {"constructor" : "Bar", "product" : [1]} and {"constructor" : "Baz", "product" : ["baz"]}.

Testing encoding typeclasses: from . to goldens

For each typeclass, we maintain a set of golden' files of known good/bad that lift into the target language withfromand write it back withto` into a separate file. Then we provide an assertion that these files are semantically 'equal' (for example JSON spaces don't matter and such).

Example test:

  • test/foo/Foo.lbf
  • test/foo/json/bar.1.json
  • test/foo/json/bar.2.json
  • test/foo/json/baz.1.json
  • test/foo/json/baz.2.json
  • test/foo/json/haskell/bar.1.json
  • test/foo/json/haskell/bar.2.json
  • test/foo/json/haskell/baz.1.json
  • test/foo/json/haskell/baz.2.json
  • test/foo/json/purescript/bar.1.json
  • test/foo/json/purescript/bar.2.json
  • test/foo/json/purescript/baz.1.json
  • test/foo/json/purescript/baz.2.json
  • test/foo/plutusdata/bar.1.json
  • test/foo/plutusdata/bar.2.json
  • test/foo/plutusdata/baz.1.json
  • test/foo/plutusdata/baz.2.json
  • test/foo/plutusdata/haskell/bar.1.json
  • test/foo/plutusdata/haskell/bar.2.json
  • test/foo/plutusdata/haskell/baz.1.json
  • test/foo/plutusdata/haskell/baz.2.json
  • test/foo/plutusdata/purescript/bar.1.json
  • test/foo/plutusdata/purescript/bar.2.json
  • test/foo/plutusdata/purescript/baz.1.json
  • test/foo/plutusdata/purescript/baz.2.json

Testing equality typeclasses: symmetry and transitivity

  • sym :: (a :~: b) -> b :~: a
  • trans :: (a :~: b) -> (b :~: c) -> a :~: c

This could be done with goldens and randomized testing. However, goldens approach assumes a correct marshaling procedure while the randomized assumes generators. Perhaps having both makes sense.

TODO (provide time estimates)

  1. Define the version 1 of Prelude and Plutus LBF libraries which MUST be supported
  2. Support the Json typeclass
    1. Codegen
    2. Runtime support
  3. Support the Plutarch backend
  4. Document typeclasses
    1. Eq
    2. PlutusData
    3. Json
  5. Document backends
    1. Sum/Product/Record type construction and deconstruction (Haskell, Purescript, Plutarch)
    2. Eq/Json/PlutusData typeclass use (Haskell, Purescript, Plutarch minus Json)
  6. Devise goldens
    1. Eq
    2. PlutusData
    3. Json
  7. Implement the test suite
    1. Provide assertions
    2. Hook to the CI

Catalyst 10 reports

Catalyst milestone 1: Rust support

Outputs

  • A LambdaBuffers code generation module that outputs type definitions and derived implementations in the Rust programming language given a LambdaBuffers schema.
  • A Rust library that implements the LambdaBuffers Prelude runtime. This module would include standardised JSON encoding and equality implementations for all declared type class instances in the schema.
  • A Rust test suite that assures the manually implemented and automatically generated implementations are consistent with the predefined LambdaBuffers Prelude golden data set of JSON files and perform correct implementation derivation.
  • A Rust library that implements the LambdaBuffers Plutus runtime. This module would include standardised PlutusData encoding implementations for all declared type class instances in the Plutus schema.
    • A standalone library was implemented (still in active development) with Plutus ledger types. LambdaBuffers runtime functionality is included in this library: plutus-ledger-api-rust.
  • A Rust test suite that assures the manually implemented and automatically generated implementations are consistent with the predefined LambdaBuffers Plutus golden data set of PlutusData encoded files and perform correct implementation derivation.
  • Nix devops modules (Nix API) for streamlining the LambdaBuffers code generation pipeline to Rust.
  • Documentation on LambdaBuffers usage patterns for Rust.

Acceptance Criteria

  • LambdaBuffers schemas that are based on the LambdaBuffers Prelude module can be used in Rust projects to specify application types.
  • LambdaBuffers schemas that are based on the LambdaBuffers Plutus module can be used in Rust projects to specify application types.
  • The documentation and devops tooling is available to facilitate easy adoption.

Evidence of Milestone Completion

Demo files

Catalyst milestone 2: Javascript/Typescript support

While the milestone technically requires Javascript support, we implemented Typescript support which is a typed superset of Javascript. This was done to better interpolate with existing Javascript/Typescript libraries on Cardano such as lucid, cardano-js-sdk, cardano-serialization-lib, etc. With Typescript support, we get Javascript support for free as the compilation from Typescript to Javascript is a well established practise.

Outputs

  • A LambdaBuffers code generation module that outputs type constructors and derived implementations in the Javascript programming language given a LambdaBuffers schema.

  • A Javascript library that implements the LambdaBuffers Prelude runtime. This module would include standardised JSON encoding and equality implementations for all declared type class instances in the schema.

  • A Javascript test suite that assures the manually implemented and automatically generated implementations are consistent with the predefined LambdaBuffers Prelude golden data set of JSON files and perform correct implementation derivation.

  • A Javascript library that implements the LambdaBuffers Plutus runtime. This module would include standardised PlutusData encoding implementations for all declared type class instances in the Plutus schema.

  • A Javascript test suite that assures the manually implemented and automatically generated implementations are consistent with the predefined LambdaBuffers Plutus golden data set of PlutusData encoded files and perform correct implementation derivation.

  • Nix devops modules (Nix API) for streamlining the LambdaBuffers code generation pipeline to Javascript.

  • Documentation on LambdaBuffers usage patterns for Javascript.

Acceptance Criteria

  • LambdaBuffers schemas that are based on the LambdaBuffers Prelude module can be used in Javascript projects to specify application types.
  • LambdaBuffers schemas that are based on the LambdaBuffers Plutus module can be used in Javascript projects to specify application types.
  • The documentation and devops tooling is available to facilitate easy adoption.

Evidence of Milestone Completion

Demo files

Catalyst milestone 3: Aiken integration research and development

Outputs

  • Document describing Aiken integration and how PlutusData encodings can be provided by the Lambda Buffers toolkit. NOTE: We cannot ignore the fact that the outcome of the research done in Milestone 3 could point to incompatibilities that make an Aiken integration with Lambda Buffers infeasible. Should that be the case, an in-depth technical explanation of the limitations will be provided.
    • The document can be found here.

Acceptance Criteria

  • The generated document describes all the technical components relevant to the integration of Aiken with Lambda Buffers, showing a good understanding of them and how they would be relevant to the intended integration.
    • The document discusses relevant technical components of Aiken integration with Lambda Buffers.
  • The generated document describes a clear technical path to integrate Aiken with Lambda buffers and, in addition or alternatively, an in-depth analysis of any limitations found.
    • The document discusses limitations of Aiken and its incompatibilities with Lambda Buffers along with workarounds when possible.

Evidence of Milestone Completion

  • The completed and reviewed "Lambda Buffers for Aiken - PlutusData encoding implementation provider" documentation is available.
    • Methods to workaround the incompatibilities of Aiken and Lambda Buffers where possible are provided in the document.
  • Alternatively, if Aiken integration proves to be infeasible:
  • The completed and reviewed documentation describing the Aiken limitations and their impact is available.
    • The discussion of the limitations and whether they are worthwhile are also included in the document.

Demo files

Catalyst milestone 4: Separate PlutusTx backend and improvements to existing LambdaBuffers facilities

Outputs

  • A separate code generator for Haskell and PlutusTx

    • A separate LambdaBuffers code generation module that outputs type definitions and derived implementations for Haskell's Prelude and PlutusTx's Prelude (or equivalent LambdaBuffers' Preludes) given a LambdaBuffers schema.

      • The common files for code generation of both Haskell's Prelude and PlutusTx's Prelude can be found here

      • The Haskell code generator implementation can be found here

      • The PlutusTx code generator implementation can be found here

    • A Haskell test suite that assures the manually implemented and automatically generated implementations are consistent with the predefined LambdaBuffers Prelude golden data set of JSON files and perform correct implementation derivation.

      • The Haskell test suite can be found here
  • Nix devops modules (Nix API) for streamlining the LambdaBuffers code generation pipeline to either Haskell's Prelude or PlutusTx's Prelude.

    • A Nix API for streamlining LambdaBuffers code generation Haskell's Prelude can be found here

    • The Nix API for streamlining LambdaBuffers code generation PlutusTx's Prelude can be found here or PlutusTx's Prelude can be found

  • Documentation on LambdaBuffers usage patterns for Haskell's Prelude and PlutusTx's Prelude

    • Documentation for Haskell's Prelude can be found here.

    • Documentation for PlutusTx's Prelude can be found here.

  • A complete Plutus .lbf schema file to include all Plutus Ledger API types with backend support for Rust, TypeScript, and PureScript.

    • The .lbf schema file for V1 Plutus Ledger API can be found here

    • The .lbf schema file for V2 Plutus Ledger API can be found here

    • Rust backend support is given by the JSON file here where the types are provided by this package.

    • TypeScript backend support is given by the JSON file here where the types are provided by this package

    • PureScript backend support is given by the JSON file here where the types are provided by this package

  • An extended integration test suite to verify that the manually implemented and automatically implemented instances of the updated LambdaBuffers' Plutus .lbf schema file are consistent across all backends.

    • The extended integration test suite to verify that the manually implemented and automatically implemented instances of the updated LambdaBuffers' PlutusTx .lbf schema file are consistent across all backends can be found here.
  • A versioning scheme for the LambdaBuffers git repository using git tags that follow semantic versioning.

    • A versioning scheme for the git repository using git tags that follows semantic versioning is given in the Releases page

    • Documentation for how the versioning scheme interplays with the monorepo setup is described in the CONTRIBUTING.md

  • Optimized Nix build times.

    • This was fixed in #233. It's well known that (in the version of Nix used with LambdaBuffers) large flake.lock files are a detriment to performance. So prior of #233, in e.g. 9ae3705f3f1a5c2506c7f86c8d5643c38d79e849, we see that

      $ git checkout 9ae3705f3f1a5c2506c7f86c8d5643c38d79e849
      $ time nix flake lock
      
      real    0m27.437s
      user    0m27.331s
      sys     0m0.116s
      
      $ wc -l flake.lock
      44552 flake.lock
      

      it takes almost 28 seconds to execute the nix flake lock command due to the almost 45000 lines large flake.lock file.

      After the merge of #233, in e.g. 401f8a920a557c71440795174da199a1e128c4f9, we see significantly improved Nix performance

      $ git checkout 401f8a920a557c71440795174da199a1e128c4f9
      $ time nix flake lock
      
      real    0m1.423s
      user    0m1.348s
      sys     0m0.067s
      
      $ wc -l flake.lock
      11585 flake.lock
      

      where it now takes only about 1.5 seconds to execute the nix flake lock command due to the significantly reduced flake.lock file size of being just under 12000 lines.

  • Error messages that follow the GNU error message format.

    • This was implemented in PR #147 and PR #249. See below for examples of following the GNU error message format.

Acceptance Criteria

  • An executable and Nix tooling to translate LambdaBuffers modules to Haskell projects to specify application types.

    • An executable to translate LambdaBuffers modules to Haskell projects is given in the Nix attribute lbf-prelude-to-haskell defined here

    • Convenient Nix tooling to translate LambdaBuffers modules to Haskell projects is given in the Nix function lbfPreludeHaskell defined here here

  • An executable and Nix tooling to translate LambdaBuffers modules to PlutusTx projects to specify application types.

    • An executable to translate LambdaBuffers modules to PlutusTx projects is given in the Nix attribute lbf-plutus-to-plutustx defined here

    • Convenient Nix tooling to translate LambdaBuffers modules to PlutusTx projects is given in the Nix function lbfPlutusTx defined here here

  • An updated LambdaBuffers Plutus schema for the complete Plutus Ledger API types.

    • The updated LambdaBuffers Plutus schema for the complete Plutus Ledger API types can be found here
  • The documentation and devops tooling is available to facilitate easy adoption.

    • The entire project's documentation can be found in the documentation website here

    • Devops tooling can be found in the form of Nix functions for conveniently creating Haskell or PlutusTx projects found here.

  • Git tags for official releases of LambdaBuffers.

    • The releases page contains git tags for official releases of LambdaBuffers.

Evidence of Milestone Completion

  • The completed and reviewed LambdaBuffers code generator for Haskell's Prelude.

    • The completed common files for code generation of both Haskell's Prelude and PlutusTx's Prelude can be found here

    • The Haskell code generator implementation can be found here

  • The completed and reviewed LambdaBuffers code generator for PlutusTx's Prelude.

    • The completed common files for code generation of both Haskell's Prelude and PlutusTx's Prelude can be found here

    • The PlutusTx code generator implementation can be found here

  • Benchmarks of the before and after optimized Nix build times.

    • This was fixed in #233, see above for benchmarks.
  • Demonstrations (in the form of screenshots or simply text) of error messages following the GNU error message format.

    • We can see the following error messages follow the GNU error message format
    $ lbf-prelude-to-haskell DoesNotExist.lbf
    DoesNotExist.lbf: error: the provided module file DoesNotExist.lbf doesn't exist
    
    $ cat TEST.lbf
    module TEST
    
    import Prelude (Eq, Json, Maybe, Either, List, Integer)
    
    derive Eq What
    
    $ lbf-prelude-to-haskell TEST.lbf
    TEST.lbf:5.11-5.15: error: [module TEST] type What not found in the module's scope Either Integer List Maybe Prelude.Either Prelude.Integer Prelude.List Prelude.Maybe
    
    $ cat ANOTHERTEST.lbf
    module ANOTHERTEST
    
    import Prelude (Eq, Json, Maybe, Either, List, Integer)
    
    prod What = (Integer
    derive Eq What
    
    $ lbf-prelude-to-haskell ANOTHERTEST.lbf
    ANOTHERTEST.lbf:6.7: error:
    unexpected keyword
    expecting s-expression
    

References