Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

🦊 The Phox Programming Language

Phox is a small functional programming language.

It aims to be a simple yet expressive tool - your clever companion for exploring type theory and practical programming.

✨ Features

  • Hindley–Milner type inference
    No need to annotate types in most cases. (In fact, Phox has no type-annotation syntax)
  • Algebraic data types (ADT)
    Define expressive data structures with variants.
  • Pattern matching
    Concise and powerful destructuring.
  • Newtype shorthand
    Cleaner syntax for single-constructor wrapper types.
  • First-class functions
    Functions are values, operators are functions too.
  • Generic function temlates
    Non-first class / overloadable generic function templates.
  • Multi-parameter typeclasses (trait/impl)
    Define the relationship between multiple methods and multiple types.
  • Trait record
    Typeclasses (trait/impl) can be instansiated as a first-class record value, with a simple syntax.
  • Module system
    Rust like module / namespace definition.
  • Simple syntax
    Inspired by ML-family languages, with a focus on clarity.

Install Phox

⚠️ Work in progress — Phox is under active development.

Build

Clone the GitHub repository and build with Rust (tested on 1.80+, may work on earlier versions):

git clone https://github.com/mori0091/phox.git
cd phox
cargo build

Run in REPL

If you run without arguments, Phox starts an interactive REPL:

cargo run
> let rec fact = λn. if (0 == n) 1 else n * fact (n - 1);
> fact 5
=> 120: Int

In REPL, an input starts with : is recognized as a REPL command.
For example, :? shows the list of available commands:

cargo run
> :?

:quit, :q
    exit REPL.

:help, :h, or :?
    print this help messages.

:load <path>, :l <path>
    load and evaluate Phox source file specified by <path>.

:modules, :m
    print list of root modules.

:using, :u
    print list of using aliases for each modules.

:symbols, :s
    print list of symbols for each modules.

:impls, or :impls [options]
    print list of `impl`s.
    options:
        -v, or --verbose
            print also the `impl`s' implementation.

Run a program file

Pass a .phx file to execute it:

  • .phx is the conventional extension for Phox source files (plain text).
  • .txt files are also accepted.
cargo run examples/fact.phx
=> 120: Int

Run from stdin

You can also pipe code from stdin (use - explicitly):

echo "1 + 2" | cargo run -
=> 3: Int

Resulting value and inferred type are printed on success, like this:

=> value: type

On error, an error message is printed, like this:

parse error: UnrecognizedToken { /* snip... */ }

First Program

> type HelloWorld = Hi;
=> (): ()

> Hi
=> Hi: HelloWorld

Note: Phox currently has no string type or I/O.
That means you can’t even write “Hello, World!” yet —
but you can prove that fact 5 = 120 with full type safety 😉

Pipeline Example

> counter 1 |> take 5 |> fold (+) 0
=> 15: Int

> counter 1 |> take 10 |> fold (+) 0
=> 55: Int

> counter 1 |> take 5 |> fold (*) 1
=> 120: Int

> counter 1 |> map |(* 2) |> take 5 |> fold (flip Cons) Nil
=> Cons 10 (Cons 8 (Cons 6 (Cons 4 (Cons 2 Nil)))): List Int

> counter 1 |> filter (\x. x % 2 == 0) |> take 5 |> collect Nil
=> Cons 2 (Cons 4 (Cons 6 (Cons 8 (Cons 10 Nil)))): List Int

Comments

Line comments

In regular expression: //[^\n\r]*

  • starts with //, followed by any characters (single-line text), and ends with newline.
// This is a line-comment.

////// This is also a line-comment.

Similar to C/C++ line-comments.

Block comments

In regular expression: /\*[^*]*\*+(?:[^/*][^*]*\*+)*/

  • starts with /*, followed by any charcters (single or multi-line texts), and ends with */.
  • Nesting is not allowed.
/* This is a block comment. */

/*
 * This is also a block comment.
 */

/*****
 *** This is also a block comment.
 **/

Similar to C/C++ block-comments.

Semicolon (item separator)

  • ; separates multiple items (declarations / statements / expressions) in a block or at the top level.
  • Each item is evaluated in order; only the last expression’s value is returned.
  • If a block or top-level input ends with ;, an implicit () is added.
// Multiple items in a block
{
    let x = 1; // => (): ()  (discarded)
    let y = 2; // => (): ()  (discarded)
    x + y;     // => 3: Int  (discarded)
    2 * x + y  // => 4: Int  (result)
}
// => 4: Int
// Multiple items in the top level
let x = 1; // => (): ()  (discarded)
let y = 2; // => (): ()  (discarded)
x + y;     // => 3: Int  (discarded)
2 * x + y  // => 4: Int  (result)
// => 4: Int
// Items in a block ends with `;`
{
    1 + 2; // => 3: Int  (discarded)
}
// => (): ()
// Items in the top level ends with `;`
1 + 2; // => 3: Int  (discarded)
// => (): ()
// No items in a block
{}
// => (): ()

Primitive Types

typevaluesdescription
()()Unit type, that consists of single value ().
Booltrue, falesBoolean type. True or false.
Int0, 100, -1, …Integer type. (currently 64-bit integer)

Type definitions

type Option a = Some a | None;
type Pair a b = Pair a b;
type Result e a = Ok a | Err e;
  • Variants can take 0 or more arguments.
  • Newtype shorthand is available when:
    • There is only one variant, and
    • The type name and constructor name are the same, and
    • The variant has exactly one tuple or one record argument.
// Normal form
type Point a = Point @{ x: a, y: a };
type Wrapper a = Wrapper (a,);

// Newtype shorthand
type Point a = @{ x: a, y: a };
type Wrapper a = (a,);

Language constructs

Identifiers

Variables and type variables

In regular expression:[a-z_]([-]*[a-zA-Z0-9_]+)*['?]*

  • seq, x', some?, foo-bar
  • fooBar, a0, take_while

Constructor, type constructor, and trait names

In regular expression:[A-Z][a-zA-Z0-9_]*

  • List, Nil, Eq, Iter

Module names

In regular expression:[a-z_]([-]*[a-zA-Z0-9_]+)*

  • core, cmp, iter

Module path (qualified module names)

A sequence of module names separated by ::

  • ::core::cmp, ::core::list Absolute path
  • cmp, list, foo::bar Relative path

Qualified names

A module path followed by a variable, constructor, type constructor, or trait name

  • ::core::iter::seq, ::core::iter::Seq, ::core::iter::Iter
  • iter::seq, iter::Seq, iter::Iter

Literals

  • () Unit value literal
  • true, false Boolean value literals
  • 0,100, -1 Integer value literals

Primitive types

  • () Unit type
  • Bool Boolean type
  • Int Integer type
  • (t,), (t1, t2) Tuple type
  • @{x:t1, y:t2} Record type
  • t1 -> t2 Function type
  • T t, t1 t2 Type-level function application

Operators

Infix (binary) operators

  • ==, != Equality / Non-equality operators
  • <=, <, >, >= Comparison operators
  • &&, || Logical operators
  • +, -, *, /, % Arithmetic operators
  • |>, <| Pipeline operators
  • >>, << Function composition operators

Prefix operators

  • nagete, - Arithmetic negation
  • not, ! Boolean not

User defined infix operators

In regular expression:[*+\-/!$%&=^?<>]+

  • ?>, ===, ++

Operators as functions

  • (==) 1 1
  • (+) 1 2

Functions and constructors as operators

  • 1 `Cons` Nil
  • x `@{Eq Int}.(==)` y

Declarations

At the top level of a module:

  • mod Define sub-module
  • use Import identifiers
  • type Define Algebraic Data Type (ADT)
  • trait Define Multi-Parameter Type Class (MPTC)
  • impl Define MPTC instance implementation
  • *let Define generic/overloaded function template
  • let Define a variable or function
  • let rec Define a recursive function

Statements

Within a local scope:

  • let Variable / function binding
  • let rec Recursive function binding

Note

In the case of let rec p = e;,

  • p must be a variable pattern.

In the case of let p = e;,

  • At the top level of a module, p must be a variable pattern.
  • Within a local scope, any pattern can be used for p.

Expressions

Value expressions

(evaluate to themselves; safe to generalize)

  • (), true, 0 Literals
  • λp.e/\p.e Lambda abstraction (function)
  • (e1,), (e1, e2) Tuple value
  • @{x = e1, y = e2} Record value
  • @{T t1 t2} Trait record value
  • |(e op)/|(e op _) Infix-operator partial application (bind the 1st argument)
  • |(op e)/|(_ op e) Infix-operator partial application (bind the 2nd argument)

Infix-operator partial applications (a.k.a. section syntax) are desugared into lambda abstraction.

Note

Old section syntax #(e op)/#(e op _) and #(op e)/#(_ op e) are deprecated.
Use new syntax |(e op)/|(e op _) and |(op e)/|(_ op e) instead.

Important

Old record value syntax @{x: e1, y: e2} are deprecated and no longer supported.
Use new syntax @{x = e1, y = e2} instead.

Non-value expressions

(require evaluation; not eligible for generalization)

  • x Evaluate variable
  • e1 e2 Function application
  • e.0 Tuple index access
  • e.x Record field access
  • { stmt1; stmt2; expr } Block expression (evaluates to the last expression; introduces a new scope)
  • if (e1) e2 else e3 If then else
  • match (e) { p1 => e1, .. } Pattern matching

Patterns

  • _ Wildcard pattern
  • (), true, 0 Literal pattern
  • x, foo Variable pattern
  • Nil, Cons p ps Constructor pattern
  • (p1,), (p1, p2) Tuple pattern
  • @{x = p1, y = p2}, @{x, y} Record pattern

Important

Old record pattern syntax @{x: p1, y: p2} are deprecated and no longer supported.
Use new syntax @{x = p1, y = p2} instead.

Design Notes

Phox Type System (Future)

(kind)variableconstraint solverconstraintvar-gen / union-findconstraint set
KindKindVarId $aKindSolverKindConstraintKindVarContextKindConstraintSet
TypeTypeVarId aUnifiedSolver (Type + MPTC)TypeConstraintTypeContextUnifiedConstraintSet (Type + MPTC)
RowRowVarId @aUnifiedSolver (Row)RowConstraintRowContextUnifiedConstraintSet (Row)
NatNatVarId #aUnifiedSolver (Nat)NatConstraintNatContextUnifiedConstraintSet (Nat)

Note

TODO: Change section-syntax from #(op x) to |(op x).

  1. Add new syntax |(op x) and mark old syntax as deprecated.
  2. After Nat solver is implemented, remove old syntax #(op x).

Note

  • $a, @a, and #a are primarily used in pretty-printed inferred types.
  • KindVarId $a is internal use only for kind inference.
  • In user code,
    • type-level variables in expressions should generally be written as plain identifiers like a.
    • type-level variables in type/trait declarations should be written as a and #a for Type and Nat variables respectively.
  • As a future extension, ?a, #a, and @a may be accepted as syntactic sugar in expressions for explicit kind annotations (a : Type), (a : Nat), and (a : Row) respectively.

Note

Maybe Nat kind will not be implemented…
it’s not so useful.

Rough sketch of Phox Type System process-pipeline

  1. parse: source code “a” → RawAST such as RawTyVar(“a”)
  2. resolve phase: RawTyVar(“a”) → RowVarId(r), NatVarId(n), or TypeVarId(t)
  • if a appeared in Row context (@{ .. | a }), a shall be RowVarId(r)
  • if a appeared in type parameters (@{T a}), a shall be NatVarId(n) or TypeVarId(t)
    • check type/trait declarations and its parameters’ kind,
    • then assign NatVarId(n) or TypeVarId(t) according to the kind.
  1. infer phase: infer and solve with UnifiedSolver
  2. apply phase: bake specialized code

Note

  • row-var appears only in record/row contexts such as @{...}, so no ambiguity arises.
  • Since both type-var and nat-var can appear as type parameters only in trait record @{T a b}, they must resolve to either NatVarId or TypeVarId after kind-check and kind-assignment in resolve phase.
  • Kind inference and KindSolver are not needed for Phox.

Rough sketh of Phox Type System data sturctures

#![allow(unused)]
fn main() {
// Constructors for Kind expression
pub enum Kind {
    Fun(Box<Kind>, Box<Kind>),   // κ1 -> κ2
    Type,                        // τ
    Row,                         // ρ
    // Nat,                      // ν
}

pub enum TypeExpr {
    Ty(Type),
    Row(Row),
    // Nat(Nat),
}

// pub fn kind_of(texpr: &TypeExpr) -> Kind {
//     match texpr {
//         TypeExpr::Ty(ty) => match ty {
//             Type::Var(_) => {
//                 todo!()
//             }
//             Type::Con(_name) => {
//                 todo!()
//             }
//             Type::App(ref t1, ref t2) => {
//                 Kind::Fun(Box::new(kind_of(t1)), Box::new(kind_of(t2)))
//             }
//             Type::Fun(_, _) | Type::Tuple(_) | Type::Record(_) => {
//                 Kind::Type
//             }
//         }
//     }
// }

// Constructors for Type expression
pub enum Type {
    Var(TypeVarId),             // 型変数
    Fun(Box<Type>, Box<Type>),  // 関数型
    Con(Symbol),                // 型構築子

    App(Box<TypeExpr>, Box<TypeExpr>),  // 型適用

    Tuple(Vec<Type>),
    // Record(Vec<(String, Type)>), // => Type::App(Type::Con("@{}"), TypeExpr::Row(row))
}

pub struct TypeVarId(usize);  // `a`, `?100`

pub enum TypeConstraint {
    TypeEq(Type, Type),
    // -- for MPTC (`trait`/`impl`) and Overloaded Function Template (`*let`) --
    TraitBound(TraitHead),
    Overloaded(Symbol, Type, Vec<SchemeTemplate<Type>>),
}

// Constructors for Row expression
pub enum Row {
    Empty,                              // `@{}`
    Var(RowVarId),                      // `@{ | row }`
    Ext(Box<(Label, Type)>, Box<Row>),  // `@{ x:τ | row }`
}

pub struct RowVarId(usize);  // `@a`, `@?100`
pub type Label = String;

// T.B.D.
pub enum RowConstraint {
    RowEq(Row, Row),
    Has(Row, Label),
    Lacks(Row, Label),
}

// Constructors for Nat expression
// pub enum Nat {
//     Lit(usize),                         // `0`, `1`, ...
//     Var(NatVarId),                      // `n`
//     Add(Box<Nat>, Box<Nat>),            // `n + m`
// }

// pub struct NatVarId(usize);  // `#a`, `#?100`

// T.B.D.
// pub enum NatConstraint {
//     NatEq(Nat, Nat),
//     NatLT(Nat, Nat),
//     // NatEqu(...),   // natural number equation (e.g. `m + n = 0`)
// }

}
#![allow(unused)]
fn main() {
pub struct Scheme<T> {
    pub vars: Vec<Var>,             // quantified variables              (ex. `∀ a.`)
    pub constraints: ConstraintSet, // constraints / trait bounds        (ex. `(Eq a, Ord a) =>`)
    pub target: T,                  // the type, or                      (ex. `a -> a -> Bool`)
                                    // the trait head produced by `impl` (ex. `Eq (List a)`)
}

pub enum Var {
    Ty(TypeVarId),
    Row(RowVarId),
    // Nat(NatVarId),
}

pub enum Constraint {
    Ty(TypeConstraint),
    Row(RowConstraint),
    // Nat(NatConstraint),
}

pub struct ConstraintSet {
    pub primary:  Option<Box<TraitHead>>,
    pub requires: BTreeSet<Constraint>,
}

pub struct UnifiedSolver {
    // ...
}
}

🧭 Roadmap

✔️ Mission 1

  • Hindley-Milner type inference
  • Safe subset of System Fω
  • Algebraic Data Type (ADT)

✔️ Mission 2

  • Traits a.k.a., Type Class
  • Trait Record - Type Class as first-class record value

✔️ Mission 3

  • Multi-Parameter Type Class (MPTC)
  • Higher-order Trait Record - MPTC as first-class record value
  • Overloaded Function Template
  • Type and requires inference w/o type annotation

Extra Mission 1

  • Split/Merge record operators
  • Row Kind / Row polymorphism
  • Row Kind inference w/o Kind annotation

Extra Mission 2

  • Safe subset of Nat Kind
  • Nat Kind inference w/o Kind annotation
  • Array, Linear Type, Affine Type, etc.
  • String, I/O

It’s a long, long journey…

🏁 Goal

  • Hello World!
> println "Hello, world! (Finally, we meet!)";
Hello, world! (Finally, we meet!)
=> (): ()