🦊 The Phox Programming Language
Phox is a small functional programming language.
It aims to be a simple yet expressive tool - your clever companion for exploring type theory and practical programming.
✨ Features
- Hindley–Milner type inference
No need to annotate types in most cases. (In fact, Phox has no type-annotation syntax) - Algebraic data types (ADT)
Define expressive data structures with variants. - Pattern matching
Concise and powerful destructuring. - Newtype shorthand
Cleaner syntax for single-constructor wrapper types. - First-class functions
Functions are values, operators are functions too. - Generic function temlates
Non-first class / overloadable generic function templates. - Multi-parameter typeclasses (
trait/impl)
Define the relationship between multiple methods and multiple types. - Trait record
Typeclasses (trait/impl) can be instansiated as a first-class record value, with a simple syntax. - Module system
Rust like module / namespace definition. - Simple syntax
Inspired by ML-family languages, with a focus on clarity.
Install Phox
⚠️ Work in progress — Phox is under active development.
Build
Clone the GitHub repository and build with Rust (tested on 1.80+, may work on earlier versions):
git clone https://github.com/mori0091/phox.git
cd phox
cargo build
Run in REPL
If you run without arguments, Phox starts an interactive REPL:
cargo run
> let rec fact = λn. if (0 == n) 1 else n * fact (n - 1);
> fact 5
=> 120: Int
In REPL, an input starts with : is recognized as a REPL command.
For example, :? shows the list of available commands:
cargo run
> :?
:quit, :q
exit REPL.
:help, :h, or :?
print this help messages.
:load <path>, :l <path>
load and evaluate Phox source file specified by <path>.
:modules, :m
print list of root modules.
:using, :u
print list of using aliases for each modules.
:symbols, :s
print list of symbols for each modules.
:impls, or :impls [options]
print list of `impl`s.
options:
-v, or --verbose
print also the `impl`s' implementation.
Run a program file
Pass a .phx file to execute it:
.phxis the conventional extension for Phox source files (plain text)..txtfiles are also accepted.
cargo run examples/fact.phx
=> 120: Int
Run from stdin
You can also pipe code from stdin (use - explicitly):
echo "1 + 2" | cargo run -
=> 3: Int
Resulting value and inferred type are printed on success, like this:
=> value: type
On error, an error message is printed, like this:
parse error: UnrecognizedToken { /* snip... */ }
First Program
> type HelloWorld = Hi;
=> (): ()
> Hi
=> Hi: HelloWorld
Note: Phox currently has no string type or I/O.
That means you can’t even write “Hello, World!” yet —
but you can prove thatfact 5 = 120with full type safety 😉
Pipeline Example
> counter 1 |> take 5 |> fold (+) 0
=> 15: Int
> counter 1 |> take 10 |> fold (+) 0
=> 55: Int
> counter 1 |> take 5 |> fold (*) 1
=> 120: Int
> counter 1 |> map |(* 2) |> take 5 |> fold (flip Cons) Nil
=> Cons 10 (Cons 8 (Cons 6 (Cons 4 (Cons 2 Nil)))): List Int
> counter 1 |> filter (\x. x % 2 == 0) |> take 5 |> collect Nil
=> Cons 2 (Cons 4 (Cons 6 (Cons 8 (Cons 10 Nil)))): List Int
Comments
Line comments
In regular expression:
//[^\n\r]*
- starts with
//, followed by any characters (single-line text), and ends with newline.
// This is a line-comment.
////// This is also a line-comment.
Similar to C/C++ line-comments.
Block comments
In regular expression:
/\*[^*]*\*+(?:[^/*][^*]*\*+)*/
- starts with
/*, followed by any charcters (single or multi-line texts), and ends with*/. - Nesting is not allowed.
/* This is a block comment. */
/*
* This is also a block comment.
*/
/*****
*** This is also a block comment.
**/
Similar to C/C++ block-comments.
Semicolon (item separator)
;separates multiple items (declarations / statements / expressions) in a block or at the top level.- Each item is evaluated in order; only the last expression’s value is returned.
- If a block or top-level input ends with
;, an implicit()is added.
// Multiple items in a block
{
let x = 1; // => (): () (discarded)
let y = 2; // => (): () (discarded)
x + y; // => 3: Int (discarded)
2 * x + y // => 4: Int (result)
}
// => 4: Int
// Multiple items in the top level
let x = 1; // => (): () (discarded)
let y = 2; // => (): () (discarded)
x + y; // => 3: Int (discarded)
2 * x + y // => 4: Int (result)
// => 4: Int
// Items in a block ends with `;`
{
1 + 2; // => 3: Int (discarded)
}
// => (): ()
// Items in the top level ends with `;`
1 + 2; // => 3: Int (discarded)
// => (): ()
// No items in a block
{}
// => (): ()
Primitive Types
| type | values | description |
|---|---|---|
() | () | Unit type, that consists of single value (). |
Bool | true, fales | Boolean type. True or false. |
Int | 0, 100, -1, … | Integer type. (currently 64-bit integer) |
Type definitions
type Option a = Some a | None;
type Pair a b = Pair a b;
type Result e a = Ok a | Err e;
- Variants can take 0 or more arguments.
- Newtype shorthand is available when:
- There is only one variant, and
- The type name and constructor name are the same, and
- The variant has exactly one tuple or one record argument.
// Normal form
type Point a = Point @{ x: a, y: a };
type Wrapper a = Wrapper (a,);
// Newtype shorthand
type Point a = @{ x: a, y: a };
type Wrapper a = (a,);
Language constructs
Identifiers
Variables and type variables
In regular expression:
[a-z_]([-]*[a-zA-Z0-9_]+)*['?]*
seq,x',some?,foo-barfooBar,a0,take_while
Constructor, type constructor, and trait names
In regular expression:
[A-Z][a-zA-Z0-9_]*
List,Nil,Eq,Iter
Module names
In regular expression:
[a-z_]([-]*[a-zA-Z0-9_]+)*
core,cmp,iter
Module path (qualified module names)
A sequence of module names separated by
::
::core::cmp,::core::listAbsolute pathcmp,list,foo::barRelative path
Qualified names
A module path followed by a variable, constructor, type constructor, or trait name
::core::iter::seq,::core::iter::Seq,::core::iter::Iteriter::seq,iter::Seq,iter::Iter
Literals
()Unit value literaltrue,falseBoolean value literals0,100,-1Integer value literals
Primitive types
()Unit typeBoolBoolean typeIntInteger type(t,),(t1, t2)Tuple type@{x:t1, y:t2}Record typet1 -> t2Function typeT t,t1 t2Type-level function application
Operators
Infix (binary) operators
==,!=Equality / Non-equality operators<=,<,>,>=Comparison operators&&,||Logical operators+,-,*,/,%Arithmetic operators|>,<|Pipeline operators>>,<<Function composition operators
Prefix operators
nagete,-Arithmetic negationnot,!Boolean not
User defined infix operators
In regular expression:
[*+\-/!$%&=^?<>]+
?>,===,++
Operators as functions
(==) 1 1(+) 1 2
Functions and constructors as operators
1 `Cons` Nilx `@{Eq Int}.(==)` y
Declarations
At the top level of a module:
modDefine sub-moduleuseImport identifierstypeDefine Algebraic Data Type (ADT)traitDefine Multi-Parameter Type Class (MPTC)implDefine MPTC instance implementation*letDefine generic/overloaded function templateletDefine a variable or functionlet recDefine a recursive function
Statements
Within a local scope:
letVariable / function bindinglet recRecursive function binding
Note
In the case of
let rec p = e;,
pmust be a variable pattern.In the case of
let p = e;,
- At the top level of a module,
pmust be a variable pattern.- Within a local scope, any pattern can be used for
p.
Expressions
Value expressions
(evaluate to themselves; safe to generalize)
(),true,0Literalsλp.e/\p.eLambda abstraction (function)(e1,),(e1, e2)Tuple value@{x = e1, y = e2}Record value@{T t1 t2}Trait record value|(e op)/|(e op _)Infix-operator partial application (bind the 1st argument)|(op e)/|(_ op e)Infix-operator partial application (bind the 2nd argument)
Infix-operator partial applications (a.k.a. section syntax) are desugared into lambda abstraction.
Note
Old section syntax
#(e op)/#(e op _)and#(op e)/#(_ op e)are deprecated.
Use new syntax|(e op)/|(e op _)and|(op e)/|(_ op e)instead.
Important
Old record value syntax
@{x: e1, y: e2}are deprecated and no longer supported.
Use new syntax@{x = e1, y = e2}instead.
Non-value expressions
(require evaluation; not eligible for generalization)
xEvaluate variablee1 e2Function applicatione.0Tuple index accesse.xRecord field access{ stmt1; stmt2; expr }Block expression (evaluates to the last expression; introduces a new scope)if (e1) e2 else e3If then elsematch (e) { p1 => e1, .. }Pattern matching
Patterns
_Wildcard pattern(),true,0Literal patternx,fooVariable patternNil,Cons p psConstructor pattern(p1,),(p1, p2)Tuple pattern@{x = p1, y = p2},@{x, y}Record pattern
Important
Old record pattern syntax
@{x: p1, y: p2}are deprecated and no longer supported.
Use new syntax@{x = p1, y = p2}instead.
Design Notes
Phox Type System (Future)
| (kind) | variable | constraint solver | constraint | var-gen / union-find | constraint set |
|---|---|---|---|---|---|
$a | |||||
| Type | TypeVarId a | UnifiedSolver (Type + MPTC) | TypeConstraint | TypeContext | UnifiedConstraintSet (Type + MPTC) |
| Row | RowVarId @a | UnifiedSolver (Row) | RowConstraint | RowContext | UnifiedConstraintSet (Row) |
| Nat | NatVarId #a | UnifiedSolver (Nat) | NatConstraint | NatContext | UnifiedConstraintSet (Nat) |
Note
TODO: Change section-syntax from
#(op x)to|(op x).
- Add new syntax
|(op x)and mark old syntax as deprecated.- After
Natsolver is implemented, remove old syntax#(op x).
Note
$a,@a, and#aare primarily used in pretty-printed inferred types.- KindVarId
$ais internal use only for kind inference.- In user code,
- type-level variables in expressions should generally be written as plain identifiers like
a.- type-level variables in
type/traitdeclarations should be written asaand#afor Type and Nat variables respectively.- As a future extension,
?a,#a, and@amay be accepted as syntactic sugar in expressions for explicit kind annotations(a : Type),(a : Nat), and(a : Row)respectively.
Note
Maybe
Natkind will not be implemented…
it’s not so useful.
Rough sketch of Phox Type System process-pipeline
- parse: source code “a” → RawAST such as RawTyVar(“a”)
- resolve phase: RawTyVar(“a”) → RowVarId(r), NatVarId(n), or TypeVarId(t)
- if
aappeared in Row context (@{ .. | a }),ashall be RowVarId(r) - if
aappeared in type parameters (@{T a}),ashall be NatVarId(n) or TypeVarId(t)- check
type/traitdeclarations and its parameters’ kind, - then assign NatVarId(n) or TypeVarId(t) according to the kind.
- check
- infer phase: infer and solve with UnifiedSolver
- apply phase: bake specialized code
Note
- row-var appears only in record/row contexts such as
@{...}, so no ambiguity arises.- Since both type-var and nat-var can appear as type parameters only in trait record
@{T a b}, they must resolve to either NatVarId or TypeVarId after kind-check and kind-assignment in resolve phase.- Kind inference and KindSolver are not needed for Phox.
Rough sketh of Phox Type System data sturctures
#![allow(unused)]
fn main() {
// Constructors for Kind expression
pub enum Kind {
Fun(Box<Kind>, Box<Kind>), // κ1 -> κ2
Type, // τ
Row, // ρ
// Nat, // ν
}
pub enum TypeExpr {
Ty(Type),
Row(Row),
// Nat(Nat),
}
// pub fn kind_of(texpr: &TypeExpr) -> Kind {
// match texpr {
// TypeExpr::Ty(ty) => match ty {
// Type::Var(_) => {
// todo!()
// }
// Type::Con(_name) => {
// todo!()
// }
// Type::App(ref t1, ref t2) => {
// Kind::Fun(Box::new(kind_of(t1)), Box::new(kind_of(t2)))
// }
// Type::Fun(_, _) | Type::Tuple(_) | Type::Record(_) => {
// Kind::Type
// }
// }
// }
// }
// Constructors for Type expression
pub enum Type {
Var(TypeVarId), // 型変数
Fun(Box<Type>, Box<Type>), // 関数型
Con(Symbol), // 型構築子
App(Box<TypeExpr>, Box<TypeExpr>), // 型適用
Tuple(Vec<Type>),
// Record(Vec<(String, Type)>), // => Type::App(Type::Con("@{}"), TypeExpr::Row(row))
}
pub struct TypeVarId(usize); // `a`, `?100`
pub enum TypeConstraint {
TypeEq(Type, Type),
// -- for MPTC (`trait`/`impl`) and Overloaded Function Template (`*let`) --
TraitBound(TraitHead),
Overloaded(Symbol, Type, Vec<SchemeTemplate<Type>>),
}
// Constructors for Row expression
pub enum Row {
Empty, // `@{}`
Var(RowVarId), // `@{ | row }`
Ext(Box<(Label, Type)>, Box<Row>), // `@{ x:τ | row }`
}
pub struct RowVarId(usize); // `@a`, `@?100`
pub type Label = String;
// T.B.D.
pub enum RowConstraint {
RowEq(Row, Row),
Has(Row, Label),
Lacks(Row, Label),
}
// Constructors for Nat expression
// pub enum Nat {
// Lit(usize), // `0`, `1`, ...
// Var(NatVarId), // `n`
// Add(Box<Nat>, Box<Nat>), // `n + m`
// }
// pub struct NatVarId(usize); // `#a`, `#?100`
// T.B.D.
// pub enum NatConstraint {
// NatEq(Nat, Nat),
// NatLT(Nat, Nat),
// // NatEqu(...), // natural number equation (e.g. `m + n = 0`)
// }
}
#![allow(unused)]
fn main() {
pub struct Scheme<T> {
pub vars: Vec<Var>, // quantified variables (ex. `∀ a.`)
pub constraints: ConstraintSet, // constraints / trait bounds (ex. `(Eq a, Ord a) =>`)
pub target: T, // the type, or (ex. `a -> a -> Bool`)
// the trait head produced by `impl` (ex. `Eq (List a)`)
}
pub enum Var {
Ty(TypeVarId),
Row(RowVarId),
// Nat(NatVarId),
}
pub enum Constraint {
Ty(TypeConstraint),
Row(RowConstraint),
// Nat(NatConstraint),
}
pub struct ConstraintSet {
pub primary: Option<Box<TraitHead>>,
pub requires: BTreeSet<Constraint>,
}
pub struct UnifiedSolver {
// ...
}
}
🧭 Roadmap
✔️ Mission 1
- Hindley-Milner type inference
- Safe subset of System Fω
- Algebraic Data Type (ADT)
✔️ Mission 2
- Traits a.k.a., Type Class
- Trait Record - Type Class as first-class record value
✔️ Mission 3
- Multi-Parameter Type Class (MPTC)
- Higher-order Trait Record - MPTC as first-class record value
- Overloaded Function Template
- Type and requires inference w/o type annotation
Extra Mission 1
- Split/Merge record operators
- Row Kind / Row polymorphism
- Row Kind inference w/o Kind annotation
Extra Mission 2
- Safe subset of Nat Kind
- Nat Kind inference w/o Kind annotation
- Array, Linear Type, Affine Type, etc.
- String, I/O
It’s a long, long journey…
🏁 Goal
- Hello World!
> println "Hello, world! (Finally, we meet!)";
Hello, world! (Finally, we meet!)
=> (): ()