FFI & Native Extensions

Foreign Function Interface (FFI) allows Nostos to call native code written in Rust. This enables you to leverage existing Rust libraries, achieve maximum performance for compute-intensive operations, or access system-level APIs not available in pure Nostos.

Example Extensions

  • nalgebra - Linear algebra with dynamic vectors and matrices (compute-bound example)
  • nostos-redis - Redis client using Tokio async I/O

How FFI Works in Nostos

A Nostos extension consists of two parts that work together:

  1. Rust library - A dynamic library (.so/.dylib/.dll) that exports native functions
  2. Nostos wrapper - A .nos file that provides a type-safe interface using __native__() calls

The Rust side handles low-level operations and memory management, while the Nostos wrapper provides a clean, idiomatic API with types, traits, and operator overloading.

Extension Project Structure

An extension project has this layout:

my-extension/
├── Cargo.toml          # Rust project configuration
├── src/
│   └── lib.rs          # Rust implementation
└── my_extension.nos    # Nostos wrapper file

Cargo.toml Configuration

The Cargo.toml must specify cdylib crate type to produce a dynamic library, and depend on nostos-extension for the FFI interface:

[package]
name = "nostos-nalgebra"
version = "0.1.0"
edition = "2021"

[lib]
crate-type = ["cdylib"]  # Creates a dynamic library

[dependencies]
nostos-extension = { git = "https://github.com/pegesund/nostos", branch = "master" }
nalgebra = "0.33"  # Your native dependency

Writing the Rust Implementation

The Rust implementation involves three key components: declaring the extension, registering functions, and handling garbage collection. Let's examine each in detail.

1. Declaring the Extension

Every extension must use the declare_extension! macro to export its entry point:

use nostos_extension::{declare_extension, ExtContext, ExtRegistry, Value};

// Declare the extension with name, version, and registration function
declare_extension!("nalgebra", "0.1.0", register);

fn register(reg: &mut ExtRegistry) {
    // Register your native functions here (covered below)
}

The macro creates the necessary FFI exports that Nostos looks for when loading the extension. The first argument is the extension name (used in import statements), the second is the version, and the third is your registration function.

2. Registering Native Functions

The registration function adds your native functions to the extension registry. Each function has a unique name that Nostos uses to locate it:

fn register(reg: &mut ExtRegistry) {
    // Vector operations
    reg.add("Nalgebra.dvec", dvec_new);           // Create vector from list
    reg.add("Nalgebra.dvecAdd", dvec_add);        // Add two vectors
    reg.add("Nalgebra.dvecScale", dvec_scale);    // Multiply vector by scalar
    reg.add("Nalgebra.dvecDot", dvec_dot);        // Dot product
    reg.add("Nalgebra.dvecToList", dvec_to_list); // Convert back to list

    // Matrix operations
    reg.add("Nalgebra.dmat", dmat_new);           // Create matrix
    reg.add("Nalgebra.dmatMul", dmat_mul);        // Matrix multiplication
    // ... more functions
}

The naming convention "ModuleName.functionName" helps organize functions logically. These exact names are used in the Nostos wrapper with __native__().

3. Native Function Signature

Every native function must match this signature:

fn my_function(args: &[Value], _ctx: &ExtContext) -> Result<Value, String>

Let's break this down:

  • args: &[Value] - A slice of values passed from Nostos. Extract them by index.
  • ctx: &ExtContext - Context for accessing runtime features (rarely needed).
  • Result<Value, String> - Return Ok(value) on success, or Err(message) on failure.

Here's a complete example that creates a vector from a Nostos list:

use nalgebra::DVector;

fn dvec_new(args: &[Value], _ctx: &ExtContext) -> Result<Value, String> {
    // Extract the first argument as a list
    let list = args[0].as_list()
        .ok_or("dvec: expected List argument")?;

    // Convert List[Float] to Vec<f64>
    let data: Result<Vec<f64>, _> = list.iter()
        .map(|v| v.as_f64().ok_or("dvec: expected Float elements"))
        .collect();
    let data = data?;

    // Create the nalgebra vector
    let vector = DVector::from_vec(data);

    // Wrap it in a GC handle and return
    Ok(dvec_handle(vector))
}

Garbage Collection Integration

When your extension allocates native memory (like a DVector or DMatrix), Nostos needs to know how to free that memory when it's no longer referenced. This is done through GC handles and cleanup callbacks.

Understanding the GC Handle System

The GC handle system has three components:

Component Purpose
pointer (usize) Raw pointer to your native data, stored as an integer
type_id (u64) Identifier to distinguish between different native types
cleanup function Called by GC when the handle is no longer reachable

Defining Type IDs

First, define constants for each native type your extension manages:

// Unique identifiers for each native type
const TYPE_DVECTOR: u64 = 1;
const TYPE_DMATRIX: u64 = 2;

These IDs let the cleanup function know what type of data the pointer refers to, so it can cast and free it correctly.

The Cleanup Callback

The cleanup function is called by the Nostos garbage collector when a handle is no longer reachable. It receives the raw pointer and type ID, and must properly deallocate the memory:

/// Called by GC when a native handle is garbage collected.
///
/// # Safety
/// - `ptr` must be a valid pointer created by Box::into_raw
/// - `type_id` must match the type that was stored at that pointer
fn nalgebra_cleanup(ptr: usize, type_id: u64) {
    match type_id {
        TYPE_DVECTOR => {
            // Reconstruct the Box and let it drop, freeing the memory
            unsafe {
                let _ = Box::from_raw(ptr as *mut DVector<f64>);
            }
        }
        TYPE_DMATRIX => {
            unsafe {
                let _ = Box::from_raw(ptr as *mut DMatrix<f64>);
            }
        }
        _ => {
            eprintln!("nalgebra_cleanup: unknown type_id {}", type_id);
        }
    }
}

Important: Memory Safety

The cleanup callback involves unsafe Rust code. The Box::from_raw call reconstructs the Box from the raw pointer, and when this Box goes out of scope, Rust automatically frees the memory. Make sure your type IDs are correct - casting to the wrong type causes undefined behavior!

Creating GC Handles

Use helper functions to wrap native data in GC-managed handles. The Value::gc_handle function takes three arguments:

/// Wrap a DVector in a GC-managed handle
fn dvec_handle(v: DVector<f64>) -> Value {
    Value::gc_handle(
        Box::new(v),           // 1. Box the data (moves ownership to heap)
        TYPE_DVECTOR,          // 2. Type identifier for cleanup dispatch
        nalgebra_cleanup       // 3. Cleanup function to call on GC
    )
}

/// Wrap a DMatrix in a GC-managed handle
fn dmat_handle(m: DMatrix<f64>) -> Value {
    Value::gc_handle(
        Box::new(m),
        TYPE_DMATRIX,
        nalgebra_cleanup
    )
}

Inside Value::gc_handle, the Box is converted to a raw pointer via Box::into_raw, which transfers ownership to Nostos. When the GC determines the value is unreachable, it calls your cleanup function with that pointer.

Extracting Data from GC Handles

When a native function receives a GC handle as input, use as_gc_handle() to get the raw pointer, then cast it:

/// Extract a DVector reference from a GC handle value
fn get_dvec(v: &Value) -> Result<&DVector<f64>, String> {
    let ptr = v.as_gc_handle()
        .ok_or("expected GC handle for DVector")?;

    // Safety: We trust that handles with TYPE_DVECTOR contain DVector
    unsafe {
        Ok(&*(ptr as *const DVector<f64>))
    }
}

// Example usage in a native function:
fn dvec_add(args: &[Value], _ctx: &ExtContext) -> Result<Value, String> {
    let a = get_dvec(&args[0])?;
    let b = get_dvec(&args[1])?;

    let result = a + b;  // nalgebra's Add implementation
    Ok(dvec_handle(result))
}

Complete GC Lifecycle Example

Let's trace through the complete lifecycle of a native vector:

# In Nostos code:
v = vec([1.0, 2.0, 3.0])  # Creates native handle
result = v + v             # Uses handle, creates new handle
# v goes out of scope... GC eventually runs cleanup

Here's what happens at each step:

  1. Creation: vec([1.0, 2.0, 3.0]) calls Nalgebra.dvec
    • Rust creates DVector::from_vec(vec![1.0, 2.0, 3.0])
    • dvec_handle boxes it and returns Value::gc_handle(...)
    • Nostos stores the handle in variable v
  2. Usage: v + v calls Nalgebra.dvecAdd
    • get_dvec extracts references from both handles
    • nalgebra performs vector addition
    • Result is wrapped in a new GC handle
  3. Cleanup: When v becomes unreachable
    • GC marks the handle as garbage
    • Calls nalgebra_cleanup(ptr, TYPE_DVECTOR)
    • Box::from_raw reconstructs and drops the Box
    • Memory is freed

Writing the Nostos Wrapper

The Nostos wrapper file provides a type-safe, idiomatic interface to your native functions. It uses __native__() to call into Rust and defines types and traits for a clean API.

The __native__() Function

__native__() is a special built-in that calls registered native functions:

# Syntax: __native__("FunctionName", arg1, arg2, ...)

# Examples:
__native__("Nalgebra.dvec", [1.0, 2.0, 3.0])           # Create vector
__native__("Nalgebra.dvecAdd", handle1, handle2)       # Add vectors
__native__("Nalgebra.dvecScale", handle, 2.0)          # Scale vector

The first argument is the function name (matching what you registered), followed by any arguments to pass to the native function.

Defining Wrapper Types

Wrapper types hold the native handle and provide a typed interface:

# The wrapper type stores the native handle
# 'data' is typed as Any because it holds an opaque native pointer
pub type Vec = { data: Any }

# Constructor function - wraps native call
pub vec(data: List) -> Vec = Vec(__native__("Nalgebra.dvec", data))

# Operations that take and return wrapper types
pub vecDot(a: Vec, b: Vec) -> Float = __native__("Nalgebra.dvecDot", a.data, b.data)
pub vecNorm(v: Vec) -> Float = __native__("Nalgebra.dvecNorm", v.data)
pub vecScale(v: Vec, s: Float) -> Vec = Vec(__native__("Nalgebra.dvecScale", v.data, s))

Note how a.data extracts the native handle from the wrapper - this is what gets passed to Rust.

Implementing Operator Overloading

To use operators like +, -, *, / with your type, implement the Num trait:

# Define the Num trait (or use stdlib's)
trait Num
    add(self, other: Self) -> Self
    sub(self, other: Self) -> Self
    mul(self, other: Self) -> Self
    div(self, other: Self) -> Self
end

# Implement for Vec - enables v1 + v2, v1 - v2, etc.
Vec: Num
    add(self, other: Vec) -> Vec = Vec(__native__("Nalgebra.dvecAdd", self.data, other.data))
    sub(self, other: Vec) -> Vec = Vec(__native__("Nalgebra.dvecSub", self.data, other.data))
    mul(self, other: Vec) -> Vec = Vec(__native__("Nalgebra.dvecMap", self.data, other.data))
    div(self, other: Vec) -> Vec = Vec(__native__("Nalgebra.dvecDiv", self.data, other.data))
end

Now you can write natural mathematical expressions:

v1 = vec([1.0, 2.0, 3.0])
v2 = vec([4.0, 5.0, 6.0])

sum = v1 + v2        # Calls Vec.add -> Nalgebra.dvecAdd
diff = v1 - v2       # Calls Vec.sub -> Nalgebra.dvecSub
scaled = v1 * v2     # Component-wise multiplication

Scalar Operations

For mixed-type operations like vector * 2.0, define functions with a special naming convention:

# Naming convention: {TypeName}{Operation}Scalar
# The compiler automatically finds these for mixed-type operations

pub vecAddScalar(v: Vec, s: Float) -> Vec = Vec(__native__("Nalgebra.dvecAddScalar", v.data, s))
pub vecSubScalar(v: Vec, s: Float) -> Vec = Vec(__native__("Nalgebra.dvecSubScalar", v.data, s))
pub vecMulScalar(v: Vec, s: Float) -> Vec = Vec(__native__("Nalgebra.dvecScale", v.data, s))
pub vecDivScalar(v: Vec, s: Float) -> Vec = Vec(__native__("Nalgebra.dvecDivScalar", v.data, s))

Now scalar operations just work:

v = vec([1.0, 2.0, 3.0])
doubled = v * 2.0        # Calls vecMulScalar
shifted = v + 10.0       # Calls vecAddScalar

Pretty Printing with Show

Implement the Show trait so your type displays nicely in the REPL and with println:

trait Show
    show(self) -> String
end

Vec: Show
    show(self) -> String = {
        # Convert native handle back to list for display
        list = __native__("Nalgebra.dvecToList", self.data)
        len = length(list)
        maxShow = 10
        showList = if len <= maxShow then list else take(list, maxShow)
        strs = showList.map(x => show(x))
        joined = if isEmpty(strs) then ""
                 else tail(strs).fold(head(strs), (acc, s) => acc ++ ", " ++ s)
        suffix = if len <= maxShow then "" else ", ... (" ++ show(len) ++ " total)"
        "Vec[" ++ joined ++ suffix ++ "]"
    }
end

Now in the REPL:

>>> v = vec([1.0, 2.0, 3.0, 4.0, 5.0])
Vec[1.0, 2.0, 3.0, 4.0, 5.0]

>>> v * 2.0
Vec[2.0, 4.0, 6.0, 8.0, 10.0]

Using Extensions in Your Code

Once your extension is built and installed, import it like any other module:

import nalgebra
use nalgebra.*

main() = {
    # Create vectors
    v1 = vec([1.0, 2.0, 3.0])
    v2 = vec([4.0, 5.0, 6.0])

    # Use operators
    println(show(v1 + v2))         # Vec[5.0, 7.0, 9.0]
    println(show(v1 * 2.0))        # Vec[2.0, 4.0, 6.0]

    # Call functions
    println(show(vecDot(v1, v2)))  # 32.0
    println(show(vecNorm(v1)))     # 3.7416...

    # Create matrices
    m = mat([[1.0, 2.0], [3.0, 4.0]])
    println(show(m * m))           # Matrix multiplication

    0
}

Installing Extensions

Extensions are installed to ~/.nostos/extensions/. Each extension has its own directory:

~/.nostos/extensions/
└── nostos-nalgebra/
    ├── libnostos_nalgebra.so    # Compiled dynamic library
    └── nalgebra.nos             # Nostos wrapper file

Build and install your extension:

# Build the dynamic library
cd my-extension
cargo build --release

# Create extension directory
mkdir -p ~/.nostos/extensions/my-extension

# Copy files
cp target/release/libmy_extension.so ~/.nostos/extensions/my-extension/
cp my_extension.nos ~/.nostos/extensions/my-extension/

Async Runtime & Blocking Operations

Nostos runs on Tokio, a high-performance async runtime for Rust. Understanding how your extension interacts with Tokio is critical for writing performant, non-blocking code.

Critical Warning: Never Block the Runtime

Tokio uses a small number of worker threads (typically equal to CPU cores) to run thousands of concurrent tasks. If your native function blocks one of these threads, it cannot process other tasks, causing:

  • HTTP server stops responding to requests
  • Database queries time out
  • The entire application appears frozen

Async Models for Extensions

There are three main approaches for native extensions, depending on what your code does:

Model Use Case Example
Compute-bound (sync) Fast CPU operations that complete in microseconds Vector math, parsing, hashing
spawn_blocking Long CPU operations or blocking APIs Image processing, compression, legacy libraries
Async I/O Network, file I/O, or waiting for events HTTP clients, database drivers, WebSockets

1. Compute-Bound (Synchronous) - The nalgebra Approach

The nalgebra extension uses this model because linear algebra operations are:

  • Fast - Vector addition, dot products, and small matrix operations complete in nanoseconds to microseconds
  • CPU-bound - They don't wait for I/O, they just compute
  • Predictable - Execution time scales linearly with data size
// Good: Fast, non-blocking operation
fn dvec_add(args: &[Value], _ctx: &ExtContext) -> Result<Value, String> {
    let a = get_dvec(&args[0])?;
    let b = get_dvec(&args[1])?;
    let result = a + b;  // Completes in microseconds
    Ok(dvec_handle(result))
}

// Good: Even matrix multiplication is fine for reasonable sizes
fn dmat_mul(args: &[Value], _ctx: &ExtContext) -> Result<Value, String> {
    let a = get_dmat(&args[0])?;
    let b = get_dmat(&args[1])?;
    let result = a * b;  // O(n³) but still fast for n < 1000
    Ok(dmat_handle(result))
}

This model is appropriate when operations complete in under 1 millisecond. The brief time on the runtime thread is acceptable.

2. spawn_blocking - For Long Operations

If your operation takes more than a few milliseconds, use Tokio's spawn_blocking to run it on a dedicated thread pool:

use tokio::task::spawn_blocking;

// Bad: Blocks the runtime for seconds
fn bad_image_resize(args: &[Value], _ctx: &ExtContext) -> Result<Value, String> {
    let image = get_image(&args[0])?;
    let resized = image.resize(4000, 4000);  // Takes 2 seconds!
    Ok(image_handle(resized))
}

// Good: Runs on blocking thread pool
async fn good_image_resize(args: &[Value], _ctx: &ExtContext) -> Result<Value, String> {
    let image = get_image(&args[0])?.clone();

    let resized = spawn_blocking(move || {
        image.resize(4000, 4000)  // Runs on separate thread
    }).await.map_err(|e| e.to_string())?;

    Ok(image_handle(resized))
}

Note: Currently, Nostos extension functions are synchronous. If you need async support for I/O-bound extensions, the extension API would need to be extended. For now, avoid I/O in extensions or use synchronous I/O sparingly with appropriate timeouts.

3. What NOT to Do

These patterns will cause problems:

// BAD: Blocks on network I/O
fn bad_http_get(args: &[Value], _ctx: &ExtContext) -> Result<Value, String> {
    let url = args[0].as_str().unwrap();
    let response = reqwest::blocking::get(url)?;  // Blocks runtime!
    Ok(Value::string(response.text()?))
}

// BAD: Sleeps on runtime thread
fn bad_delay(args: &[Value], _ctx: &ExtContext) -> Result<Value, String> {
    let ms = args[0].as_i64().unwrap();
    std::thread::sleep(Duration::from_millis(ms as u64));  // Disaster!
    Ok(Value::unit())
}

// BAD: Waits for file I/O
fn bad_read_large_file(args: &[Value], _ctx: &ExtContext) -> Result<Value, String> {
    let path = args[0].as_str().unwrap();
    let content = std::fs::read_to_string(path)?;  // Blocks on disk!
    Ok(Value::string(content))
}

// BAD: CPU-intensive loop
fn bad_prime_search(args: &[Value], _ctx: &ExtContext) -> Result<Value, String> {
    let n = args[0].as_i64().unwrap();
    // Finding large primes can take minutes
    let prime = find_nth_prime(n as usize);  // Blocks forever!
    Ok(Value::int(prime))
}

Guidelines for Choosing a Model

Operation Type Expected Duration Recommendation
Simple math, small data < 1ms Synchronous (like nalgebra)
Complex computation, large data 1ms - 100ms Consider spawn_blocking
Heavy processing > 100ms Must use spawn_blocking
Any I/O (network, disk) Unpredictable Use Tokio async patterns

Working with Tokio-Enabled Libraries

If you're wrapping a library that already uses Tokio (like database drivers, HTTP clients, or WebSocket libraries), follow the library's async patterns. The key principle is:

If your library speaks Tokio, your extension should speak Tokio too.

For example, if wrapping a Tokio-based Redis client:

// The library is already async/Tokio-aware
use redis::aio::Connection;

// Your extension function should use the library's async API
// (async extension support is being developed)
async fn redis_get(args: &[Value], ctx: &ExtContext) -> Result<Value, String> {
    let conn = get_connection(&args[0])?;
    let key = args[1].as_str().ok_or("expected string key")?;

    // Use the library's native async method - this won't block!
    let value: String = conn.get(key).await
        .map_err(|e| e.to_string())?;

    Ok(Value::string(value))
}

This approach works because:

  • The library's async operations yield back to the Tokio runtime
  • Other tasks can run while waiting for I/O
  • No thread blocking occurs
  • You get the full performance benefits of async I/O

Note: Full async extension support is under development. Currently, you can handle this by having long-running operations marked for the I/O thread pool. Check the latest documentation for async extension API updates.

Why nalgebra Works Well

The nalgebra extension is a good example of when synchronous operations are appropriate:

  • Predictable performance - Vector/matrix operations have O(n) to O(n³) complexity with small constants
  • No I/O - Pure computation, no waiting for external resources
  • Memory-local - Data is already in RAM, no disk or network access
  • User-controlled size - Developers know their data sizes and can avoid huge matrices

If you're wrapping a library with similar characteristics (parsing, encoding, cryptographic primitives, data structures), the synchronous model is likely appropriate.

Best Practices

  • Use helper functions for handle creation and extraction to avoid code duplication
  • Choose meaningful type IDs - document them clearly in your code
  • Validate inputs - return descriptive errors when arguments are invalid
  • Implement Show - makes debugging and REPL usage much better
  • Use the Num trait - enables natural operator syntax
  • Define scalar operations - follow the typeMulScalar naming convention
  • Test thoroughly - native code bugs can crash the runtime
  • Never block the runtime - keep operations under 1ms, use spawn_blocking for CPU-heavy work, or use Tokio async for I/O
  • Follow library patterns - if wrapping a Tokio-enabled library, use its async API

Summary

Creating a Nostos FFI extension involves:

  1. Set up a Rust project with crate-type = ["cdylib"] and nostos-extension dependency
  2. Use declare_extension! to export the entry point
  3. Register native functions in the register function
  4. Define type IDs and a cleanup callback for GC integration
  5. Create helper functions that wrap data in GC handles via Value::gc_handle
  6. Write native functions matching the fn(args, ctx) -> Result<Value, String> signature
  7. Create a .nos wrapper with types, traits, and __native__() calls
  8. Respect the async runtime - keep sync operations fast (<1ms), use spawn_blocking for heavy CPU work, and follow Tokio patterns for I/O libraries

The result is a seamless integration where Nostos code can use native Rust libraries with full type safety, operator overloading, and automatic memory management. Just remember: Nostos runs on Tokio, so if your library speaks Tokio, your extension should too!