C99MIT LicensedZero DependenciesAllocation-Free

The math library embedded systems
always deserved.

numx brings scientific-grade numerical computing to bare-metal hardware. No heap. No operating system. No external dependencies. Fourteen production-ready modules spanning linear algebra, signal processing, automatic differentiation, post-quantum NTT, and more, all designed to run directly on ESP32, ARM Cortex-M, RISC-V, and everything else your project targets.

View on GitHub Read the Docs

Build: Passing License: MIT Lang: C99 ESP32: Validated ARM Cortex-M: Supported RISC-V: Supported

example.c

#include "numx/numx.h"

int main(void) {
    /* float32 default; -DNUMX_USE_DOUBLE=1 for float64 */
    /* every function returns numx_status_t, always check it */

    numx_real_t result;
    numx_status_t s = numx_integrate_simpson(
        my_func, 0.0f, 1.0f, 100, &result
    );

    if (s != NUMX_OK) {
        /* NUMX_ERR_INVALID_ARG, NUMX_ERR_NO_CONVERGE */
        return 1;
    }

    return 0;
}

Engineering Principles

Not features. Constraints.

Every decision in numx was made in the context of the hardware it runs on. These are the four constraints that shaped the entire architecture.

Zero Dynamic Allocation

Not a single call to malloc, calloc, or realloc exists anywhere in the numx codebase. On embedded systems, heap fragmentation is not a theoretical concern, it is a production failure mode. Every buffer, every intermediate result, every output lives either on the stack or in caller-provided memory. The library never surprises you with a memory footprint larger than what you gave it.

Fully Reentrant

numx holds no global state, no static mutable buffers, no hidden dependencies between calls. This matters in real embedded systems where the same library functions may be called from interrupt handlers, from RTOS tasks running concurrently, or from multiple execution contexts on multicore processors. Reentrancy is not a feature. It is a requirement for production use.

Typed Status Codes

Every function returns numx_status_t. If a solver does not converge, if a matrix is singular, if input falls outside valid range for an algorithm, the caller knows exactly what happened and why. There is no silent failure in numx. This is the only honest way to build software for systems where silent failure can mean a sensor reading the wrong value or a control loop running on corrupted data.

Single Precision Flag

The entire precision of the library switches between float32 and float64 with a single compile-time flag: -DNUMX_USE_DOUBLE=1. On hardware with a hardware floating-point unit, float64 runs without penalty. On hardware where memory and speed are tighter, float32 keeps the footprint minimal. One library. One codebase. The right precision for every target.

14 Modules

Every algorithm embedded engineers actually need.

Each module is self-contained, allocation-free, and integrates cleanly into any embedded project regardless of platform or toolchain.

linalg complete

Vectors and matrices. Dot product, norms, cross product, matrix multiply, transpose, determinant, and LU decomposition.

dotnormcrossmat_multransposedetlu

Docs →

stats complete

Descriptive statistics on fixed-size buffers. Mean, variance, standard deviation, median, and percentile without a runtime.

meanvariancestd_devmedianpercentile

Docs →

roots complete

Equation root-finding. Bisection for guaranteed convergence, Newton-Raphson for speed, and Brent's method for both.

bisectnewtonbrent

Docs →

integrate complete

Numerical integration. Trapezoidal rule, Simpson's rule, and Gaussian quadrature for varying accuracy requirements.

trapsimpsongauss

Docs →

differentiate complete

Finite difference derivatives. Forward difference, central difference, and Richardson extrapolation for improved accuracy.

forwardcentralrichardson

Docs →

interpolate complete

Curve fitting and reconstruction. Linear interpolation, cubic splines, and Chebyshev polynomial approximation.

linearcubic_splinechebyshev

Docs →

poly complete

Polynomial arithmetic. Horner's method for efficient evaluation and Newton with deflation for root-finding.

evalroots

Docs →

ode complete

ODE solvers for physical simulation. Fixed-step RK4 for deterministic loops and adaptive RK45 with error control.

rk4rk45

Docs →

signal complete

Signal processing primitives. FIR filters, IIR filters, convolution, correlation, windowing, EMA, and peak detection.

firiirconvolvecorrelatewindowemapeaks

Docs →

fft complete

Fast Fourier Transform. Cooley-Tukey radix-2 in float32 and Q15 fixed-point, IFFT, and magnitude spectrum.

fft_f32fft_q15ifftmagnitude

Docs →

autodiff complete

Automatic differentiation without a runtime. Forward mode via dual numbers, reverse mode via a static compile-time tape.

forwardreversegrad

Docs →

compressed_sensing complete

Sparse signal recovery. Orthogonal Matching Pursuit and Iterative Shrinkage-Thresholding for compressed sensing.

ompista

Docs →

sketch complete

Randomized matrix methods. Randomized SVD using the Halko-Martinsson-Tropp algorithm for large matrix approximation.

rsvd

Docs →

ntt complete

Number Theoretic Transform over Z_3329[x]/(x^256+1). Forward and inverse NTT with a data-independent butterfly network, pointwise basemul in 128 degree-2 rings, polynomial multiplication, and coefficient reduction. Targets CRYSTALS-Kyber and CRYSTALS-Dilithium parameters. Zero heap allocation.

forwardinversepointwise_mulpolymulpoly_addpoly_subreduce

Docs →

View full module reference →

Target Hardware

Runs where it matters.

ESP32 240 MHz, FPUARM Cortex-M0/M0+ ARM Cortex-M4 DSPARM Cortex-A RISC-V RV32IMFC AVR 8-bitx86-64 dev/CI

GCC. Clang. IAR. MSVC. If your toolchain compiles C99, numx compiles on your toolchain. No platform-specific code. No architecture-specific intrinsics. No dependencies on any header outside the C99 standard library.

Quick Start

Up and running in minutes.

# Option A: FetchContent (no cloning needed)
include(FetchContent)
FetchContent_Declare(numx
  GIT_REPOSITORY https://github.com/NIKX-Tech/numx.git
  GIT_TAG        v1.0.0
)
FetchContent_MakeAvailable(numx)
target_link_libraries(my_target PRIVATE numx::numx)

# Option B: clone and build locally
git clone https://github.com/NIKX-Tech/numx.git
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --parallel
ctest --test-dir build --output-on-failure

Full installation guide →

Production Validated

Not a prototype.

Through 2024 and 2025, NIKX Technologies developed TERRA, a production IoT platform running on ESP32 microcontrollers. Signal processing pipelines, numerical solvers, on-device mathematical operations running at 240 MHz, allocation-free, within real hardware memory constraints.

TERRA validated every architectural decision that had gone into numx since 2020. Zero dynamic allocation was not a design preference. It was the difference between a library that holds up in production and one that fails unpredictably under load.

validation-report.txt

x86-64 / ARM64 / Windows 329 / 329 tests passed

ESP32-S3 548 / 550 tests passed

Modules 14 of 14 platform-validated

malloc() calls 0 in the entire codebase

Free. Open. Ready for production.

MIT licensed. No royalties, no attribution requirements beyond the license notice, no restrictions on commercial or academic use.

Validated on x86-64, ARM64, Windows, ESP32-S3, and Raspberry Pi 4B. Full hardware results in the repository.

View on GitHub Browse Documentation

The math library embedded systems always deserved.

Not features. Constraints.

Zero Dynamic Allocation

Fully Reentrant

Typed Status Codes

Single Precision Flag

Every algorithm embedded engineers actually need.

Runs where it matters.

Up and running in minutes.

Not a prototype.

Free. Open. Ready for production.

The math library embedded systems
always deserved.