Live/2026

Hawk

A self-correcting ML system trading live financial markets.

An end-to-end machine-learning system that ingests live market data 24/7, predicts short-horizon outcomes with a calibrated model, and retrains itself behind automated drift gates — running unattended in production for months.

Months, unattended

Uptime

130k+

Live snapshots

Drift-gated

Retrain safety

Isotonic

Calibration

The problem

Most ML projects die in a notebook: a model trained once on a static dataset, evaluated on a clean test split, and never deployed. The hard part of machine learning in the real world isn't fitting a model — it's keeping one accurate, calibrated, and trustworthy as live data drifts underneath it.

Hawk is my answer to that. It targets a deliberately unforgiving domain — short-horizon prediction on live, adversarial markets — as a forcing function for genuine production ML engineering. The goal was never a one-off backtest; it was a system that operates for months and corrects itself.

What I built

Live data pipeline: a collector running 24/7 on a Linux VPS snapshots market and order-book state on a fixed interval into a columnar (Parquet) store — 130k+ snapshots and counting.
Engineered feature set built on domain physics (distance-to-strike normalized by time and volatility, order-book imbalance, momentum) rather than throwing raw inputs at the model.
LightGBM classifier with isotonic probability calibration and group-aware walk-forward validation, so reported confidence reflects real-world frequencies instead of overfit optimism.
Automated retraining behind safety gates: the system retrains on new data only when it improves, rejecting any candidate that regresses calibration (Brier score) beyond a strict threshold.
Hot model-reload: the live process swaps in a newly accepted model on file change — no downtime, no restart.
Observability built in: drift monitoring, a model-change audit log, and alerting so a silent model swap can never go unnoticed.
Risk discipline as a first-class feature — fixed-dollar sizing, position limits, and staged validation (paper-first) before any real capital. The system is engineered to scale capital deliberately, not recklessly.

Architecture

1Coinbase / market WebSocket + REST → live price & order-book feed
2Collector (systemd, 24/7) → snapshot every interval → Parquet store
3Feature engineering → LightGBM + isotonic calibration → walk-forward eval
4Retrain job → Brier-regression gate → accept / reject candidate model
5Live engine hot-reloads accepted model → calibrated prediction + risk sizing
6Drift monitor + model-change audit log + alerting

What it demonstrates

Hawk is the full MLOps lifecycle in one system: data engineering, feature engineering, calibrated modeling, time-series validation, automated retraining with regression guards, hot deployment, and production monitoring.

It is as much a system-design problem as a modeling one. The decisions that made it work were architectural: separating collection from inference so a stalled feed can never corrupt training data, making the retrain gate the only path a model can take into production, treating the model artifact as hot-swappable state with an audit trail, and designing the risk controls as a layer the model cannot talk its way past.

It reflects how I think about ML in production — that a model is only as good as the system keeping it honest, and that calibration, drift detection, and risk controls matter more than a single impressive metric.

Stack

PythonLightGBMpandas / NumPyParquetscikit-learnLinux / systemdVPS (24/7)

In production

Hawk's live operations dashboard: session stats, active positions, recent closed trades with model probabilities, and a per-strategy performance breakdown. — Hawk's live operations dashboard — a real-time terminal UI surfacing session P&L, open positions, recent market resolutions (with the model's predicted probability per trade), and a per-strategy performance breakdown. This is the observability layer that makes the system safe to run unattended. (Single live session, May 2026.)

More work

StateBizMapA network map of who's really behind a state's businesses.ScanopyTurning raw aerial imagery into a scored, mapped lead product.Portal CRMA complete self-hosted CRM + ops platform for a field-service crew.VectorAIAn LLM agent that finds and ranks the jobs actually worth applying to.StickyA digital corkboard built for the people who still use paper ones.