Feb 24, 2019 - 12 MIN READ

From Microstructure to Features - What the Model Will See

If RL taught me “the state is the contract,” then trading is where that contract becomes painful. This month I map order book microstructure into concrete feature families my models can actually learn from.

Axel Domingues

In 2018, RL taught me something I didn’t fully appreciate at the time:

the state representation is the real interface.

If your agent sees the wrong state, it learns the wrong world.

Now I’m aiming at BitMEX (Bitcoin Mercantile Exchange) and suddenly that lesson is unavoidable.
Because in markets, “state” is not a grid world.

It’s a living order book — and it’s very easy to lie to yourself about what it’s saying.

This month is about making a promise:

whatever I train on, I must be able to compute the same way in live trading.

That means defining feature families grounded in microstructure, and implementing them in code that can run both in the collector and in the live loop.

Repo grounding (this post’s center of gravity):

BitmexPythonDataCollector/SnapshotManager.py
BitmexPythonChappie/SnapshotManager.py

The mental model

The model doesn’t “see the market”.
It sees a vector I choose to construct.

Feature families

Price, depth, flow, volatility, liquidity creation — each one is a different microstructure story.

Multi-scale windows

Everything is computed over time windows (seconds → minutes → hour) because regimes live at multiple speeds.

Train/live parity

The same feature computation exists in both the collector and Chappie, so “research” can’t drift from “reality”.

This is research engineering, not financial advice.

I’m documenting how I build systems, how they fail, and how I debug them — not suggesting anyone should trade.

RL Carryover: In Trading, the “Observation Space” Is a Choice

In Gym environments, the observation space is a spec.

In trading, it’s a trap.

You can always invent a feature that looks predictive in a backtest — especially if it accidentally smuggles the future in.

So I’m approaching features like I approached RL instrumentation in 2018:

define the contract
log everything
assume I’m wrong until it survives pressure

The Feature Families (Microstructure → What the Model Gets)

The implementation lives in SnapshotManager.__create_current_snapshot(...).

The point is not “clever features”.
The point is: features that represent microstructure effects I can explain.

1) Price + spread (the simplest “market state”)

These are the obvious ones, but they anchor everything else:

best bid / best ask (top of book)
spread (called quotes_diff)
“time since the last meaningful move” (seconds_last_move)

This is where I start because it gives me basic sanity checks:

spread should never be negative
best ask should be above best bid
time-since-move should increase until something moves

2) Depth (liquidity available right now)

I don’t start with a full book image yet.
I start with something I can reason about:

sizes of the top N quotes on each side (RELEVANT_NUM_QUOTES = 5)
total depth at levels 10 and 25 (bid_depth_l10, ask_depth_l10, bid_depth_l25, ask_depth_l25)

This is microstructure in plain terms:

thin depth → fragile price
thick depth → price has “cushion”

3) Quote movement over time (price velocity without pretending it’s a chart)

Instead of only looking at current best bid/ask, I compute how much they moved over different time windows.

In code these are:

bid_change_<window>
ask_change_<window>

Each window is in seconds, and the system uses many windows (from ~1 second up to an hour).

This gives the model a chance to learn:

“we’re drifting”
“we’re snapping”
“we’re dead”

without me hardcoding “trend”.

4) Distribution view (where are we inside recent movement?)

This is one of the first features that made me feel like I was doing microstructure and not just “technical indicators”.

For each window (2 min → 1 hour), I compute:

cdf_<window>: where the current mid price sits inside a normal approximation of recent mid prices
std_pct_<window>: volatility proxy as std/mean over that window

This becomes my early “regime sensor”:

tight distribution → calm regime
wide distribution → unstable regime

5) Aggressive flow (who is crossing the spread?)

A limit order is intention.

A market order is conviction.

So I compute traded volume separated by aggressor side:

buy_volume_<window>
sell_volume_<window>

Not because volume is magic — but because in an order book world, flow is often the first thing that changes before price does.

6) Liquidity creation (what the passive side is doing)

This one is very “order-book-native” and becomes important later when maker behavior matters.

Over short windows I track created liquidity:

at the best quotes: best_bid_created_liq_<window>, best_ask_created_liq_<window>
deeper quotes: bids_created_liq_<window>, asks_created_liq_<window>

In plain words:

are people stacking the bid?
are they pulling the ask?
is liquidity appearing… or evaporating?

The Windows: Why Everything Is “Over Time”

I don’t trust single timescale features.

Markets have multiple clocks:

micro bursts (seconds)
local mean reversion (tens of seconds)
slow drift / regime (minutes to hours)

So the SnapshotManager computes many features across multiple time windows.

The “Feature Dictionary” Contract (What Exists, By Name)

One of my rules: feature names are part of the API.

If I rename a feature, I’m changing the world my model lives in.

So here are the core keys (as implemented in __create_current_snapshot):

Why Two SnapshotManagers (Collector vs Chappie)

This is deliberate.

The DataCollector SnapshotManager computes features while ingesting data.
The Chappie SnapshotManager computes features in live mode — and (critically) it’s where live decisions eventually attach.

I want both to match because I’ve seen this failure pattern too many times:

If feature computation differs between research and production, your backtests are fiction.

It won’t fail loudly. It will fail by “almost working”.

So I’m treating SnapshotManager like a shared contract, not a helper script.

The Debugging Discipline Before Any Model Exists

I don’t start training first.
I start by trying to break my own features.

Step 1: Sanity invariants (the “don’t be stupid” checks)

best_ask > best_bid
quotes_diff >= 0
sizes and depth are non-negative
timestamps are monotonic (or I stop and investigate)

Step 2: Window features should behave like windows

Pick one feature family and force a scenario:

a burst of trades should spike the short buy_volume_*
volatility proxies should react slower
long windows should smooth

Step 3: Cross-check depth sums

If bid_size_1..5 are large, bid_depth_l10 should usually be larger than any single level.

If it isn’t, I probably messed up indexing or sorting.

Step 4: Create “known moments” and print the snapshot

I want at least a few timestamps where I can say:

“spread widened”
“depth thinned”
“flow flipped” …and the snapshot should reflect it in the obvious fields.

What I’m Not Doing Yet (On Purpose)

I’m not trying to solve the entire order book in February.

No L3 queue modeling.
No full book images into a CNN.
No “alpha signals” yet.

This month is about a foundation I can trust.

My 2017 deep learning rule still applies:

Start with what you can debug.
Only then scale the representation.

Resources

My research repo

This series is tied to code. Collector → datasets → features → alpha models → Gym envs → live loop (“Chappie”).

FAQ

What’s Next

Now that I know what the model will see, I need to build the thing that collects it reliably.

Next month is where the market stops being theory and becomes a hostile networked system.

The Collector - Websockets, Clock Drift, and the First Clean Snapshot

In March 2019 I stop “talking about microstructure” and start collecting it. Websockets drop messages, clocks drift, and the only thing that matters is producing a snapshot I can trust.

Order Books Are the Battlefield - Matching Engines in Plain English

In 2018 I learned RL inside clean Gym worlds. In 2019 I’m pointing that mindset at BitMEX — where the “environment” is a matching engine and the rewards come with slippage, fees, queue priority, and outages.