Filter Level 2 order book spoofing using size-to-fill ratios

A Level 2 book that shows 40,000 shares at the offer and prints only 900 shares when price trades into that level has a displayed-liquidity problem. The ratio is not proof of spoofing. It is a filter input.

Garrett Croft·Updated: July 04, 2026·19 min read

Filter Level 2 order book spoofing using size-to-fill ratios

Spoofing is the placement of large orders with intent to cancel before execution. The function is to create false depth, pressure, or urgency. In U.S. markets, it is prohibited under the Dodd-Frank framework as a disruptive trading practice. For an intraday trader, the operational issue is different from the legal issue. The trader needs a repeatable method to mark displayed size that does not convert into prints.

Level 2 alone is insufficient. Time and sales alone is insufficient. The filter requires both: quoted size at each price level, executed size when that price is touched, cancellation behavior, order flow imbalance, and latency timestamps measured in milliseconds.

The mechanics of spoofing in the displayed book

Level 2 typically exposes 5–10 price levels with aggregate size by venue or price. It shows intent to trade only in a limited sense. A displayed bid is executable until it is canceled, routed around, refreshed, hidden behind queue priority, or bypassed by faster participants. The distinction matters.

A spoofing pattern normally has five observable components:

1. Displayed size appears away from the last traded price.

The size is large relative to nearby levels and relative to recent depth. It may appear two to five ticks away in a liquid stock, or one to two ticks away in a thin book.

2. The size influences visible imbalance.

Bid-side or ask-side depth becomes skewed. Market participants reading raw book pressure may infer support or resistance.

3. Price approaches the displayed layer.

The layer becomes decision-relevant only when the market trades toward it. Size sitting 30 cents away in a stock with a 2-cent spread and low volatility has low immediate execution value.

4. The order cancels or relocates before meaningful execution.

This is the core mechanical signal. The displayed size does not fill in proportion to its advertised size.

5. The opposite side receives execution or price movement.

The spoofing hypothesis becomes stronger if the large layer helps generate trades in the other direction, then disappears.

A filter should not label the first component as spoofing. Large orders exist for valid reasons. Institutions display liquidity. Market makers rebalance. Hedging programs can create large visible layers. Iceberg orders may show small displayed size and large reserve size. The spoofing filter starts only when displayed size meets low fill conversion and abnormal cancellation behavior.

A large order is not the signal. A large order that refuses to become traded volume is the signal.

The minimum data fields are precise:

Timestamp at millisecond resolution or better.
Best bid and offer.
Depth by price level.
Displayed size by level.
Executed trade size.
Trade price.
Trade aggressor side, if available.
Add, cancel, and modify events, if the feed provides them.
Venue identifier, if available.
Spread and volatility state.

Retail platforms often aggregate depth and compress updates. That reduces signal quality. The same visual Level 2 ladder can hide cancel-replace cycles. A platform that samples the book every 100–250 ms will miss a large share of flickering quotes. Spoofing occurs in the millisecond environment. The filter must be designed around the feed resolution, not the screen display.

Size-to-fill ratio: the operating metric

The size-to-fill ratio compares displayed liquidity at a level with actual executed liquidity when that level becomes tradable. It is also related to order-to-trade ratio logic: how much quoted size appears versus how much volume prints.

A basic calculation:

Size-to-fill ratio = displayed size at tested price / executed size at tested price during the test window

If 25,000 shares are shown at $48.20 and only 1,250 shares execute at $48.20 before the visible size cancels, the ratio is 20:1. The higher the ratio, the lower the fill conversion.

This metric is not universal. Thresholds vary by stock, spread, volatility, venue mix, and exchange surveillance logic. Institutional proprietary thresholds are not public. A retail trader should not import a static number across symbols. A 15:1 ratio in a mega-cap with thick depth may mean less than a 6:1 ratio in a thin small-cap with a two-cent spread.

A usable implementation should define the test window first. Without a window, the ratio becomes arbitrary.

Parameter	Tight scalping configuration	Slower intraday configuration
Book depth used	Top 3–5 levels	Top 5–10 levels
Timestamp tolerance	1–50 ms	50–500 ms
Price test definition	Trade prints at the exact level	Trade prints at level or within one tick
Fill window	Until cancel or 250 ms after first touch	Until cancel or 1–3 seconds after first touch
Minimum displayed size	Relative to median depth	Relative to median depth and ADV context
Valid signal	High ratio plus cancel behavior	High ratio plus repeat pattern

The ratio should be normalized against local depth. Raw size is weak. Relative size is stronger.

Example variables:

D_level: displayed size at the price level before first touch.
F_level: executed volume at that price during the test window.
C_level: canceled or removed volume at that price.
M_depth: median displayed depth at comparable levels over the last N seconds.
R_stf: size-to-fill ratio.
R_rel: displayed size divided by median local depth.

Operational formula:

R_stf = D_level / max(F_level, minimum lot floor)

R_rel = D_level / M_depth

The lot floor prevents division by zero. If displayed size is 18,000 shares and fills are zero, the ratio is undefined in pure math and maximal in practice. A system should tag it as no-fill cancellation, not as an infinite numeric value that breaks downstream logic.

A simple filter can run as follows:

1. Build a rolling depth baseline.

Compute median displayed size per level over 30–120 seconds. Use median, not average. Averages are distorted by the same large layers the filter is trying to detect.

2. Flag abnormal displayed size.

Mark levels where displayed size exceeds a selected multiple of local median depth. The multiple must be calibrated by symbol.

3. Wait for a price test.

Ignore untouched layers. A layer must be approached or hit by trades. Otherwise the ratio has no fill denominator.

4. Measure executed volume at that price.

Use time and sales. If aggressor side is available, separate bid-hit volume from offer-lift volume.

5. Measure cancellation before fill.

If the displayed layer disappears before execution proportional to its size, raise the spoofing score.

6. Recheck after relocation.

If the same side reappears one or two ticks away after cancellation, raise the score again. If the order rests and fills, reduce the score.

This is not a trade-entry model. It is a liquidity-quality filter. The output should be a score or binary exclusion flag. It should not generate long or short signals by itself.

Order flow imbalance and the false-depth problem

Order flow imbalance measures the difference between demand and supply pressure at the quote and near-quote levels. A basic version compares changes in bid size and ask size, plus executed volume. More advanced versions track queue additions, cancellations, and marketable orders.

The spoofing problem is that false displayed size can distort imbalance metrics. A naïve imbalance model sees 80,000 shares bid and 18,000 shares offered. It labels the book bid-heavy. If the 80,000-share bid cancels before execution, the model has processed fake support as valid pressure.

A Level 2 spoofing filter should therefore segment imbalance into two components:

Component	Input	Interpretation
Displayed imbalance	Resting bid and ask depth	Vulnerable to spoofing and quote stuffing
Executed imbalance	Trades at bid versus trades at offer	Harder to fake because it requires fills
Cancellation imbalance	Removed bid size versus removed ask size	Useful for detecting false pressure
Persistence imbalance	Depth that remains through price tests	Stronger than raw displayed depth

The useful metric is not “large bid equals support.” The useful metric is “large bid persists and fills when hit.” That converts displayed interest into execution evidence.

A practical scoring layer:

Displayed skew: bid depth or ask depth is materially larger than recent median.
Approach condition: last price moves toward the skewed side.
Low conversion: executed size is small relative to displayed size.
Fast removal: cancellation occurs before or during first meaningful touch.
Relocation: same-side large size appears again at a new level.
Opposite-side trade response: prints occur in the direction that the fake pressure would encourage.

Each item adds information. None is determinative alone.

Flickering quotes create additional noise. High-frequency systems often place and cancel orders in milliseconds to test market reaction, manage adverse selection, or maintain queue position. Some flickering is legal market making. Some may be part of deceptive behavior. The filter cannot infer intent. It can only measure behavior.

Therefore the correct label inside a trading system is not “spoofing confirmed.” The correct labels are operational:

low_fill_displayed_liquidity
high_cancel_near_touch
unstable_depth_layer
suspect_depth_pressure
exclude_from_support_resistance

These labels avoid legal certainty. They also reduce model contamination. A trader does not need to prove intent to avoid using the level as support.

Distinguishing spoof layers from institutional icebergs

The most common implementation error is treating every large visible layer as spoofing. That creates false positives. It also removes useful liquidity zones from the trader’s map.

Icebergs and spoof layers differ in execution behavior. An iceberg hides reserve size and refreshes after partial fills. A spoof layer displays size and cancels before meaningful fills. The difference is visible in the relationship between prints and displayed depth.

Behavior	Likely iceberg or genuine liquidity	Likely spoof-style layer
Price trades into level	Prints occur repeatedly	Few or no prints occur
Displayed size after fill	Refreshes or remains	Cancels or relocates
Size-to-fill ratio	Lower or stable	High and persistent
Queue behavior	Absorbs marketable flow	Avoids interaction with flow
Effect on price	Slows movement through level	Creates brief hesitation, then vanishes
Repeat pattern	Same price absorbs volume	New large layer appears away from price

Absorption is the key distinction. If a bid shows 10,000 shares, trades 25,000 shares at that price, and remains visible in smaller clips, the original visible size understated available liquidity. That is not a spoofing signal. It is hidden liquidity or replenishment.

If an offer shows 50,000 shares, trades 700 shares, cancels when the bid steps up, and reappears 3 cents higher, the fill conversion is weak. The ratio flags unstable displayed supply.

A filter should use persistence through stress. Stress means the level is tested by marketable flow. Resting size that survives no test has no informational value.

Handling partial fills

Partial fills require classification. A large layer that executes 20% of displayed size before canceling is different from one that executes 0.5%. The correct threshold depends on symbol behavior. The filter should store distributions rather than assume a fixed rule.

Useful stored metrics:

Median size-to-fill ratio by symbol.
Median ratio by time of day.
Ratio during opening auction aftermath.
Ratio during lunch liquidity compression.
Ratio during closing imbalance period.
Ratio by spread regime.
Ratio by volatility regime.
Ratio by venue, if the feed exposes venues.

The same stock can show different normal ratios at 09:35, 12:10, and 15:50. The open has wide spreads and fast depth churn. Midday has thinner prints and lower displayed urgency. The close has imbalance-related routing and position adjustment.

This matters for day traders using the primary keyword workflow: how to check filter Level 2 order book spoofing using size-to-fill day trading conditions cannot use one static threshold across all sessions. The “day” context changes the denominator. Intraday liquidity is not stationary.

Platform and feed constraints

The filter quality is bounded by data quality. The screen is not the data. A Level 2 montage that updates slower than the underlying feed will make spoofing detection appear cleaner than it is. The missing events are the problem.

Core constraints:

1. Latency.

Spoofing-like behavior can occur inside millisecond windows. If a platform timestamps only at coarse intervals, the sequence of add, trade, and cancel may be misordered.

2. Aggregation.

Some platforms aggregate by price level. That hides individual order IDs and queue changes. The user sees 30,000 shares at a level but not the composition of the interest.

3. Venue fragmentation.

U.S. equities trade across multiple venues and off-exchange mechanisms. Displayed depth is not the full liquidity map. Dark pool prints and midpoint executions can alter price without appearing as visible resting depth.

4. API limits.

Retail APIs may throttle depth updates, limit historical order book storage, or provide snapshots instead of full depth event streams.

5. Clock synchronization.

Trade and quote data must use aligned timestamps. A 100 ms mismatch can reverse the apparent order of cancellation and execution.

6. Odd lots.

Odd-lot liquidity can affect price formation but may not display the same way across feeds and interfaces. Filters that ignore it may misread near-quote behavior.

A platform suitable for this filter should support exportable quote and trade data. If it only provides a visual ladder, the user can still make discretionary observations, but statistical validation will be weak.

The benchmark is simple: the system must preserve event sequence. If it cannot show whether the order canceled before or after the trade, it cannot support a reliable size-to-fill model.

The book snapshot is a frame. Spoofing detection needs the tape between frames.

This has a parallel outside market data: release calendars and headline feeds in other sectors are useful only when timestamped and sequenced correctly; even entertainment industry news desks depend on the order of updates to separate signal from stale inventory. Market data has a lower tolerance. Milliseconds change classification.

Implementation model for a size-to-fill filter

A production-grade version does not need complex machine learning. It needs clean event handling and strict definitions.

Step 1: Define the tested level

A tested level is a displayed price level that price reaches or nearly reaches within a specified tick distance. For liquid equities with one-cent spreads, exact touch is preferable. For wider-spread stocks, a one-tick tolerance may be required.

Inputs:

Symbol.
Price level.
Side: bid or ask.
Initial displayed size.
Distance from best bid or offer.
Time first observed.
Time first tested.

Untested levels should expire. Otherwise the system accumulates false evidence from depth that never mattered.

Step 2: Define the fill window

The fill window begins when the price level is first tested. It ends at the earliest of:

Full disappearance of the displayed size.
Price moves away beyond the defined tick threshold.
A maximum time limit.
A material refresh event that changes the displayed-size baseline.

For scalping, the window may be sub-second. For slower intraday work, one to three seconds may be acceptable. The longer the window, the more noise from unrelated flow.

Step 3: Compute conversion

Compute:

Displayed size at start.
Executed size at tested price.
Executed size through the level.
Canceled size.
Remaining size.
Refresh count.
Time-to-cancel.
Time-to-first-fill.
Size-to-fill ratio.

The ratio should be stored with context. A raw number without context decays in value.

Recommended record structure:

Field	Purpose
`symbol`	Avoids cross-symbol threshold contamination
`session_time`	Captures open, midday, close effects
`side`	Bid and ask behavior may differ
`spread_bps`	Normalizes by trading condition
`volatility_bucket`	Separates stable and fast regimes
`displayed_size`	Numerator input
`executed_size`	Denominator input
`canceled_size`	Confirms removal
`time_to_cancel_ms`	Measures flicker risk
`rel_depth_multiple`	Compares size with local baseline
`stf_ratio`	Main filter metric
`classification`	Stable, unstable, suspect, excluded

Step 4: Score behavior, not intent

Intent is not observable from the trading screen. Behavior is observable. The score should reflect behavior:

0: normal displayed liquidity.
1: above-normal size, not tested.
2: tested with moderate fill conversion.
3: tested with low fill conversion and fast cancellation.
4: repeated low-fill cancellation at nearby levels.
5: repeated low-fill cancellation plus opposite-side execution response.

A score of 4 or 5 should not mean “confirmed illegal spoofing.” It should mean “do not treat this displayed size as reliable liquidity.”

Step 5: Feed the output into execution rules

The output can affect execution in several ways:

Reduce confidence in visible support or resistance.
Avoid passive order placement behind suspect layers.
Require tighter stop logic if the setup depends on displayed depth.
Increase slippage assumptions.
Exclude the level from liquidity-pool mapping.
Delay entry until executed volume confirms the level.

This is where commercial trading platforms differ. Some allow custom indicators on depth. Some allow only chart-based scripting. Some expose depth through API but not in the visual platform. The buyer’s due diligence is concrete: can the platform export synchronized quote and trade events with millisecond timestamps and enough depth levels to compute the ratio?

If not, the platform can still show the ladder. It cannot validate the ladder.

Regulatory context and HFT behavior

Spoofing is explicitly prohibited in U.S. markets under the post-2010 regulatory framework. The 2015 Navinder Singh Sarao case became a reference point for large-scale spoofing enforcement and public attention. The enforcement standard concerns intent and disruptive practice. Trading-system filters do not need to make that legal conclusion.

High-frequency participants create another layer of complexity. Many strategies modify quotes rapidly for non-deceptive reasons:

Inventory control.
Queue position management.
Adverse selection reduction.
Latency arbitrage protection.
Spread capture under changing volatility.
Market-making obligation management.

These behaviors can look similar to spoofing at screen speed. That is why the size-to-fill ratio must be combined with repeated cancellation near price tests. A fast cancel far from price may be normal quote maintenance. A fast cancel at the moment of interaction carries different information.

Regulators and exchanges monitor order-to-trade ratios for extreme deviations. Exact thresholds vary and are not public in a way that traders can directly replicate. The trader’s filter should therefore be adaptive, not absolute.

A strict implementation avoids three invalid assumptions:

1. All large displayed orders are spoofing.

False. Large legitimate orders exist. Icebergs and real absorption are common.

2. A high size-to-fill ratio predicts profit.

False. It identifies low-quality displayed liquidity. Directional edge requires additional evidence.

3. Retail data can match institutional surveillance.

Usually false. Retail feeds can be delayed, aggregated, sampled, or incomplete.

The filter still has value. It removes contaminated depth from intraday decisions. That can reduce avoidable slippage and false support/resistance reads.

Execution use cases

The size-to-fill filter is most useful in setups where the trader’s decision depends on visible liquidity.

Breakout through an offer wall

A stock trades at $32.48 bid, $32.49 offer. The book shows 65,000 shares offered at $32.50. Price lifts into $32.50. Only 2,400 shares print. The offer disappears and reappears at $32.53.

The size-to-fill ratio is high. The displayed offer did not absorb. The breakout trader should not classify $32.50 as a proven supply level. It was visible friction, not confirmed liquidity.

Pullback into a bid layer

A stock trends above VWAP. A 40,000-share bid appears two cents below the market. Price pulls back. The bid trades 38,000 shares, refreshes, and price stabilizes.

The ratio is low. Fill conversion is high. That is not a spoofing-style signal. It is closer to absorption. The level has execution evidence.

Failed support from false depth

A large bid appears at $19.80. The book becomes bid-heavy. Price drops toward the level. The bid cancels after 600 shares trade. Price prints $19.79 immediately after.

The filter marks low-fill cancellation. A trader using $19.80 as support without conversion data would overestimate the level. The correct model excludes the canceled bid from support calculations.

Scalping around flickering quotes

In very liquid names, top-of-book depth may change dozens of times per second. The filter should not react to every add-cancel event. It should require:

Minimum relative size.
Price approach.
Low fill conversion.
Fast cancellation.
Repeat behavior.

Without these gates, the system becomes a quote-noise detector.

Calibration by stock and session

Calibration is not optional. A filter for high-volume mega-cap stocks will not transfer cleanly to small-cap momentum names. Depth, spread, and venue mix differ.

Practical calibration buckets:

Bucket	Typical issue	Filter adjustment
Mega-cap, tight spread	High quote churn	Require repeat pattern and low fill conversion
Mid-cap, moderate spread	Mixed displayed and hidden liquidity	Use wider test windows
Small-cap momentum	Thin book and large relative orders	Normalize by median depth, not raw size
Opening 15 minutes	Fast depth reset	Raise minimum sample size
Midday	Lower prints	Avoid over-penalizing low execution volume
Closing period	Imbalance and routing effects	Separate closing liquidity from regular flow

A good baseline uses recent data from the same symbol and same session segment. A weak baseline uses arbitrary universal thresholds.

The minimum viable calibration set:

At least several sessions per symbol.
Separate open, midday, and close.
Store median and percentile ratios.
Exclude halts and news shock periods from baseline unless the strategy trades those regimes.
Recompute periodically as liquidity conditions change.

For commercial platform selection, this becomes a feature requirement. The platform must allow storage, export, and backtesting of depth-derived metrics. Chart-only backtesting will not capture spoofing filters because historical candles do not contain quote cancellation data.

Final parameter set

The usable answer to how to check filter Level 2 order book spoofing using size-to-fill is binary at the execution layer.

Use the displayed level if it persists, trades, refreshes, and shows normal conversion against its symbol baseline.

Exclude the displayed level if it is large relative to local depth, is tested by price, converts poorly into prints, cancels quickly, and repeats nearby.

Minimum parameters:

Level 2 depth with at least top 5 levels.
Synchronized time and sales.
Millisecond-grade timestamps where available.
Rolling median depth baseline.
Defined price-test rule.
Defined fill window.
Size-to-fill ratio.
Cancellation ratio.
Repeat-location detection.
Session-specific calibration.

The filter does not prove legal spoofing. It classifies liquidity reliability. That is sufficient for intraday execution. A level that does not fill is not support. A wall that cancels before contact is not resistance. The order book is useful only after displayed size is reconciled against actual trades.

Key takeaways

A size-to-fill ratio identifies unreliable displayed liquidity by comparing visible order size against actual executed volume when a price level is tested.
Spoofing is characterized by large orders that create false depth and cancel before meaningful execution, rather than by the size of the order itself.
Effective filters must be calibrated to specific symbols and session times, as liquidity patterns vary significantly between market open, midday, and close.
A reliable spoofing filter requires high-resolution data, including millisecond timestamps and synchronized quote and trade events, to distinguish between deceptive behavior and legitimate market-making.
The goal of a size-to-fill filter is to classify liquidity reliability for execution decisions rather than to prove legal intent or confirm illegal spoofing.

FAQ

What is the size-to-fill ratio?

It is a metric calculated by dividing the displayed size at a specific price level by the actual executed volume at that level during a defined test window.

How can I distinguish between a spoofing layer and an institutional iceberg order?

Iceberg orders typically refresh or remain visible after partial fills and show consistent absorption, whereas spoofing layers show high size-to-fill ratios, few prints, and tend to cancel or relocate when price approaches.

Why is raw Level 2 data insufficient for detecting spoofing?

Level 2 data alone does not show whether displayed size actually converts into trades, making it impossible to distinguish between genuine support and false, non-executable depth.

What data resolution is required for an effective spoofing filter?

The filter requires millisecond-resolution timestamps and synchronized event data to accurately sequence add, trade, and cancel events, as spoofing often occurs in sub-second timeframes.

Should I use a static threshold for the size-to-fill ratio across all stocks?

No, thresholds must be calibrated by symbol, volatility, and session time, as a 15:1 ratio in a mega-cap stock may have a different meaning than the same ratio in a thin small-cap stock.