Quant Strategies and Systematic Trading

This document synthesizes content from institutional sources, conversations with former Two Sigma and Citadel quants, and academic literature. All product references have been removed. Where simplifications exist, these are marked.


1. CTAs and Systematic Funds

1.1 What Are CTAs?

Commodity Trading Advisors (CTAs) are specialized investment vehicles that deploy futures contracts, options on futures, and currency forwards across asset classes. The term "Commodity" is historically misleading: modern CTAs trade equities, currencies, rates, and commodities. They are required to register with the CFTC (Commodity Futures Trading Commission) and the NFA (National Futures Association) in the United States.

The assets under management (AUM) of the CTA industry most recently exceeded the 400 billion USD mark for pure CTA programs; when all systematic macro funds and volatility strategies are included, the total market significantly exceeds 1 trillion USD. Given this scale, CTA flows measurably influence price discovery in liquid futures markets.

⚠️ Simplification: The figure of 400 billion USD refers to a specific survey point in time (Barclay Hedge). The total volume of systematic funds including risk parity and volatility control is considerably larger and varies depending on the definition.

Regulatorily, CTAs have their origins in the U.S. Commodity Exchange Act. Global market access is achieved through managed futures programs designed for all holding duration categories:

Category Typical Holding Duration
Short-term Seconds to 3 months
Medium-term 3 months to 1 year
Long-term Over 1 year

1.2 Trend-Following as Core Strategy

The vast majority of AUM in CTA programs flows into trend-following strategies. The basic principle: buy what has risen; sell what has fallen. This intuitive idea has a strong empirical basis.

Time-Series Momentum (TSMOM) is the most important variant: an asset receives a long signal when its own past return (typically 12 months, minus the most recent month to avoid mean-reversion) is positive; a short signal when the historical return is negative.

📚 Source: Moskowitz, Ooi & Pedersen (2012), "Time Series Momentum", Journal of Financial Economics 104(2). The authors document statistically significant momentum premiums across 58 futures markets and 4 decades.

Practical implementation is done through moving average crossover models. A simple example: long signal when the spot price is above the 20-period SMA; short signal below it. More complex variants use multiple averages (e.g., 50/200) or exponential moving averages. Position size is dynamically scaled based on realized volatility so that all positions contribute risk-equivalently to the portfolio.

Cross-sectional momentum is a complementary approach: instead of evaluating each asset on its own, relative strength within a universe is assessed. Assets with the strongest returns of the prior period are bought; the weakest are sold short.

📚 Source: Jegadeesh & Titman (1993), "Returns to Buying Winners and Selling Losers", Journal of Finance 48(1). This paper is regarded as the foundation of the cross-sectional momentum literature for equities; later work by AQR (Asness, Moskowitz, Pedersen, 2013) extends it to global asset classes.

1.3 Why CTAs Influence Markets: Feedback Loops and Crowded Trades

Systematic funds react to the same price signals. When a moving average level is triggered, many CTAs act simultaneously. This creates feedback loops:

  1. Price crosses the 20-period SMA → CTAs generate long signal
  2. Mass buy orders amplify the upward move
  3. Stronger upward move reinforces the signal → further buying

The same dynamic acts as an amplifier in downtrends. CTAs behave in this context like market participants with negative gamma: they buy when the market rises and sell when it falls — analogous to a Market Maker adjusting their delta exposure in the direction of price movement.

Crowded-trade risk: when many CTAs hold identical positions (e.g., long USD/short yen in a carry trade), a paradigm shift can lead to abrupt, cascading unwind moves. These "crowded unwinds" can create massive price movements within a short period that cannot be justified fundamentally.

📚 Source: Pedersen, L.H. (2015), Efficiently Inefficient, Princeton University Press. Chapter on CTA crowding and liquidation risks.

1.4 Volatility-Control Funds

Volatility-control funds (VC funds) are a distinct category with approximately 350 billion USD AUM (Morningstar estimate), corresponding to roughly 2% of the mutual fund and ETF market. Their mechanism:

Objective: constant portfolio volatility over time. Typical targets are 10%, 12%, or 15% annualized volatility.

Mechanics: the allocation to risk assets (equities, high-yield bonds) is adjusted daily or weekly:

  • Realized volatility < target → leverage up with more risk assets
  • Realized volatility > target → reduce risk assets, build cash or bonds

Three main variants are distinguished in practice:

Type Description
Volatility Target Delivers stable volatility around a target
Volatility Cap Volatility never exceeds the cap
Variable Volatility Cap (VVC) Cap level depends on accrued losses

Why VC funds are forced to sell during volatility spikes: when the 1-month realized volatility exceeds the 3-month realized volatility, this signals an acceleration of the risk environment. The model triggers reduction programs. During a volatility shock (e.g., a VIX spike from 15 to 35), these funds can sell significant quantities of equities within hours, amplifying the market move. This explains why market downturns sometimes proceed faster and deeper than fundamentally justified.

1.5 Risk-Parity Funds

Originally developed by Ray Dalio and Bridgewater, Risk Parity is based on the concept of equal-weighted risk allocation: instead of dividing capital equally, positions are chosen so that all asset classes contribute equally to portfolio risk.

Since bonds have much lower volatility than equities, Risk Parity in practice means: bonds are held with leverage (e.g., bond futures), while equities are weighted relatively smaller. The economic rationale: equities and bonds diversify well against growth risks (during strong growth: equities up, bonds down; during recessions: the reverse). For inflation shocks, crude oil was traditionally used as a hedge.

The correlation breakdown 2020–2022: in classic Risk Parity logic, equities and bonds are negatively correlated. In 2022, interest rates rose sharply (bonds lost massively), while equities corrected due to recessionary expectations. Equities and bonds fell together — a so-called correlation breakdown. Risk Parity funds were forced to reduce both positions simultaneously, amplifying the pressure on both markets.

📚 Source: Asness, Frazzini & Pedersen (2012), "Leverage Aversion and Risk Parity", Financial Analysts Journal 68(1). Critical discussion of the mechanism and its limits.

⚠️ Simplification: The simplified statement "equities and bonds correlate negatively" does not hold universally; the correlation is time-varying and regime-dependent on inflation.


2. How Quant Funds Process Data

2.1 Alpha versus Beta: The Fundamental Distinction

In the language of the quant industry:

  • Beta: return achieved through systematic market exposure. Cheaply replicable via ETFs. No "skill" required.
  • Alpha: return above and beyond the beta explanation. Demonstrates genuine informational advantages, forecasting ability, or factor exposures beyond the market.

From a hedge fund perspective, the primary objective is: orthogonal Alpha — returns that do not correlate with the market. For institutional investors with large passive positions, a hedge fund with low beta and positive Alpha has more value than a fund that merely beats the market but is highly correlated.

⚠️ Simplification: "Alpha" in practice depends on the chosen factor model. What is Alpha versus CAPM can be Beta versus a five-factor model (Fama-French plus momentum and low volatility).

Mathematically: Alpha is the intercept of the regression line between portfolio returns and factor returns. Beta is the slope coefficient. This is not correlation in the strict sense, but a sensitivity measurement (linear regression coefficient).

2.2 Alpha Decay: Why Systematic Edges Disappear

Alpha decay describes the progressive erosion of a strategy's edge as it becomes known and widely deployed.

Mechanism:

  1. Researchers or traders discover a market inefficiency (e.g., an inefficiency in options data)
  2. Strategy is implemented and generates Alpha
  3. Other market participants observe or infer the strategy
  4. Capital flows into the strategy
  5. The inefficiency is arbitraged away: price differences disappear, Alpha falls to zero
  6. The strategy becomes factor Beta — replicable, but no longer an advantage

A cautionary observation from practice (former Two Sigma quant): the acceleration of Alpha decay is real. Inefficiencies that used to take years to be exhausted now disappear within months. Generative AI further accelerates this process by democratizing the finding and testing of strategies.

Countermeasures:

  • Alpha Factory: portfolio of various, uncorrelated Alpha sources. When individual strategies decay, others compensate. Diversification across Alpha types protects against the decay of individual signals.
  • Continuous research: permanently test new data sources and hypotheses.
  • Proprietary data: signals from sources not broadly accessible decay more slowly.
  • Signal combination: combining multiple weak signals with high Information Coefficient (IC) creates a more robust overall signal.

2.3 Data-Mining Bias, Overfitting, and Walk-Forward Testing

This is the most serious practical problem in quantitative strategy development.

Overfitting occurs when a model learns the historical randomness ("noise") of the training data instead of robust statistical patterns. An overfitted model looks excellent on historical data and fails completely in live trading.

Data-mining bias: when thousands of parameter combinations are tested, some will look good by chance. Without correct statistical adjustment (e.g., via Bonferroni corrections or bootstrap simulations), the reported performance is overestimated.

📚 Source: Bailey, Borwein, de Prado & Zhu (2014), "The Probability of Backtest Overfitting", Journal of Computational Finance 20(4). Systematic analysis of the overfitting problem in financial backtesting research.

Look-ahead bias: a fundamental problem in data infrastructure. Point-in-time data means that at each time t in the backtest, only information that was actually known at time t is available. Later revisions of fundamental data (e.g., revised earnings, restatements) must not flow into historical signals. Without point-in-time integrity, a backtest produces overestimated returns — in practice, this has been described as a knockout criterion for institutional data purchases.

Walk-forward testing (out-of-sample validation):

  • Split the historical dataset into training and test sets
  • Train the model exclusively on training data
  • Evaluate it on test data the model has never seen
  • Repeat over rolling windows

This is the most important protection against overfitting. As a rule of thumb: a walk-forward test should cover at least 30–40% of the historical period, and the out-of-sample result should not be dramatically worse than the in-sample result.

Monte Carlo simulation allows understanding the distribution of possible outcomes. Instead of reporting a single equity curve, thousands of paths are simulated (e.g., by permutation of trading returns), and the resulting range shows whether historical performance is statistically significant or within random dispersion.

2.4 Alternative Data: What Actually Works

Alternative data is all information beyond traditional market data that can be used for signal construction. Categories:

Type Example Application
Satellite data Parking lot utilization, tank farm fill levels Leading indicator for retail, oil demand
Credit card data Anonymized transactions Anticipating revenue announcements
NLP/Social Media Twitter sentiment, news analysis Sentiment indicator, volatility forecasting
Supply chain data Order volumes, delivery times Leading indicator for earnings
Satellite imagery Emission levels, crop estimates Commodity demand

Important nuances from the Two Sigma perspective:

  1. Alternative data is a hypothesis tool, not a strategy: only when one has a well-founded thesis (e.g., "credit card transactions predict quarterly revenues") does it make sense to test alternative data. Data without a thesis is noise.

  2. The liquidity problem: alternative data is often only tradeable for a small universe. In the options market, for example, most contracts have very wide spreads and low volume. Only for a small subset of underlyings are there sufficiently liquid options. This significantly limits the capacity of strategies.

  3. Data cleanliness is expensive: institutional buyers ask as the first question: is the data point-in-time? Without this guarantee, the dataset is worthless for systematic strategies. Startups have experienced this as a knockout criterion in sales conversations.

  4. AI and generative AI: generative AI enables generating new signal ideas that are not explicitly present in the training dataset. However, LLMs are non-deterministic, poor with structured numerical data, and their training data is outdated. The competitive advantage lies not in access to AI tools, but in the critical verification of their outputs and the human judgment about them.

2.5 Signal Combination: IC and IR

Information Coefficient (IC): measure of a signal's predictive power. Technically: the rank correlation (Spearman) between forecasted and realized return. An IC of 0 means no predictive ability; IC = 1 would be perfect prediction. Real Alpha signals often show ICs between 0.02 and 0.10 — small, but economically meaningful at large scale.

Information Ratio (IR): IR = IC × √Breadth, where Breadth is the number of independent bets per time period. The Fundamental Law of Active Management (Grinold & Kahn) states: one can combine a poor strategy with broad universe or a good strategy with narrow universe. Large quant funds optimize both dimensions.

Factor portfolio construction: institutional quant funds typically rank their entire investment universe (often thousands of assets) by a composite score from multiple factors (momentum, value, quality, volatility). They buy the top quintile and sell short the bottom quintile. This creates a market-neutral, diversified Alpha portfolio.


3. Momentum Strategies

3.1 Time-Series Momentum (TSMOM): Mechanics and Evidence

TSMOM is the academically best-documented strategy in systematic funds. Core result: assets that have risen over the past 12 months (excluding the most recent month) tend to continue rising. The same holds in reverse for falling assets.

Standard implementation:

Signal_t = sign(r_{t-12, t-1})
Position_size_t = Signal_t / σ_t

where σ_t is the current realized volatility. The volatility scaling ensures that each position contributes an equivalent risk contribution.

Evidence: across 58 futures markets (equities, bonds, currencies, commodities) from 1985 to 2012, a simple TSMOM strategy showed an annualized Sharpe Ratio of approximately 1.28 — significantly higher than most individual asset classes.

📚 Source: Moskowitz, Ooi & Pedersen (2012), op. cit.

For commodities, Koch Industries traders (Ilya Bushuyev, "Virtual Barrels") documented a simple 1-month momentum strategy for crude oil: buy when today's price is above the 20-day average; sell below it. Over 25 years, this primitive strategy generated nearly 10% annualized return — not good enough on its own (too high Drawdowns), but strong enough as a building block. Crucially: in commodities, momentum is fundamentally justified through the inertia of supply and demand.

3.2 Cross-Sectional Momentum: Relative Strength

Cross-sectional momentum (also relative strength) is based on comparing assets against each other, not absolute returns. Implementation:

  1. Define investment universe (e.g., all S&P 500 sectors)
  2. Rank assets by return over the last 3–12 months
  3. Buy the top performers (e.g., top quintile)
  4. Sell (or avoid) the underperformers

Rebalancing frequency: weekly or monthly is practical for ETF strategies; daily for futures funds. Higher frequency increases transaction costs; lower frequency increases signal latency.

Backtest evidence (ETF implementation): a strategy that daily buys the sector ETF with the highest momentum (holding period: 1 day, rolling) showed the following characteristics over the period 2014–2025:

  • CAGR above the S&P 500 benchmark
  • Sharpe Ratio approximately 0.76 (respectable value for a simple strategy)
  • Beta to S&P 500: approximately 0.48 (low market sensitivity)
  • Positive Alpha (intercept of the regression)

⚠️ Simplification: backtesting results contain inherent over-optimization risks. The above figures are illustrative, not guaranteed.

3.3 Momentum Crashes: The Main Risk

Momentum strategies are not infallible. The literature documents specific conditions under which momentum fails:

Momentum crashes typically occur in two situations:

  1. After market collapses and recessions: stocks that fell most sharply during the crash (short positions in a momentum strategy) often recover most strongly. This reverses the momentum position.
  2. After rapid market rallies (Sharp Market Reversals): when the market turns quickly, momentum positions are incorrectly positioned.

📚 Source: Daniel & Moskowitz (2016), "Momentum Crashes", Journal of Financial Economics 122(2). Document that the negative skewness of momentum returns is statistically significant and momentum strategies have known left-tail risks.

January effect: in January, past losers (short side of the momentum strategy) tend to generate excess returns due to tax-motivated selling in December and subsequent recovery in January. This is a known seasonal headwind for momentum.

Consequences for strategy construction:

  • Dynamically reduce momentum exposure after large market declines
  • Volatility scaling helps, but must be supplemented by regime filters
  • Combination with value signals can dampen crash risk (value and momentum are negatively correlated, creating diversification value)

3.4 Momentum in ETFs: Practical Considerations

ETF-based momentum strategies have specific properties compared to futures-based CTA programs:

  • Liquidity: sector ETFs (XLE, XLK, etc.) are liquid enough for retail implementation
  • No roll costs: unlike futures, no costs arise from the monthly rolling of positions
  • Limited short opportunities: most ETF momentum strategies are long-only, which increases crash risk
  • Transaction costs: with daily rebalancing, transaction costs are relevant; capital base > 50,000 USD needed so that fixed commissions do not eat up returns

MAG7 momentum: a concentrated universe (Apple, Microsoft, Amazon, NVIDIA, Alphabet, Tesla, Meta) with daily-rolling momentum selection showed over 2019–2025 a CAGR of 52.57% and Sharpe Ratio 1.43 — but with a Drawdown of 58.78% and annualized volatility of 52.59%. This illustrates the risk of concentrated momentum strategies.

⚠️ Simplification: a backtest over a period of extraordinary MAG7 growth (post-COVID AI boom) likely significantly overestimates future results.


4. Breakout Strategies

4.1 Why Breakouts Work: Market Structural Explanations

A breakout occurs when price crosses a defined support or resistance level with elevated volume. The structural mechanisms that drive genuine breakouts:

Stop-loss cascades: above resistance levels, stop-loss orders from short sellers accumulate. When price crosses this level, these orders are automatically activated (stop-buy orders), triggering further buying and reinforcing the breakout. The same applies downward: below support levels lie stop-loss orders from long holders.

Institutional flows: CTA funds use moving average breakouts as entry signals. When price breaks above the 20-day SMA, many systematic funds simultaneously generate long signals. These coordinated buys give the breakout additional energy.

Gamma squeeze (with options): when price approaches a concentration zone in the options book (high open call inventory), Market Makers become increasingly delta-negative. They must buy futures to hedge, creating additional upward pressure. This mechanism explains some accelerations above "gamma walls."

Market psychology: technical levels are widely observed. When a well-known level breaks, the collective assessment of market participants changes — a psychological threshold is overcome, creating momentum through sentiment shift.

4.2 False Breakouts: Volume as the Critical Filter

False breakouts are one of the most common pitfalls for breakout traders. Price briefly breaks through support or resistance, but returns without sustained follow-through. Recognition features:

  1. Missing volume confirmation: a breakout without significantly elevated volume is suspect. Volume represents conviction; without conviction, any move is fragile.

  2. Intraday reversal: price breaks above a level but closes below it (bearish signal = "rejection candle").

  3. Broader context: a breakout against the overriding trend is less reliable than a continuation breakout. Context matters.

Opening Range Breakout (ORB): a specialized form: the first hour after market open defines the "Opening Range." If price breaks out above or below this range, this is taken as a signal for the day's direction. Empirically, ORB strategies show moderately positive results in equities and futures, but high false breakout frequency without filters.

4.3 Expected-Move-Based Breakout Frameworks: Weekly Ranges from IV

A quantitatively grounded approach uses implied volatility (IV) to derive expectation corridors:

Expected Move (EM) for a week:

EM_weekly ≈ S₀ × IV × √(7/365)

where S₀ is the current price and IV is the annualized implied volatility. An approximation: the price of the ATM straddle of the upcoming weekly expiration.

Strategy logic:

  • Upper boundary: S₀ + EM
  • Lower boundary: S₀ - EM

When price breaks these boundaries, genuine directional impulse arises from two sources:

  1. Delta-hedging flows: options sellers who did not expect the breakout must adjust their delta
  2. Interpretation signal: market-wide recognition that the weekly expectation corridor has been broken

Refinement: monitor Gamma Exposure (GEX) near the EM boundaries. A GEX concentration point above the EM boundary can serve as the next target after the breakout (gamma magnets). A negative GEX area amplifies the move.

Weaknesses of the approach:

  • Intra-week events (earnings, macro data) can retroactively invalidate the EM basis
  • Time decay for options buyers is substantial with short maturities
  • False breakouts above/below the EM are frequent enough to require risk management

5. Seasonality

5.1 Statistical Foundation and Data-Mining Risk

Seasonality refers to recurring patterns in market returns that correlate with the calendar. The statistical basis is double-edged:

For seasonality: with 20+ years of data and hundreds of instruments, one has sufficient observations to identify statistically significant patterns. Certain seasonal effects have economic justifications and are replicable.

Against seasonality: seasonal backtests suffer from particularly high data-mining bias. When testing 12 months and many instruments, numerous apparent patterns will be found by chance. Without walk-forward validation, such patterns are often spurious.

📚 Source: Sullivan, Timmermann & White (2001), "Dangers of Data Mining: The Case of Calendar Effects", Journal of Econometrics 105. Shows that many calendar effects do not hold when multiple testing is correctly addressed.

5.2 Known Seasonal Patterns with Economic Basis

Sell in May and Go Away: historically, equity markets performed worse during summer months (May–October) than November–April. Explanatory attempts: lower institutional trading volume in summer, bonus payments and reinvestment in winter. The pattern is more real than pure data-mining, but unreliable in individual years.

Turn-of-month effect: the first trading days of a new month historically show a slightly positive bias. Explanation: monthly pension contributions, portfolio rebalancings, and fund inflows concentrate at the start of the month. This effect is small but consistent.

OpEx effects: in the weeks around options expirations (typically the third Friday), recurring patterns appear in volatility and sector movements, driven by the rolling and hedging adjustments of large options positions. OpEx weeks often show elevated intraday volatility.

January effect: historically, small-caps outperformed in January. Explained by tax-motivated tax-loss harvesting in December (losers are sold) and subsequent repurchases in January. Partially arbitraged away, but still recognizable.

5.3 Seasonality in Commodities: Fundamental Basis

Commodities have the strongest fundamental basis for seasonality:

  • Natural gas: demand rises in winter (heating) and summer (air conditioning). Inventory build in spring/fall creates Contango structure; inventory draws during peak load periods create Backwardation.
  • Crude oil: refinery shutdown periods (March–April) reduce demand for crude, but then increase gasoline demand. Summer demand season (Driving Season in the USA).
  • Agricultural commodities: harvest seasonality with fundamental basis through planting and harvest cycles. Corn and soybeans show clear seasonal patterns around planting and harvest dates.

The crucial distinction: seasonality in commodities is fundamentally justified and thus structurally more robust against data-mining criticism than purely calendar-based equity anomalies.

Practical implementation (Futures Seasonality):

  • Use 20 years of historical data to calculate the average 5-day return pattern for each date
  • Rank futures contracts daily by seasonality strength
  • Buy the contract with the highest positive seasonality signal
  • Exit on the next trading day (unless signal is confirmed)

Backtest result (ES, NQ, GC, ZN, CL, 2014–2025): CAGR 17.77%, Sharpe Ratio 1.11, Drawdown -33.12% (comparable to S&P 500's -33.92%). Important: other futures (silver, wheat, soybeans, etc.) showed weaker seasonality and were not included — a sign that not all commodities have equally strong seasonality.


6. Swing Trading Systematics

6.1 Definition and Distinctions

Swing trading refers to multi-day holding periods, typically 2 to 20 trading days. It differs from:

  • Intraday trading: positions are closed within the same trading day. Focus on short-term patterns, high transaction frequency, higher fee burden.
  • Position trading: holding periods of weeks to months. More fundamental orientation, less timing precision needed.

Swing trading is the natural time horizon for traders who want to combine a professional routine with active trading. It allows sufficient reaction time between signal generation and execution.

6.2 Volatility-Based Swing Trades: Using IV Cycles

The core of a volatility-based swing trading approach:

Implied vs. realized volatility:

  • When IV > RV (Realized Volatility): options are "expensive" → prefer seller strategies
  • When IV < RV: options are "cheap" → prefer buyer strategies

Swing trade structures:

  1. Long Straddle / Long Strangle (7–21 days): when IV is below average and a catalyst is anticipated. Profits from moves in both directions. Advantage: no active delta hedging needed — hold position and let volatility "work."

  2. Calendar Spread: sell short maturity (elevated IV), buy long maturity (fair IV). Exploits IV mean reversion and theta differences between maturities.

  3. Directional positions in low-IV phases: when IV is in the 10th percentile and a directional expectation exists, simple calls or puts can be cheap. Benefits from move + IV expansion.

Gamma regime as filter:

  • Positive gamma (GEX > 0): Market Makers buy on price decline, sell on price rise → dampening effect, range behavior preferred
  • Negative gamma (GEX < 0): Market Makers amplify moves → trending, higher volatility, long-vol strategies more suitable

6.3 Risk/Reward Calculation: Win Rate vs. Payoff Ratio

A fundamental concept that swing traders must understand:

Expected Value (EV):

EV = (Win-Rate × Avg. Win) - (Loss-Rate × Avg. Loss)

Two paths to positive EV:

  1. High win rate, small average win (mean-reversion strategies)
  2. Low win rate, large average win (momentum/breakout strategies)

Momentum strategies often have only 45–55% win rate, but an avg. win / avg. loss ratio of 2:1 or higher. Options premium selling has win rates of 60–70%, but an unfavorable payoff ratio during tail events.

Practical swing model framework: forecast bands for 5 and 20 days, derived from momentum indicators, options flow (Gamma, Delta), and market positioning provide:

  • Upper Band: upper price target (resistance zone / exit for longs)
  • Lower Band: lower price target (support zone / entry for longs)
  • Risk Trigger: dynamic inflection point based on Gamma/Delta — breakout signals accelerated move

7. Backtesting Methodology

7.1 Critical Bias Sources

Look-ahead bias: the most serious problem. Arises when signals use information that was not available at the time of trading. Examples:

  • Use of revised earnings figures instead of originally reported ones
  • Including data that only became known after market close
  • Survivorship bias in benchmarks (indices today contain only survivors; historical composition was different)

Survivorship bias: a stock universe that contains only currently listed companies excludes all bankruptcies and delistings. This makes historical strategies appear better than they were. Correct practice: use historical index compositions (point-in-time index membership).

Transaction cost realism: backtests without realistic transaction costs are worthless in practice:

  • Spread costs (bid-ask)
  • Market impact with large orders
  • Slippage in volatile markets
  • Interest on short positions (stock borrow costs)

In high-frequency strategies, spread costs alone can devour the entire Alpha generation.

7.2 Walk-Forward vs. Cross-Validation

Walk-forward testing is the standard for time-series-based financial strategies:

|--- Training Period ---|--- Test Period ---|
|--- Training ---|--- Test ---|--- (not seen) ---|
                    |--- Training ---|--- Test ---|

By repeatedly shifting the window, one obtains multiple out-of-sample periods that together give a more robust picture of strategy performance than a single split.

Purged cross-validation: with time series, classic K-fold cross-validation cannot be directly applied, as future data can enter the training window (look-ahead). The "purging" method removes a buffer period between training and test sets to prevent data leakage.

📚 Source: López de Prado (2018), Advances in Financial Machine Learning, Wiley. Standard work on methodology of machine learning in financial applications; details purging, embargoes, and other techniques.

Key backtesting metrics:

Metric Meaning
CAGR Annualized return
Sharpe Ratio Return per unit of total volatility
Sortino Ratio Return per unit of negative volatility
Max Drawdown Largest cumulative loss from peak to trough
Beta Sensitivity to the market (linear regression coefficient)
Alpha Return beyond market exposure (regression intercept)
Win Rate Proportion of profitable trades

Benchmark comparison is mandatory: a strategy that is no better than a passive ETF does not justify the costs, stress, and time investment. The strategy's Sharpe Ratio should substantially exceed the benchmark Sharpe.

7.3 Monte Carlo Simulation for Robustness Testing

Monte Carlo methods allow understanding the range of possible outcomes — not just a single equity curve:

Approaches:

  1. Trade permutation: the historical trade P&Ls are randomly rearranged to generate thousands of alternative equity curves. The resulting distribution of Drawdowns and end returns shows whether the historical result is typical or an outlier.
  2. Block bootstrap: blocks of consecutive returns are drawn to preserve autocorrelation.
  3. Parametric model: based on estimated return distributions (often fat-tailed), future paths are simulated.

Typical questions Monte Carlo answers:

  • How likely is it that the strategy suffers a Drawdown > 20% in a 2-year window?
  • What is the 5% quantile Drawdown (i.e., the worst 5% of simulated scenarios)?
  • Does the historical result lie within the normal dispersion of random paths?

Important note: Monte Carlo simulations test robustness against random path variations, but not against regime changes (structural market changes). A strategy can pass Monte Carlo tests and still fail when the market regime fundamentally changes (e.g., interest rate reversal after decades of declining rates).


Summary: Hierarchy of Concepts

Market Structure
├── Systematic Funds (CTAs, VC funds, Risk Parity)
│   ├── Generate predictable flows at defined price levels
│   └── Amplify trends and volatility spikes
│
├── Alpha vs. Beta
│   ├── Orthogonal Alpha is the goal of institutional funds
│   └── Alpha decays with increasing crowding
│
├── Strategy styles
│   ├── Momentum: Trend-Following, TSMOM, Cross-Sectional
│   ├── Breakouts: Volume-based, IV-based, Structure-based
│   ├── Seasonality: Calendar-based (weak), Fundamental (strong in commodities)
│   └── Swing Trading: Multi-day, IV cycles, payoff optimization
│
└── Methodology
    ├── Backtesting: Point-in-time, Walk-Forward, Transaction costs
    ├── Monte Carlo: Robustness testing
    └── Overfitting protection: Simplicity, Out-of-sample testing

Sources: Moskowitz, Ooi & Pedersen (2012) — TSMOM; Jegadeesh & Titman (1993) — Cross-Sectional Momentum; Daniel & Moskowitz (2016) — Momentum Crashes; Asness, Frazzini & Pedersen (2012) — Risk Parity; Bailey et al. (2014) — Backtest Overfitting; López de Prado (2018) — ML in Finance; Sullivan, Timmermann & White (2001) — Calendar Effects. Institutional perspectives: Tharsis Souza (former Two Sigma), Nick (former Citadel), Ilya Bushuyev (former Koch Global Partners).