Research & Published Work

ASVR

Adaptive Stratified Variance Reduction for Monte Carlo Options Pricing

A bandit-based framework that automatically selects the best variance reduction strategy for Monte Carlo option pricing without prior knowledge of the payoff structure. Achieves near-oracle performance on well-separated option types with theoretical O(K log N/N) MSE bounds.

Publication Date
May 2026
Option Types Tested
8
VR Strategies
9
Best Performance
7.45×
May 13, 2026
ASVR: Adaptive Strategy-selected Variance Reduction
for Monte Carlo Option Pricing

Alexander Robbins

We propose Adaptive Strategy-selected Variance Reduction (ASVR), a bandit-based framework that automatically selects the best variance reduction strategy for Monte Carlo option pricing without requiring prior knowledge of the option payoff structure.

ASVR runs a two-phase explore-then-exploit procedure over a pool of K variance reduction strategies and achieves expected MSE within O(K log N/N) of the oracle (the best fixed strategy).

1
Abstract (Continued)

Across eight option types—European, Asian, barrier, and lookback—ASVR always outperforms plain Monte Carlo and achieves near-oracle performance on 2 out of 8 options (within 10% of the best fixed strategy) with a single, parameter-free implementation.

We also propose a Bayesian fusion estimator that combines exploration estimates via inverse-variance weighting; with oracle weights this is provably variance-minimising, and with estimated weights it achieves genuine MSE-ratio improvement on 6 out of 8 option types.

All experiments use N = 10,000 paths, 252 steps, and 200 independent trials with parameters S₀ = K = 100, r = 0.05, σ = 0.20, T = 1.

2
Contents

1 Introduction — Motivation and Research Question

2 Problem Setup — Monte Carlo, VR Strategies, Oracle

3 The ASVR Algorithm — Two-Phase Framework & Theory

4 Implementation Details — Pool Design & Halton QMC

5 Experimental Results — Performance across 8 option types

6 Discussion — IS Parameterization & Extensions

7 Conclusion

28 pages | 5 figures | Appendices with proofs & data

3
01 — Problem

Automatic Strategy
Selection

Monte Carlo simulation is the standard method for pricing path-dependent and high-dimensional derivatives. However, its O(N^−1/2) convergence rate can be slow. Variance reduction (VR) techniques dramatically improve efficiency, but the challenge is that different strategies excel on different option types.

Control variates dominate European calls but fail on puts. Latin hypercube sampling works for Asian puts but not lookback calls. Importance sampling is powerful but requires careful parameterization and matching the option's sensitivity direction. In practice, practitioners choose strategies by domain knowledge or trial-and-error — both impractical for automated pricing pipelines.

ASVR solves this by running a two-phase explore-then-exploit procedure: allocate exploration paths to test each strategy's variance on the unknown payoff, then devote the remainder of the budget to the estimated best strategy. The result is a parameter-free algorithm that's never worse than plain Monte Carlo and achieves near-oracle performance when strategy separation is large.

02 — Results

Key Findings

ASVR was tested on 8 option types across 200 independent trials with 10,000 paths each, using the same GBM parameters throughout. The algorithm always improves over plain Monte Carlo and achieves near-oracle performance on well-separated option types.

Always Beats Plain MC
1.13–7.45×
Minimum 13%, maximum 645% improvement
Near-Oracle on Calls
3.6% Gap
European Call: 7.45× vs oracle 7.72×
Beats Oracle
5.4% Gain
Lookback Call: 5.77× vs CV 5.46×
Fusion Wins
6/8 Types
Bayesian inverse-variance weighting improves on exploit
European Call
ASVR achieves 7.45×, within 3.6% of oracle (CV at 7.72×). Control variates are optimal when payoff is smooth and correlated with ST — ASVR identifies this reliably during exploration.
Lookback Call
ASVR achieves 5.77× and actually beats the oracle CV at 5.46×. The 9,100 exploitation paths with a well-calibrated strategy can outperform any fixed strategy using 10,000 paths.
Asian Put
ASVR identifies LHS (3.28×) but the finite-sample gap is 118% due to 100-path exploration being insufficient when strategy separation is small. Adaptive exploration budgets would solve this.
Halton Failure
Randomised Halton QMC in 252 dimensions performs 3–33× worse than plain MC. Its artificially low variance in small samples causes the bandit to over-exploit it, requiring exclusion from high-dimensional pools.
03 — Methodology

Explore-Then-
Exploit Algorithm

ASVR is a multi-armed bandit algorithm that identifies the best variance reduction strategy without prior knowledge of the payoff structure. It operates in two phases:

Exploration Phase: Allocate n_exp paths to each of K strategies.
Compute sample variance σ̂²_k for each strategy.

Selection: Pick k̂* = arg min_k σ̂²_k

Exploitation Phase: Devote remaining paths to k̂*
Return the price estimate from the best strategy.

In experiments, n_exp = 100 and K = 9, consuming 900 of 10,000 total paths (9% exploration overhead). The oracle-gap theorem guarantees E[MSE(ASVR)] ≤ MSE(oracle) + C · K log N / N, where the constant C depends on strategy separation.

For strategies with large variance separation (European call: CV vs LHS gap is 3×), the bound is tight and empirical performance matches theory. For options with small separation (Asian put: LHS vs moment matching gap is only 1.2×), finite-sample noise dominates and the gap to oracle widens.

04 — Enhancement

Bayesian Fusion
Estimator

After both phases, ASVR has K exploration estimates and one exploitation estimate. Rather than discarding exploration paths, the fusion estimator combines all K+1 estimates via inverse-variance weighting:

V̂_fusion = [Σ_k (n_exp / σ̂²_k) V̂_k + (N_expl / σ̂²_k*) V̂_expl] /
          [Σ_k (n_exp / σ̂²_k) + (N_expl / σ̂²_k*)]

With oracle weights (true variances), the fusion variance is guaranteed ≤ exploit variance. In practice with estimated weights, two failure modes arise: estimation noise when σ̂²_k is noisy, and selection bias (winner's curse) when σ̂²_k* is optimistically biased. Despite these, fusion wins on 6 out of 8 option types and produces 2.1–5.2% MSE improvements.

05 — Publication

Paper Abstract

ASVR: Adaptive Strategy-selected Variance Reduction for Monte Carlo Option Pricing
We propose Adaptive Strategy-selected Variance Reduction (ASVR), a bandit-based framework that automatically selects the best variance reduction strategy for Monte Carlo option pricing without requiring prior knowledge of the option payoff structure. ASVR runs a two-phase explore-then-exploit procedure over a pool of K variance reduction strategies and achieves expected MSE within O(K log N/N) of the oracle (the best fixed strategy). Across eight option types—European, Asian, barrier, and lookback—ASVR always outperforms plain Monte Carlo and achieves near-oracle performance on 2 out of 8 options (within 10% of the best fixed strategy) with a single, parameter-free implementation. We also propose a Bayesian fusion estimator that combines exploration estimates via inverse-variance weighting; with oracle weights this is provably variance-minimising, and with estimated weights from n_exp = 100 exploration paths it achieves a genuine MSE-ratio improvement on 6 out of 8 option types. A novel negative result accompanies the positive findings: randomised Halton quasi-Monte Carlo in 252 time-step dimensions performs 3–33× worse than plain Monte Carlo due to severe inter-dimensional correlation, and must be excluded from high-dimensional bandit pools. All experiments use N = 10,000 paths, 252 steps, and 200 independent trials.
06 — Full Text

Access the
Complete Paper

The full paper includes detailed proofs of the oracle-gap theorem, variance reduction strategy implementations, extended benchmark tables across all 8 option types, and discussion of the Halton failure mode in high dimensions.

↓ Download Full PDF (587 KB)

Paper: "ASVR: Adaptive Strategy-selected Variance Reduction for Monte Carlo Option Pricing"
Author: Alexander Robbins
Date: May 13, 2026
Pages: 28
Keywords: variance reduction, Monte Carlo, multi-armed bandit, explore-then-exploit, Bayesian fusion, quasi-Monte Carlo