Working with Sparse Features#

When your feature space is large and sparse (one-hot encoded categories, text features, high-cardinality IDs), dense precision matrices become impractical. A model with 100k features needs a 100k x 100k matrix in the dense case. Set sparse=True on the estimator and the precision matrix is stored as a sparse CSC array, so storage and updates scale with the number of nonzero entries instead of the square of the feature dimension.

All linear estimators support sparse=True. The intercept-only models (DirichletClassifier, GammaRegressor) don’t need it.

Enable sparse mode#

Pass sparse=True to the estimator and provide context as scipy.sparse.csc_array:

import numpy as np
from scipy.sparse import random as sparse_random
from bayesianbandits import (
    Arm, ContextualAgent, NormalRegressor, ThompsonSampling,
)

arms = [
    Arm(f"variant_{i}", learner=NormalRegressor(
        alpha=1.0, beta=1.0, sparse=True,
    ))
    for i in range(3)
]
agent = ContextualAgent(arms, ThompsonSampling(), random_seed=42)

# Sparse context: 1 row, 10000 features, ~1% density
X = sparse_random(1, 10000, density=0.01, format="csc", random_state=42)
(action,) = agent.pull(X)
agent.update(X, np.array([1.0]))

The estimator will not convert dense arrays to sparse for you. If you pass a dense array to a sparse=True estimator, it will raise.

If the precision matrix fills in over time (common with unstructured sparse features like bag-of-words), sparse operations become slower than dense. Hierarchical features (one-hot at each level of a taxonomy) keep fill bounded and are the ideal use case.

Install CHOLMOD for production workloads#

Two sparse backends are available:

SuperLU ships with scipy and works out of the box. It handles arbitrary sparse LU decomposition, which is a harder problem than what we actually need. Precision matrices are symmetric positive definite, and SuperLU can’t exploit that, so it does more work than necessary. For small models this doesn’t matter. For large ones it dominates your inference time.

CHOLMOD (via scikit-sparse) knows the matrix is symmetric positive definite and takes advantage of it.

pip install bayesianbandits[cholmod]

If scikit-sparse is installed, the library uses CHOLMOD automatically. No code changes needed. Both backends apply fill-reducing permutations internally; the library handles unpermuting so that sampling, prediction, and all other operations are unaware of the solver choice. To force SuperLU (for debugging or benchmarking), set the environment variable BB_NO_SUITESPARSE=1.

With CHOLMOD, real-time inference under 10 ms is feasible with models up to 2²⁰ features.

Working with Sparse Features#

Enable sparse mode#

Install CHOLMOD for production workloads#

This Page