Noise Injection Strategies¶
This module provides different approaches to injecting noise into hidden states. The key insight is that noise injection should enable “trajectory resampling” - allowing the model to escape local optima in reasoning space.
Injection State¶
- class structured_stochasticity.injection.InjectionState(step=0, total_steps=None)[source]¶
Bases:
objectTracks state across injection calls for stateful strategies.
- Parameters:
step (int)
total_steps (int | None)
- step: int = 0¶
- total_steps: int | None = None¶
- __init__(step=0, total_steps=None)¶
- Parameters:
step (int)
total_steps (int | None)
- Return type:
None
Base Class¶
- class structured_stochasticity.injection.NoiseInjector(scale=0.1, device='cuda')[source]¶
Bases:
ABCAbstract base class for noise injection strategies.
- Parameters:
scale (float)
device (str)
- abstractmethod sample(shape)[source]¶
Sample noise tensor of given shape.
- Parameters:
shape (tuple[int, ...])
- Return type:
torch.Tensor
Gaussian Noise¶
- class structured_stochasticity.injection.GaussianNoiseInjector(scale=0.1, device='cuda')[source]¶
Bases:
NoiseInjectorInjects Gaussian noise scaled by a constant factor.
This is the simplest strategy: z ~ N(0, scale²)
The scale parameter controls the magnitude of perturbation. Too small: won’t escape attractor basins Too large: destroys coherent reasoning
- Parameters:
scale (float)
device (str)
Uniform Noise¶
- class structured_stochasticity.injection.UniformNoiseInjector(scale=0.1, device='cuda')[source]¶
Bases:
NoiseInjectorInjects uniform noise in range [-scale, scale].
Uniform noise has bounded magnitude, which may be preferable when you want to guarantee perturbations stay within a range.
- Parameters:
scale (float)
device (str)
Annealed Noise¶
- class structured_stochasticity.injection.AnnealedNoiseInjector(scale=0.1, anneal_factor=0.95, min_scale=0.01, device='cuda')[source]¶
Bases:
NoiseInjectorInjects noise that decreases over the generation process.
Motivation: Strong perturbation early (when problem framing matters most) tapering to stability later (when solution is crystallizing).
Scale at step t: scale * (anneal_factor ^ t)
This mirrors a natural intuition: you want to explore different framings early, then commit and execute once you’ve found a good path.
- Parameters:
scale (float)
anneal_factor (float)
min_scale (float)
device (str)
- __init__(scale=0.1, anneal_factor=0.95, min_scale=0.01, device='cuda')[source]¶
- Parameters:
scale (float)
anneal_factor (float)
min_scale (float)
device (str)
- property current_scale: float¶
Layer-Selective Injection¶
- class structured_stochasticity.injection.LayerSelectiveInjector(layer_scales, default_scale=0.0, device='cuda')[source]¶
Bases:
NoiseInjectorApplies different noise scales to different layers.
This allows testing the hypothesis that early layers (problem framing) vs late layers (output realization) have different sensitivity to perturbation.
- Parameters:
layer_scales (dict[int, float]) – Dict mapping layer index to noise scale
default_scale (float) – Scale for layers not in layer_scales
device (str)
- __init__(layer_scales, default_scale=0.0, device='cuda')[source]¶
- Parameters:
layer_scales (dict[int, float])
default_scale (float)
device (str)
- current_layer: int | None¶
- set_layer(layer_idx)[source]¶
Set which layer we’re currently injecting into.
- Parameters:
layer_idx (int)
- property current_scale: float¶
Once-Per-Generation Injection¶
- class structured_stochasticity.injection.OncePerGenerationInjector(scale=0.1, latent_dim=None, device='cuda')[source]¶
Bases:
NoiseInjectorSamples noise once and reuses it for entire generation.
- This corresponds to the formalism in the paper:
z ~ P(z|X) [sampled once] h = f_θ(X, z)
The same z influences all tokens, creating a consistent “reasoning trajectory” rather than per-token perturbation.
- Parameters:
scale (float)
latent_dim (int | None)
device (str)
- __init__(scale=0.1, latent_dim=None, device='cuda')[source]¶
- Parameters:
scale (float)
latent_dim (int | None)
device (str)
Factory Function¶
- structured_stochasticity.injection.create_injector(strategy, scale=0.1, device='cuda', **kwargs)[source]¶
Factory function to create noise injectors.
- Parameters:
strategy (str) – One of “gaussian”, “uniform”, “annealed”, “once”, “layer_selective”
scale (float) – Base noise scale
device (str) – Torch device
**kwargs – Strategy-specific arguments
- Returns:
Configured NoiseInjector instance
- Return type: