PyTorch Hooks¶
This module provides the core mechanism for injecting noise into the model’s internal representations using PyTorch’s forward hook system.
Hook Configuration¶
- class structured_stochasticity.hooks.HookConfig(layers=<factory>, injection_point='post', enabled=True)[source]¶
Bases:
objectConfiguration for hidden state hooks.
- Parameters:
layers (list[int])
injection_point (str)
enabled (bool)
- layers: list[int]¶
- injection_point: str = 'post'¶
- enabled: bool = True¶
- __init__(layers=<factory>, injection_point='post', enabled=True)¶
- Parameters:
layers (list[int])
injection_point (str)
enabled (bool)
- Return type:
None
Noisy Inference Wrapper¶
- class structured_stochasticity.hooks.NoisyInferenceWrapper(model, injection_layers=None, noise_scale=0.1, noise_strategy='gaussian', injection_mode='continuous', device=None, **injector_kwargs)[source]¶
Bases:
objectWraps a transformer model to enable noisy inference.
This is the main interface for running experiments. It handles: - Identifying target layers in different model architectures - Registering/removing hooks - Generating multiple trajectories with different noise samples - Aggregating results
Example
>>> model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B") >>> wrapper = NoisyInferenceWrapper(model, injection_layers=[0,1,2]) >>> outputs = wrapper.generate_trajectories(input_ids, k=5)
This is the main interface for running experiments. It handles:
Identifying target layers in different model architectures
Registering/removing hooks
Generating multiple trajectories with different noise samples
Aggregating results
Supported Architectures:
LLaMA / Llama 2 / Llama 3
Mistral
GPT-2 / GPT-Neo
OPT
Falcon
Phi
Qwen
Gemma
Example Usage:
from transformers import AutoModelForCausalLM from structured_stochasticity.hooks import NoisyInferenceWrapper model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B") wrapper = NoisyInferenceWrapper( model, injection_layers=[0, 1, 2], noise_scale=0.1 ) # Generate 5 trajectories outputs = wrapper.generate_trajectories(input_ids, k=5)
- Parameters:
model (transformers.PreTrainedModel)
injection_layers (list[int] | None)
noise_scale (float)
noise_strategy (str)
injection_mode (str)
device (str | None)
- LAYER_PATTERNS = {'falcon': 'transformer.h', 'gemma': 'model.layers', 'gpt2': 'transformer.h', 'gpt_neo': 'transformer.h', 'llama': 'model.layers', 'mistral': 'model.layers', 'opt': 'model.decoder.layers', 'phi': 'model.layers', 'qwen': 'model.layers'}¶
- __init__(model, injection_layers=None, noise_scale=0.1, noise_strategy='gaussian', injection_mode='continuous', device=None, **injector_kwargs)[source]¶
Initialize the wrapper.
- Parameters:
model (transformers.PreTrainedModel) – HuggingFace transformer model
injection_layers (list[int] | None) – Which layers to inject noise into. If None, defaults to first 25% of layers.
noise_scale (float) – Magnitude of noise injection
noise_strategy (str) – Type of noise (“gaussian”, “uniform”, “annealed”, “once”)
injection_mode (str) – “continuous” (every forward) or “once” (per generation)
device (str | None) – Device for noise tensors
**injector_kwargs – Additional arguments for noise injector
- injectors: dict[int, NoiseInjector]¶
- hooks: dict[int, HiddenStateHook]¶
- generate(input_ids, attention_mask=None, max_new_tokens=512, **generate_kwargs)[source]¶
Generate with noise injection.
- Parameters:
input_ids (torch.Tensor) – Input token IDs
attention_mask (torch.Tensor | None) – Attention mask
max_new_tokens (int) – Maximum tokens to generate
**generate_kwargs – Additional arguments for model.generate()
- Returns:
Generated token IDs
- Return type:
torch.Tensor
- generate_trajectories(input_ids, attention_mask=None, k=5, max_new_tokens=512, **generate_kwargs)[source]¶
Generate K independent trajectories with different noise samples.
This is the core experimental method. Each trajectory gets a fresh noise sample, enabling exploration of different reasoning paths.
- Parameters:
input_ids (torch.Tensor) – Input token IDs
attention_mask (torch.Tensor | None) – Attention mask
k (int) – Number of trajectories to generate
max_new_tokens (int) – Maximum tokens per trajectory
**generate_kwargs – Additional arguments for generation
- Returns:
List of K generated token ID tensors
- Return type:
list[torch.Tensor]
- generate_trajectories_decoded(prompt, tokenizer, k=5, max_new_tokens=512, **generate_kwargs)[source]¶
Convenience method: generate K trajectories and decode to strings.
- Parameters:
prompt (str) – Input prompt string
tokenizer (transformers.AutoTokenizer) – Tokenizer for encoding/decoding
k (int) – Number of trajectories
max_new_tokens (int) – Max tokens per trajectory
- Returns:
List of K decoded response strings
- Return type:
list[str]