Vector 1 - Data Poisoning - CH(AI)OS THEORY

FIG.01

Data Poisoning

Training set contamination leading to targeted misclassification bounds. Analysis of injection vectors, backdoor mechanisms, and SRE mitigation strategies.

Poison Rate %

0.05% Threshold

Model Accuracy Delta

-14.2% Impact

Backdoor Success

98.5% Triggered

Attack Surface

PHASE 01

Data Collection

VULN: Web scraping unverified sources

PHASE 02

Labeling

VULN: Crowdsourced annotator bias/malice

PHASE 03

ETL Pipeline

SECURE: Hash verification active

PHASE 04

Model Training

IMPACT: Poisoned weights finalized

Mechanism

Poison Patterns

Introduction of specific feature correlations (e.g., specific pixel blocks, text syntax) into the training distribution.

> [x_trigger, y_target] ∈ D_train

Label Flips

Direct modification of ground truth labels for specific classes to degrade overall accuracy or target specific outputs.

Y_true: 'Benign' Y_poison: 'Malicious'

Backdoors

Model learns to associate the trigger pattern with the target class, remaining dormant until activated during inference.

Normal Input

Trigger

Target Class

Detection & Mitigation

Vector

Monitoring Signal

Control Measure

Data Source

Hash Mismatch

Cryptographic Provenance Enforce signed data manifests

Labeling

High Annotator Variance

Multi-pass Consensus Require 3+ annotators per critical sample

Features

Activation Clustering Spectral Signatures

Robust Statistics Filter outliers in latent space before training

Model

Unusually Low Loss

Differential Privacy Gradient clipping & noise injection (DP-SGD)