FIG.02
Adversarial Evasion
Test-time input perturbations designed to force misclassification. Analysis of threat models, robustness thresholds, and certified defenses in operational environments.
Threat Model Scope
White-Box
Black-Box
Targeted
Untargeted
Budget (ε)
EVASION MECHANICS
Input x
+ δ (||δ||_p ≤ ε)
Model f(x)
argmax P(y|x+δ)
Decision Boundary
Output y'
Y' ≠ Y_true
Misclassification
DECISION BOUNDARY SKETCH
ε-ball (L_p norm)
Robustness Ladder
L0: Baseline Model
HIGH RISKStandard empirical risk minimization. No explicit defense against test-time perturbations.
Pros
High standard accuracy, fast training.
Cons
Fails at minimal ε perturbations.
L1: Data Augmentation
MODERATE RISKTraining with random noise, rotations, or heuristic perturbations.
Pros
Improves average-case corruption robustness.
Cons
Vulnerable to targeted optimization (PGD/FGSM).
L2: Adversarial Training
LOW RISKMin-max optimization: training on actively generated worst-case adversarial examples.
Pros
Empirically robust against known attacks.
Cons
Accuracy trade-off, computationally expensive.
L3: Certified Defenses
SECUREMathematical guarantees (e.g., Randomized Smoothing) that no perturbation within ε can change the prediction.
Pros
Provable lower bound on robustness.
Cons
Scalability issues, conservative bounds.
Attack Success vs Perturbation Budget (ε)
Comparing defense strategies under PGD attack (L_inf norm)
Baseline
Adv. Trained
Certified
OPERATING THRESHOLD