Modern primer + interactive labs

Find what doesn't fit: anomaly detection

From statistical foundations to practical detectors for wearables, networks, and images — explore, tune, and see outliers appear in real time.

Unsupervised Robust statistics Time series Autoencoders & PCA IDS / network Medical imaging
3
interactive simulators
5
practical use cases
10+
key formulas with intuition
This page is educational. Real deployments require proper validation, drift monitoring, and domain review.

Theory and math

Definition: An anomaly is an observation that deviates so much from the majority that it raises suspicion it was generated by a different mechanism.
Problem settings:
Unsupervised
Semi-supervised (normal only)
Supervised (labeled anomalies)
Z-score detector: For scalar data \(x\) with mean \(\mu\) and standard deviation \(\sigma\), score is \(z=\frac{x-\mu}{\sigma}\). Flag if \(|z|>\tau\). Robust variant replaces \(\mu,\sigma\) with median and \(\mathrm{MAD}=\mathrm{median}(|x-\mathrm{median}(x)|)\), scaling \(\sigma\approx 1.4826\cdot \mathrm{MAD}\).
Multivariate Gaussian: With mean \(\boldsymbol{\mu}\) and covariance \(\Sigma\), the Mahalanobis distance is \[ D_M(\mathbf{x})=\sqrt{(\mathbf{x}-\boldsymbol{\mu})^\top \Sigma^{-1}(\mathbf{x}-\boldsymbol{\mu})}. \] If data are Gaussian, \(D_M^2\) follows \(\chi^2_k\) (k = dimensions), enabling thresholds via quantiles.
Density-based view: Learn \(p(\mathbf{x})\). Flag if \(p(\mathbf{x})<\epsilon\). Approximations: kernel density estimation, Gaussian mixtures, normalizing flows.
Reconstruction-based: Learn to reconstruct normals (PCA, autoencoders). Score \(s(\mathbf{x})=\lVert \mathbf{x}-\hat{\mathbf{x}}\rVert\). Flag if \(s>\tau\).
Time series: Baselines like EWMA and residual modeling. EWMA: \[ m_t=\alpha x_t+(1-\alpha)m_{t-1},\quad e_t=x_t-m_t. \] Flag if \(|e_t|>k\cdot \hat{\sigma}_e\). Seasonality uses STL/ETS or Fourier terms before residual detection.
PCA for anomalies: Let \(X\in\mathbb{R}^{n\times d}\), compute principal subspace \(U_r\) (top r eigenvectors). Reconstruction \(\hat{\mathbf{x}}=U_r U_r^\top \mathbf{x}\). Residual \(r=\lVert \mathbf{x}-\hat{\mathbf{x}}\rVert\). Large \(r\Rightarrow\) anomalous. In covariance form: \[ r^2 = \sum_{j=r+1}^{d} (\mathbf{u}_j^\top \mathbf{x})^2. \]
Thresholding: Choose \(\tau\) via: Statistical quantile Target false positive rate Expected contamination Validation ROC/PR
Beware of dataset shift: if the “normal” distribution drifts, fixed thresholds break. Refit, recalibrate, or use adaptive methods.
Class imbalance is extreme in anomaly detection. Precision-Recall is more informative than ROC when positives are rare.
IDS intuition: Model typical traffic features (per-flow bytes, packet rate, unique ports). Outliers may indicate scans, exfiltration, or beaconing.
Medical imaging intuition: Train a model (PCA/autoencoder) on healthy images. Lesions or artifacts yield high reconstruction error or segmentation residuals.
Fake content intuition: Inconsistencies in frequency spectra, resampling footprints, or lighting geometry can be anomalous relative to genuine data.

Interactive simulations

Experiment with classic detectors: z-score and MAD, EWMA for time series, and PCA-based reconstruction. Tune parameters and see anomalies light up.

Simulator 1 — distributions and time series

Normal
Anomaly
Model line / bounds
TPR, FPR and counts appear here after simulation.

Simulator 2 — PCA reconstruction in 2D

Inliers
Outliers (high residual)
Principal axis
Residual stats and flagged count will show here.

Simulator 3 — synthetic imaging anomalies

Base image
Detected mask
Threshold
Pixel-level detection rate and FP will show here.

Applications overview

Healthcare monitoring with smartphones and wearables

Signals: Heart rate (HR), heart rate variability (HRV), step count, accelerometer, SpO₂, skin temperature, sleep stages.
Use case (logs): Model expected HR given activity and time-of-day. Residuals \(e_t = HR_t - \hat{HR}_t(\text{activity}, t)\). Persistent large residuals may indicate sensor misplacement, arrhythmias, or artifacts. Always confirm clinically before action.
Method sketch: Train a baseline on “normal” days. Use EWMA on residuals for online alerts. Robust percentiles adapt per user.
Data hygiene: Battery drops, missing data, and device swaps produce anomalies too. Add quality flags and imputation rules.

Cybersecurity — network anomalies and IDS

Features: Per-flow bytes, packets, duration, inter-arrival time stats, TCP flags, destination port entropy, unique dests per source.
Patterns: Port scans → many short flows to many ports; exfiltration → sustained large egress bytes; beaconing → periodic connections with low variance.
Approach: Windowed features + robust scaling + distance/density methods (e.g., Mahalanobis). Tune thresholds to meet alert budgets; feedback loops reduce noise.
Caveat: Encrypted traffic hides payloads. Side-channel features remain useful but require careful baselining per network segment.

Medical imaging — lesions and artefacts

Modalities: MRI, CT, X-ray, ultrasound. Anomalies include unexpected bright/dark regions, motion streaks, metal artefacts, coil failures.
Pipelines: Train on healthy scans with autoencoders or diffusion models. Use reconstruction error or uncertainty maps. For segmentation, use U-Net with out-of-distribution scoring on features.
Validation: Calibrate per site and scanner. Dice/IoU for lesion masks, pixel AUROC for detection, clinical reader studies for utility.

Detecting fake details

Targets: Spliced regions, resampling, face swaps, AI-generated edits, inconsistent reflections/shadows/specularities.
Signals: JPEG blocking inconsistencies, PRNU sensor noise mismatch, frequency spectra anomalies, copy-move self-similarity, eye specular highlights mismatch.
Workflow: Localize suspect regions → compute forensic features → threshold via robust stats → optional human-in-the-loop review.

Simple use cases

Wearable HR spike: Unusually high HR during sleep; cross-check with accelerometer. If no movement, flag sensor misread vs. tachycardia pattern.
Step count reset: Sudden zeros at midday. Anomaly may be app restart → auto-resume heuristics.
Network scan burst: Spike in unique destination ports per minute → alert as potential scan.
CT metal artifact: High-frequency streaks localized near implants → artifact-aware masking before analysis.
Copy-move forgery: Duplicate texture blocks with different lighting → high self-similarity but inconsistent gradients.
Start simple: robust baselines + interpretable thresholds. Then layer representation learning where it truly reduces false positives or reveals subtle patterns.

Evaluation and deployment tips

Metrics: Precision, recall, F1; AUROC and AUPRC; time-to-detect for streaming.
Calibration: Convert scores into probabilities with Platt/isotonic on validated sets; maintain target alert rate.
Drift: Monitor population stats, PSI, or KS; schedule retraining; keep backtesting windows.
Explainability: Show top contributing features, nearest neighbors, or reconstruction heatmaps.
Human in the loop: Triage queue with context and feedback improves precision over time.
Privacy: Minimize PII, aggregate where possible, and enforce purpose limitation for sensitive domains.
Safety: Never act automatically on medical anomalies without clinical confirmation. Treat alerts as prompts for review.