VIGÍA — Architecture & Mathematical Framework

Input Layer

SIFT Workstation

mount_evidence → hash_chain → analyze_artifacts → export ForensicBundle(JSON + SHA-256)

Memory dumps

.raw / .vmem — Volatility3

Disk images

.E01 / .dd — MFT + USN

Event logs

.evtx — Windows EVTX

Network captures

.pcap — Zeek / Suricata

Registry hives

SYSTEM / SAM / NTUSER

Prefetch / Amcache

execution evidence

│
▼ ForensicBundle (JSON + SHA-256 sealed)

Core Engine

VIGÍA MCP Server — vigia_sift_bridge.py

21 MCP tools · Zero-trust DAG isolation · Deterministic Fraction arithmetic · Daubert-compliant output

Layer 0 — ebs_v1.py

Data contracts · SignalOutput · ForensicBundle schema · Immutable field definitions · No business logic

Layer 1 — External Signals

SIFT tool outputs · CLI/SDA/GCI adapters · signal_adapter.py · Raw evidence ingestion

Layer 2 — Inference Engine

likelihood_engine.py · KDE + Ledoit-Wolf covariance · graph_stability.py · Bootstrap stability selection

Layer 3 — Risk Governance

risk_bounded_layer.py · r = (1-P)·(1+λD)·(1+γ(1-S))·(1+ω(1-I)) · Decimal prec=28 · PolicySpec sealed

Layer 4 — Audit

audit_action.py · Diff engine · Optimizer · PolicyEngine · decision_layer.py · dissent_report.py

Layer 5 — Verification

verify_ebs_v1.py · stdlib only · no VIGÍA imports · independent bundle verification · SHA-256 check

Cross-Artifact Engine — caie.py

USN_JOURNAL_GAP · TIMESTAMP_PRECISION_ANOMALY · CRYPTOGRAPHIC_INCONSISTENCY · STAGING_ARTIFACT_PRESENCE
Golden Rule: EFFECT_BEFORE_CAUSE → MALICE categorical (confidence=0.95, no override)

Phase 1 — Chain of Custody (9 tools)

mount_sift_evidence

forensic image mount

generate_forensic_hash

SHA-256 chain

read_evidence

single-pass I/O + hash

list_files

filesystem perimeter

search_pattern

pure Python search

list_processes

persistence detect.

audit_network

exfil channel map

calculate_shannon_entropy

payload/cipher detect.

audit_image_metadata

GPS + timestamp

Phase 2 — Intentionality Analysis (12 tools)

analyze_stylometry

astroturfing detect.

calculate_human_entropy

bot vs. human

detect_human_jitter

sleep(2) signature

infer_intent

Peirce + Carnegie

audit_grice_maxims

linguistic deception

detect_eco_overinterpretation

planted evidence

detect_habit_incongruence

LotL detection

activate_honey_token

active exfil trap

reason_with_llm

novel case abduction

validate_and_correct_analysis

Peircean fallacy check

reload_phonetic_dict

hot-reload dict.

get_phonetic_dict_stats

dictionary diagnostics

Hard Invariant — Daubert Requirement

THE LLM IS OUTSIDE THE MATHEMATICAL DECISION LOOP.
Claude Code / Ollama receives only the sealed ForensicBundle.
It cannot alter scores, weights, or verdicts.
It can only translate the sealed result into a judicial narrative.
Violation of this invariant constitutes fabrication of evidence.

│
▼ Sealed ForensicBundle (SHA-256: bundle_hash)

Narrative Layer — External LLM

Peirce Planner — Claude Code / Ollama

peirceplanner_bounded.py · AbductiveHuntingStrategy · Firstness → Secondness → Thirdness
Tool selection: value / (cost × spoofability) · FALLBACK_TOOLS on failure · self-correction loop

│
▼

Output

Amicus Curiae Judicial Narrative

Confirmed findings vs. inferred hypotheses — explicitly separated · Falsifiability conditions for each hypothesis
MITRE ATT&CK TTP mapping · Daubert admissibility statement · Reproducible: same input → same SHA-256 bundle_hash

Layer	Module	Responsibility	Import Rule
L0	ebs_v1.py	Data contracts, SignalOutput, ForensicBundle schema	No imports from VIGÍA
L1	signal_adapter.py	Raw evidence ingestion, SIFT tool adapters	→ L0 only
L2	likelihood_engine.py · graph_stability.py	KDE inference, Ledoit-Wolf covariance, bootstrap stability	→ L0, L1
L3	risk_bounded_layer.py	r-formula governance, ε-bounded decision	→ L0–L2
L4	audit_action.py · decision_layer.py	Policy diff, optimizer, dissent escalation	→ L0–L3
L5	verify_ebs_v1.py	Independent bundle verification — stdlib only	No VIGÍA imports

1 Signal Normalization

# Standard path z = (value - baseline_mean) / MAD # Pre-normalized path (already z-score — do NOT recalculate) z = value # is_pre_normalized=True

MAD = Median Absolute Deviation (robust against outliers).
is_pre_normalized flag prevents double-normalization on signals that are already z-scores (e.g. from external calibrated tools).

2 Noisy-OR Fusion (CAIE)

# Within-source fusion (dependent evidence) group_score = 1 - ∏(1 - score_i) # Cross-source fusion (independent sources) composite = 1 - ∏(1 - group_j) # Penalty: fewer than 3 independent sources composite *= 0.80 # -20% if n_sources < 3

Noisy-OR models "at least one signal is genuine" without assuming mutual exclusivity.
The 20% penalty penalizes thin evidence bases — a Daubert guard against single-source overconfidence.

3 Likelihood Ratio (ENFSI-calibrated)

z_clipped = clip(z, -3.0, 3.0) log_lr_i = (z_clipped²) / 2 mean_corr = mean(|corr[i,j]|, i≠j) combined_log_lr = Σ log_lr_i × (1 - mean_corr) LR = exp(combined_log_lr) P_fabrication = LR / (1 + LR) # prior = 0.5

Correlation penalty (1 - mean_corr) prevents amplification of duplicated telemetry (Sysmon + Defender + EDR often share ETW origin).

ENFSI evidence strength scale:

LR < 10

10–100

100–1K

1K–10K

≥ 10,000

weak

moderate

strong

very strong

extreme

4 Trust Decay — Broken Chain

trust_effective = base_trust × exp(-λ × break_severity) # exp() via precomputed Decimal table — no FPU # deterministic across x86 / ARM / WASM

Chain-of-custody breaks are not binary failures — they degrade trust proportionally.
break_severity ∈ [0,1]: 0 = intact, 1 = unrecoverable.
No floating-point processor involved: table-based exp() guarantees bit-identical results.

5 Bayesian Update — Temporal Neighborhood

likelihood = mean_neighbor_trust - (contamination_ratio × 0.5 + suspicious_ratio × 0.2) marginal = likelihood × prior + (1 - likelihood) × (1 - prior) posterior = clip(likelihood × prior / marginal, 0.0, 1.0)

Temporally adjacent events update each other's credibility.
A single suspicious event propagates distrust to its temporal neighborhood, weighted by contamination density.

6 Final Intent Score (vigia_scorer.py)

raw_intent_score = clip( composite + fracture_malice_boost - fracture_credibility_penalty, 0.0, 0.99) support_score = clip(log(1 + n_artifacts) / log(5), 0.0, 1.0) final_score = raw_intent_score × (0.9 + 0.1 × support_score) # Hard gate: thin evidence high score → clamp if n_artifacts < 2 and final_score > 0.65: final_score = 0.65

support_score implements logarithmic diminishing returns on evidence quantity (log base 5 — saturation at 5 artifacts).
Hard gate prevents a single high-weight signal from forcing a verdict without corroboration — Daubert requirement.

7 Risk Function (RiskBoundedDecisionLayer)

r = (1-P) · (1+λD) · (1+γ(1-S)) · (1+ω(1-I)) P = P(authenticity | evidence) ∈ [0,1] 1.0 = genuine D = drift_score ∈ [0,1] S = graph_stability ∈ [0,1] I = consistency_score ∈ [0,1] λ, γ, ω = adaptive params always > 0 # Decision (ε-bounded): r ≤ ε_accept → ACCEPT (ACCEPT_POSTERIOR) r ≥ 1 - ε_reject → REJECT (REJECT_POSTERIOR) otherwise → ABSTAIN (see §4.7 subtypes) # ε defaults: ε_accept = ε_reject = 0.05 # All thresholds read from sealed PolicySpec — none hardcoded

Decimal prec=28, ROUND_HALF_EVEN throughout. Inputs clamped before any multiplication.
λ, γ, ω are adaptive (see §4.11). ε_accept and ε_reject are policy parameters, not constants.

8 Causal Closure Score (CCS Gate)

CCS = Σ(weights of CONSISTENT evidence) / Σ(weights of ALL evidence) ∈ ℚ ∩ [0,1] — exact Fraction, no floats CCS > 1/2 → causal chain admissible → verdict possible CCS ≤ 1/2 → chain BROKEN → ABSTAIN (hard veto)

The CCS gate is a hard veto: a majority of evidence must be mutually consistent before any verdict is issued.
Fraction arithmetic: the threshold 1/2 is exact, no floating-point approximation possible.

9 Effective Confidence (Stability-Adjusted)

effective_confidence = confidence × (1/2 + stability × 1/2) # All in Fraction — exact rational arithmetic # stability ∈ [0,1] from graph_stability bootstrap

A verdict won by narrow margin (low stability) carries lower effective confidence even if the raw score is identical.
This prevents overconfident verdicts from unstable evidence graphs — critical for Daubert cross-examination.

10 Quadripartite Verdict — 8 States

State	Condition
ABSTAIN_DEGRADED	DEGRADED_MODE active (always first — system fault)
ESCALATE	dissent escalation_required by audit engine
ABSTAIN_CONTRADICTION	raw == ABSTAIN + oscillation detected
ABSTAIN_INSUFFICIENT	effective_confidence < 3/5
MALICE_HIGH	MALICE + eff_conf ≥ 4/5
MALICE_MEDIUM	MALICE + eff_conf ∈ [3/5, 4/5)
BENIGN_HIGH	BENIGN + eff_conf ≥ 4/5
BENIGN_MEDIUM	BENIGN + eff_conf ∈ [3/5, 4/5)

audit_hash = SHA-256(state ‖ raw_verdict ‖ confidence ‖ stability ‖ integrity ‖ adversarial ‖ dissent, sort_keys=True)
BENIGN + adversarial_penalty cleared → +5% bonus to effective_confidence.

11 Meta-Calibration (Adaptive — no opaque ML)

# Parameter updates per evaluation cycle λ_t = λ_{t-1} × (1 + 0.5×fn_rate - 0.5×fp_rate) γ_t = γ_{t-1} × (1 + fp_rate - 0.5×fn_rate) ε_t = ε_{t-1} × (1 + abst_rate - target_abst_rate) # PD stabilization (discrete) velocity_t = momentum × velocity_{t-1} + (1-momentum) × delta_t if |delta_t| > τ: theta = prev_theta + damping × delta_t theta = theta - 0.1 × velocity_t # Absolute bounds (Daubert — auditable, stable) λ ∈ [0.1, 15.0], γ ∈ [0.1, 15.0], ε ∈ [0.005, 0.30]

Parameters adapt to observed FP/FN rates without opaque gradient descent.
PD stabilization prevents runaway oscillation. Bounded ranges ensure Daubert auditability — no parameter can drift to an inadmissible extreme.

12 Protocol P2 — Complexity Metrics

# Permutation Entropy (order m) H_PE = -Σ P(π) × log₂(P(π)) H_norm = H_PE / log₂(m!) # Entropy Rate (over symbol pairs) H_rate = -Σ P(t) × log₂(P(t)) token = (uint32_a << 32) | uint32_b # bijection, no collisions # Canonical cross-platform quantization d = Decimal(str(value)) result = float(d.quantize(Decimal('1.000000'), rounding=ROUND_HALF_EVEN)) # NOT round() — delegates to libc, non-deterministic cross-platform

SHA-256 P2 Canonical Vectors (v2.8 — 22 vectors)

f7276a524a46149a2811d52f9e5072d2a281df227f9d46d084a651d6420cf4ce

Any correct implementation must produce bit-identical outputs for all 22 canonical vectors.
This is the determinism guarantee required for Daubert reproducibility.