ACI Technical Deep Dive

This deep dive documents ACI internals end to end: data flow, rule semantics, confidence computation, ranking integration, reporting integration, and validation boundaries.

Scope and implementation map

Primary implementation files:

src/vulnparse_pin/core/passes/ACI/aci_pass.py
src/vulnparse_pin/core/passes/TopN/TN_triage_semantics.py
src/vulnparse_pin/core/passes/TopN/TN_triage_config.py
src/vulnparse_pin/core/schemas/topN.schema.json
src/vulnparse_pin/core/passes/TopN/topn_pass.py
src/vulnparse_pin/utils/markdown_report.py

Pass dependency and lifecycle

Default derived pass order includes:

Scoring@2.0
ACI@1.0
TopN@1.0
Summary@1.0

AttackCapabilityInferencePass declares:

name = "ACI"
version = "1.0"
requires_passes = ("Scoring@2.0",)

TopNPass depends on ACI@1.0 output for ranking behavior.

Signal extraction pipeline

Signal extraction is centralized in _extract_signals(...) and pulls from:

finding-level flags: exploit and KEV
finding affected port (remote service signal list)
CVE analysis entries
title/description/plugin output
reference URLs/text

Normalization behavior:

lowercase matching
substring match against effective token vocabulary

Effective token vocabulary is built by _effective_text_tokens(cfg) using:

core tokens (maintainer baseline)
alias overlays (signal_aliases)
optional core token suppression (disabled_core_tokens)
replacement mode (token_mode = replace)

Rule evaluation model

Capability rules

For each finding:

collect normalized signals
iterate enabled capability rules
match if any rule signal is present in finding signals
append capability and add rule weight to confidence base

Confidence base is capped at 1.0.

Exploit bonus

If exploit boost is enabled and exploit evidence exists:

exploit bonus is computed from base confidence and capped by max_bonus
confidence factors include exploit_boost when bonus applies

Final confidence

Final confidence:

confidence = min(1.0, confidence_base + exploit_bonus)

Bucket mapping:

high if >= 0.8
medium if >= 0.5
low otherwise

Chain inference model

ACI evaluates enabled chain rules after capabilities are resolved.

Chain match condition:

set(rule.requires_all) is a subset of matched capabilities

Matched chain labels are emitted in finding semantics, and rule hit counts are aggregated in metrics.

Rank uplift model

Finding-level uplift uses thresholded linear interpolation:

If confidence >= min_confidence:

uplift = max_uplift * ((confidence - min_confidence) / (1.0 - min_confidence))

Else:

uplift = 0

Then clamp:

uplift = clamp(uplift, 0, max_uplift)

Asset-level uplift aggregates finding uplifts and scales by asset_uplift_weight, then clamps by max_uplift.

Output contracts

Finding semantic contract

Each finding semantic record includes:

finding_id
asset_id
confidence
confidence_factors
capabilities
chain_candidates
cwe_ids
evidence
exploit_boost_applied
rank_uplift

Asset semantic contract

Each asset semantic record includes:

asset_id
weighted_confidence
max_confidence
capability_count
chain_candidate_count
ranked_finding_count
rank_uplift

Metrics contract

Metrics include:

total_findings
inferred_findings
coverage_ratio
capabilities_detected
chain_candidates_detected
confidence_buckets
uplifted_findings

Decision ledger integration

ACI writes structured ledger events when ledger service is present:

pass-level summary event
preview of inferred finding events (bounded sample)

Evidence fields include capabilities, chain candidates, confidence, and uplift, enabling review and runmanifest traceability.

Configuration and validation boundaries

Schema layer

topN.schema.json validates structure and ranges for:

aci.enabled
confidence and uplift parameters
token mode and alias arrays
capability rule shapes
chain rule shapes

Semantic layer

TN_triage_semantics._parse_aci validates semantics such as:

unique rule IDs
non-empty normalized signal arrays
token and signal length constraints
supported token mode values
range checks for all numeric fields

These checks prevent malformed policy from reaching runtime pass logic.

Fallback behavior

If config validation fails in non-strict mode:

loader returns safe fallback config
fallback ACI defaults are loaded from packaged tn_triage.json when possible
otherwise last-resort ACI fallback disables inference and emits empty rules

This prevents hard failures in tolerant runtime modes while preserving safe behavior.

Integration with markdown reporting

markdown_report.py consumes ACI output to render:

ACI metrics snapshot
top capability signals
top assets mapped to top findings and inferred capabilities
chain candidates and confidence per finding
explicit inference disclaimer for decision hygiene

This appears in both executive and technical markdown report modes.

Performance characteristics

ACI runs in-memory over findings and assets and uses:

linear scans over findings
set-based signal matching
dictionary aggregation for metrics and rollups

Expected complexity is approximately linear in finding count multiplied by configured rule counts. Typical cost is low relative to enrichment and large-volume scoring paths.

Testing coverage and invariants

Relevant tests include:

tests/test_aci_pass.py
tests/test_config_schema_validation.py
tests/test_topn_summary_aggregation_alignment.py
tests/test_runmanifest.py
tests/test_markdown_report.py

Notable validated invariants:

replace mode honors alias-only vocabulary
merge mode supports selective core token disablement
TopN parity is preserved across sequential and parallel paths
ACI uplift behavior is deterministic for tie-break intent

Extension guidance

When extending ACI policy:

add capability rules first
add chain rules only after capability quality is stable
add aliases for environment-specific phrases
tune weights conservatively and compare output deltas
preserve deterministic IDs and labels for audit continuity

Residual risks

Substring matching can over-trigger on ambiguous language.
Over-broad aliases can increase false-positive capability inference.
Excessive uplift settings can distort practical triage order.

Mitigations:

maintain conservative defaults
disable proven noisy tokens
validate via representative datasets and runmanifest review