Skip to content

Identity Contract

VulnParse-Pin uses deterministic identity construction to keep findings and assets stable across runs when canonical inputs are equivalent.

Why identity contract matters

  • Enables reliable deduplication and comparison workflows
  • Supports reproducible scoring and ranking outputs
  • Simplifies downstream joins to CMDB/ticketing/reporting systems

Asset identity

Asset IDs are generated from canonicalized host attributes (not random UUIDs).

Conceptually:

asset_id = hash(canonical(ip, hostname))

Implementation resides in src/vulnparse_pin/core/id.py.

Finding identity

Finding IDs are generated from canonicalized finding context, including:

  • Asset identity
  • Scanner signature/plugin identity
  • Service tuple (protocol/port)
  • Finding kind/title signal

Conceptually:

finding_id = hash(canonical(asset_id, scanner_sig, proto, port, kind))

Canonicalization expectations

Normalization functions enforce stable semantics for:

  • Text fields (case/spacing cleanup)
  • Protocol values
  • Port coercion
  • Host string normalization

This minimizes accidental ID drift from formatting variation.

Versioning

Identity generation includes versioned canonical prefixes, allowing controlled evolution while preserving compatibility boundaries.

Contract guarantees

For equivalent canonical inputs, the resulting IDs should be identical across runs.

For materially different identity inputs, IDs should diverge.

Non-goals

  • Identity does not claim semantic equivalence across unrelated scanners without mapped signatures
  • Identity does not replace historical diff logic by itself

Integration guidance

  • Persist IDs as authoritative keys in downstream systems
  • Avoid recomputing IDs externally with different normalization rules
  • Treat canonicalization changes as migration events requiring explicit version handling