Skip to main content

NATS EventBus

Status

Implemented, with AWS event-backbone choice still under ADR review.

  • SRS references: asynchronous RMM, SIEM, compliance, and integration workflows.
  • Client response references: Phase 1 needs MVP-depth RMM, SIEM, and Compliance foundations without overbuilding.
  • ADR references: Phase 1 platform stack ADR; event backbone ADR remains queued.
  • Task board references: OP-D025, OP-D033.

Problem Statement

OneProtect needs an async event spine so API ingestion, worker processing, delivery, evidence, asset, and future SIEM flows do not become tightly coupled.

Architectural Intent

Application code depends on the EventBus seam. NATS JetStream is the current local and Phase 1 bootstrap event spine. AWS production may keep NATS or move to MSK/Kinesis after ADR review.

What Was Implemented

  • EventBus abstraction.
  • NATS JetStream implementation.
  • Worker-service consumer path.
  • Event envelope requirements for tenant, correlation, causation, event ID, and event type.
  • NATS smoke/demo validation targets.

Components Involved

  • services/common/event_bus.py
  • services/worker_service/runner.py
  • Event schemas under specs/events/
  • AsyncAPI under specs/asyncapi.yaml

APIs / Events / Schemas

  • Event schemas are stored as JSON Schema.
  • AsyncAPI documents event channels and messages.
  • New events require schema and AsyncAPI updates.

Deployment Notes

Docker Compose includes the local NATS path. AWS deployment has not committed to NATS vs MSK/Kinesis yet; that remains an explicit ADR decision.

Security / Tenant Isolation

  • Events carry tenant_id.
  • Worker processing sets explicit tenant context.
  • Event handlers must not infer tenant context from spoofable headers.
  • Payloads must not contain secrets.

Validation Steps

UI Validation

  1. Trigger a supported demo or smoke path.
  2. Confirm downstream console pages update through worker-processed data, such as Integrations or Assets when the related flow is active.

API Validation

Use implemented API paths that publish events, then inspect worker logs for structured event_id, event_type, tenant_id, and correlation_id fields.

Smoke Validation

make test-nats
make smoke-nats
make demo-nats

Known Limitations

  • Exact AWS event backbone remains an open ADR: NATS JetStream vs MSK/Kinesis.
  • Full SIEM log store/search is not implemented.
  • Consumer lag dashboards and alerting remain future observability work.

Follow-Up Work

  • Complete AWS event backbone ADR.
  • Add production dashboards and alerting.
  • Expand SIEM event contracts and search store after contract approval.

Acceptance Criteria Mapping

Acceptance criterionEvidence
EventBus seam existsservices/common/event_bus.py
NATS consumer path worksmake test-nats and make smoke-nats
Events are contract-backedJSON Schemas and AsyncAPI
Worker processing is asyncservices/worker_service/runner.py