Skip to main content

Operations

Documentation Map

Operations

Tool / Contract Summary

This page documents the real operational behavior of jhf-dobby: service modes, worker behavior, metrics, degraded semantics, and runtime limits.

Business Value

  • gives operators an exact view of how Dobby behaves under drift and failure
  • separates normal runtime behavior from planned features
  • provides one place for monitoring and fail-closed semantics

Current Verified State

Available now:

  • effective service modes with Fabric-driven degradation
  • persistent metrics with bounded degraded output
  • worker loop for revalidation expiry with low-pressure polling defaults and bounded backoff
  • crash recovery through compose restart policy
  • bounded degraded persistence behavior for readiness and metrics

Planned / not in current scope:

  • dedicated background queue platform
  • internal authz policy engine
  • distributed worker coordination

Available Now

Lifecycle Status

  • capability class: adaptive-learning
  • mode impact: stack-only
  • lifecycle stage: active
  • runtime kind: service+worker

Service Modes

  • warmup
  • observe_only
  • proposal_only
  • promotion_enabled

Effective mode behavior:

  • requested mode comes from JHF_DOBBY_SERVICE_MODE
  • effective mode is lowered by Fabric drift
  • persistence loss degrades readiness to warmup

Worker

  • entrypoint: python -m jhf_dobby.worker.service
  • default poll interval: sixty seconds
  • failure backoff: exponential, starts at sixty seconds, caps at three hundred seconds, adds bounded jitter
  • current responsibility: expire revalidate_required proposals that are still active
  • recovery: marks a failed cycle, backs off, and continues looping without tight retries

Readiness / Drift / Monitoring

Readiness

Readiness is healthy only when:

  • Fabric alignment is green
  • persistence is reachable

Degraded readiness still returns HTTP 200, but it explicitly reports degraded status and forced warmup mode.

Drift

Drift sources include:

  • Fabric surfaces unavailable
  • JARVIS repo entry missing
  • capability class mismatch
  • mode impact mismatch
  • matrix or catalog gaps
  • admission dry-run not green

Metrics

Important counters and gauges:

  • dobby_signals_emitted_total_family_learning_signal_outcome_*
  • dobby_candidates_proposed_total_risk_class_*_target_type_*
  • dobby_revalidate_required_total
  • dobby_policy_denials_total_reason_code_*
  • dobby_budget_denials_total_reason_code_replay_budget_exhausted
  • dobby_budget_denials_total_reason_code_promotion_velocity_breaker_active
  • dobby_queue_depth_intake
  • dobby_queue_depth_replay
  • dobby_queue_depth_promotion
  • dobby_replay_budget_remaining
  • dobby_promotion_velocity_remaining
  • dobby_service_mode
  • dobby_persistence_ready
  • dobby_metrics_degraded

Failure And Degraded Semantics

  • Fabric failure: fail closed by lowering effective mode
  • Warp failure: approval checks fail closed
  • Shuttle failure: degraded evidence only; intake still works
  • Bobbin sink missing: degraded Bobbin publication result; proposal state still remains Dobby-owned
  • persistence failure:
    • /ready returns degraded state quickly
    • /metrics returns bounded degraded output
    • mutation routes fail closed with 503
    • worker keeps looping under bounded backoff

Runtime Guardrails

  • shared-host defaults stay low-pressure:
    • healthchecks default to 60s
    • worker polling defaults to 60s
    • higher-sensitivity healthchecks are opt-in only
  • diagnostics must stay bounded:
    • use timeouts for host commands
    • use docker logs --since ... --tail ... instead of unbounded log streams
    • prefer one-shot snapshots such as docker stats --no-stream
  • restart and rerun behavior:
    • compose restart policy remains unless-stopped
    • services use stop_grace_period: 20s
    • repeated verify runs should not create new containers, new compose projects, or long-lived debug processes

Optional / Extended

  • Bobbin sink file path through JHF_DOBBY_BOBBIN_ARTIFACT_SINK
  • Shuttle base URL through JHF_SHUTTLE_BASE_URL
  • Warp bearer token through JHF_WARP_API_TOKEN
  • configurable replay budget and promotion velocity controls

Planned / Not In Current Scope

  • long-lived replay queue
  • autonomous retraining jobs
  • Dobby-authored governance policies

Known Limits

  • no per-route auth implementation
  • no automatic replay retry scheduler
  • no external alerting integration in this repo
  • queue depth metrics are logical runtime indicators, not a broker queue view

Exceptions / Waivers

  • Shuttle evidence is optional by design
  • Spool is treated as optional read-only evidence classification only
  • jhf-dobby#27
  • jhf-dobby#31
  • jhf-dobby#34
  • jhf-dobby#36

License: AGPLv3. See ../LICENSE (LICENSE).
Learn more at helpifyr.com.