Skip to main content

Operations

Documentation Map

Operations

Operating Model

This repository is operated as an install/reapply/verify bundle for another runtime. It has no continuously running process of its own.

Start / Deploy

Linux reference path

  • prepare configs/installer.env
  • run bash scripts/bootstrap.sh
  • complete installation with bash scripts/install_jhf_memory.sh

macOS bounded path

  • use the documented macOS deployment mode in docs/MACOS.md (docs/MACOS.md)
  • prefer minimal divergence from the Linux flow

Health And Readiness

Repository-local

  • bash scripts/fabric-selfcheck.sh
  • python3 scripts/export-fabric-metadata.py
  • python3 scripts/export_fabric_status_bundle.py

Target runtime

  • LocalAI readiness endpoint
  • LocalAI model listing
  • LocalAI embeddings endpoint
  • Qdrant collection inspection
  • OpenClaw memory slot and plugin presence
  • read-only runtime snapshot export and validation

The repository itself has no native /health or /ready.

Healthcheck Standards

  • no interval below 20s
  • standard runtime checks: 120s
  • low-cpu profile checks: 180s
  • timeout stays in 2-5s
  • retries stay in 3-5
  • start period stays in 20-60s

The same policy is exposed machine-readably via fabric-manifest.json under runtime.verificationContract.

Logs

Repository-local CI and script output are the primary local logs.

For live operation, relevant logs are on the target runtime:

  • OpenClaw gateway logs
  • LocalAI container logs
  • Qdrant container logs

Monitoring Signals

SignalSourceMeaningOperator action
fabric-selfcheck okscripts/fabric-selfcheck.shRepo-local contract is internally consistent.Continue with bounded verification or release preparation.
metadata export succeedsscripts/export-fabric-metadata.pyManifest and config contracts can be exported machine-readably.Use exported metadata for Fabric/Wiki/update consumers.
status bundle export succeedsscripts/export_fabric_status_bundle.pyRepository-only Fabric presence/status surface is valid.Publish bundle output for Fabric read-side consumers.
runtime snapshot contract validscripts/export_host_runtime_snapshot.py + scripts/validate_runtime_snapshot.pyHost truth probe is available, structurally valid, and classifies blocked canonical config reads explicitly instead of hanging verification.Inspect drift signals and execute reapply/rollback decisions if needed.
runtime materialization drift check validscripts/check_runtime_materialization_drift.py --check-liveRepo truth, active compose/env materialization, container truth, and app readback agree; undocumented host overrides and stale readback fail closed.Stop rollout, classify owner, and fix repo- or runtime-owner drift before continuing.
stack/container contract validscripts/check_stack_contract.pyRepository stack truth is complete and naming/compose/env/health contracts are consistent.Stop rollout and fix contract drift in repo before live mutation.
LocalAI runtime contract validscripts/check_live_runtime_contract.pyActive LocalAI container labels/path/project and guardrails match canonical contract, and the published LocalAI /readyz surface is reachable.If failing, redeploy from canonical stack root and remove parallel/legacy compose launch paths.
LocalAI readyLocalAI /readyzEmbedding runtime is available.Proceed with runtime smoke checks.
LocalAI probe guard metrics stable/tmp/jhf-bobbin-localai-guard.promTimeout bursts are below threshold and degraded mode is not active for long periods.If timeout/degraded counters rise, stop recreate loops and investigate LocalAI runtime pressure before further mutation.
LocalAI embeddings respondLocalAI /v1/embeddingsEmbedding path is functionally usable.Verify model alias and semantic memory path.
Qdrant collection presentQdrant collection inspectionMemory store is available with expected collection.Continue recall/store checks or rerun bootstrap if missing.
OpenClaw memory slot correctOpenClaw config/runtimeSemantic memory is routed to the intended slot.Reapply or roll back if the slot drifted.

Known Failure Modes

  • OpenClaw upgrade breaks the patched plugin
  • Qdrant collection shape drifts from expected dimensions/indexes
  • LocalAI is reachable but configured model alias is wrong
  • target host env/config values drift from repo expectations

Restart / Recovery

  • use scripts/reapply_after_openclaw_update.sh after host updates
  • restore memory-core via scripts/activate_memory_core.py if semantic memory becomes unstable
  • rerun smoke and Qdrant checks after recovery

Runtime Dependency Summary

  • OpenClaw host
  • LocalAI
  • Qdrant
  • optional OpenAI-compatible LLM endpoint
  • operator access to host configuration and extensions path

Packaging Operations

  • build package: bash scripts/build_package.sh --version <version>
  • publish package: GITEA_TOKEN=*** bash scripts/publish_package.sh --version <version>
  • verify package pull + digest: GITEA_TOKEN=*** bash scripts/verify_published_package.sh --version <version>
  • canonical contract: docs/ARTIFACT_CONTRACT.md (docs/ARTIFACT_CONTRACT.md)

Current Operational Gaps

  • no host-probed runtime snapshot in default CI (intentionally operator-run only)
  • no centralized live runtime metrics endpoint from this repo
  • compatibility matrix maintenance remains operator-evidence-driven

Canonical Ops References

  • docs/STANDALONE_VERIFICATION_PROFILE.md (docs/STANDALONE_VERIFICATION_PROFILE.md)
  • docs/REGRESSION_CADENCE.md (docs/REGRESSION_CADENCE.md)
  • docs/OPERATOR_HANDOVER.md (docs/OPERATOR_HANDOVER.md)
  • docs/STACK_CONTAINER_CONTRACT.md (docs/STACK_CONTAINER_CONTRACT.md)

AGPLv3. See ../LICENSE (LICENSE).

Learn more at helpifyr.com.