Operations
Use this area for install, deploy, observe, recover, and post-verify guidance. It is the operator-first entry point for practical runbooks, not a mirror of repository folders.
Architecture / Flow
What this area covers
- Safe restart, upgrade, verification, and recovery paths for the admitted public Helpifyr stack.
- Symptom-oriented runbooks that tell operators what to check first, what evidence to collect, and when to escalate.
- Bounded links back to versioned product operations pages when the operational boundary is tool-specific.
How to use these runbooks
- Start with the symptom that best matches what you see, not with the repository you think might be involved.
- Capture readback and evidence before mutation so recovery stays auditable and reversible.
- Treat Platform Truth as the contract boundary when runtime behavior and docs appear to disagree.
Canonical scope in this wave
/docs/operations/install/docs/operations/configuration/docs/operations/health-readiness/docs/operations/troubleshooting/docs/operations/upgrade/docs/operations/backup-recovery/docs/operations/automation
Current materialized surfaces
- Restart safely
- Upgrade safely
- Check stack health
- Debug failed agent task
- Debug failed Shuttle or n8n workflow
- Debug authentication issue
- Debug Fabric readiness issue
- Recover from partial deployment
- Failure Modes
- Upgrade and Migration
Related entry points
- Upgrade and Migration for release sequencing and rollback posture.
- Compatibility before changing version combinations or support windows.
- Versioned product operations pages under
/products/<tool>/<channel>/operationswhen the failure is already narrowed to a specific module.
Verification
This area is working as intended when a reader can:
- start from a symptom or operator need instead of guessing the repo
- choose a bounded runbook with explicit readback and recovery posture
- move from stack-level operations into product-specific operations only when the boundary is already narrowed
Common failure modes
Starting with a repo instead of a symptom
Problem:
- the reader jumps into a tool-specific lane before knowing whether the issue is actually local to that tool.
Better path:
- start from the symptom-oriented runbook list first
Mutating the stack before capturing readback
Problem:
- evidence is lost and recovery becomes less auditable.
Better path:
- capture health, readiness, and relevant operator evidence before restart, redeploy, or rollback