Nine milestones sequence the implementation. Each milestone builds on the previous and has a specific "done when" criterion that is testable, not aspirational.
Timeline: Day 1–2
Goal: A running process with DB, API, and audit writer. No pipeline yet.
Deliverables:
pyproject.toml with deps (fastapi, uvicorn, sqlalchemy, alembic, pydantic)config.py with typed settings (Pydantic BaseSettings, reads DATABASE_URL from env)audit_events, system_health tablesAuditWriter implemented and unit-testedGET /health returns a hardcoded heartbeatGET /audit returns audit events (empty list)Done when: uv run python -m syris.main starts. Hit /health and get a JSON response. Write an audit event via test helper. See it at GET /audit. All unit-tested.
Timeline: Week 1
Goal: An event flows end-to-end through Normalize → Route → Execute (fast only) and every stage emits a queryable audit event. The full routing cascade structure is wired — even if most branches are stubs — so future milestones add intents without touching the router's structure.
Deliverables:
schemas/ package: MessageEvent, RoutingDecision, AuditEvent (Pydantic v2)storage/models.py + repos for events, audit, routing_decisionsnormaliser.py: accepts raw dict + channel, returns MessageEvent, persists + auditsrouter.py: full cascade structure wired — filters → IFTTT rules → fastpath → LLM fallback. Ships with one fastpath intent (timer.set). Rules and LLM fallback are stubs that return a canned routing.unhandled response, but the cascade is real and traversed in order.pipeline/executor.py: fast lane only, calls NoopTool, auditstools/executor.py: scope check + idempotency + NoopTool + auditpipeline/responder.py: stub final stage — logs that a response would be sent, does nothing yetGET /events, GET /audit, GET /audit?trace_id=X workingFastpath intents added this milestone: timer.set
Done when: POST a raw event → full trace at /audit?trace_id=X showing exactly 4 audit events (event.ingested, routing.decided, tool_call.attempted, tool_call.succeeded). POST same event again → event.deduped. Sending an event the router cannot match → routing.unhandled in audit with no tool call attempted. Zero log-spelunking required.
Timeline: Week 2
Goal: Multi-step workflows with checkpointing, retries, and crash recovery. Adds the llm_plan execution lane so the LLM can act as executor — not just router — for inputs that require dynamic step sequences.
Deliverables:
schemas/tasks.py: Task, Step, RetryPolicystorage/repos/tasks.pytasks/engine.py: claim → execute → checkpoint loop with FOR UPDATE SKIP LOCKEDtasks/step_runner.py: run step, handle retries, write checkpointtasks/state.py: state machine enforcement (illegal transitions blocked at DB level)tasks/recovery.py: startup reconciliationtasks/llm_runner.py: handles steps of type llm_decide. At each such step, calls the LLM with the user's original intent plus all prior step outputs, and receives a structured decision: either call a named tool next, or mark the task complete. The resulting tool call is dispatched through the normal tool executor with full gating. The step sequence is built dynamically as the task executes — it is not fixed at creation.GET /tasks, GET /tasks/{id}, POST /tasks/{id}/cancel|pause|resumeFastpath intents added this milestone: task.status, task.cancel
Done when: Create a 3-step task with a NoopTool at step 2. Kill the process mid-step-2. Restart. Task resumes from step 2 (not step 1). Audit shows interruption and recovery. Run 10 times — no duplicated side effects. Separately, route an unrecognised input through llm_plan → task created → LLM runner fires → llm.step_decided + tool_call.attempted in audit.
Timeline: Week 3
Goal: Autonomy levels, risk classification, and approval gates working end-to-end. Also: for user-initiated events, a response is synthesised and sent back through the originating channel.
Deliverables:
safety/autonomy.py: read/write current level, persist historysafety/risk.py: classify tool action → risk levelsafety/gates.py: gate decision logic per autonomy × risk matrixsafety/dryrun.py: preview protocolschemas/approvals.py + storage/repos/approvals.pyGET /approvals, POST /approvals/{id}/approve|denyPOST /controls/autonomypipeline/responder.py (full implementation): replaces the stub from Milestone 1. After any user-initiated event completes (fast-lane or multi-step), collects the tool outputs and calls the LLM to compose a natural-language reply. Dispatches the reply via the originating channel's outbound adapter. The send itself goes through the tool executor with normal gating — at A1, sending a reply email still requires approval.Fastpath intents added this milestone: autonomy.set, approval.list, approval.approve, approval.deny
Done when: Set autonomy to A1. Trigger a medium-risk tool call. Approval created at /approvals, tool not executed, gate.required in audit. Approve via API. Tool executes. gate.approved + tool_call.succeeded + response.sent in audit. Full trace queryable. Separately, trigger a low-risk fast-lane action — response.sent appears in audit with the composed reply.
Timeline: Week 4
Goal: Timers and scheduled events flow through the pipeline proactively.
Deliverables:
scheduler/loop.py: cron + interval + one-shot loopstorage/repos/schedules.pywatchers/base.py + HeartbeatWatcherGET /schedules, POST /schedules, PATCH /schedules/{id}GET /watchers, PATCH /watchers/{id}GET /health now uses real heartbeat dataFastpath intents added this milestone: schedule.create, schedule.list, schedule.cancel, timer.set (full implementation replacing stub), schedule.pause
Done when: Create a 30-second interval schedule. Observe schedule.fired in audit every ~30 seconds. Heartbeat appears at /health with real uptime. Disable watcher via API → confirm it stops ticking. Kill process, restart → schedule catches up per policy.
Timeline: Week 5
Goal: IFTTT-style rules fire, suppress correctly, and emit child events.
Deliverables:
rules/engine.py + condition evaluatorstorage/repos/rules.py (rules stored in DB)quiet_hours_policies table)GET /rules, PATCH /rules/{id}Fastpath intents added this milestone: rule.list, rule.enable, rule.disable, rule.create
Done when: Rule matching ha_event fires → emits child event with parent_event_id set → both in audit with same trace_id. Same event fired 5× in 1 second with 10s debounce → 1 rule.triggered + 4 rule.suppressed in audit.
Timeline: Week 6
Goal: The LLM is actually good at its job. It gets structured context, knows what tools are available, and produces consistent, useful responses. The architecture should support the model running multi-turn tool calling in an agentic loop — making it not just an LLM, but a true agent. This is the milestone where SYRIS goes from "LLM wired in" to "LLM worth using."
Deliverables:
llm/context.py: builds the context bundle passed to the LLM on any call. Includes: the user's original message, recent conversation history scoped to thread_id, relevant recent audit events (last N completions, active tasks), and the current tool registry (so the LLM knows what capabilities exist).llm/prompts.py: prompt templates and a SYRIS system prompt covering personality, role, response style, and constraints. Includes few-shot examples for the most common fallback intents.llm/provider.py: thin CompletionProvider interface wrapping the inference backend. Swappable — initial impl can be a direct API call; later milestones can swap in SGLang or another engine without touching callers.thread_id propagated through MessageEvent and stored on tasks, so conversation history is joinable.GET /llm/context?trace_id=X debug endpoint: shows exactly what context the LLM received for a given trace.Done when: Send two related messages in the same thread. LLM response to the second message demonstrably references the first (verifiable via /llm/context). Tool registry is present in context — LLM correctly selects a registered tool for an ambiguous input rather than hallucinating one. System prompt and templates are version-controlled and testable.
Timeline: Week 7–8
Goal: An MCP server's tools appear in the tool registry and execute with full SYRIS gating. Fastpath intents are auto-registered for discovered MCP tools.
Deliverables:
mcp/connection.py: persistent connection + reconnectmcp/provider.py: tool discovery + registry syncmcp/adapter.py: MCPToolAdapter(BaseTool)mcp/trust.py: TrustPolicy schema + loadermcp.<server>.<tool_name>)GET /integrations showing MCP server healthDone when: Connect a real MCP server. Tools appear in GET /integrations and in the tool registry visible to the LLM context. Execute one tool → full audit trail (scope check, risk, idempotency, gate, result). Disconnect server → health degrades in dashboard within 30 seconds. LLM correctly selects a newly-registered MCP tool for a relevant input.
Timeline: Week 9
Goal: A gated job submission and status reporting mechanism exists.
Deliverables:
workers/manager.py: job table, spawn/progress/cancelworkers/runtimes/process.py: OS process isolationGET /state shows job countDone when: Submit a stub long-running job via API → observe in /state → cancel → see cancellation in audit.
Timeline: Week 10+
Goal: A Home Assistant adapter or email adapter working end-to-end with live data, including a full response loop.
Deliverables:
Done when: A real-world event flows in, routes correctly, executes a tool with full audit trail, requires and passes approval if risk demands it, and a natural-language reply is sent back through the originating channel. SYRIS is useful.
On This Page
Milestone 0: SkeletonMilestone 1: Pipeline skeleton with real audit outputMilestone 2: Task engine + LLM-planned tasksMilestone 3: Safety layer + Approvals + Response synthesisMilestone 4: Scheduler + WatchersMilestone 5: Rules EngineMilestone 6: LLM conversation qualityMilestone 7: MCP IntegrationMilestone 8: Worker SkeletonMilestone 9: First real integrationsRelated