Baseline architecture design doc for the SYRIS v3 core (event pipeline, tasks, integrations, safety, observability, workers).
This document is a consolidated architecture draft assembled from prior system requirements and iterative design discussions. It should be treated as a living baseline and updated as implementation realities emerge.
MessageEvent and goes through Normalize > Route > Plan/Execute.At the center is an Event Processing Core that consumes normalized events, routes them, and executes actions through a Tool/Integration Layer, while persisting:
Key design choice: use a single durable event log + task store to reconstruct state after restarts. This isn’t full event sourcing (you may also store current-state projections), but it provides replay/debuggability.
MessageEvent (versioned schema)MessageEvent (unified internal event schema)Versioned (e.g., schema_version="1.0"). Must preserve provenance, identity, content, and correlation.
Fields (suggested):
event_id (UUID)schema_versionoccurred_at (source timestamp)ingested_atsource:
channel (e.g., email, sms, slack, ha_event, webhook, scheduler, rule_engine, vision)connector_id (which integration instance)thread_id / conversation_id (if applicable)message_id (native id if applicable)actor:
actor_type (user | system | integration)actor_id (stable)content:
text (optional)structured (dict) (e.g., parsed intent, webhook payload summary)attachments (list of metadata refs)links (list)context:
locale, timezonedevice_id (if from IoT)correlation:
trace_id (spans the whole chain)parent_event_id (if triggered by watcher/rule)dedupe_key (if provided/derived)security:
sensitivity (low/med/high)redaction_policy_idTradeoff: richer schema costs effort, but avoids bespoke handling per channel and makes dashboarding and replay viable.
Task fields:
task_id, created_at, status (pending / running / paused / succeeded / failed / canceled)trigger_event_id (what started it)trace_idcurrent_step_idautonomy_level_at_startlabels (for dashboards)next_wake_time (for sleepers)Step fields:
step_id, task_id, name, statusattempt, max_attempts, retry_backoff_policyinput (structured), checkpoint (structured), output (structured)idempotency_key (required if effectful)started_at, ended_at, error (structured)Every meaningful operation writes an AuditEvent:
AuditEvent fields:
audit_id, timestamptrace_id, span_id, parent_span_idevent_id / task_id / step_id referencestype (enum)summary (human readable)payload (structured, redacted)outcome (success/failure/suppressed)latency_ms, connector_id, tool_namerisk_level, autonomy_levelTradeoff: append-only logs grow; mitigate with retention + compaction + export.
Input: raw adapter events
Output: MessageEvent + normalization audit record
Responsibilities:
dedupe_key.Testability: deterministic transforms.
Input: MessageEvent
Output: RoutingDecision
Routing layers (in order):
Routing decision must record:
Tradeoff: rules-first is more work upfront but yields predictable ops and lower cost.
Two modes:
Execution always:
Core loop:
idempotency_key = hash(tool_name + stable inputs + trace_id + step_id)trace_id created at ingestion.span_id links to tool calls.Tradeoff: building a real “workflow engine” is heavier than simple background jobs, but it’s required for restart safety and operator control.
InboundAdaptercapabilities() > what events it can emit (threads, attachments, message types)connect() / poll() or webhook_handler()normalize(raw) > returns normalized intermediate (or directly MessageEvent)OutboundAdapter / Toolcapabilities() > read/send/create/delete/control, formatting limits, supports threads, etc.scopes_required(action) > list of permission scopesexecute(request, context) > ToolResult (structured)Registry stores:
Core behavior adapts to capabilities (e.g., if channel doesn’t support threads, it creates a new message).
Core only knows:
New connectors register themselves + provide adapters; routing and tasks remain stable.
get_secret(connector_id, key) returns ephemeral tokenTradeoff: strict interfaces slow early prototyping but prevent integration sprawl and make observability consistent.
Requirements:
Design:
schedule_id, type (cron/interval/one-shot), next_run_at, last_run_atpayload (what event to emit)timezone, quiet_hours_policy_idcatch_up_policy (skip, run_once, run_all_with_cap)Scheduler loop:
MessageEvent with source.channel="scheduler"A watcher is a specialized proactive component that produces events. Examples:
Watcher interface:
watcher_id, enabledtick(now) > emits 0..N MessageEventsRules are stored as:
rule_id, enabled, priorityconditions: match on event fields + time windows + state predicatesactions: emit event(s) or start task template(s)debounce_ms, dedupe_window, escalation_policyaudit_policy (always log evaluation outcome)Rules evaluation:
MessageEvent, evaluate relevant rules (indexed by source/channel/type).MessageEvent (source=rule_engine, parent_event_id set).Tradeoff: rules+watchers can explode in complexity; constrain with simple primitives and strong observability.
Persist current autonomy level and history in system state.
Each tool action has:
risk_level (low/medium/high/critical)Gate decision record includes:
For risky tools:
preview(request) returning:
Tradeoff: dry-run requires tool-specific work; start with the highest-risk tools first (email/send, delete, device control).
Memory writes go through a MemoryPolicy gate:
Tradeoff: separating facts from history prevents accidental “model hallucination memory” and supports privacy/deletion.
You maintain projections (query-optimized views) derived from the append-only audit/event log:
status (healthy/degraded/down)API endpoints to:
Every control action generates an audit event with actor=operator.
Tradeoff: projections add complexity, but they’re the only way to deliver dashboard-first ops without “grep logs”.
On restart:
idempotency_key > tool_call_id > outcome.If an integration is down:
Tradeoff: robust idempotency requires careful design, but it’s mandatory to avoid duplicate emails, calendar events, or device toggles.
Even single-user:
email.send, calendar.write, ha.switch.control)Goal: run heavy jobs without blocking the main loop.
Design now (minimal):
job_id, type, status, resource_limits, trace_id, task_idspawn(job) > starts isolated process/containerreport_progress(job_id, percent, message, artifacts)checkpoint(job_id, state)cancel(job_id)Isolation options:
Artifacts:
Tradeoff: starting with processes is simpler; containers later for stronger isolation.
Treat vision as just another inbound adapter:
MessageEvent with source.channel="vision"Rules/watchers can subscribe to vision events with the same debouncing and quiet-hours policies.
jarvis_os/
core/
models/ # pydantic schemas: MessageEvent, ToolCall, Task, AuditEvent
pipeline/ # normalize, route, execute
routing/ # deterministic intents, rule matching, llm fallback
tasks/ # task engine, step runner, persistence adapters
safety/ # autonomy, risk model, gating, approvals
memory/ # memory policy, stores, retention
observability/ # audit writer, projections, metrics, health
state/ # queryable system state aggregator
integrations/
registry.py # tool registry, capability discovery
inbound/ # adapters: email, webhook, home assistant, etc.
outbound/ # tools: messaging, calendar, ha control, etc.
scheduler/
scheduler.py
watchers/ # watcher implementations
rules/ # rules engine + rule definitions
workers/
manager.py # job lifecycle
runtimes/ # process runtime, (future) container runtime
api/
app.py # FastAPI
routes/ # tasks, events, audit, health, integrations, controls
storage/
db.py # SQLAlchemy/SQLModel
migrations/
artifact_store.py
secrets_store.py
main.py # boot: start loopsControl plane:
Workers:
Minimum endpoints:
GET /health (status, heartbeat)GET /state (aggregated system state)GET /events (inbound events with filters)GET /audit (search by trace_id/task_id/tool/outcome/time)GET /tasks, GET /tasks/{id}, POST /tasks/{id}/cancel, /retry, /pause, /resumeGET /schedules, POST /schedules, PATCH /schedules/{id}GET /watchers, PATCH /watchers/{id} enable/disableGET /rules, PATCH /rules/{id} enable/disableGET /integrations, PATCH /integrations/{id} enable/disableGET /approvals, POST /approvals/{id}/approve|denyPOST /controls/autonomy set levelImplement a small intent DSL for common commands:
Use:
LLM fallback used only if:
LLM called through a single “ModelTool” with:
LLM outputs treated as suggestions, not authority:
Every pipeline stage writes audit events.
Every tool call writes:
ToolCallAttempted (request metadata redacted)ToolCallSucceeded/Failed (latency, error)Maintain projections via:
On This Page
1. Assumptions & non-goalsAssumptionsNon-goals (for now)2. Goals & quality attributesPrimary goalsQuality attributes (tradeoffs made explicit)3. High-level system overviewCore shape: “event-sourced-ish modular monolith”Major components4. Data model & core schemas4.1MessageEvent (unified internal event schema)4.2 Task & workflow primitives4.3 Audit log (append-only)5. Processing pipeline: Normalize > Route > Plan/Execute5.1 Normalize5.2 Route (rules-first routing with LLM fallback)5.3 Plan/Execute6. Task engine & workflow orchestration6.1 Task model: resumable, checkpointed steps6.2 Step execution semantics6.3 Retries, backoff, pause/resume6.4 Cancellation6.5 Correlation & traceability7. Tools, integrations & plugin system7.1 Strict interfacesInboundAdapterOutboundAdapter / Tool7.2 Tool Registry & capability discovery7.3 Adding new connectors without core changes7.4 Secrets boundary8. Scheduler, watchers & rules engine8.1 Scheduler8.2 Watchers framework8.3 Rules engine (IFTTT-style)8.4 Anti-spam / device-flap controls9. Autonomy, safety, confirmation gates & dry-run9.1 Autonomy levels (example)9.2 Risk classification9.3 Confirmation gates9.4 Dry-run mode10. Memory architecture (policy-driven, facts vs history)10.1 Memory tiers10.2 Policy-driven writes10.3 Storage11. Observability, dashboard API & state projections11.1 Everything exposable via API: core principle11.2 Telemetry types11.3 Health model11.4 Operator controls (minimum viable)12. Reliability, idempotency, retries & crash recovery12.1 Crash recovery12.2 Idempotency strategy12.3 Retry discipline12.4 Graceful degradation13. Security, privacy, redaction & access control13.1 Dashboard access control13.2 Redaction by default13.3 Least privilege tools/scopes13.4 Data minimization14. Future-facing: sandbox workers & vision streams (skeleton)14.1 Sandbox worker subsystem (long-running work)14.2 Pluggable vision/event stream support15. Implementation blueprint15.1 Suggested Python package layout (modular monolith)15.2 Persistence choices15.3 Execution runtime (control plane + workers)15.4 APIs (dashboard-first)15.5 Deterministic “fast path” intent parsing15.6 LLM integration discipline15.7 Observability implementation details15.8 Minimal milestone plan (practical sequencing)