Backend Architecture & Production Roadmap
Production Architecture Plan
How this tool evolves from a static portfolio demo to a live log analysis platform.
analyzer.py| Layer | Component | Role |
|---|---|---|
| Collection | Filebeat / WinLogBeat / rsyslog | Lightweight agents installed on endpoints. Ship logs via TCP/TLS to the analysis backend. Zero parsing at source โ ship raw, analyze centrally. |
| Ingestion | FastAPI endpoint + Redis queue | Receives log streams, batches into 500-line chunks, enqueues for analysis. Redis handles burst traffic and decouples ingestion from API calls. |
| Analysis | Claude API (claude-sonnet-4-6) | Processes each chunk with system_prompt.txt. Runs synthesis across batches every 60s with synthesis_prompt.txt. Results stored in PostgreSQL. |
| Delivery | WebSocket / Server-Sent Events | Pushes new events and updated AI analysis to the dashboard in real-time. Frontend subscribes on load โ same UI, different data source. |
| Storage | PostgreSQL + optional Elasticsearch | PostgreSQL for structured event storage and querying. Elasticsearch optional for full-text log search at scale (>10M events/day). |
| Auth | JWT + API key | API key for log-shipping agents. JWT for dashboard users. Role-based access: read-only analyst vs admin. |
Build a minimal Python FastAPI service that accepts log lines via POST, chunks them, calls the Claude API, and returns JSON matching the existing output schema. Frontend is unchanged โ just swap fetchJSON('output/...') for a fetch to the API endpoint. Deploy behind a reverse proxy (nginx, Caddy).
Replace the static fetch with a WebSocket connection. Backend pushes new events and updated AI analysis as they arrive. Frontend already re-renders on new data โ the render() function is stateless and handles partial datasets cleanly (as demonstrated by the simulation mode).
Deploy Filebeat on Linux hosts, WinLogBeat on Windows, or configure rsyslog forwarding. Point all agents at the FastAPI ingestion endpoint. This step is infrastructure โ no code changes to the analyzer or dashboard.
At high volume (>50k events/min), introduce Kafka or Redis Streams between ingestion and analysis. Multiple analysis workers consume from the queue in parallel. PostgreSQL handles storage; add read replicas for dashboard queries under load.
| Component | Spec | Est. Monthly Cost |
|---|---|---|
| VPS / Compute | 2 vCPU, 4GB RAM (DigitalOcean, Linode) | ~$20โ40 |
| PostgreSQL | Managed DB, 10GB storage | ~$15โ25 |
| Claude API | ~500k tokens/day analysis (typical SOC) | ~$20โ60 |
| Bandwidth | Log ingestion + WebSocket delivery | ~$5โ15 |
| Total | Small-to-medium deployment | ~$60โ140/mo |
This document outlines the production path for the AI Log Analyzer. The current demo uses pre-analyzed static JSON files served from Netlify โ sufficient to demonstrate the core AI analysis capability without infrastructure overhead. The analyzer.py script and prompt files in this repo are the foundation of the production analysis engine.