Production Architecture Plan

How this tool evolves from a static portfolio demo to a live log analysis platform.

Current State vs Production

Portfolio Demo (Now)

  • Log files analyzed locally via analyzer.py
  • Output JSON committed to repo
  • Frontend reads static files from Netlify CDN
  • No backend โ€” fully serverless
  • Claude API called once, results frozen
  • New analysis requires manual re-run + commit

Live Production

  • Agents forward logs in real-time from endpoints
  • FastAPI backend ingests, chunks, and queues events
  • Claude API called continuously on new batches
  • WebSocket pushes updates to frontend instantly
  • Frontend UI unchanged โ€” only data source swaps
  • Multiple simultaneous log streams supported

Production Architecture

๐Ÿ–ฅ๏ธ
Log Sources
Linux syslog
Windows Events
Network gear
App logs
โ†’
๐Ÿ“ก
Collection
Filebeat
WinLogBeat
Fluentd
Syslog UDP/TCP
โ†’
โš™๏ธ
Analysis Engine
FastAPI
Claude API
Chunking + synthesis
PostgreSQL
โ†’
๐Ÿ“Š
Dashboard
This UI
WebSocket feed
Real-time updates
No page refresh

Component Details

LayerComponentRole
Collection Filebeat / WinLogBeat / rsyslog Lightweight agents installed on endpoints. Ship logs via TCP/TLS to the analysis backend. Zero parsing at source โ€” ship raw, analyze centrally.
Ingestion FastAPI endpoint + Redis queue Receives log streams, batches into 500-line chunks, enqueues for analysis. Redis handles burst traffic and decouples ingestion from API calls.
Analysis Claude API (claude-sonnet-4-6) Processes each chunk with system_prompt.txt. Runs synthesis across batches every 60s with synthesis_prompt.txt. Results stored in PostgreSQL.
Delivery WebSocket / Server-Sent Events Pushes new events and updated AI analysis to the dashboard in real-time. Frontend subscribes on load โ€” same UI, different data source.
Storage PostgreSQL + optional Elasticsearch PostgreSQL for structured event storage and querying. Elasticsearch optional for full-text log search at scale (>10M events/day).
Auth JWT + API key API key for log-shipping agents. JWT for dashboard users. Role-based access: read-only analyst vs admin.

Upgrade Path โ€” Demo to Production

1

Add FastAPI Backend

Build a minimal Python FastAPI service that accepts log lines via POST, chunks them, calls the Claude API, and returns JSON matching the existing output schema. Frontend is unchanged โ€” just swap fetchJSON('output/...') for a fetch to the API endpoint. Deploy behind a reverse proxy (nginx, Caddy).

2

Add Real-Time Delivery

Replace the static fetch with a WebSocket connection. Backend pushes new events and updated AI analysis as they arrive. Frontend already re-renders on new data โ€” the render() function is stateless and handles partial datasets cleanly (as demonstrated by the simulation mode).

3

Install Log Forwarders

Deploy Filebeat on Linux hosts, WinLogBeat on Windows, or configure rsyslog forwarding. Point all agents at the FastAPI ingestion endpoint. This step is infrastructure โ€” no code changes to the analyzer or dashboard.

4

Scale Horizontally

At high volume (>50k events/min), introduce Kafka or Redis Streams between ingestion and analysis. Multiple analysis workers consume from the queue in parallel. PostgreSQL handles storage; add read replicas for dashboard queries under load.

Estimated Operating Cost

ComponentSpecEst. Monthly Cost
VPS / Compute2 vCPU, 4GB RAM (DigitalOcean, Linode)~$20โ€“40
PostgreSQLManaged DB, 10GB storage~$15โ€“25
Claude API~500k tokens/day analysis (typical SOC)~$20โ€“60
BandwidthLog ingestion + WebSocket delivery~$5โ€“15
TotalSmall-to-medium deployment~$60โ€“140/mo

This document outlines the production path for the AI Log Analyzer. The current demo uses pre-analyzed static JSON files served from Netlify โ€” sufficient to demonstrate the core AI analysis capability without infrastructure overhead. The analyzer.py script and prompt files in this repo are the foundation of the production analysis engine.