Features
🎭 YAML Persona Profiles - Define personality, expertise, tone, and constraints in a simple YAML file - no code changes needed to customize the AI's identity.
🔌 Multi-LLM Support - Works with OpenAI, Gemini, and any OpenAI-compatible API (DeepSeek, Grok, Ollama). Switch providers by changing environment variables.
✅ Response Evaluator - Optional second LLM pass that judges response quality against persona guidelines. Auto-reruns with feedback if the answer doesn't meet standards.
⚡ Streaming REST API - FastAPI backend with Server-Sent Events for real-time, token-by-token responses that feel natural and responsive.
🧠 Context-Aware Caching - Intelligent response caching with TF-IDF similarity matching. Knowledge answers cached 30 days, conversational responses 24 hours.
🛡️ Rate Limiting & Auth - Configurable per-IP rate limiting and API key authentication to protect against abuse.
🔧 Tool Calling - Built-in tools to capture visitor contact info and log unanswered questions - turns conversations into leads.
📊 Conversation Logging - PostgreSQL database tracks all sessions and messages for analytics and improvement.