Features
π YAML Persona Profiles - Define background, skills, tone, and constraints in a simple YAML file - no code changes needed to customize the AI's identity.
π Multi-LLM Support - Works with OpenAI, Gemini, and any OpenAI-compatible API (DeepSeek, Grok, Ollama). Switch providers by changing environment variables.
β Response Evaluator - Optional second LLM pass that judges response quality against persona guidelines. Auto-reruns with feedback if the answer doesn't meet standards.
β‘ Streaming REST API - FastAPI backend with Server-Sent Events for real-time, token-by-token responses that feel natural and responsive.
π§ Context-Aware Caching - Intelligent response caching with TF-IDF similarity matching. Knowledge answers cached 30 days, conversational responses 24 hours.
π‘οΈ Rate Limiting & Auth - Configurable per-IP rate limiting and API key authentication to protect against abuse.
π§ Tool Calling - Built-in tools to capture visitor contact info and log unanswered questions - helps capture useful follow-up information.
π Conversation Logging - PostgreSQL database tracks all sessions and messages for analytics and improvement.