Skip to main content
Max Solomyanov
Back to projects

Features

  • 🎭 YAML Persona Profiles - Define personality, expertise, tone, and constraints in a simple YAML file - no code changes needed to customize the AI's identity.

  • 🔌 Multi-LLM Support - Works with OpenAI, Gemini, and any OpenAI-compatible API (DeepSeek, Grok, Ollama). Switch providers by changing environment variables.

  • ✅ Response Evaluator - Optional second LLM pass that judges response quality against persona guidelines. Auto-reruns with feedback if the answer doesn't meet standards.

  • ⚡ Streaming REST API - FastAPI backend with Server-Sent Events for real-time, token-by-token responses that feel natural and responsive.

  • 🧠 Context-Aware Caching - Intelligent response caching with TF-IDF similarity matching. Knowledge answers cached 30 days, conversational responses 24 hours.

  • 🛡️ Rate Limiting & Auth - Configurable per-IP rate limiting and API key authentication to protect against abuse.

  • 🔧 Tool Calling - Built-in tools to capture visitor contact info and log unanswered questions - turns conversations into leads.

  • 📊 Conversation Logging - PostgreSQL database tracks all sessions and messages for analytics and improvement.

Technology Stack

PythonFastAPIOpenAI API

Reach out via email or LinkedIn.

LinkedIn