Routing Strategy:

Temperature: 0.7

Max Tokens: 150

Enable Streaming

Demo Mode

Simulated

Messages: 0

Avg Response: -

Total Tokens: 0

This is a real working demo that runs open source models from HuggingFace directly in your browser using our intelligent orchestration system.

🤖 Open Source Models: DialoGPT, Zephyr-7B, local inference

🎯 Smart Routing: Automatically selects optimal model based on strategy

🔄 Multi-Engine: WebGPU acceleration with WASM fallback

📊 Live Analytics: Real performance metrics and model status

🆓 Completely Free: No API keys, no costs, all local

Ask me anything! SmolLM3-3B will generate a response in about 3-5 seconds.

💡 Live System: Running real AI inference on CPU. Press Enter to send messages (Shift+Enter for new line).