Skip to content

FlashHead Quick Start

Both LLM and TTS use cloud APIs. The avatar uses FlashHead Lite mode for diffusion-based talking head generation (GPU only).

Handlers Used

TypeHandlerReference
Clientclient/rtc_client/client_handler_rtcRTC Client
VADvad/silerovad/vad_handler/silero
ASRasr/sensevoice/asr_handler_sensevoice
LLMllm/openai_compatible/llm_handler/llm_handler_openai_compatibleOpenAI Compatible
TTStts/bailian_tts/tts_handler_cosyvoice_bailianBailian CosyVoice
Avataravatar/flashhead/avatar_handler_flashheadFlashHead

Quick Start

bash
uv run install.py --config config/chat_with_openai_compatible_bailian_cosyvoice_flashhead.yaml
uv run scripts/download_models.py --handler flashhead
uv run src/demo.py --config config/chat_with_openai_compatible_bailian_cosyvoice_flashhead.yaml

NOTE

FlashHead depends on flash-attn. First-time compilation may take a while.