Introduction
Open Avatar Chat is a modular interactive digital human dialogue implementation that can run full functionality on a single PC. It supports cloud-based APIs for ASR + LLM + TTS or local multimodal language models.
Requirements
- Python version >=3.11.7, <3.12
- CUDA-enabled GPU
- The digital human component can perform inference using GPU/CPU. The test device CPU is i9-13980HX, achieving up to 30 FPS for CPU inference.
TIP
Using cloud APIs for ASR + LLM + TTS can greatly reduce hardware requirements. See Bailian API config.
Component Dependencies
| Type | Open Source Project | GitHub | Model |
|---|---|---|---|
| RTC | HumanAIGC-Engineering/gradio-webrtc | GitHub | |
| WebUI | HumanAIGC-Engineering/OpenAvatarChat-WebUI | GitHub | |
| VAD | snakers4/silero-vad | GitHub | |
| Avatar | HumanAIGC/lite-avatar | GitHub | |
| TTS | FunAudioLLM/CosyVoice | GitHub | |
| Avatar | aigc3d/LAM_Audio2Expression | GitHub | HuggingFace |
| facebook/wav2vec2-base-960h | HuggingFace / ModelScope | ||
| Avatar | TMElyralab/MuseTalk | GitHub | |
| Avatar | Soul-AILab/SoulX-FlashHead | GitHub | HuggingFace |