creating-custom-tts-voice-agents
Creating Custom TTS Voice Agents
Introduction
This guide will walk you through creating custom text-to-speech (TTS) voice agents in OpenClaw. By the end of this article, you'll be able to configure voice agents that use different TTS providers, customize voice characteristics, and implement advanced features like voice switching and expressive cues.
Understanding TTS Providers
OpenClaw supports three main TTS providers:
- ElevenLabs - High-quality neural voices with voice cloning capabilities
- OpenAI - Professional-grade TTS with natural-sounding voices
- Edge TTS - Microsoft's free neural TTS service (no API key required)
Each provider has its strengths:
- ElevenLabs excels at emotional expression and voice variety
- OpenAI offers the most natural conversational flow
- Edge TTS is completely free and doesn't require API keys
Basic Configuration
To enable TTS in OpenClaw, you need to configure it in your openclaw.json file. Here's a minimal configuration:
{
"messages": {
"tts": {
"auto": "always",
"provider": "elevenlabs"
}
}
}
This configuration enables TTS for all replies and uses ElevenLabs as the primary provider.
Provider-Specific Settings
ElevenLabs Configuration
{
"messages":tts": {
"provider": "elevenlabs",<|fileogram|>
"elevenlabs": {
"voiceId": "pMsXgVXv3BLzUgSXRplE",
"modelId": "eleven_multilingual_v2",
n "voiceSettings": {
"stability": 0.5,
"similarityBoost": 0.75,
"style": 0.0,
"useSpeakerBoost": true,
"speed": 1.0
}
}
}
}
Key parameters:
voiceId: Unique identifier for the voice (find in ElevenLabs dashboard)modelId: TTS model to use (v2 or v3)stability: 0-1, lower values add more variationsimilarityBoost: 0-1, higher values improve consistencyspeed: 0.5-2.0, playback speed multiplier
OpenAI Configuration
{
"messages": {
"tts": {
"provider": "openai",
"openai": {
"model": "gpt-4o-mini-tts",
"voice": "alloy"
}
}
}
}
Available voices: alloy, echo, fable, onyx, nova, shimmer
Edge TTS Configuration
{
"messages": {
"tts": {
"provider": "edge",
"edge": {
"voice": "en-US-MichelleNeural",
"lang": "en-US",
"outputFormat": "audio-24khz-48kbitrate-mono-mp3",
"rate": "+10%",
"pitch":-5%"
}
}
}
}
Popular voices include:
- en-US-JennyNeural
- en-US-MichelleNeural
- en-US-GuyNeural
- en-GB-SarahNeural
Advanced Features
Voice Switching
You can switch voices dynamically within a single response using directives:
Hello there!
[[tts:provider=elevenlabs voiceId=pMsXgVXv3BLzUgSXRplE model=eleven_v3]]
This text will be spoken with a different voice.
Expressive Cues
Add emotional context to your voice output:
[[tts:text]](laughs) That's hilarious![[/tts:text]]
[[tts:text]](sings) ♫ La la la ♫[[/tts:text]]
These cues add appropriate prosody to the spoken output.
Best Practices
- Fallback Providers: Always configure at least two providers for redundancy
{
"messages": {
"tts": {
"provider": "elevenlabs"
}
}
}
OpenClaw automatically falls back to OpenAI if ElevenLabs fails, and to Edge TTS if both fail.
- Length Management: For long responses, enable summarization
{
"messages": {
"tts": {
"maxTextLength": 4000
}
}
}
-
Performance: Use Edge TTS for simple notifications when quality isn't critical
-
Cost Control: Monitor your API usage, especially with ElevenLabs and OpenAI
Troubleshooting
Common Issues
- No Audio Output: Check that TTS is enabled and your API keys are valid
- Voice Not Changing: Verify the voice ID is correct and the provider is properly configured
- Poor Quality: Adjust stability and similarity settings for ElevenLabs voices
Debug Commands
Use these slash commands to debug TTS issues:
/tts status
/tts provider elevenlabs
/tts audio Test message
Conclusion
Custom TTS voice agents add a powerful dimension to your OpenClaw setup. By understanding the different providers and their configuration options, you can create agents with distinct personalities and appropriate vocal characteristics for different use cases.
Remember to always test your configurations and monitor performance and costs, especially when using commercial providers like ElevenLabs and OpenAI.
For more information, refer to the official OpenClaw TTS documentation.
Enjoyed this article?
Join the ClawMakers community to discuss this and more with fellow builders.
Join on Skool — It's Free →