Let any text speak - with open-source

Open-source text-to-speech powered by QWEN3-TTS

**→ The app is already online. You can try it at anyspeak.ai and check repo on anyspeak-ai · GitHub
**
:sparkles: Features

  • TTS: Multiple voices and Qwen3-TTS models (1.7B CustomVoice, VoiceDesign, Base). Supported languages: Chinese, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, Spanish.

  • Speaker tags: Use [SPEAKER_NAME] in text for multi-speaker output.

  • Mood tags: Use [SPEAKER:mood] (e.g. [VIVIAN:happy]) for emotional expressiveness.

  • Timing tags: Use [SPEAKER:mood:timing] (e.g. [RYAN:+1.5] for 1.5 s pause after, [VIVIAN:happy:-0.3] for shorter gap or even speaker overlapping) or the + TIMING button to control pacing between segments.

  • Custom voices: Voice Design by selecting attributes (gender, language, old/young, slow/fast, high/low, loud/soft, warm/rough).

  • CLEAN: Analyze and clean text for better TTS output.

  • CHECK: Quality check with Whisper (compare original vs. synthesized).

  • IMPROVE: Post-process generated audio by chunk: selectively regenerate mispronounced segments, trim silence, add pauses, or edit text and re-generate. Compare original vs. improved before applying. You can also edit finished MP3s afterward (e.g. different wording or different speaker at a specific place) without regenerating everything.

  • VIDEO: Create video from audio and images (with optional subtitles).

  • Load & Save (MP3): Load and save via MP3. Very powerful: you can continue working at the same place, and files store the original text with tags in their metadata.

  • Import: Load text from URL or file.

  • Run local - easy installation

1 Like