Description
🧠 What It Does
This workflow empowers your Telegram bot to act as a real-time speech translator.
Users can send voice messages in any supported language, and the bot will:
🔊 Convert the voice message to text using Whisper
🌍 Detect the language automatically (no need for the user to specify)
🌐 Translate it into another language with high accuracy
🗣️ Convert the translation back into speech
📩 Send the translated audio back as a voice message in Telegram
No typing, no menus — just seamless voice translation.
💡 Who Is This For
This template is perfect for:
🌐 Multilingual communities using Telegram
🧳 Travelers needing quick real-time voice translation
🎓 Language learners who want to compare pronunciation and meaning
🧑🏫 Educators engaging with non-native speakers
🛟 Support teams communicating with global users
🧪 Developers experimenting with AI-powered audio apps
Whether you're building a utility bot or enhancing user accessibility, this workflow is an ideal starting point.
⚙️ How It Works (Step-by-Step Flow)
1️⃣ Telegram Trigger Node
Listens for incoming voice messages from users.
2️⃣ Download Audio File
Grabs the .ogg file from Telegram servers for further processing.
3️⃣ Convert to MP3 (optional)
(If Whisper or your ASR API requires MP3 format.)
4️⃣ Speech Recognition (OpenAI Whisper)
Transcribes the audio file to plain text, detecting the language automatically.
5️⃣ Language Detection (Optional Fallback)
If Whisper doesn’t return a language code, a separate language detection service can step in.
6️⃣ Translation
Translates the transcribed text using OpenAI, DeepL, or Google Translate to your target language.
7️⃣ Text-to-Speech (TTS)
Converts translated text into spoken audio (MP3 or OGG), ready to be sent back.
8️⃣ Send Audio Back via Telegram
Replies to the user with the translated voice message and optionally the text version too.
🛠️ Tech Stack Used
🟣 Telegram Bot API (for input/output)
🟢 OpenAI Whisper (speech-to-text)
🔵 OpenAI or Google Translate (for translation)
🟠 Google TTS / ElevenLabs / Bark (for text-to-speech)
⚫ n8n Cloud or Self-hosted
🔄 Customization Ideas
🧑🔧 Allow user to set target language via inline command
📋 Add a log step to track usage or errors
🗣️ Return both audio + translated text for clarity
🎛️ Add a toggle for “formal/informal” translation tone
🎯 Limit usage to verified users or certain Telegram groups