Description
๐ง What It Does
This workflow empowers your Telegram bot to act as a real-time speech translator.
Users can send voice messages in any supported language, and the bot will:
๐ Convert the voice message to text using Whisper
๐ Detect the language automatically (no need for the user to specify)
๐ Translate it into another language with high accuracy
๐ฃ๏ธ Convert the translation back into speech
๐ฉ Send the translated audio back as a voice message in Telegram
No typing, no menus โ just seamless voice translation.
๐ก Who Is This For
This template is perfect for:
๐ Multilingual communities using Telegram
๐งณ Travelers needing quick real-time voice translation
๐ Language learners who want to compare pronunciation and meaning
๐งโ๐ซ Educators engaging with non-native speakers
๐ Support teams communicating with global users
๐งช Developers experimenting with AI-powered audio apps
Whether you're building a utility bot or enhancing user accessibility, this workflow is an ideal starting point.
โ๏ธ How It Works (Step-by-Step Flow)
1๏ธโฃ Telegram Trigger Node
Listens for incoming voice messages from users.
2๏ธโฃ Download Audio File
Grabs the .ogg file from Telegram servers for further processing.
3๏ธโฃ Convert to MP3 (optional)
(If Whisper or your ASR API requires MP3 format.)
4๏ธโฃ Speech Recognition (OpenAI Whisper)
Transcribes the audio file to plain text, detecting the language automatically.
5๏ธโฃ Language Detection (Optional Fallback)
If Whisper doesnโt return a language code, a separate language detection service can step in.
6๏ธโฃ Translation
Translates the transcribed text using OpenAI, DeepL, or Google Translate to your target language.
7๏ธโฃ Text-to-Speech (TTS)
Converts translated text into spoken audio (MP3 or OGG), ready to be sent back.
8๏ธโฃ Send Audio Back via Telegram
Replies to the user with the translated voice message and optionally the text version too.
๐ ๏ธ Tech Stack Used
๐ฃ Telegram Bot API (for input/output)
๐ข OpenAI Whisper (speech-to-text)
๐ต OpenAI or Google Translate (for translation)
๐ Google TTS / ElevenLabs / Bark (for text-to-speech)
โซ n8n Cloud or Self-hosted
๐ Customization Ideas
๐งโ๐ง Allow user to set target language via inline command
๐ Add a log step to track usage or errors
๐ฃ๏ธ Return both audio + translated text for clarity
๐๏ธ Add a toggle for โformal/informalโ translation tone
๐ฏ Limit usage to verified users or certain Telegram groups














