1. The Problem
Manual expense tracking is tedious and often forgotten. Most people struggle to open a spreadsheet or an app every time they buy something small. When users try to automate this via voice notes, they face technical hurdles:
Audio Complexity: Telegram doesn't send raw audio directly; it sends a file_id that requires a secure multi-step handshake to download.
Information Extraction: Standard bots can't "understand" context or perform math on multiple amounts mentioned in a single breath.
2. The Solution
This workflow creates a seamless bridge between Telegram's Bot API and Google Sheets, powered by Google Gemini 1.5 Flash.
Smart Downloading: It uses a specialized HTTP Request node to navigate Telegram's security, fetching the .oga voice file as a binary "data" field.
AI Listening: Gemini acts as the "ears," processing the raw audio to identify amounts, dates, and categories.
Logic Engine: The AI is instructed to perform real-time math—summing up separate amounts mentioned in one voice note (e.g., 500 + 1500).
3. The Result
A fully hands-free accounting system where:
Zero Friction: You just speak to your bot: "I spent 500 on lunch and 1500 for the grocery."
Instant Logging: Within seconds, a new row appears in your Google Sheet showing a total of 2000, the date, and the category.
Accuracy: Eliminates manual entry errors and ensures every penny is tracked the moment it's spent.
Key Technical Highlight:
The workflow is optimized for stability, using Binary Data handling to ensure Gemini receives high-fidelity audio, and a structured JSON Output for perfectly organized spreadsheets.
mimi.n8n27@gmail.com