May 30, 2025
Automate Invoice Data Extraction from Google Drive to Google Sheets Using Dumpling AI and n8n
Invoices pile up quickly, and manually entering data into spreadsheets is a tedious, error-prone task no business owner enjoys. What if every invoice dropped into a Google Drive folder was automatically read, processed, and logged into Google Sheets?
That’s exactly what this automation does.
Using n8n, Google Drive, Dumpling AI, and Google Sheets, this workflow extracts structured invoice data, including line items, and saves them neatly into a spreadsheet, without a single manual click.
🧠 Why Automate Invoice Processing?
Manual invoice entry is:
- Time-consuming
- Prone to human error
- Not scalable as your business grows
This automation helps you:
- Save hours every week
- Eliminate transcription mistakes
- Keep financial records up to date in real time
- Have consistent formatting across all invoices
Let’s walk through how this works.
How the Automation Works (Step-by-Step)
Step 1: Watch for New Invoices in Google Drive
Node: Google Drive Trigger
This node is configured to trigger the workflow any time a new file is created in a specific Google Drive folder.
Configuration:
- Trigger on: fileCreated
- Folder to Watch: Choose or paste your shared invoice folder ID
- Polling: every minute (or as often as needed)
Purpose: Automates the detection process so no manual checks are needed. As soon as an invoice is uploaded, the workflow starts.


Step 2: Download the File
Node: Download Invoice File
This step uses the file ID captured from the trigger to download the binary data of the uploaded invoice.
Configuration:
- Operation: Download
- File ID: ={{ $json.id }} (from previous node)
Purpose: The file is downloaded so it can be prepared for AI analysis.


Step 3: Convert File to Base64 Format
Node: Convert Invoice File to Base64
This transformation is needed because Dumpling AI requires files to be uploaded in base64 format when using their document parser.
Configuration:
- Operation: binaryToProperty
Output: JSON object with base64 encoded invoice content.


Step 4: Send the File to Dumpling AI for Extraction
Node: Send file to Dumpling AI for Data Extraction
This HTTP node connects with Dumpling AI’s /extract-document endpoint and sends the base64-encoded invoice file for parsing.
Payload Example:
{
“inputMethod”: “base64”,
“files”: [“{{ $json.data }}”],
“prompt”: “Extract the order number, document date, PO number, sold to name and address, ship to name and address, list of items with model, quantity, unit price, and total price, and the final total amount including tax.”,
“jsonMode”: “true”
}
Key Features:
- Custom prompt ensures extraction is accurate and structured
- Enables multi-field parsing, including nested item lists
Tip: Tweak the prompt to match your specific invoice structure or add fields like invoice number, tax ID, etc.


Step 5: Parse the JSON Response from Dumpling AI
Node: Parse Dumpling AI JSON Response
Dumpling AI returns the result as a single string inside the results field. This node uses JavaScript to parse the raw string into usable JSON data.
Code Used:
const raw = $input.first().json.results;
const parsed = JSON.parse(raw);
return [{ json: parsed }];
Output: Structured JSON with fields like order_number, document_date, and an array of items.


Step 6: Split the Line Items
Node: Split line Items from Invoice
The parsed invoice typically contains multiple product entries. This node splits the items[] array so that each item can be processed and saved as a separate row.
Configuration:
- Field to Split: items
Purpose: Supports row-by-row logging of invoice items.


Step 7: Append Data to Google Sheets
Node: Save Data to Google Sheet
Each item, along with shared invoice-level data, is now ready to be written into your Google Sheet.
Columns Mapped:
- Order number
- Document date
- PO number
- Sold to name and address
- Ship to name and address
- Model
- Description
- Quantity
- Unit price
- Total price
This ensures every line item is recorded in a structured and searchable format.
Notes:
- Use dynamic expressions like ={{ $(‘Previous Node’).item.json.field }}
- Pre-create matching headers in your Google Sheet for best results


Final Workflow Sequence
- Detect new invoice file in Drive
- Download the file
- Convert to base64
- Send to Dumpling AI for structured extraction
- Parse the response
- Split each item
- Save to Google Sheets
This runs completely unattended once deployed.
Bonus Tips
- Validation: Add logic to flag missing fields like PO number or tax
- Backup: Use a second Google Sheet to store raw JSON logs
- Notifications: Send a Slack or Gmail message once an invoice is processed
- Error Handling: Add try/catch blocks and fallback paths for corrupted PDFs or missing files
📌 Requirements Checklist
- ✅ Dumpling AI API Key
- ✅ Google Drive folder ID
- ✅ Google Sheet with headers set
- ✅ n8n running with Gmail and Google Drive credentials connected
🧾 Conclusion
This invoice automation workflow takes a painful, manual process and turns it into a seamless system. From detecting new uploads to parsing complex invoice data and storing it cleanly in Google Sheets, everything is handled instantly.
For businesses handling dozens or hundreds of invoices per month, this setup is a no-brainer. It scales easily, saves time, and ensures you never miss a record.