May 27, 2025
Automatically Crawl Sites with Dumpling AI, Summarize with GPT-4o, and Send Beautiful Newsletters via Gmail
Introduction
Imagine turning your favorite industry blogs, competitor updates, or curated websites into a polished HTML newsletter, fully automated, styled, and summarized, all delivered to your inbox or subscribers.
This n8n workflow does exactly that.
With this system, you can:
- Pull website URLs from Google Sheets
- Crawl those pages using Dumpling AI
- Clean, extract, and format article content
- Summarize and repackage the info using GPT-4o into a newsletter
- Send a fully formatted email through Gmail — no writing, no copy-paste
Perfect for:
- Weekly internal team updates
- Curated content for subscribers
- Competitive intelligence digests
Let’s break it down, step by step.
Step 1: Start the Workflow Manually
Node: Start Workflow Manually
This lets you trigger the automation by hand, ideal for testing or running on demand. You can later swap this out with a scheduler or webhook if needed.

Step 2: Fetch Website URLs from Google Sheets
Node: Get Website URLs from Google Sheet
This node reads a list of URLs from a Google Sheet where you’ve curated your sources.
What to configure:
- Google Sheets OAuth2 credentials
- Document ID and Sheet Name
Expected Output:
A list of URLs under a column named something like websites.
Pro Tip:
Use this Google Sheet as your source control panel. You can update links, remove low-quality sources, or prioritize domains by row order.


Step 3: Crawl Pages Using Dumpling AI
Node: Crawl and Extract Site Content with Dumpling AI
This step uses Dumpling AI’s crawl API to extract readable content from each website.
Request Payload:
{
“url”: “{{ $json.websites }}”,
“limit”: “5”,
“depth”: “2”,
“format”: “text”
}
What it returns:
An array of article-like entries, each including:
- metadata.title
- metadata.original_url
- content (cleaned text)
Why this is powerful:
Dumpling AI handles complex scraping logic, content cleaning, and structure parsing ,so you don’t have to.


Step 4: Split Results into Individual Articles
Node: Split Extracted Results into Individual Items
The previous Dumpling response is an array. This step separates each item so it can be processed one-by-one.
Field to split: results
Why this matters:
Working with individual items is cleaner and lets us reformat them easily in the next steps.

Step 5: Extract and Map Article Fields
Node: Map Title, URL, and Page Text
This Set node captures only the fields we care about:
- Title of the article
- Page text
- Original URL
This trims excess metadata and gets everything ready for formatting.
Mapped Fields:
- metadata.title
- metadata.original_url
- Content


Step 6: Combine All Articles Into a Single Prompt
Node: Combine Articles into Single Prompt Format
A simple JavaScript Code node formats all items into one single long-form string, structured for GPT-4o to read easily.
JS Logic:
- Numbers each article
- Lists title, URL, and content
- Adds spacing for readability
Output Example:
1. How to Build with AI
Long form content…
2. Why Automation Matters
More content…
Pro Tip:
You can change the formatting here to match different prompt styles or AI summarization strategies.


Step 7: Summarize and Generate HTML Newsletter with GPT-4o
Node: Generate HTML Newsletter with Subject Using GPT-4o
This is the heart of automation.
Prompt Logic:
- GPT-4o reads all articles
- It creates a catchy subject line
- Then for each article, it:
- Adds an <h3> with the article title
- Writes a 2–3 sentence summary in a <p>
- Adds a clickable “read more” link
- Separates entries with <br/>
- Adds an <h3> with the article title
Required Output Format:
{
“subject”: “Catchy newsletter title”,
“body”: “<html content with summaries>”
}
Why this step is powerful:
You’re not sending raw article dumps — you’re sending clean summaries, with links, in a structured, branded format.


Step 8: Send the Newsletter via Gmail
Node: Send Newsletter via Gmail
Finally, the newsletter is sent using your Gmail account.
Mapped Fields:
- subject: From GPT
- message: HTML content created by GPT
- recipient: Can be your own inbox for review or a distribution list
Pro Tip:
You can duplicate this node to send to multiple team members or integrate it with Mailchimp or SendGrid later.


Workflow Summary
Here’s the full automation in action:
- Start manually
- Pull URLs from Google Sheets
- Crawl each page using Dumpling AI
- Extract article title, content, and link
- Format all into a combined prompt
- GPT-4o summarizes each item and generates HTML
- Send it as an email newsletter
Conclusion: The Newsletter You No Longer Have to Write
With this automation in place, your newsletter becomes:
- Consistent (runs on your schedule)
- Well-structured (HTML, links, formatting handled)
- Smartly summarized (GPT-4o captures key insights)
- Beautifully delivered (straight to inbox)
Whether you’re reporting company trends, content marketing recaps, or competitor coverage ,this system saves hours per week while keeping your brand informative and relevant.
Want to Level Up?
- Add GPT tone modifiers: formal, casual, witty
- Add a filter step to exclude older or low-word-count content
- Connect with Google Drive to archive each newsletter
- Auto-publish the newsletter to your blog using WordPress API