November 21, 2024
How to Automate Webpage Screenshot Capture and Data Extraction Using Dumpling AI in Make.com.
In this tutorial, we will walk you through setting up an automation in Make.com that uses Dumpling AI’s Screenshot URL and Extract Data from Image modules. This automation is designed to capture a screenshot of a webpage and extract key information directly from the screenshot. This use case is particularly useful for extracting insights from visually rich content like competitor analysis, product features, or even tracking updates on websites where scraping HTML content might not be effective.
By the end of this guide, you’ll have a working automation that pulls URLs from Google Sheets, captures screenshots using Dumpling AI, extracts specific information, and stores the results back in Google Sheets. This approach ensures you can gather valuable insights from websites.
Step 1: Setting Up Google Sheets as the Trigger
This setup allows the scenario to monitor changes in your Google Sheet. Each time a new URL is added, it will trigger the automation to capture and analyze the website content.
- Create a New Scenario in Make.com
- Log into Make.com and start a new scenario.
- Search for Google Sheets: Watch Rows and add it to your scenario.
- Configure Google Sheets Module
- Connection: Select your connected Google account.
- Spreadsheet ID: Choose the spreadsheet that contains the list of URLs you want to extract data from.
- Sheet Name: Set to “Sheet1” where your URLs are stored.
Limit: Set to 1 to process one row per trigger.
Step 2: Capturing Webpage Screenshots Using Dumpling AI’s Screenshot Module
Dumpling AI’s Screenshot module excels at capturing visual content, which is especially useful for analyzing pages that are resistant to traditional scraping methods. This module ensures that you get a complete and clean capture of the webpage.
- Add Dumpling AI Screenshot URL Module
- Add Dumpling AI: Screenshot URL to your scenario.
- Connection: Use your Dumpling AI connection. If you don’t have one, add it by entering your API key.
- Configure the Screenshot Module
- URL: Map the URL field from the Google Sheets trigger ({{1.URL}}).
- Screenshot Full Page: Set to true to capture the entire webpage.
- Auto Scroll: Enable to ensure content loaded dynamically (e.g., infinite scroll) is captured.
- Block Cookie Banners: Enable to avoid unnecessary pop-ups in your screenshots.
- Viewport: Set width and height to 1024×1024 for consistent sizing.
Step 3: Extracting Data from the Captured Screenshot Using Dumpling AI
This step extracts information from the captured screenshot, making it perfect for analyzing product features, benefits, or promotional content displayed on a webpage. The flexibility of Dumpling AI’s extraction module allows you to customize the prompt to fit your needs.
- Add Dumpling AI Extract Image Module
- Add the Dumpling AI: Extract Image module to your scenario.
- Connection: Use your existing Dumpling AI connection.
- Input Method: Select URL since you are using a screenshot URL from the previous step.
- Images: Map the screenshot URL output from the Screenshot module ({{2.screenshotUrl}}).
- Configure Extraction Parameters
- Prompt: Use a prompt tailored to the information you want to extract. For example:
“Analyze the URL and list key benefits, features, and any highlighted value propositions.” - JSON Mode: Set to false if you want a simple text response. Set to true if you prefer structured JSON output for easier parsing.
- Prompt: Use a prompt tailored to the information you want to extract. For example:
Step 4: Storing Extracted Data Back in Google Sheets
This step ensures all extracted insights are stored in a centralized location, making it easy to review and analyze the data you’ve captured.
- Add Google Sheets: Update Row Module
- Add Google Sheets: Update Row to save the extracted data back into your Google Sheets.
- Connection: Use your connected Google account.
- Spreadsheet ID: Select the original spreadsheet.
- Row Number: Map {{1.__ROW_NUMBER__}} to update the correct row.
- Values: Map the extracted content from the Extract Image module ({{3.results}}) to the designated column.
Step 5: Testing and Activating Your Automation
- Run a Test Scenario
- Add a sample URL to your Google Sheet.
- Manually run the scenario to confirm:
- The URL is correctly captured and processed.
- A screenshot is taken using Dumpling AI.
- Relevant information is extracted from the screenshot.
- The results are saved back into Google Sheets.
- Activate the Scenario
Once testing is successful, activate the scenario to automate the process. Every time a new URL is added to your Google Sheet, the automation will run, capturing and extracting data as specified.
This automated workflow is ideal for businesses looking to extract valuable insights from webpages without relying on traditional web scraping techniques. The combination of Dumpling AI’s advanced modules ensures you get precise data extraction, even from visually rich content.
Get the Blueprint Featured in This Guide
Access the full blueprint here to get started on setting up this automation effortlessly!