Apify Dataset
Apify is a web scraping and automation platform that lets you extract data from websites using pre-built or custom actors (scrapers). When an actor runs, it stores its output in a dataset — a structured collection of items you can export and analyze. Connecting Apify datasets to Coupler.io lets you pull that scraped data into your preferred destination automatically.
Why connect Apify Dataset to Coupler.io?
Centralize scraped data — move web-extracted data into Google Sheets, BigQuery, Excel, or Looker Studio without manual exports
Combine with other sources — use Append or Join transformations to merge datasets from multiple actors or runs
Send to AI tools — pipe scraped content into ChatGPT, Claude, Gemini, or Perplexity for analysis, summarization, or classification
Keep data fresh — schedule recurring syncs so your destination always reflects the latest actor output
Prerequisites
An active Apify account with at least one completed actor run that has produced dataset output
Your Apify API token (found in Apify Console under Settings > Integrations)
The Dataset ID of the dataset you want to export (found in Storage > Datasets in Apify Console)
Quick start
If you want to export items from a website content crawler actor, choose the Item collection website content crawlers entity — it maps the crawler output fields automatically.
How to connect
Create a new data flow in Coupler.io. From your Coupler.io dashboard, click Add data flow and search for Apify Dataset as your source.
Enter your Apify API token. In the source settings, paste your API token from Apify Console (Settings > Integrations > API token). Coupler.io uses this to authenticate requests on your behalf.
Enter your Dataset ID. In the Dataset ID field, paste the ID of the dataset you want to export. You can find it in Apify Console under Storage > Datasets — click on the dataset and copy the ID from the URL or the detail panel.
Choose an entity. Select one of the available entities depending on what you want to export — see the table below for a quick guide.
Choose a destination. Pick where you want your data to land — Google Sheets, Excel, BigQuery, Looker Studio, or an AI destination like ChatGPT, Claude, Gemini, Cursor, Perplexity, or OpenClaw.
Run the data flow. Click Run to execute a manual sync and confirm data is flowing correctly before setting up a schedule.
Entities overview
Dataset collections
A list of all datasets stored in your Apify account
Datasets
Metadata and details for a specific dataset
Item collection
The raw items (rows) stored in a specific dataset
Item collection website content crawlers
Items from a website content crawler actor, with pre-mapped fields like URL, title, and page content
Last updated
Was this helpful?
