Apify Dataset

Apify is a web scraping and automation platform that lets you extract data from websites using pre-built or custom actors (scrapers). When an actor runs, it stores its output in a dataset — a structured collection of items you can export and analyze. Connecting Apify datasets to Coupler.io lets you pull that scraped data into your preferred destination automatically.

Why connect Apify Dataset to Coupler.io?

  • Centralize scraped data — move web-extracted data into Google Sheets, BigQuery, Excel, or Looker Studio without manual exports

  • Combine with other sources — use Append or Join transformations to merge datasets from multiple actors or runs

  • Send to AI tools — pipe scraped content into ChatGPT, Claude, Gemini, or Perplexity for analysis, summarization, or classification

  • Keep data fresh — schedule recurring syncs so your destination always reflects the latest actor output

Prerequisites

  • An active Apify account with at least one completed actor run that has produced dataset output

  • Your Apify API token (found in Apify Console under Settings > Integrations)

  • The Dataset ID of the dataset you want to export (found in Storage > Datasets in Apify Console)

Quick start

circle-check

How to connect

1

Create a new data flow in Coupler.io. From your Coupler.io dashboard, click Add data flow and search for Apify Dataset as your source.

2

Enter your Apify API token. In the source settings, paste your API token from Apify Console (Settings > Integrations > API token). Coupler.io uses this to authenticate requests on your behalf.

3

Enter your Dataset ID. In the Dataset ID field, paste the ID of the dataset you want to export. You can find it in Apify Console under Storage > Datasets — click on the dataset and copy the ID from the URL or the detail panel.

4

Choose an entity. Select one of the available entities depending on what you want to export — see the table below for a quick guide.

5

Choose a destination. Pick where you want your data to land — Google Sheets, Excel, BigQuery, Looker Studio, or an AI destination like ChatGPT, Claude, Gemini, Cursor, Perplexity, or OpenClaw.

6

Run the data flow. Click Run to execute a manual sync and confirm data is flowing correctly before setting up a schedule.

Entities overview

Entity
What it returns

Dataset collections

A list of all datasets stored in your Apify account

Datasets

Metadata and details for a specific dataset

Item collection

The raw items (rows) stored in a specific dataset

Item collection website content crawlers

Items from a website content crawler actor, with pre-mapped fields like URL, title, and page content

Last updated

Was this helpful?