# Apify Dataset

Apify is a web scraping and automation platform that lets you extract data from websites using pre-built or custom actors (scrapers). When an actor runs, it stores its output in a **dataset** — a structured collection of items you can export and analyze. Connecting Apify datasets to Coupler.io lets you pull that scraped data into your preferred destination automatically.

## Why connect Apify Dataset to Coupler.io?

* **Centralize scraped data** — move web-extracted data into Google Sheets, BigQuery, Excel, or Looker Studio without manual exports
* **Combine with other sources** — use Append or Join transformations to merge datasets from multiple actors or runs
* **Send to AI tools** — pipe scraped content into ChatGPT, Claude, Gemini, or Perplexity for analysis, summarization, or classification
* **Keep data fresh** — schedule recurring syncs so your destination always reflects the latest actor output

## Prerequisites

* An active Apify account with at least one completed actor run that has produced dataset output
* Your Apify API token (found in Apify Console under **Settings > Integrations**)
* The Dataset ID of the dataset you want to export (found in **Storage > Datasets** in Apify Console)

## Quick start

{% hint style="success" %}
If you want to export items from a website content crawler actor, choose the **Item collection website content crawlers** entity — it maps the crawler output fields automatically.
{% endhint %}

## How to connect

{% stepper %}
{% step %}
**Create a new data flow in Coupler.io.** From your Coupler.io dashboard, click **Add data flow** and search for **Apify Dataset** as your source.
{% endstep %}

{% step %}
**Enter your Apify API token.** In the source settings, paste your API token from Apify Console (**Settings > Integrations > API token**). Coupler.io uses this to authenticate requests on your behalf.
{% endstep %}

{% step %}
**Enter your Dataset ID.** In the **Dataset ID** field, paste the ID of the dataset you want to export. You can find it in Apify Console under **Storage > Datasets** — click on the dataset and copy the ID from the URL or the detail panel.
{% endstep %}

{% step %}
**Choose an entity.** Select one of the available entities depending on what you want to export — see the table below for a quick guide.
{% endstep %}

{% step %}
**Choose a destination.** Pick where you want your data to land — Google Sheets, Excel, BigQuery, Looker Studio, or an AI destination like ChatGPT, Claude, Gemini, Cursor, Perplexity, or OpenClaw.
{% endstep %}

{% step %}
**Run the data flow.** Click **Run** to execute a manual sync and confirm data is flowing correctly before setting up a schedule.
{% endstep %}
{% endstepper %}

## Entities overview

| Entity                                       | What it returns                                                                                      |
| -------------------------------------------- | ---------------------------------------------------------------------------------------------------- |
| **Dataset collections**                      | A list of all datasets stored in your Apify account                                                  |
| **Datasets**                                 | Metadata and details for a specific dataset                                                          |
| **Item collection**                          | The raw items (rows) stored in a specific dataset                                                    |
| **Item collection website content crawlers** | Items from a website content crawler actor, with pre-mapped fields like URL, title, and page content |
