Data Overview

Apify datasets store the output of actor runs as structured collections of items — essentially rows of scraped or processed data. The exact fields in each item depend on the actor that produced them, but Coupler.io supports four entities that let you access both metadata and the raw item content.

Entities and what they return

Entity

Description

Dataset collections

Returns a list of all datasets in your Apify account, including IDs, names, and creation dates

Datasets

Returns metadata for a specific dataset: item count, size, creation date, and last modified time

Item collections

Returns the full list of items (rows) stored in a specific dataset — fields vary by actor

Item collection website content crawlers

Returns items from website content crawlers with standardized fields like URL, page title, and crawled text

Available fields

Dataset collections fields

Field

Description

id

Unique identifier for the dataset

name

Dataset name (if set)

createdAt

Timestamp when the dataset was created

modifiedAt

Timestamp of the last modification

accessedAt

Timestamp of the last access

itemCount

Number of items stored in the dataset

cleanItemCount

Number of items excluding empty or duplicate records

Dataset metadata fields

Field

Description

id

Dataset ID

name

Dataset name

userId

ID of the Apify user who owns the dataset

createdAt

Creation timestamp

modifiedAt

Last modified timestamp

itemCount

Total item count

cleanItemCount

Clean item count

actId

ID of the actor that created this dataset

actRunId

ID of the specific actor run that produced the data

Item collection fields

Fields vary depending on the actor that produced the dataset. Common fields include:

Field

Description

url

The URL that was scraped

title

Page or record title

description

Short description or meta description

text

Extracted text content

price

Price (e-commerce actors)

imageUrl

Image URL

Custom fields

Any additional fields output by the actor

Item collection website content crawler fields

Field

Description

url

Crawled page URL

title

Page title

text

Full extracted text from the page

markdown

Page content in Markdown format

metadata

Additional page metadata

crawl

Crawl metadata (depth, referrer URL, etc.)

Common field combinations

Content audits: url + title + text from Item collection to review all scraped pages
Dataset monitoring: itemCount + cleanItemCount + modifiedAt from Datasets to track actor run output over time
AI analysis: url + markdown from Item collection website content crawlers, piped into ChatGPT or Claude for summarization
Cross-dataset comparison: Use Append transformation to stack item collections from multiple actor runs into one unified table

Use cases by role

Pull competitor pricing data scraped by an Apify actor into Google Sheets for weekly tracking
Send crawled website content to ChatGPT or Gemini for automated content gap analysis
Append item collections from multiple scraping runs to build a historical dataset of SERP results

Platform-specific notes

Item fields are actor-dependent — the schema of Item collection data will differ between actors (e.g., an e-commerce scraper vs. a news scraper)
The cleanItemCount may be lower than itemCount if the actor produced empty or duplicate records
Apify datasets are tied to a specific actor run; if you re-run an actor, it creates a new dataset with a new ID
For website content crawlers, the markdown field is the most useful for feeding content into AI destinations
Very large datasets (millions of items) may require pagination — Coupler.io handles this automatically

PreviousApify Dataset NextCommon Issues

Last updated 11 days ago

Was this helpful?

Good evening

hashtagEntities and what they return

hashtagAvailable fields

hashtagDataset collections fields

hashtagDataset metadata fields

hashtagItem collection fields

hashtagItem collection website content crawler fields

hashtagCommon field combinations

hashtagUse cases by role

hashtagPlatform-specific notes

Entities and what they return

Available fields

Dataset collections fields

Dataset metadata fields

Item collection fields

Item collection website content crawler fields

Common field combinations

Use cases by role

Platform-specific notes