# Box Data Extract

Box is a cloud content management platform used by teams to store, share, and collaborate on documents. The Box Data Extract source in Coupler.io lets you pull text content and AI-generated insights directly from files stored in your Box folders.

With this source, you can go beyond just syncing files — you can extract structured data and ask questions about document content using Box AI.

**Why connect Box Data Extract to Coupler.io?**

* Extract raw text from Box documents and load it into spreadsheets, databases, or AI tools
* Use Box AI to pull structured metadata from documents without manual review
* Ask natural-language questions about your Box files and get answers as structured data
* Route document content to destinations like Google Sheets, BigQuery, or AI tools such as ChatGPT, Claude, or Gemini

## Prerequisites

Before you start, make sure you have:

* A Box account with access to the folder you want to extract data from
* A Box Developer App with a **Client Credentials Grant** or **JWT** authentication configured
* Your Box API credentials: **Client ID**, **Client Secret**, and **Enterprise ID** (or **User ID**)
* The **Folder ID** of the Box folder you want to process (visible in the URL when you open the folder in Box)

{% hint style="success" %}
If you're not sure which entity to start with, try **Stream text representation folders** first — it gives you the raw text content of your documents and works without any AI prompt configuration.
{% endhint %}

## Quick start

{% stepper %}
{% step %}
**Create a new data flow in Coupler.io** and select **Box Data Extract** as the source.
{% endstep %}

{% step %}
**Enter your Box API credentials.** You'll need your Client ID, Client Secret, and either your Enterprise ID or User ID. These come from your Box Developer App settings at developer.box.com.
{% endstep %}

{% step %}
**Set your Folder ID.** Paste the ID of the Box folder containing the files you want to process. You can find this in the URL when you open the folder in Box — it's the number after `/folder/`.
{% endstep %}

{% step %}
**Choose an entity.** Select from Stream text representation folders, Stream AI ask folders, Stream AI extract folders, or Stream AI extract structured folders depending on what kind of data you need.
{% endstep %}

{% step %}
**Configure entity-specific settings.** If you selected an AI entity, enter the relevant prompt in the Ask AI Prompt or Extract AI Prompt field. Enable **Recursive** if you want to include subfolders.
{% endstep %}

{% step %}
**Choose your destination.** Send your data to Google Sheets, Excel, BigQuery, Looker Studio, or an AI tool like ChatGPT, Claude, Cursor, Gemini, Perplexity, or OpenClaw.
{% endstep %}

{% step %}
**Run your data flow manually** to confirm everything is working before setting up a schedule.
{% endstep %}
{% endstepper %}

## Available entities

| Entity                               | Description                                                           |
| ------------------------------------ | --------------------------------------------------------------------- |
| Stream text representation folders   | Raw text content extracted from documents in the specified Box folder |
| Stream AI ask folders                | AI-generated answers to a custom question asked about each document   |
| Stream AI extract folders            | Metadata extracted from documents using a free-form Box AI prompt     |
| Stream AI extract structured folders | Structured data extracted using a predefined field schema via Box AI  |
