# Best Practices

## Recommended setup

<table data-card-size="large" data-view="cards"><thead><tr><th></th><th></th></tr></thead><tbody><tr><td><strong>Use one entity per data flow</strong></td><td>Each Box Data Extract entity serves a distinct purpose. Keep text extraction, AI ask, and AI extract in separate data flows so you can tune prompts and schedules independently.</td></tr><tr><td><strong>Target narrow folders</strong></td><td>Point your data flow at a folder containing only the files you need. Avoid top-level folders with mixed content — it slows processing and makes results harder to work with downstream.</td></tr><tr><td><strong>Use structured extraction for consistent schemas</strong></td><td>If you need the same fields from every document (e.g., vendor, amount, date from invoices), use Stream AI extract structured folders with a defined schema instead of a free-form prompt. The output is far easier to load into a database or spreadsheet.</td></tr></tbody></table>

## AI prompt design

<table data-card-size="large" data-view="cards"><thead><tr><th></th><th></th></tr></thead><tbody><tr><td><strong>Be specific in your prompts</strong></td><td>Vague prompts like "summarize this document" produce inconsistent results. Instead, ask for exactly what you need: "What is the contract end date and the name of the counterparty?"</td></tr><tr><td><strong>Test prompts on a small folder first</strong></td><td>Before running on your full document library, create a test folder with 5–10 representative files. This lets you refine your prompt without burning through Box AI credits on bad results.</td></tr><tr><td><strong>Use Ask AI for Q&#x26;A, Extract for data</strong></td><td>Stream AI ask folders is best for open-ended questions where a sentence answer is fine. Stream AI extract folders (or structured) is better when you need structured values you'll analyze or store in a table.</td></tr></tbody></table>

## Combining entities with Coupler.io transformations

<table data-card-size="large" data-view="cards"><thead><tr><th></th><th></th></tr></thead><tbody><tr><td><strong>Join text and structured data by file ID</strong></td><td>Run a text representation entity and a structured extraction entity, then use Coupler.io's Join transformation on file_id to combine raw text with extracted fields in a single output table.</td></tr><tr><td><strong>Append across multiple folders</strong></td><td>If your documents are spread across several Box folders (e.g., one per client or project), create a source for each folder and use Coupler.io's Append transformation to consolidate them into one dataset.</td></tr></tbody></table>

## Common pitfalls

{% hint style="danger" %}
Don't enable Recursive on a large folder tree without testing first. Processing hundreds of nested files can exhaust Box AI quotas and cause long-running or failed data flows.
{% endhint %}

{% columns %}
{% column %}

#### Do

* Confirm Box AI is enabled on your plan before building AI entity flows
* Add the Box app's service account as a collaborator on your target folder
* Use a fixed schema (structured extraction) when you need consistent column output
* Test with a small, focused folder before expanding scope
  {% endcolumn %}

{% column %}

#### Don't

* Use the root Box folder ID for large organizations — scope it to the relevant subfolder
* Leave AI prompts blank for AI entities — it will produce empty or error results
* Rely on free-form AI extract output as a stable schema — prompts can drift in output structure
* Mix file types with very different structures in the same folder if using structured extraction
  {% endcolumn %}
  {% endcolumns %}
