> For the complete documentation index, see [llms.txt](https://docs.coupler.io/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.coupler.io/sources/category/files-and-tables/box-data-extract/best-practices.md). # Best Practices ## Recommended setup


Use one entity per data flow	Each Box Data Extract entity serves a distinct purpose. Keep text extraction, AI ask, and AI extract in separate data flows so you can tune prompts and schedules independently.
Target narrow folders	Point your data flow at a folder containing only the files you need. Avoid top-level folders with mixed content — it slows processing and makes results harder to work with downstream.
Use structured extraction for consistent schemas	If you need the same fields from every document (e.g., vendor, amount, date from invoices), use Stream AI extract structured folders with a defined schema instead of a free-form prompt. The output is far easier to load into a database or spreadsheet.

## AI prompt design


Be specific in your prompts	Vague prompts like "summarize this document" produce inconsistent results. Instead, ask for exactly what you need: "What is the contract end date and the name of the counterparty?"
Test prompts on a small folder first	Before running on your full document library, create a test folder with 5–10 representative files. This lets you refine your prompt without burning through Box AI credits on bad results.
Use Ask AI for Q&A, Extract for data	Stream AI ask folders is best for open-ended questions where a sentence answer is fine. Stream AI extract folders (or structured) is better when you need structured values you'll analyze or store in a table.

## Combining entities with Coupler.io transformations


Join text and structured data by file ID	Run a text representation entity and a structured extraction entity, then use Coupler.io's Join transformation on file_id to combine raw text with extracted fields in a single output table.
Append across multiple folders	If your documents are spread across several Box folders (e.g., one per client or project), create a source for each folder and use Coupler.io's Append transformation to consolidate them into one dataset.

## Common pitfalls {% hint style="danger" %} Don't enable Recursive on a large folder tree without testing first. Processing hundreds of nested files can exhaust Box AI quotas and cause long-running or failed data flows. {% endhint %} {% columns %} {% column %} #### Do * Confirm Box AI is enabled on your plan before building AI entity flows * Add the Box app's service account as a collaborator on your target folder * Use a fixed schema (structured extraction) when you need consistent column output * Test with a small, focused folder before expanding scope {% endcolumn %} {% column %} #### Don't * Use the root Box folder ID for large organizations — scope it to the relevant subfolder * Leave AI prompts blank for AI entities — it will produce empty or error results * Rely on free-form AI extract output as a stable schema — prompts can drift in output structure * Mix file types with very different structures in the same folder if using structured extraction {% endcolumn %} {% endcolumns %} --- # Agent Instructions This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com. ## Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter: ``` GET https://docs.coupler.io/sources/category/files-and-tables/box-data-extract/best-practices.md?ask=&goal= ``` `ask` is the immediate question: it should be specific, self-contained, and written in natural language. `goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.