> For the complete documentation index, see [llms.txt](https://docs.coupler.io/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.coupler.io/sources/category/files-and-tables/azure-blob-storage/data-overview.md).

# Data Overview

Azure Blob Storage doesn't have predefined entities like a CRM or API. Instead, you sync **files** from your container. The data structure depends entirely on the format and content of your blobs.

## File formats and structure

#### Structured formats

| Format      | Row/record structure                         | Data types                                    | Use case                                      |
| ----------- | -------------------------------------------- | --------------------------------------------- | --------------------------------------------- |
| **CSV**     | Comma-separated columns, first row = headers | Text, numbers (as strings unless converted)   | Sales transactions, customer lists, inventory |
| **JSONL**   | One JSON object per line                     | Nested objects, arrays, mixed types preserved | API logs, event streams, user activity        |
| **Parquet** | Binary columnar format                       | Native types (int, float, string, date, etc.) | Analytics data, large datasets, BI tools      |
| **Excel**   | Rows and columns in sheets                   | Mixed types per column                        | Budgets, forecasts, reports                   |

#### Unstructured format

| Format           | Content                        | Example use                              |
| ---------------- | ------------------------------ | ---------------------------------------- |
| **Unstructured** | Raw text, logs, markdown, HTML | Log files, error messages, documentation |

## Common import patterns

**Single file sync** — Pull one CSV or Excel file into Google Sheets for analysis.

**Multi-file append** — Use glob patterns like `monthly/*.csv` to match all monthly reports and Append them into a single BigQuery table. This creates a historical archive without duplicate logic.

**Nested folder sync** — Glob patterns like `data/**/*.parquet` sync Parquets from all subfolders in the `data` directory, useful for tiered storage structures.

**Format conversion** — Export data from a legacy system as Excel, sync to Coupler.io, and land it in BigQuery as a proper data type-aware table.

## Use cases by role

{% tabs %}
{% tab title="Data analysts" %}
Sync Parquet or CSV files from data lakes into BigQuery or Snowflake. Use glob patterns to match versioned exports (e.g., `exports/v2/*.parquet`) and Append transformations to build historical tables for trend analysis and modeling.
{% endtab %}

{% tab title="Finance teams" %}
Pull monthly export files from accounting systems (exported as CSV or Excel) into Google Sheets or a financial data warehouse. Schedule weekly syncs to stay current with bank statements, invoices, and budget actuals.
{% endtab %}

{% tab title="Data engineers" %}
Ingest raw event logs (JSONL format) from your application into a data lake. Use start dates to sync only recent logs, reducing storage and processing overhead.
{% endtab %}

{% tab title="Operations" %}
Sync inventory or capacity reports (Excel or CSV) from Azure into Looker Studio for real-time dashboarding. Combine with other data sources using Join transformations to correlate stock levels with sales trends.
{% endtab %}
{% endtabs %}

## Platform-specific notes

* **Glob patterns are powerful** — `*.csv` matches files in the root; `**/*.csv` matches nested folders; `data/2024/*.parquet` targets a specific year
* **Start date filters by modification time** — Files touched after the start date are synced; unchanged files are skipped on refresh
* **Excel syncs one sheet** — If your workbook has multiple sheets, create separate data flows for each or export individual sheets as CSVs
* **JSONL requires line-delimited JSON** — Each line must be a complete, valid JSON object (not pretty-printed)
* **Large files and performance** — Parquet files compress better than CSV; for very large datasets, consider splitting into multiple containers or using date-based folder structures


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.coupler.io/sources/category/files-and-tables/azure-blob-storage/data-overview.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.