# Best Practices

## Recommended setup

<table data-card-size="large" data-view="cards"><thead><tr><th></th><th></th></tr></thead><tbody><tr><td><strong>Use a dedicated service account</strong></td><td>Create a separate GCP Service Account just for Coupler.io instead of using a personal account. This isolates permissions, makes auditing easier, and allows you to revoke access without affecting other tools.</td></tr><tr><td><strong>Grant minimal permissions</strong></td><td>Assign only **BigQuery Data Viewer** and **BigQuery Job User** roles. Avoid granting Editor or Owner roles. If writing to BigQuery as a destination, also add **BigQuery Data Editor** — nothing more.</td></tr><tr><td><strong>Start with small, testable queries</strong></td><td>Before scheduling a complex query, run it manually in Coupler.io and compare results to BigQuery's web UI. This catches SQL errors and permission issues early.</td></tr><tr><td><strong>Test with a manual run first</strong></td><td>Always run your data flow manually once and verify the output before scheduling. Coupler.io requires a successful manual run before scheduling is allowed.</td></tr></tbody></table>

## Data refresh and scheduling

<table data-card-size="large" data-view="cards"><thead><tr><th></th><th></th></tr></thead><tbody><tr><td><strong>Use date-filtered queries for incremental loads</strong></td><td>Instead of always fetching the full table, filter by a date range to only get new or updated records. Example: `WHERE updated_at >= DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)`. This reduces query costs and execution time.</td></tr><tr><td><strong>Schedule during off-peak hours</strong></td><td>If running large queries, schedule them during low-traffic periods (e.g., early morning or weekends) to avoid contention and reduce costs.</td></tr><tr><td><strong>Use Replace mode for daily snapshots</strong></td><td>If you want a fresh copy of data each day, use **Replace** mode. If you want a running log of all records, use **Append** mode with appropriate date filtering.</td></tr></tbody></table>

## Performance optimization

<table data-card-size="large" data-view="cards"><thead><tr><th></th><th></th></tr></thead><tbody><tr><td><strong>Select only the columns you need</strong></td><td>Avoid `SELECT *`. BigQuery charges for bytes scanned, so listing specific columns reduces costs. Example: `SELECT order_id, customer_name, revenue FROM table` instead of `SELECT *`.</td></tr><tr><td><strong>Filter early in your WHERE clause</strong></td><td>Push filters down to the query itself rather than filtering in Google Sheets later. BigQuery processes WHERE conditions before returning results, scanning fewer bytes and running faster.</td></tr><tr><td><strong>Use partitioned or clustered tables</strong></td><td>If you own the BigQuery table, partition it by date and cluster by frequently-filtered columns (e.g., region, product). Coupler.io will scan less data, queries run faster, and costs drop.</td></tr></tbody></table>

## Common pitfalls

{% columns %}
{% column %}
**Do**

* Test your SQL query directly in BigQuery first
* Use full table references: `project.dataset.table`
* Filter by date ranges to reduce data scanned
* Grant service account only the roles it needs
* Use Replace mode for daily snapshots, Append for logs
* Monitor BigQuery cost and query execution time
  {% endcolumn %}

{% column %}
**Don't**

* Share your GCP project key with team members — use Coupler.io's centralized connection
* Use `SELECT *` in production queries — specify columns explicitly
* Run very large unfiltered queries on schedules — they may timeout or spike costs
* Change SQL queries without testing them first in BigQuery
* Use Append mode without date filtering (creates duplicates over time)
* Set destination cell ranges too small in Replace mode (data gets cut off)
  {% endcolumn %}
  {% endcolumns %}

{% hint style="danger" %}
**Avoid unoptimized queries on frequent schedules.** A query that scans 100 GB hourly will cost significantly more than a filtered query scanning 1 GB daily. Always optimize your SQL and consider scheduling frequency carefully.
{% endhint %}

{% hint style="danger" %}
**Don't change dataset permissions after setup.** If you revoke the service account's access to a dataset, all data flows querying that dataset will fail. Test permission changes in a development dataset first.
{% endhint %}
