Best Practices
Recommended setup
Use a dedicated service account
Create a separate GCP Service Account just for Coupler.io instead of using a personal account. This isolates permissions, makes auditing easier, and allows you to revoke access without affecting other tools.
Grant minimal permissions
Assign only **BigQuery Data Viewer** and **BigQuery Job User** roles. Avoid granting Editor or Owner roles. If writing to BigQuery as a destination, also add **BigQuery Data Editor** — nothing more.
Start with small, testable queries
Before scheduling a complex query, run it manually in Coupler.io and compare results to BigQuery's web UI. This catches SQL errors and permission issues early.
Test with a manual run first
Always run your data flow manually once and verify the output before scheduling. Coupler.io requires a successful manual run before scheduling is allowed.
Data refresh and scheduling
Use date-filtered queries for incremental loads
Instead of always fetching the full table, filter by a date range to only get new or updated records. Example: `WHERE updated_at >= DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)`. This reduces query costs and execution time.
Schedule during off-peak hours
If running large queries, schedule them during low-traffic periods (e.g., early morning or weekends) to avoid contention and reduce costs.
Use Replace mode for daily snapshots
If you want a fresh copy of data each day, use **Replace** mode. If you want a running log of all records, use **Append** mode with appropriate date filtering.
Performance optimization
Select only the columns you need
Avoid `SELECT *`. BigQuery charges for bytes scanned, so listing specific columns reduces costs. Example: `SELECT order_id, customer_name, revenue FROM table` instead of `SELECT *`.
Filter early in your WHERE clause
Push filters down to the query itself rather than filtering in Google Sheets later. BigQuery processes WHERE conditions before returning results, scanning fewer bytes and running faster.
Use partitioned or clustered tables
If you own the BigQuery table, partition it by date and cluster by frequently-filtered columns (e.g., region, product). Coupler.io will scan less data, queries run faster, and costs drop.
Common pitfalls
Do
Test your SQL query directly in BigQuery first
Use full table references:
project.dataset.tableFilter by date ranges to reduce data scanned
Grant service account only the roles it needs
Use Replace mode for daily snapshots, Append for logs
Monitor BigQuery cost and query execution time
Don't
Share your GCP project key with team members — use Coupler.io's centralized connection
Use
SELECT *in production queries — specify columns explicitlyRun very large unfiltered queries on schedules — they may timeout or spike costs
Change SQL queries without testing them first in BigQuery
Use Append mode without date filtering (creates duplicates over time)
Set destination cell ranges too small in Replace mode (data gets cut off)
Avoid unoptimized queries on frequent schedules. A query that scans 100 GB hourly will cost significantly more than a filtered query scanning 1 GB daily. Always optimize your SQL and consider scheduling frequency carefully.
Don't change dataset permissions after setup. If you revoke the service account's access to a dataset, all data flows querying that dataset will fail. Test permission changes in a development dataset first.
Last updated
Was this helpful?
