# Best Practices

## Recommended setup

<table data-card-size="large" data-view="cards"><thead><tr><th></th><th></th></tr></thead><tbody><tr><td><strong>Whitelist Coupler.io IPs early</strong></td><td>Add Coupler.io's IP addresses to your PostgreSQL firewall before you create any data flows. This prevents connection errors and speeds up troubleshooting. Check the connection dialog in Coupler.io for the exact IPs to whitelist in AWS, Azure, or your on-premise firewall.</td></tr><tr><td><strong>Create a read-only database user</strong></td><td>Don't use your admin credentials. Create a dedicated PostgreSQL user with SELECT-only permissions on the tables you need: `CREATE USER coupler_user WITH PASSWORD 'secure_password'; GRANT SELECT ON table_name TO coupler_user;` This improves security and prevents accidental modifications.</td></tr><tr><td><strong>Start with a test query</strong></td><td>Before scheduling, run a manual test with a simple query like `SELECT * FROM table LIMIT 100;` to confirm the connection works and data looks correct. This catches permission or schema issues early.</td></tr><tr><td><strong>Use Custom SQL for filtering</strong></td><td>Write WHERE clauses in your SQL query to pull only the data you need, rather than importing everything and filtering in the destination. This speeds up imports and reduces data transfer costs.</td></tr></tbody></table>

## Data refresh and scheduling

<table data-card-size="large" data-view="cards"><thead><tr><th></th><th></th></tr></thead><tbody><tr><td><strong>Schedule imports during off-peak hours</strong></td><td>If your PostgreSQL database serves production traffic, schedule Coupler.io data flows during low-traffic windows (e.g., 2–4 AM). This prevents the import from competing with user queries and causing slowdowns.</td></tr><tr><td><strong>Refresh frequency depends on data volatility</strong></td><td>High-velocity data (e.g., real-time metrics, hourly events) may need daily or hourly refreshes. Historical reports or snapshots can refresh weekly. Start with daily and adjust based on your use case.</td></tr><tr><td><strong>Monitor failed runs</strong></td><td>Coupler.io will notify you of failures. Check your email alerts and the data flow's run history regularly. Common failures are IP whitelist changes, schema updates, or permission revocation.</td></tr></tbody></table>

## Performance optimization

<table data-card-size="large" data-view="cards"><thead><tr><th></th><th></th></tr></thead><tbody><tr><td><strong>Index frequently filtered columns</strong></td><td>If your SQL uses WHERE clauses on columns like `user_id`, `created_at`, or `status`, create indexes: `CREATE INDEX idx_created_at ON orders(created_at);` Indexes speed up data retrieval significantly, especially for large tables.</td></tr><tr><td><strong>Aggregate at the source, not the destination</strong></td><td>Use GROUP BY and SUM/COUNT in your SQL query to aggregate data in PostgreSQL. This transfers fewer rows and is faster than aggregating in Google Sheets or BI tools after import.</td></tr><tr><td><strong>Avoid SELECT * on large tables</strong></td><td>Specify only the columns you need: `SELECT id, name, email, created_at FROM users;` Fewer columns = faster imports and smaller file sizes in your destination.</td></tr></tbody></table>

## Common pitfalls

{% hint style="danger" %}
**Don't use production admin credentials** — If your admin password is compromised or exposed in logs, your entire database is at risk. Always create a read-only user with minimal permissions.
{% endhint %}

{% hint style="danger" %}
**Don't forget to test after schema changes** — If you alter a table (add/remove columns, change types), your data flow may break or return unexpected data. After any schema change, run a manual test to confirm everything still works.
{% endhint %}

{% columns %}
{% column %}
**Do**

* Filter by date range in your SQL to reduce data volume
* Test queries in psql before pasting them into Coupler.io
* Create indexes on columns used in WHERE and JOIN clauses
* Use Custom SQL for complex transformations
* Schedule imports during off-peak database hours
  {% endcolumn %}

{% column %}
**Don't**

* Use SELECT \* on tables with hundreds of columns
* Import raw log tables with billions of rows (aggregate first)
* Share database credentials in Slack or email
* Rely on table or view updates without confirming in psql
* Run multiple heavy data flows simultaneously on the same database
  {% endcolumn %}
  {% endcolumns %}
