Best Practices

Whitelist Coupler.io IPs early

Add Coupler.io's IP addresses to your PostgreSQL firewall before you create any data flows. This prevents connection errors and speeds up troubleshooting. Check the connection dialog in Coupler.io for the exact IPs to whitelist in AWS, Azure, or your on-premise firewall.

Create a read-only database user

Don't use your admin credentials. Create a dedicated PostgreSQL user with SELECT-only permissions on the tables you need: `CREATE USER coupler_user WITH PASSWORD 'secure_password'; GRANT SELECT ON table_name TO coupler_user;` This improves security and prevents accidental modifications.

Start with a test query

Before scheduling, run a manual test with a simple query like `SELECT * FROM table LIMIT 100;` to confirm the connection works and data looks correct. This catches permission or schema issues early.

Use Custom SQL for filtering

Write WHERE clauses in your SQL query to pull only the data you need, rather than importing everything and filtering in the destination. This speeds up imports and reduces data transfer costs.

Data refresh and scheduling

Schedule imports during off-peak hours

If your PostgreSQL database serves production traffic, schedule Coupler.io data flows during low-traffic windows (e.g., 2–4 AM). This prevents the import from competing with user queries and causing slowdowns.

Refresh frequency depends on data volatility

High-velocity data (e.g., real-time metrics, hourly events) may need daily or hourly refreshes. Historical reports or snapshots can refresh weekly. Start with daily and adjust based on your use case.

Monitor failed runs

Coupler.io will notify you of failures. Check your email alerts and the data flow's run history regularly. Common failures are IP whitelist changes, schema updates, or permission revocation.

Performance optimization

Index frequently filtered columns

If your SQL uses WHERE clauses on columns like `user_id`, `created_at`, or `status`, create indexes: `CREATE INDEX idx_created_at ON orders(created_at);` Indexes speed up data retrieval significantly, especially for large tables.

Aggregate at the source, not the destination

Use GROUP BY and SUM/COUNT in your SQL query to aggregate data in PostgreSQL. This transfers fewer rows and is faster than aggregating in Google Sheets or BI tools after import.

Avoid SELECT * on large tables

Specify only the columns you need: `SELECT id, name, email, created_at FROM users;` Fewer columns = faster imports and smaller file sizes in your destination.

Common pitfalls

triangle-exclamation
triangle-exclamation

Do

  • Filter by date range in your SQL to reduce data volume

  • Test queries in psql before pasting them into Coupler.io

  • Create indexes on columns used in WHERE and JOIN clauses

  • Use Custom SQL for complex transformations

  • Schedule imports during off-peak database hours

Don't

  • Use SELECT * on tables with hundreds of columns

  • Import raw log tables with billions of rows (aggregate first)

  • Share database credentials in Slack or email

  • Rely on table or view updates without confirming in psql

  • Run multiple heavy data flows simultaneously on the same database

Last updated

Was this helpful?