Best Practices

Start with a narrow date range

When setting up the Events entity for the first time, use the date picker to start with the last 7–14 days. Confirm the data looks correct before expanding to a longer history. This avoids timeouts and makes troubleshooting easier.

Use one entity per data flow source for large datasets

If you're pulling both Events and Active users, add them as separate sources within the same data flow rather than trying to cram everything into one pull. This makes it easier to configure different time ranges per entity.

Export to BigQuery for event-level analysis

Raw Events data can be large and contains nested JSON properties. BigQuery handles this structure natively and is better suited for event-level querying than spreadsheets.

Data refresh and scheduling

Match your refresh interval to your reporting cadence

Active users and session length data typically only need a daily refresh. Events data may warrant more frequent syncs if you're tracking real-time product behavior. Avoid syncing more often than your team actually reviews the data.

Keep the request time range short for scheduled runs

For recurring syncs on high-volume projects, set the request time range to 1–6 hours. This prevents timeout errors during automated runs and keeps each sync fast and reliable.

Performance optimization

Disable country grouping if you don't need it

The Active users "group by country" option multiplies API calls by the number of countries in your project. If you only need total active user counts, keep this disabled to speed up syncs and avoid connection errors.

Use Aggregate transformation to pre-summarize Events

If your destination is Google Sheets or Excel, use Coupler.io's Aggregate transformation to summarize event counts by date, event type, or platform before loading. This keeps your sheets manageable and avoids hitting row limits.

Common pitfalls

triangle-exclamation

Do

  • Use the project-level API key (not an organization key)

  • Set a request time range of 1–6 hours for high-volume projects

  • Trigger a cohort recompute in Amplitude before syncing cohort data

  • Use BigQuery or a database destination for raw event data with nested properties

Don't

  • Enable "group by country" unless you specifically need geographic breakdowns

  • Expect event counts to match Amplitude's built-in charts exactly — raw export data is unfiltered

  • Use a spreadsheet as the destination for multi-million-row event exports

  • Assume cohort data is historical — it reflects the cohort's current computed state

Last updated

Was this helpful?