Best Practices

Start with Builds as your primary entity

Builds contain the richest operational data — status, duration, branch, and trigger info. Get this entity working first before layering in Analytics reports or Audit logs.

Join Builds with Pipelines

Build records include a pipeline ID but not the pipeline name or project. Use a Join transformation in your data flow to enrich build data with pipeline metadata for readable, filterable reports.

Use Append for multi-range analytics

Codefresh Analytics reports only cover one date range per request. Use the Append transformation to combine reports from multiple periods into a single continuous dataset for trend analysis.

Scope your API key to an admin account

Entities like Accounts, Account settings, and Audits require admin access. Using an admin-generated API key from the start avoids permission errors when you add new entities later.

Data refresh and scheduling

Set your start date close to your reporting window

Don't pull all-time build history on every sync. Set a start date that covers your actual reporting need (e.g., the last 90 days) to keep syncs fast and your destination clean.

Match analytics granularity to your reporting cadence

If you report weekly to stakeholders, set report granularity to weekly. Daily granularity on long date ranges generates a lot of rows and slows down your destination.

Performance optimization

Separate Builds from Audit logs into different data flows

Builds update frequently; Audit logs are lower volume but may have different retention. Keeping them in separate data flows lets you set different schedules and start dates for each.

Use BigQuery for large build histories

If you're syncing months of build records from an active account, Google Sheets will hit row limits quickly. Send high-volume entities like Builds to BigQuery and use Looker Studio to visualize them.

Common pitfalls

triangle-exclamation

Do

  • Generate your API key from an admin account

  • Join Builds with Pipelines for readable reports

  • Test with a short date range before expanding to full history

  • Use BigQuery or Looker Studio for large-scale build analytics

Don't

  • Use the same data flow for both frequently-updated and static entities

  • Pull all-time build history on every scheduled run

  • Rely on raw Build IDs in reports without joining pipeline metadata

  • Assume Contexts expose secret values — they only contain metadata

Last updated

Was this helpful?