Accurate Marketing Attribution with GA4 Users in BigQuery

Marketing teams constantly ask: “Which channel truly drives conversions?”

GA4 provides user-level metrics and event data, but discrepancies between the GA4 interface and BigQuery exports can make attribution tricky. For example, Active Users or New Users may not match when you try to analyze traffic by campaign, source, or medium in BigQuery.

This guide shows how to solve these discrepancies and use GA4 user data in BigQuery to perform accurate marketing attribution, so you can confidently measure channel performance.


The Problem

MetricGA4 UIBigQuery ExportWhy It Differs
Active UsersCounts engaged users per channelCounted via user_id / user_pseudo_id in eventsGA4 applies session deduplication, blended identity, and filters; BigQuery raw data requires explicit SQL for these
New UsersFirst-time users per channelCalculated from earliest event per userGA4 UI uses first_visit/first_open events with session stitching; raw SQL needs custom logic
Campaign AttributionAuto-assigned in GA4 reportsMust join campaign parameters in BigQueryBigQuery data is raw and requires mapping UTM parameters and deduplication per user

Key Insight: To get accurate marketing attribution, BigQuery queries must mimic GA4’s logic for deduplication, session identification, and engagement.


The Solution

Follow these steps to align BigQuery analysis with GA4 for marketing attribution:

1. Deduplicate Users Across Devices

COALESCE(user_id, user_pseudo_id) AS unique_user_id

  • Ensures each user is counted once even if they visit from multiple devices.
  • Critical for measuring channel performance accurately.

2. Filter for Engaged Users

WHERE event_name = 'user_engagement'

  • Only count users who truly engaged, not just visited.
  • Matches GA4 Active User logic and prevents inflated attribution numbers.

3. Assign Users to Marketing Channels

SELECT
  COALESCE(user_id, user_pseudo_id) AS unique_user_id,
  traffic_source.source AS source,
  traffic_source.medium AS medium,
  COUNT(DISTINCT event_date) AS active_days
FROM
  `project.dataset.events_*`
WHERE
  event_name = 'user_engagement'
GROUP BY
  unique_user_id, source, medium;

  • Uses UTM parameters or GA4 traffic_source fields to assign users to channels.
  • Deduplicates across multiple events so each user is counted once per channel.

4. Calculate Active and New Users per Channel

WITH first_events AS (
  SELECT
    COALESCE(user_id, user_pseudo_id) AS unique_user_id,
    MIN(event_date) AS first_seen_date,
    traffic_source.source,
    traffic_source.medium
  FROM
    `project.dataset.events_*`
  GROUP BY unique_user_id, source, medium
)

SELECT
  source,
  medium,
  COUNT(unique_user_id) AS new_users
FROM
  first_events
WHERE first_seen_date BETWEEN '2025-08-01' AND '2025-08-07'
GROUP BY source, medium
ORDER BY new_users DESC;

  • Identifies first-time users per marketing channel.
  • Ensures channels are credited accurately without double-counting users.

5. Cross-Device & Multi-Session Attribution

  • GA4 UI automatically stitches sessions for user_id.
  • In BigQuery, join events by user_id and optionally session_id to track conversions across devices and sessions.

Example:

SELECT
  COALESCE(user_id, user_pseudo_id) AS unique_user_id,
  MIN(event_timestamp) AS first_touch,
  MAX(event_timestamp) AS last_touch,
  traffic_source.source AS source
FROM
  `project.dataset.events_*`
GROUP BY unique_user_id, source;

  • Determines first-touch and last-touch attribution for each user.

Best Practices

PracticeReason
Deduplicate using COALESCE(user_id, user_pseudo_id)Avoid inflating user counts per channel
Filter for user_engagementCount only engaged users, aligning with GA4 Active Users
Align date ranges and timezoneEnsures metrics match GA4 interface reports
Use traffic_source for channel attributionMaps UTM parameters and GA4 acquisition data
Implement session stitching when neededEnsures accurate cross-device and multi-session attribution

Key Takeaways

  • GA4 UI and BigQuery may show different user counts per channel due to deduplication, identity, and session logic.
  • Deduplicating users, filtering for engagement, and mapping traffic sources ensures accurate marketing attribution.
  • BigQuery allows you to customize attribution models (first-touch, last-touch, multi-touch) beyond GA4 standard report: Understanding GA4 Attribution in BigQuery
  • Aligning your SQL with GA4 logic ensures trusted, actionable insights for marketing decisions.

For more on GA4 user metrics and BigQuery SQL, see: Mastering GA4 Users: Metrics, Identity, and BigQuery Analysis.


Discover more from GA4BigQuery

Subscribe to get the latest posts sent to your email.

Posted in

Leave a comment

Discover more from GA4BigQuery

Subscribe now to keep reading and get access to the full archive.

Continue reading