Why GA4 BigQuery Changes Everything
Last month I audited a mid-market ecommerce brand spending EUR 35,000 per month on paid media. Their marketing manager wanted to compare last-click channel performance against a first-touch view. GA4's interface gave them a pre-built report with 14 months of retention and no way to customize the attribution window. They were making six-figure budget decisions on a black box.
That is the gap GA4 BigQuery fills. Once you export raw event data to BigQuery, you own every event, unsampled, queryable in SQL, with no retention ceiling. What used to require a six-figure Analytics 360 contract is now free for every standard GA4 property (Google Support).
If your tracking foundation is shaky before the data ever reaches BigQuery -- duplicate events, missing parameters, broken consent signals -- the export will faithfully replicate every problem. I cover the prerequisites in my GA4 tracking audit checklist, and if you need hands-on help, my marketing measurement engagements usually start there.
Connect GA4 to BigQuery in Five Steps
The link itself takes less than ten minutes. Here is the process.
- Create or select a Google Cloud project. Open the Google Cloud Console, navigate to APIs & Services > Library, and enable the BigQuery API.
- Check permissions. You need Editor-or-above on the GA4 property and Owner access on the BigQuery project (Google Support).
- Open Admin > Product Links > BigQuery Links in GA4 and click Link.
- Choose your region. Pick the same region as any existing BigQuery datasets you use. If this is your first dataset, choose the region closest to your business.
- Select export frequency. You have two options: Daily (batch) and Streaming. Daily is fine for most reporting use cases and costs nothing beyond BigQuery's free tier. Streaming adds near-real-time rows but incurs BigQuery streaming-insert costs of roughly USD 0.05 per GB (Google Cloud pricing).
That is it. There is no historical backfill -- the GA4 BigQuery export only includes data from the moment you enable it (Google Support). Enable it today, even if you will not query for months.
Daily export limits to watch
Standard properties cap the daily batch export at one million events per day. Exceed that consistently and Google will pause your export (Google Support). Streaming has no event-count cap, but you pay for ingestion. If your site pushes past the million-event mark, you either switch to streaming, prune noisy events (scroll-depth pings are a common culprit), or upgrade to Analytics 360.
Understanding the GA4 BigQuery Export Schema
The GA4 BigQuery export schema is the part that trips up most analysts. Unlike a flat spreadsheet, GA4 exports a nested, repeated structure. Each row in the events_YYYYMMDD table represents a single event, but the parameters live inside a repeated RECORD field called event_params.
Here are the columns you will work with most often:
| Column | Type | What it holds |
|---|---|---|
event_date | STRING | Date in YYYYMMDD format |
event_timestamp | INTEGER | Microsecond epoch timestamp |
event_name | STRING | The event name (e.g. page_view, purchase) |
event_params | REPEATED RECORD | Array of key-value structs |
user_pseudo_id | STRING | GA4's client-side user identifier |
traffic_source | RECORD | First-touch attribution at the user level |
collected_traffic_source | RECORD | Campaign data collected with each event |
session_traffic_source_last_click | RECORD | Session-level last-click source |
The event_params record contains a key (STRING) and a value record with four typed sub-fields: string_value, int_value, float_value, and double_value (Google Support — BigQuery Export schema).
Extracting values with UNNEST
Because event_params is a repeated field, you cannot access it with dot notation alone. You need to UNNEST it or use a subquery. Here is the pattern you will use constantly:
SELECT
event_date,
user_pseudo_id,
(SELECT value.string_value
FROM UNNEST(event_params)
WHERE key = 'page_location') AS page_location,
(SELECT value.string_value
FROM UNNEST(event_params)
WHERE key = 'source') AS source,
(SELECT value.string_value
FROM UNNEST(event_params)
WHERE key = 'medium') AS medium
FROM
`your_project.analytics_123456789.events_*`
WHERE
_TABLE_SUFFIX BETWEEN '20260601' AND '20260614'
AND event_name = 'page_view'
Two things to note. First, always filter on _TABLE_SUFFIX to avoid scanning your entire dataset (BigQuery bills by bytes scanned -- the first 1 TiB per month is free, then USD 6.25 per TiB after that (Google Cloud pricing)). Second, use the correct value sub-field for the parameter type. Page title is string_value; session ID is int_value. Mixing them up returns NULLs with no error.
Your First Attribution Query
Once you are comfortable with the GA4 BigQuery schema, attribution is the obvious next question: which channels actually drive revenue?
GA4's interface defaults to a data-driven attribution model. That is useful, but opaque. With the raw export you can build transparent, auditable attribution.
Last non-direct click at the session level
The session_traffic_source_last_click field gives you Google's own last-click attribution per session. Here is a query that counts purchases by source/medium:
SELECT
session_traffic_source_last_click.manual_source AS source,
session_traffic_source_last_click.manual_medium AS medium,
COUNT(*) AS purchases
FROM
`your_project.analytics_123456789.events_*`
WHERE
_TABLE_SUFFIX BETWEEN '20260501' AND '20260531'
AND event_name = 'purchase'
GROUP BY 1, 2
ORDER BY purchases DESC
This gives you a clean channel-level purchase table you can compare directly against your Google Ads conversion counts and against what Meta reports.
First-touch attribution
Want to know which channels introduce new users? The traffic_source field holds the first-touch source for each user:
SELECT
traffic_source.source AS first_touch_source,
traffic_source.medium AS first_touch_medium,
COUNT(DISTINCT user_pseudo_id) AS new_users,
COUNTIF(event_name = 'purchase') AS purchases
FROM
`your_project.analytics_123456789.events_*`
WHERE
_TABLE_SUFFIX BETWEEN '20260501' AND '20260531'
GROUP BY 1, 2
ORDER BY new_users DESC
Compare first-touch against last-click and you instantly see which channels do the introducing versus the closing. Organic search often dominates first-touch; branded paid search dominates last-click. That gap is where budget decisions should get nuanced, not automated.
Custom multi-touch: linear example
For teams that want full-path visibility, you can build a simple linear model in SQL. Assign each touchpoint equal credit:
WITH conversions AS (
SELECT
user_pseudo_id,
event_timestamp AS conversion_ts
FROM `your_project.analytics_123456789.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20260501' AND '20260531'
AND event_name = 'purchase'
),
touchpoints AS (
SELECT
e.user_pseudo_id,
e.event_timestamp,
session_traffic_source_last_click.manual_source AS source,
session_traffic_source_last_click.manual_medium AS medium
FROM `your_project.analytics_123456789.events_*` e
WHERE _TABLE_SUFFIX BETWEEN '20260501' AND '20260531'
AND event_name = 'session_start'
AND session_traffic_source_last_click.manual_source IS NOT NULL
),
paths AS (
SELECT
t.source,
t.medium,
c.user_pseudo_id,
COUNT(*) OVER (PARTITION BY c.user_pseudo_id, c.conversion_ts) AS path_length
FROM conversions c
JOIN touchpoints t
ON c.user_pseudo_id = t.user_pseudo_id
AND t.event_timestamp <= c.conversion_ts
)
SELECT
source,
medium,
ROUND(SUM(1.0 / path_length), 2) AS linear_credit
FROM paths
GROUP BY 1, 2
ORDER BY linear_credit DESC
This is deliberately simplified. Production models need lookback windows, deduplication logic, and handling of direct visits. But even this basic version reveals insights the GA4 UI cannot surface.
Three Gotchas That Bite New GA4 BigQuery Users
1. The GA4 BigQuery schema changes without warning. Google periodically adds fields (like session_traffic_source_last_click, added in late 2023). Queries referencing removed or renamed fields will break silently. Pin your queries to known columns and test after each GA4 update.
2. Consent gaps create holes. If you have implemented Consent Mode v2, users who decline tracking still generate modeled events in the GA4 interface -- but those modeled events do not appear in the BigQuery export. Your BigQuery totals will be lower than your GA4 reports. This is expected, not a bug.
3. The data layer matters more than ever. BigQuery faithfully exports whatever GA4 collects. If your data layer is missing parameters -- transaction IDs, item arrays, user properties -- those gaps propagate into every query. Fix collection first, then query.
Next Steps After Your GA4 BigQuery Setup
You now have the GA4 BigQuery link running, you understand the export schema, and you have three attribution queries you can run today. Here is a practical sequence for turning your GA4 BigQuery data into something your team actually uses:
- Audit your data layer to make sure the events and parameters reaching BigQuery are complete and accurate.
- Schedule a daily cost-data import from Google Ads and Meta so you can calculate true ROAS in SQL.
- Build a Looker Studio dashboard on top of your BigQuery tables for ongoing reporting.
- Compare BigQuery attribution against platform-reported conversions to understand the delta and calibrate bidding.
If you are running paid media and your measurement still depends on GA4's interface alone, you are leaving accuracy and budget on the table.
FAQ
Is the GA4 BigQuery export free?
Enabling the export is free for all standard GA4 properties. You only pay for BigQuery usage beyond a generous free tier that covers query processing and storage. Most small-to-mid-size sites stay well within those limits.
Does GA4 backfill historical data into BigQuery?
No. The export only includes data from the day you enable it. There is no way to retroactively export older GA4 data to BigQuery, which is why you should enable the link as soon as possible even if you are not ready to query yet.
What happens if my site sends more than one million events per day?
Standard GA4 properties cap the daily batch export at one million events. If you consistently exceed that limit, Google will pause your daily export. You can switch to streaming export, which has no event cap but incurs per-byte ingestion costs, or upgrade to Analytics 360.
Why are my BigQuery event counts lower than GA4 reports?
The most common cause is Consent Mode. GA4 models conversions for users who decline tracking, and those modeled events appear in interface reports but are not included in the BigQuery export. Other causes include the daily export being paused due to event-volume limits or timezone mismatches between GA4 and BigQuery tables.
Can I use BigQuery for attribution instead of GA4's built-in models?
Yes. The raw event export includes session-level and user-level traffic-source fields that let you build last-click, first-touch, linear, or custom attribution models in SQL. This gives you full transparency into how credit is assigned, unlike GA4's data-driven model which is a black box.
Not sure your GA4 data is trustworthy enough to query? Book a marketing-measurement engagement -- I will audit your setup and tell you exactly what needs fixing before you build on top of it.