The Two-Minute fix that might save your GA attribution data
- doug56778
- Jun 19
- 5 min read
Just as Julius Fedorovicius of Analytics Mania debunked the myth that user_engagement before page_view does NOT cause (not set) (#gamechanger), I came across a peculiar instance where user_engagement appeared to be related to (not set) and merited investigation.
Spoiler: Julius is 100% correct, and this cautionary tale may include insight that you might not be aware of regarding GA configuration.
I was running a test one day…
We experiment regularly, and in doing so, we recalibrate our analytics to validate our learning. That means we use a Python script to perform a user journey via Playwright to simulate the measurement changes and experimentation. We do that 100 times and expect to see certain results in analytics.
#Tangent: Check out this amazing story to see how debugging GA is incredibly similar to NASA deploying an interstellar software patch - no kidding! #72hourRoundTrip
Back to the story, the Playwright script was told to simulate an organic search visit.
Go to duckduckgo.com
Insert a link to dugadital.com on the page
Click it
Accept cookies
Wait 2 seconds
Close the browser
Repeat
Do this 100 times and expect to see 100 Organic sessions, right?
No. Not exactly.
Instead, we saw a whole bunch of ‘Unassigned’ alongside the Organic search sessions:

The source/medium data was not what was expected:

Landing pages didn’t help much - we were seeing broken sessions but how?

Diagnosis
To diagnose this spurious centurion of sessions, I added a secondary dimension to the report. Adding the name of the event as the secondary dimension for the session may seem a little odd to mix scopes but it was quite revealing:

Interesting. Now we needed to reproduce the calibration behaviour in forensic detail to observe and understand the cause of this data.
Running a repeat of the calibration journey involved trashing the _ga cookies, the server-side cookies, and the consent cookie to start afresh and see exactly what happens by inspecting the data sent in the GA hit.
Did I mention we’re dual tracking? This is a vital detail. Looking at the server-side data only - that’s the data pings being sent to the server-side destination from the browser, I see two events - a page_view and a user_engagement event.
Good, that’s what we expect to see but what about the details? Let’s inspect the hit payloads to see what’s happening:
Event 1 - looks fine (but isn’t!):
tid G-52E5VY8PPK
_s 1
sid 1750086044
seg 0
dt Digital and Marketing Analytics | Duga Digital
en page_view
_fv 1
_ss 1
Breaking these parameters down:
tid - is the GA data stream ID - that’s the server side data stream value - correct.
_s is the hit number in this session
sid is the session ID - hmm - take a close look at that value and remember it.
seg is the flag to say if this is an engaged session - only 1 interaction so far (the page_view), so no, this is not an engaged session yet.dl, and dt are the document location and document title - all as expected
en is the event name - page_view as expected
fv and ss mean this event is decorated with the first visit and session start flags - essential for correct attribution.
Event 2 - Looks sketchy! And it IS!
tid G-52E5VY8PPK
_s 2
sid 1750085917
seg 1
dt Digital and Marketing Analytics | Duga Digital
en user_engagement
All values as expected but why does the session id (sid) change? That’s two different sessions going into one property and that’s going break sessions, and attribution. Which one is the right one?
Causes and fixes
To answer that, we check the _ga cookies and again give thanks to David Vallejo for this excellent article):
ga52E5VY8PPK
GS2.1.s1750085917$o1$g0$t1750085954$j60$l0$h839539168
gaNY0HEPVZY3
GS2.1.s1750086044$o1$g0$t1750086046$j60$l0$h1050744214
The top one is the server-side session. The bottom one is the client-side session.
This means the page_view and user_engagement events have been assigned to different sessions. With fv and ss in the pageview session, the user_engagement event has no attribution data attached to it - this causes the dreaded (not set).
Why did the session ID change, though? Deciding the session ID value is part of the Google Tag functionality.
Two configuration choices have been highlighted as the cause of the issue.
Issue number 1
First, the trigger configuration was inconsistent. The client-side, and server-side Google Tags were configured to trigger on different events:
Client-side Google Tag trigger: Consent Initialization
Server-side Google Tag trigger: Initialization
As you can see from a Preview Mode screenshot shot they’re different events and happen at different times in the page execution:

The client-side Google Tag executed before the server-side Google tag.
The first event - page_view was using the session ID from the client-side Google Tag.
The server-side Google Tag appears to use the sid value from client-side Google Tag but still uses the correct destination ID, and all other settings.
The user_engagement event is triggered by the cookie banner click, by that time, the server-side Google Tag has cleaned up the session data, hence the sid is set correctly, but there’s no pageview, the session attribution is gone, and the _s counter is out of whack.
Make sure you don’t have your Google Tags executing on different events as I did as this will cause issues like this.
This is what we want to see:

Issue number 2
With multiple Google Tags firing on the same event, how can you prevent session cross-pollution from happening?
The second fix - use the cookie_prefix configuration setting in your Google Tag to make sure you have dedicated cookies for the data stream in question.
Here’s how the server-side Google Tag looks now:

And that results in a tidy, unambiguous set of _ga cookies for session calculation

Results - is it fixed?
Firstly, it’s nice to see the GA AI summary card is on the ball:

Having aligned the triggers for the Google Tags, the calibration journey hits appear with the correct values in the session ID field for both events:

And the calibration values are as expected - score 100 sessions for organic and zero for not set:

Recalibrated with the cookie_prefix in place delivers expected results:

Unassigned is an acceptable low level and wasn’t part of the calibration:

Conclusion
I’m more confident in the robustness of the GA data with both these solutions in place. We’ve added these checks to our manual and automated audits. Our sGTM migration now checks and fixes these issues if they’re detected.
The diagnosis methodology is a useful takeaway. Consider which events are associated with your (not set) rows, reproduce the journey, and carefully observe the hit payloads - especially session ID values.
Comments