Google quietly deprecated Signals for remarketing and reporting—and most GA4 users haven’t noticed their cross-device data just got worse. If you’ve been wondering why your audience sizes suddenly shrank, why your cross-device user counts look off, or why demographic data has gaps it didn’t have before, this is why. Google positioned the change as a privacy improvement, but let’s be honest: it benefits Google’s walled garden, not advertisers trying to understand their customers across devices.
The problem isn’t just that Signals is going away. The problem is that Google built an entire ecosystem of cross-device tracking that advertisers came to rely on, and now that it’s being dismantled, there’s no official migration path. No “here’s how to maintain your data quality” guide. No acknowledgment of what breaks.
I’ve spent the last several months helping clients rebuild what they lost. This post is the implementation roadmap I wish existed when the change was first announced—covering User-ID tracking, BigQuery session stitching, and first-party data strategies that actually work. If you relied on Signals for audience building and cross-device attribution, this is your playbook.
What Google Signals Actually Did (And Why You Probably Took It For Granted)
Let’s start with what most people misunderstand about Signals. It wasn’t just about demographics. Google Signals served three distinct functions that got quietly bundled together:
Cross-device reporting: When a user signed into their Google account on their phone, then later visited your site on their laptop (also signed in), Signals connected those sessions. Your “Users” count reflected actual humans, not device counts.
Remarketing audience expansion: Signals allowed you to build audiences that followed users across devices. Someone who browsed products on mobile could see remarketing ads on desktop—without you implementing any additional tracking.
Demographics and interests data: Age, gender, and affinity categories came from Google’s knowledge of signed-in users. This powered both reporting and audience targeting.
The critical insight most analytics teams missed: Signals was doing identity resolution for you, for free, using Google’s logged-in user graph. You didn’t have to implement User-ID tracking. You didn’t have to build your own identity layer. Google just… handled it.
Now they don’t.
What Actually Breaks: The Concrete Impact
Here’s what happens to your GA4 property when Signals deprecation fully takes effect:
User Counts Inflate
Without cross-device identity resolution, a single customer who visits on three devices becomes three users in your reports. I’ve seen properties where “Users” jumped 15-40% after Signals stopped contributing to reporting identity—not because traffic increased, but because identity resolution degraded.
Audience Sizes Shrink (Or Behave Erratically)
This is the one that hurts performance marketers most. Your carefully built remarketing audiences lose their cross-device reach. Someone who qualifies for an audience on mobile may not be included when browsing on desktop. Audience counts drop, frequency caps behave unexpectedly, and campaign performance degrades for reasons that aren’t obvious in the ads interface.
Demographic Reports Go Sparse
If you’ve been using demographic breakdowns in your reporting, expect coverage to decline. Properties that previously showed demographics for 60-70% of users might drop to 30-40% or lower, depending on your User-ID implementation rate.
Here’s a comparison based on what I’ve observed across client properties:
| Metric | With Signals Active | After Signals Deprecation | Impact |
|---|---|---|---|
| Cross-device user stitching | ~60-75% of multi-device users | 0% (without User-ID) | User counts inflated 15-40% |
| Remarketing audience reach | Full cross-device coverage | Single-device only | Audience sizes drop 20-35% |
| Demographic data coverage | 55-70% of users | 25-40% of users | Reporting gaps, targeting limitations |
| Session stitching accuracy | Automated via Google graph | Manual implementation required | Requires BigQuery + custom logic |
Attribution Models Suffer
This is the sleeper issue. If GA4 thinks one customer is three different users, your attribution models are working with corrupted input data. Multi-touch attribution becomes less accurate because the “touches” are being distributed across phantom users instead of being correctly attributed to a single customer journey.
Implementing User-ID Tracking Properly in GA4
Here’s the uncomfortable truth: User-ID tracking has been available in GA4 since launch, but most implementations I audit either don’t have it, have it implemented incorrectly, or only fire it in limited circumstances.
User-ID is now your primary mechanism for cross-device identity. Get this wrong, and everything downstream—your reports, your audiences, your BigQuery analysis—is compromised.
The Core Concept
User-ID requires you to have your own authenticated user identifier. This means:
- The user must be logged in (or you have another first-party identifier)
- You must pass that identifier to GA4 consistently
- The identifier must be stable across devices
For e-commerce sites, this is typically the customer ID from your platform. For SaaS products, it’s the user ID from your authentication system. For publishers with registration walls, it’s the subscriber ID.
GTM Implementation
If you’re using Google Tag Manager (and you should be for any serious implementation—our GTM service covers proper architecture), here’s the setup:
Step 1: Create a Data Layer Variable for User ID
First, ensure your site pushes the user ID to the data layer when available:
// Push to dataLayer when user is authenticated
// This should fire on every page where user is logged in
window.dataLayer = window.dataLayer || [];
window.dataLayer.push({
'user_id': 'YOUR_UNIQUE_USER_ID', // Replace with actual user ID
'user_properties': {
'customer_type': 'registered', // Optional: additional user properties
'account_created': '2023-01-15' // Optional: enrichment data
}
});
Step 2: Create GTM Variables
Create a Data Layer Variable in GTM:
- Variable Type: Data Layer Variable
- Data Layer Variable Name:
user_id - Name:
dlv - user_id
Step 3: Configure GA4 Configuration Tag
In your GA4 Configuration tag, add the User ID under “Fields to Set”:
- Field Name:
user_id - Value:
{{dlv - user_id}}
Also add it as a user property:
- Property Name:
user_id - Value:
{{dlv - user_id}}
Step 4: Handle the Logged-Out State
This is where most implementations fail. You must handle the state where a user isn’t logged in:
// GTM Custom JavaScript Variable: Get User ID or Undefined
function() {
var userId = {{dlv - user_id}};
// Return undefined (not empty string, not null) when no user ID
// GA4 handles undefined correctly; empty strings cause issues
if (userId && userId !== '' && userId !== 'null' && userId !== 'undefined') {
return userId;
}
return undefined;
}
The undefined return is critical. Passing empty strings or null values can corrupt your user identity graph. GA4 expects undefined when there’s no user ID.
What User IDs Should Look Like
Never use PII as your User ID. No email addresses, no phone numbers, no names. Use:
- Database primary keys (numeric IDs)
- UUIDs generated at account creation
- Hashed identifiers (if you must derive from email, use SHA-256)
Bad: john.smith@email.com
Good: usr_a1b2c3d4e5f6
Also good: 28374
Validating Your Implementation
After deployment, validate in GA4 DebugView:
- Open GA4 → Admin → DebugView
- Enable debug mode in GTM or via browser extension
- Log in on your site and navigate several pages
- Confirm
user_idappears consistently in events
Then check GA4 → Admin → Reporting Identity. Ensure it’s set to “Blended” (which prioritizes User-ID, then Device-ID).
Using BigQuery Exports to Stitch Sessions and Rebuild Cross-Device Reports
GA4’s native reporting will only show you cross-device data for users where you’ve successfully implemented User-ID. But if you’re exporting to BigQuery—and you should be for any serious analytics operation—you can build much more sophisticated session stitching.
Setting Up BigQuery Export
This is non-negotiable for advanced analytics. GA4’s BigQuery export gives you event-level data with full fidelity. If you haven’t enabled it:
- GA4 → Admin → BigQuery Links
- Link to your GCP project
- Choose “Streaming” export for real-time data (costs more) or “Daily” for batch
Once enabled, you’ll have tables in the format: analytics_PROPERTY_ID.events_YYYYMMDD
Session Stitching Query
Here’s a query that demonstrates cross-device session stitching using User-ID:
-- Cross-device session analysis using User-ID
-- This query identifies users with sessions across multiple devices
WITH user_sessions AS (
SELECT
user_pseudo_id,
user_id,
(SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS session_id,
device.category AS device_category,
device.operating_system AS os,
geo.country,
MIN(TIMESTAMP_MICROS(event_timestamp)) AS session_start,
MAX(TIMESTAMP_MICROS(event_timestamp)) AS session_end,
COUNT(*) AS event_count
FROM
`your-project.analytics_XXXXXXXX.events_*`
WHERE
_TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY))
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
AND user_id IS NOT NULL -- Only users with User-ID
GROUP BY
user_pseudo_id, user_id, session_id, device_category, os, country
),
cross_device_users AS (
SELECT
user_id,
COUNT(DISTINCT device_category) AS device_types_used,
COUNT(DISTINCT session_id) AS total_sessions,
ARRAY_AGG(DISTINCT device_category) AS devices,
MIN(session_start) AS first_session,
MAX(session_end) AS last_session
FROM user_sessions
GROUP BY user_id
HAVING COUNT(DISTINCT device_category) > 1 -- Multi-device users only
)
SELECT
device_types_used,
COUNT(DISTINCT user_id) AS user_count,
AVG(total_sessions) AS avg_sessions_per_user,
ROUND(COUNT(DISTINCT user_id) * 100.0 / SUM(COUNT(DISTINCT user_id)) OVER(), 2) AS percentage
FROM cross_device_users
GROUP BY device_types_used
ORDER BY device_types_used;
This query gives you visibility into cross-device behavior that GA4’s standard reports can’t provide anymore—but only for users where you’ve captured User-ID.
Building a User Identity Table
For more sophisticated analysis, maintain a user identity table that maps user_pseudo_id (the device-level identifier) to user_id (your authenticated identifier):
-- Create or replace user identity mapping table
CREATE OR REPLACE TABLE `your-project.analytics_processed.user_identity_map` AS
SELECT
user_pseudo_id,
user_id,
MIN(TIMESTAMP_MICROS(event_timestamp)) AS first_identified,
MAX(TIMESTAMP_MICROS(event_timestamp)) AS last_seen,
COUNT(DISTINCT (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')) AS sessions_count
FROM
`your-project.analytics_XXXXXXXX.events_*`
WHERE
user_id IS NOT NULL
AND _TABLE_SUFFIX BETWEEN '20230101' AND FORMAT_DATE('%Y%m%d', CURRENT_DATE())
GROUP BY
user_pseudo_id, user_id;
With this mapping table, you can retroactively stitch sessions—connecting anonymous sessions to known users once they authenticate.
First-Party Data Strategies to Replace What Signals Provided
User-ID only works when users are logged in. For most sites, that’s a minority of sessions. To approach the coverage Signals provided, you need a broader first-party data strategy.
Strategy 1: Aggressive (But Useful) Registration Prompts
Signals worked because Google had massive logged-in user coverage. You need to build your own logged-in user base. This means:
- Email capture with genuine value exchange (not just “sign up for our newsletter”)
- Account creation incentives (saved carts, order history, wishlist functionality)
- Progressive profiling (don’t ask for everything upfront)
For Shopify stores specifically, leveraging customer accounts effectively is critical. If you’re on Shopify, our Shopify service can help implement proper customer account flows that maximize authentication rates.
Strategy 2: First-Party Cookies with Consent
When users aren’t logged in, you can still maintain device-level continuity with properly implemented first-party cookies. This doesn’t give you cross-device tracking, but it does give you accurate single-device journeys.
Key implementation notes:
- Set cookies on your own domain (first-party)
- Respect consent signals (GDPR, CCPA)
- Use reasonable expiration windows (consider 13 months maximum for compliance)
- Fall back gracefully when cookies are blocked
Strategy 3: Probabilistic Matching in BigQuery (With Caveats)
When you don’t have deterministic User-ID matching, you can attempt probabilistic matching based on signals like:
- IP address patterns
- Browser fingerprint similarity
- Behavioral patterns (visit same pages in same sequence)
- Time-based correlation
I’m including this for completeness, but I want to be direct: probabilistic matching is less accurate than what Signals provided, requires significant engineering investment, and creates compliance risk if not implemented carefully. It’s a partial solution for large-scale properties, not a magic fix.
Strategy 4: Customer Data Platform Integration
For enterprise implementations, CDPs like Segment, mParticle, or Rudderstack can serve as your identity resolution layer. They maintain identity graphs that you control, and they integrate with GA4 via the Measurement Protocol.
This approach gives you:
- Centralized identity resolution
- Control over matching logic
- Independence from platform changes
- Ability to integrate offline touchpoints
The trade-off: significant cost and implementation complexity. This isn’t viable for small properties.
Common Mistakes and Troubleshooting
Mistake 1: Sending Empty User IDs
I see this constantly. Sites check if a user is logged in, but then send an empty string when they’re not:
// WRONG - sends empty string
dataLayer.push({
'user_id': isLoggedIn ? userId : ''
});
// CORRECT - only push when ID exists
if (isLoggedIn && userId) {
dataLayer.push({
'user_id': userId
});
}
Empty strings corrupt your identity graph. GA4 may treat '' as a valid (shared) User-ID.
Mistake 2: Using Session-Scoped Identifiers
Your User-ID must be persistent across sessions. Never use:
- Session IDs
- Temporary tokens
- Cart IDs (unless they persist with user accounts)
Mistake 3: Not Handling User Switching
On shared devices, users might log out and another user logs in. Your implementation must clear or update User-ID appropriately:
// When user logs out
dataLayer.push({
'event': 'user_logout',
'user_id': undefined // Clear the ID
});
Mistake 4: Forgetting About Consent
If you operate in GDPR or CCPA jurisdictions, User-ID tracking requires consent for analytics purposes. Your consent management platform must gate the User-ID parameter, not just the GA4 tag itself.
Mistake 5: Not Validating BigQuery Data
BigQuery exports can have delays and occasionally missing data. Always validate:
-- Check for data freshness and completeness
SELECT
_TABLE_SUFFIX AS date,
COUNT(*) AS event_count,
COUNT(DISTINCT user_pseudo_id) AS users,
COUNTIF(user_id IS NOT NULL) AS events_with_user_id,
ROUND(COUNTIF(user_id IS NOT NULL) * 100.0 / COUNT(*), 2) AS user_id_coverage_pct
FROM
`your-project.analytics_XXXXXXXX.events_*`
WHERE
_TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY))
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
GROUP BY _TABLE_SUFFIX
ORDER BY _TABLE_SUFFIX;
If user_id_coverage_pct is unexpectedly low, investigate your implementation.
This Approach Breaks When…
Let’s be honest about limitations:
- Low authentication rates: If only 5% of your users ever log in, User-ID solves for that 5%, not the other 95%. Signals covered a much broader base.
- Privacy regulations: Some jurisdictions and consent frameworks limit first-party identity tracking regardless of authentication
- Technical debt: Sites with multiple authentication systems, acquired properties, or legacy platforms face significant integration challenges
- Mobile apps: If users aren’t logged into your app, you have the same anonymous user problem—and app tracking privacy changes compound this
There’s no perfect replacement for what Signals did. What I’m describing is the best available path forward, not a complete solution.
When to Bring in Help
If you’re reading this and thinking “this is significantly more complex than what we have resources for,” you’re not wrong. Proper identity resolution requires:
- Clean authentication infrastructure
- GTM expertise for reliable User-ID implementation (we handle this)
- BigQuery skills for session stitching and analysis
- Ongoing validation and maintenance
For teams without dedicated analytics engineering resources, this is where working with a specialized GA4 implementation partner makes sense. The cost of doing it wrong—inflated user counts, broken audiences, bad attribution—compounds every day it’s not fixed.
For organizations dealing with complex data flows across multiple systems—GA4, advertising platforms, CRM, maybe even Amazon marketplace data via SP-API—the identity resolution challenge extends beyond just Google Analytics. Our approach to data integration accounts for these cross-platform identity needs.
Key Takeaways
-
Google Signals deprecation silently degraded cross-device tracking and remarketing audience reach. Most GA4 users haven’t noticed yet, but user counts are inflated, audiences are smaller, and attribution is less accurate.
-
User-ID is now your primary cross-device identity mechanism. If you haven’t implemented it properly in GTM, start there. Use undefined for logged-out users, never empty strings.
-
BigQuery exports are essential for serious cross-device analysis. Build session stitching queries and maintain user identity mapping tables to understand cross-device behavior.
-
First-party data strategy is no longer optional. Increasing authentication rates, implementing proper consent-based first-party cookies, and potentially integrating a CDP are necessary investments.
-
There is no complete replacement for what Signals provided. Google’s logged-in user graph was massive. Your own first-party identity layer will have lower coverage. Accept this and optimize accordingly.
-
Validate continuously. Check your User-ID coverage rates, audit your BigQuery data freshness, and monitor for implementation drift. This isn’t a set-and-forget situation.
Share this article