Portfolio bid strategies are genuinely powerful — they pool conversion data across campaigns, simplify bid management, and give you levers like target CPA/ROAS floors and ceilings that single-campaign strategies can't match. But the moment you want to test one against another, you hit a wall: Google's native Campaign Experiments don't support portfolio strategies. That's not a bug, it's a constraint that forces smart practitioners to build their own testing frameworks. After managing over $350M in Google Ads spend, I've developed repeatable methods that get you statistically meaningful results without relying on the experiment tool. Here's exactly how to do it.
Why Portfolio Bid Strategies Break Standard A/B Testing
Before we get into solutions, it's worth understanding why this problem exists. Google's Campaign Experiments work by splitting traffic at the ad group or campaign level, routing a percentage of auctions to a "treatment" arm in real time. Portfolio bid strategies, by design, operate across multiple campaigns simultaneously — the algorithm is making cross-campaign decisions about where to push spend and how to pace. Splitting that logic mid-flight creates model conflicts that Google hasn't built a clean solution for.
As practitioners often discuss in the r/googleads community, the core tension is this: the very thing that makes portfolio strategies valuable (pooled signals, shared learning) is the same thing that makes them nearly impossible to A/B test with standard tooling. You're essentially trying to isolate a variable that exists at a higher architectural level than individual campaigns.
Key Insight: The inability to use Campaign Experiments with portfolio bid strategies isn't a gap you can work around with a setting change. You need to design your test methodology around this constraint from the start — not try to force the native experiment tool to do something it wasn't built for.
Method 1: Time-Based Sequential Testing (The Most Practical Approach)
For most practitioners, a time-based sequential test is the most realistic starting point. You run Bid Strategy A for a defined period, then switch to Bid Strategy B for an equal period, and compare performance across the two windows.
How to Structure It
- Define your baseline period. Run your current strategy for at least 4 weeks, preferably 6, to establish a stable performance baseline. Pull weekly averages for CPA, ROAS, conversion rate, impression share, and average CPC.
- Identify external variables to control for. Seasonality, promotions, competitor activity, and landing page changes can all contaminate results. Document everything in a change log before the test begins.
- Set your transition window. After switching strategies, give the new portfolio strategy a 1–2 week learning period before you start measuring. Counting learning-phase performance against the new strategy is one of the most common mistakes I see.
- Run the test period. Match the duration of your baseline — typically 4–6 weeks of clean data.
- Normalize for external factors. Use your Google Analytics or third-party data to apply rough seasonality adjustments if comparing across different calendar periods.
Common Mistake: Running a sequential test across a seasonality boundary — for example, switching strategies in mid-November — and attributing performance differences to the bid strategy when the real driver is Q4 demand spikes. Always align your test windows to comparable seasonal periods, or extend both windows to average out the noise.
Minimum Data Requirements
For sequential testing to produce anything resembling a reliable signal, you need volume. As a rule of thumb I apply across accounts:
- Minimum 50 conversions per week per strategy being tested (100+ is much better)
- At least 4 weeks per period excluding the learning window
- A pre-test week-over-week variance of <15% in your key metric — if your account is already highly volatile, the test results will be noise
If your account is generating <30 conversions per week across the campaigns in question, sequential testing will give you directional signals at best. Don't make major structural decisions based on low-volume sequential tests.
Method 2: Geo-Based Split Testing (Highest Confidence, Most Setup Work)
If you have geographic segmentation in your campaigns — which most mid-to-large accounts do — geo-based splits give you the closest thing to a true A/B test for portfolio strategies. The concept is simple: assign comparable geographic markets to two groups, run Portfolio Strategy A in Group 1 and Portfolio Strategy B in Group 2, simultaneously.
Building Your Geo Test Design
- Identify your geo universe. Pull 90 days of performance by state, DMA, or city depending on your scale. You need enough conversion volume in each individual market to be meaningful.
- Create matched pairs. Use historical CPA/ROAS, conversion volume, average order value, and seasonality index to pair similar markets. You want Group A and Group B to look as close to identical as possible on a trailing 90-day basis.
- Duplicate your campaign structure. Create two sets of geographically targeted campaigns — one for each group — and assign each set to its own portfolio bid strategy.
- Use exclusions, not targeting bids. Apply geographic exclusions rather than bid adjustments to keep the audiences clean. Bid adjustments interact with the portfolio strategy's own logic and create confounding variables.
- Run for a minimum of 4 weeks simultaneously. The simultaneous nature is the major advantage here — you're controlling for time-based external factors automatically.
Best Practice: When building matched geo pairs, weight your matching toward conversion volume rather than revenue or ROAS. High-CPA outlier markets can skew ROAS comparisons dramatically. Pair on CPA and volume first, then sanity-check on revenue mix.
Limitations to Acknowledge
Geo splits work best for accounts with national or regional coverage. If your business is hyper-local or concentrated in one metro area, you won't have enough geographic variance to split meaningfully. Also, portfolio strategies with limited conversion data per geo group will struggle to optimize well — you're essentially giving the algorithm a smaller signal pool than it would have in a unified structure.
Method 3: Campaign Cluster Comparison (For Accounts Without Geo Segmentation)
If geo splits aren't an option, you can test across product lines, audience segments, or campaign themes — any dimension where you have natural clusters of comparable campaigns.
| Approach |
Isolation Quality |
Setup Complexity |
Min. Weekly Conversions |
Best For |
| Sequential (Time-Based) |
Low–Medium |
Low |
50+ |
Most accounts; directional confidence |
| Geo Split |
High |
High |
100+ (per geo group) |
National/regional accounts with clean geo data |
| Campaign Cluster |
Medium |
Medium |
75+ (per cluster) |
E-commerce with distinct product categories |
| Holdout (No-Strategy Control) |
Medium–High |
Medium |
100+ |
Testing portfolio vs. manual or standard smart bidding |
The campaign cluster method assigns one portfolio strategy to a set of thematically similar campaigns (e.g., brand campaigns, top-product campaigns) and a different strategy to a comparable set. The key requirement is that the clusters need comparable historical performance profiles — if one cluster has a naturally lower CPA than the other due to audience or keyword intent differences, your test is contaminated before it starts.
Key Insight: No matter which method you use, your test is only as valid as your baseline normalization. Before any test, spend two weeks documenting your current performance variance. If week-over-week CPA swings >20% organically, you need to stabilize the account before a test result will mean anything.
What Metrics to Actually Measure (And What to Ignore)
A common question in the r/googleads community is whether to focus on CPA, ROAS, or a blended efficiency metric when evaluating bid strategy tests. The honest answer: it depends on what problem the test is trying to solve, but here's how I prioritize:
Primary Metrics
- CPA or ROAS at the portfolio level — not campaign level, not ad group level. Portfolio strategies optimize at the aggregate, so measure at the aggregate.
- Conversion volume — a strategy that improves CPA by 10% but drops conversion volume 30% is a net loss for most growth-oriented accounts.
- Impression share lost to budget vs. rank — this tells you whether the strategy is leaving efficiency on the table or is actually budget-constrained.
Secondary Metrics (Directional Only)
- Average CPC (useful for understanding how the bidding behavior changed)
- Auction insight shifts (are you winning different auctions?)
- Quality score trends (a major CPC change with stable QS means the strategy is bidding differently, not that keyword relevance changed)
Metrics to Deprioritize
- Click-through rate — bidding strategy changes rarely move CTR meaningfully unless you're changing ad position dramatically
- Impression volume alone — more impressions at worse CPA is not a win
- Daily performance during the learning window — this data is almost always misleading; exclude the first 7–14 days from your analysis
Best Practice: Build a simple weekly reporting template before the test starts — not during or after. Decide in advance what a "win" looks like: for example, "Strategy B wins if CPA improves by >8% with no more than a 10% drop in conversion volume, measured over a 4-week clean window." Defining success criteria after you see the data is how confirmation bias sneaks in.
Accounting for Portfolio Strategy Learning Periods
This is the section most testing guides skip, and it's where a lot of practitioners make expensive mistakes. Smart bidding algorithms — including those powering portfolio strategies — have learning periods that can last anywhere from 1 to 3 weeks depending on conversion volume and the magnitude of the strategy change.
During the learning period, the algorithm is calibrating to the new objective. CPAs will often be worse, impression share may fluctuate, and spend pacing can behave erratically. If you start measuring on Day 1 of a strategy switch, you are not measuring the strategy — you are measuring the transition.
Learning Period Guidelines by Conversion Volume
- High volume (>200 conversions/week in the portfolio): Exclude the first 7 days post-switch from analysis
- Medium volume (50–200 conversions/week): Exclude the first 10–14 days
- Low volume (<50 conversions/week): Exclude the first 14–21 days, and reconsider whether you have enough data for a valid test at all
One practical tip: Google does flag the learning status in the bid strategy report under Tools & Settings. Check this daily during your transition week and don't start the clock on your measurement period until you see "Eligible" status consistently for at least 3 consecutive days.
Common Mistake: Panicking during the learning window and switching back to the original strategy before the algorithm has stabilized. This resets the learning clock, wastes the data you've already generated, and leaves you with no valid test results. Commit to the transition window you defined before the test started — unless you're seeing catastrophic performance degradation (e.g., >50% CPA spike sustained for more than 5 days).
Using Google's Built-In Portfolio Reporting as a Proxy
While Campaign Experiments don't support portfolio strategies, Google does provide portfolio-level reporting that many practitioners underutilize. Under Tools & Settings > Bid Strategies, you can access a dedicated portfolio strategy report that shows performance over time at the portfolio level, including a built-in comparison to target metrics.
This won't replace a proper controlled test, but it gives you a few useful lenses:
- Bid strategy status history — shows you exactly when the learning period started and ended
- Target vs. actual CPA/ROAS over time — helps you understand whether the algorithm is consistently meeting targets or systematically over/undershooting
- Portfolio-level impression share trends — useful for detecting whether the strategy is being budget-constrained in ways that invalidate comparisons
Cross-reference this report against your test windows to make sure the "Eligible" status aligns with the periods you're analyzing. It's a simple sanity check that saves a lot of headaches when you're writing up results.
What to Do Next: Bottom Line Action Plan
Portfolio bid strategy testing is more work than dropping in a Campaign Experiment, but the discipline it requires often produces better strategic insight. Here's how to move forward:
- Audit your conversion volume first. Pull your weekly conversion totals for the campaigns you want to test. If you're under 50 conversions per week in the portfolio, invest in conversion volume growth before running a bidding test — the results won't be reliable enough to act on.
- Choose your method based on account structure. Geographic segmentation available and >100 conversions/week per geo group? Use Method 2. Single market or product-homogeneous account? Use Method 1 or 3 depending on your seasonal risk tolerance.
- Define success criteria in writing before you start. Specific threshold, specific metrics, specific timeframe. Send it to a stakeholder or colleague to create accountability. This one step eliminates most of the post-hoc rationalization that invalidates test conclusions.
- Build in a mandatory learning window exclusion. Set a calendar reminder for Day 7 (or Day 14 for lower-volume accounts) to check bid strategy status before you begin recording results. Do not count learning-phase data in your analysis.
- Document everything in a change log. Any landing page change, promotion, budget change, or external market event that occurs during your test window needs to be logged with the date. This log is what separates a defensible test conclusion from an educated guess when someone (or you) challenges the results three months later.
Portfolio bid strategies are worth the effort to test rigorously — the performance ceiling is genuinely higher than single-campaign smart bidding when you have the volume to support them. The practitioners who get the most out of them are the ones who treat the testing process as a discipline, not an afterthought.