Affiliate disclosure: some links in this article are partner links. If you start a paid plan through them, imisofts may earn a commission at no extra cost to you. We only recommend tools we actually use to run client campaigns.
Most people run cold email campaigns, get mediocre results, and move on.
At imisofts, we run 100+ A/B tests monthly. Each test teaches us something. Compound those lessons, and you get 65-75% open rates and 3-5% reply rates.
This is our A/B testing framework.
The Cold Email A/B Testing Hierarchy
Not all tests have equal impact. We prioritize:
- Subject line (highest impact)
- Opening line (second highest)
- Value statement (medium impact)
- CTA (medium-low impact)
- Send time (lowest impact)
Never test low-impact variables first. You'll waste samples before finding the real wins.
Testing Subject Lines (Highest Impact)
Subject lines determine open rate. Open rate determines everything else.
Test structure:
Variant A (baseline): [Current subject line that gets 45% opens]
Variant B (test): [New subject line with different formula]
Sample size: 50-100 prospects each
Duration: 3-5 days
Example test:
Variant A: "hi john, noticed you launched [product]" (45% open rate)
Variant B: "quick thought on [industry]" (52% open rate)
Winner: Variant B (+7 percentage points)
This 7-point improvement compounds. Across 10,000 email prospects, that's 700 additional opens. 700 additional opens means 7-35 additional replies (at 1-5% reply rate).
That's 7-35 additional customers from changing one word.
Subject line tests we run:
- Personalization (first name + achievement vs. generic)
- Question format vs. statement format
- Lowercase vs. Title Case
- Short vs. specific
- Different achievement angles
Testing Opening Lines (Second Highest Impact)
Opening line determines whether they read past the first sentence.
Test structure:
Email 1, Variant A: [Opening line A] + [rest of email unchanged]
Email 1, Variant B: [Opening line B] + [rest of email unchanged]
Sample size: 100 prospects each
Duration: 3-5 days
Metric: Open rate from click to read percentage (hard to track without advanced tools)
Alternative: Track reply rate as proxy for "engagement with content"
Example test:
Variant A: "I noticed you launched [product] last month." (1.2% reply rate)
Variant B: "Most SaaS founders spend 15 hours/week on prospecting. You probably do too." (1.8% reply rate)
Winner: Variant B (+0.6 percentage points reply rate)
0.6 points seems small. Across 10,000 prospects, it's 60 additional replies.
Opening line tests we run:
- Specific personalization vs. general observation
- Problem-first vs. achievement-first
- Curiosity gap vs. direct statement
- Industry pattern vs. company-specific observation
Testing Value Statements (Medium Impact)
Value statement is your chance to prove relevance before the pitch.
Test structure:
Email 1, Variant A: [Value statement A]
Email 1, Variant B: [Value statement B]
Metric: Email 1 reply rate or Email 2 open rate (if they engage with Email 1, they'll engage with Email 2)
Example test:
Variant A: "We help SaaS teams automate their prospecting and save 12 hours/week." (2% Email 1 reply)
Variant B: "One of your competitors just booked 8 qualified deals this month using [tactic]." (2.8% Email 1 reply)
Winner: Variant B (social proof outperforms direct benefit)
This changes your entire Email 1 strategy. Across campaigns, social proof hooks outperform benefit hooks by 20-40%.
Value statement tests we run:
- Direct benefit vs. social proof
- Specific metric vs. general statement
- Industry pattern vs. company-specific observation
- Problem-agitation vs. opportunity-excitement
Testing CTAs (Medium-Low Impact)
CTA wording has lower impact than subject/opening, but still matters.
Test structure:
Email 2, Variant A: "[CTA A] or reply with your timeline."
Email 2, Variant B: "[CTA B] or reply with your timeline."
Metric: Email 2 reply rate
Example test (SaaS):
Variant A: "Book a 15-min strategy call" (3.2% reply)
Variant B: "Are you open to a quick conversation?" (2.8% reply)
Winner: Variant A (direct booking link outperforms vague ask)
But this varies by industry. Medicare-focused campaigns might see opposite results (phone-first CTAs outperform calendar links).
CTA tests we run:
- Direct booking link vs. "reply to schedule"
- Soft ask vs. hard ask
- Specific time ("15 min") vs. vague ("quick call")
- Phone number vs. calendar link (varies by industry)
Testing Send Times (Lowest Impact)
When you send affects open rate, but much less than what you send.
Test structure:
Group A: Send on Tuesday, 10 AM
Group B: Send on Thursday, 10 AM
Metric: Open rate
Our data across 50M+ emails:
Tuesday: 45% open rate
Wednesday: 47% open rate
Thursday: 48% open rate
Friday: 40% open rate
Monday: 38% open rate
Best day: Thursday, 10 AM
Worst day: Monday, 10 AM
Difference: ~10 percentage points
That matters, but nowhere near as much as subject line testing (which can change open rate by 30+ points).
Send time tests we run:
- Weekday vs. weekend
- Morning vs. afternoon vs. evening
- Time zone-specific sends
- Industry-specific patterns (e.g., healthcare gets higher open on Friday due to weekly planning)
How to Measure Statistical Significance
You don't need a PhD in statistics. Here's the simple rule:
Sample size of 50+ per variant. If you see a 5%+ difference, it's probably real.
More rigorous approach:
Use a binomial test calculator.
Example:
- Variant A: 45 opens out of 100 (45%)
- Variant B: 52 opens out of 100 (52%)
- Difference: 7 percentage points
Question: Is this real or random?
Plug into calculator. If p-value < 0.05, it's statistically significant (95% confidence). You can trust the result.
For cold email, we use this rule of thumb:
Sample < 50: Don't trust the result. Run more samples.
Sample 50-100: If difference > 5%, probably real.
Sample 100-200: If difference > 3%, probably real.
Sample 200+: If difference > 2%, probably real.
The Weekly Testing Cycle
Monday: Review last week's tests. Declare winners.
Tuesday-Wednesday: Roll out winning variant to 50% of new prospects.
Wednesday-Thursday: Run new tests on remaining 50%.
Friday: Measure results.
Monday: Repeat.
This weekly cycle compounds. Each week you find one new winning variant. Month 1, you're at baseline. Month 3, you're 30-40% above baseline.
What Not to Test
Don't test too many things at once
Wrong: Test subject line, opening line, CTA, send time simultaneously.
You won't know which variable won. Also called "multivariate testing" and it requires huge sample sizes.
Right: Test one variable per week.
Subject line week 1. Opening line week 2. CTA week 3. You learn faster and with smaller samples.
Don't test on tiny samples
Wrong: Test on 10 people per variant.
Too much variance. Random chance plays huge role.
Right: Test on 50+ people per variant minimum.
This gives signal above noise.
Don't declare winners too early
Wrong: Run test for 24 hours. Declare winner.
Time of day matters. Day of week matters. One day isn't enough.
Right: Run test for 5-7 days minimum.
This accounts for daily/weekly patterns.
Testing Template: What We Track
| Element | Variant A | Variant B | Winner | Notes |
|---------|----------|----------|--------|-------|
| Subject Line | hi john, noticed [product] | quick thought on [industry] | B | +7 points open rate |
| Opening Line | I noticed... | Most [industry]... | B | Engagement higher |
| Value Statement | Direct benefit | Social proof | B | Social proof +0.8% reply |
| CTA | Book call | Reply to schedule | A | Direct link better |
| Send Time | Tuesday 10 AM | Thursday 10 AM | B | Thursday +3 point open |
Tools for A/B Testing
At imisofts, we use:
- Instantly (built-in A/B testing)
- SmartLead (rotation + analytics)
- Clay + Apollo (data merge + manual testing)
- Custom scripts (for complex multivariate tests)
Most platforms now offer native A/B testing. Use it.
Results: What Testing Gets You
Baseline campaign (no testing):
- Subject: 35% open
- Reply: 1.5%
After 3 months of weekly testing:
- Subject: 55% open (+20 points)
- Opening: better engagement
- CTA: better conversion
- Reply: 3.5% (+2%)
That 2% improvement on reply rate is massive. It doubles your results.
What We Recommend at imisofts
We run A/B testing for all managed clients:
- Weekly testing cycles
- Subject line, opening, CTA, send time
- Multivariate testing for scaled campaigns
- Statistical significance validation
- Monthly optimization reports
Packages start at $497/month (Management with testing) to $2,450/year (Enterprise with full testing suite).
Explore imisofts Cold Email Packages