← Back to Blog

A/B Testing Guide for Websites and Apps

Stop guessing what works. A/B testing turns opinions into data and small improvements into significant revenue gains.

A 10% improvement in conversion rate means 10% more revenue without spending more on marketing. A/B testing is how you find those improvements systematically instead of through hunches and redesigns. Here's how to run tests that actually improve your business metrics.

What A/B Testing Actually Is

A/B testing shows two versions of something (page, email, ad) to different users and measures which performs better. Version A (control) is your current version. Version B (variant) is your hypothesis for improvement.

For more insights on this topic, see our guide on Conversion Tracking Setup: Measure What Matters.

What you can test:

  • Landing pages — Headlines, images, form length, CTAs, social proof placement
  • Product pages — Price display, product descriptions, photo layouts, reviews positioning
  • Checkout flows — One-page vs. multi-step, guest checkout vs. account required
  • Email campaigns — Subject lines, send times, content, CTAs
  • Ad creative — Images, copy, headlines, audience targeting

What you measure:

  • Conversion rate — Percentage who complete desired action
  • Click-through rate — Percentage who click CTA or link
  • Revenue per visitor — Average transaction value (accounts for both conversion rate and order size)
  • Time on page, bounce rate — Engagement metrics

Start with a Hypothesis: Don't Test Randomly

Random A/B tests waste time. Good tests start with a hypothesis based on data or user feedback.

Hypothesis framework:

  • Observation: "Checkout abandonment is 60%, industry average is 40%"
  • Hypothesis: "Requiring account creation before checkout causes abandonment"
  • Test: "Offering guest checkout will reduce abandonment by 10%"
  • Measurement: Compare completion rate between account-required vs. guest-checkout versions

Where to find test ideas:

  • Analytics — High drop-off points, low-converting pages, pages with high bounce rates
  • Heatmaps — Where users click, how far they scroll, what they ignore
  • User feedback — Support tickets, surveys, user testing sessions
  • Best practices — Industry standards you're not following

Sample Size and Statistical Significance

Running a test for 2 days with 100 visitors doesn't tell you anything. You need sufficient sample size and statistical significance to trust results.

Statistical significance basics:

  • 95% confidence level — Industry standard. Means there's 5% chance results are due to random variation
  • Sample size — Number of visitors needed depends on current conversion rate and expected improvement
  • Test duration — Run tests for at least 1-2 full business cycles (usually 1-2 weeks minimum)

Sample size example:

  • Current conversion rate: 2%
  • Minimum detectable effect: 10% improvement (2% → 2.2%)
  • Required sample size: ~40,000 visitors per variation (80,000 total)
  • At 1,000 visitors/day: test runs 80 days

Use a sample size calculator (free online tools from Optimizely, VWO, etc.) before launching tests. If you don't have enough traffic to reach significance in reasonable time, test something with bigger expected impact or higher traffic pages.

Common Tests to Run First

Some tests have higher success rates than others. Start with these proven test ideas.

CTA button tests:

  • Button text: "Buy Now" vs. "Add to Cart" vs. "Get Yours"
  • Button color: High-contrast colors typically win
  • Button size: Larger buttons convert better (to a point)
  • Button placement: Above the fold vs. after product details

Headline tests:

  • Benefit-focused vs. feature-focused
  • Question headlines vs. statements
  • Specificity: "Save money" vs. "Save $1,200/year"

Social proof tests:

  • Placement: Above fold vs. near CTA
  • Type: Testimonials vs. review stars vs. customer logos
  • Quantity: "Join 10,000 customers" vs. no number

Form tests:

  • Field count: Remove optional fields
  • Single-column vs. multi-column layout
  • Inline validation vs. validation on submit

Multivariate Testing: Testing Multiple Elements

Multivariate testing (MVT) tests multiple changes simultaneously. Instead of headline vs. headline, you test headline + image + CTA combinations.

When to use multivariate testing:

  • High traffic — MVT requires significantly more traffic than A/B (testing 2 headlines × 2 images × 2 CTAs = 8 combinations)
  • Mature optimization program — After you've run individual A/B tests and want to test interactions between elements
  • Major redesigns — Testing entire layouts vs. incremental changes

When to stick with A/B:

  • Traffic under 100,000 visitors/month
  • Testing specific hypotheses
  • Want results quickly

Most businesses should stick with simple A/B testing. MVT sounds sophisticated but rarely delivers better insights for the traffic investment required.

Testing Tools: Free to Enterprise

You don't need expensive tools to start A/B testing. Free options exist for low-traffic sites.

Testing platforms:

  • Google Optimize — Free (deprecated 2023, but alternatives emerged). Good for beginners
  • VWO — $200-500+/month. Visual editor, heatmaps, session recordings
  • Optimizely — $50,000+/year. Enterprise-level, requires high traffic to justify cost
  • Convert — $99-1000+/month. Privacy-focused, GDPR-compliant
  • Unbounce — $90-240+/month. Landing page builder with built-in A/B testing

For low-budget testing:

  • Email A/B testing — Built into Mailchimp, ConvertKit, etc. (free for subject line tests)
  • Facebook/Google Ads — Built-in A/B testing for ad creative
  • Manual page variants — Create two landing pages, split traffic manually via different URLs

Running the Test: Best Practices

Proper test execution matters as much as the hypothesis.

Test setup checklist:

  • 50/50 traffic split — Equal visitors to each variation (or 25/25/25/25 for 4 variations)
  • Randomization — Ensure assignment is truly random, not based on time of day or device type
  • Cookie-based consistency — Same user sees same variation across multiple visits
  • Test one thing — Multiple changes make it impossible to know what drove improvement

Don't end tests early. Even if variant is "winning" after 3 days, wait for statistical significance. Early results often don't hold.

Watch for seasonality. Running a test through Black Friday or a product launch skews results. Test during typical business periods.

Analyzing Results: Beyond the Winner

A/B testing tools declare a "winner," but good analysis goes deeper.

What to analyze:

  • Statistical significance — Must hit 95%+ confidence before trusting results
  • Segment performance — Did variant win for all users or just mobile? New vs. returning visitors?
  • Secondary metrics — Did variant improve conversion but hurt revenue per order?
  • Qualitative feedback — Check if support tickets increased about the winning variation

Real example: Variant increased form submissions by 15% (significant win). But sales team reported lead quality dropped. More submissions, fewer qualified leads. Test "won" but hurt business.

Always check downstream metrics. Increasing clicks doesn't help if conversions drop. More conversions don't help if average order value tanks.

Common A/B Testing Mistakes

  • Testing too many things at once — Can't determine which change drove results
  • Ending tests early — Early "winners" often regress to mean given time
  • Ignoring sample size requirements — 100 visitors isn't enough to test anything
  • Not documenting tests — Six months later you won't remember what you tested or why
  • Testing low-traffic pages — Takes months to reach significance. Test high-traffic pages first
  • Declaring winners without significance — 52% vs. 48% could be random noise

Building a Testing Culture

One-off tests don't transform businesses. Continuous testing does.

Sustainable testing program:

  • Monthly test cadence — Always have a test running (if traffic allows)
  • Testing backlog — List of 10-20 test ideas ranked by expected impact and ease
  • Documentation — Record every test, hypothesis, result, and learnings
  • Share results — Socialize wins and losses across team. Learning compounds

Even "failed" tests teach you something. They rule out ideas and inform future hypotheses. Testing culture values learning over being right.

Related Reading

Want to run A/B tests that actually improve conversions?

We'll set up testing infrastructure, develop hypotheses based on your data, and run statistically significant tests that grow your business.

Start A/B Testing