A-B testing will tell you whether internet advertising works.
Pick an ad. Assign users at random into groups A and B. Show users in group A the ad. Don't show users in group B the ad. Watch and see at what rate users in group A and in group B buy the product. (Or survey the users in the group to ask how they feel about the brand.) If there's a statistically significant difference between the behaviors of people in group A and in group B, then you have statistically significant evidence that the ad works.
Yes, this kind of thing is harder to do with billboards and with TV advertising than with internet ads. This is one of the selling points of internet ads over TV and billboards. If you buy ads from Google or Facebook or whatever, they can run this kind of experiment to measure how effective your advertising actually is. It doesn't involve peering into people's minds, just watching their internet behavior and/or surveying them.
- Buyers clicking on the ad, rather than just stashing the product name, and later buying it offline, or through another channel
- People not using ad-blockers
- The "statistically significant" evidence being statistically significant
The vast majority of ad views have no effect on the viewer at all.
Suppose 100,000 viewers are shown the ad (that's your group A); and 0.01% of viewers click on the ad, and complete a purchase. That's 10 actions - much too small for statistical significance. I have no idea whether these are typical numbers; but if that completion rate is in the ballpark, then I guess you need to show the ad to at least a million people.
But what about everyone else (group B)? That's the rest of the population of the planet. How many of them bought the product because they saw the ad? Zero, because they weren't shown the ad. You have no statistic for group B at all.
It makes more sense if you're comparing ad A with ad B. Which one produces more completions? That would be a more convincing statistic. But it still doesn't tell you that internet advertising "works", in any quantifiable sense. It tells you which style of ad works best, without telling you how much better it works than simply not bothering.
And it doesn't tell you how many people were sufficiently annoyed by the ad to vow never to buy that brand.
> Buyers clicking on the ad, rather than just stashing the product name, and later buying it offline, or through another channel.
I'm not talking about measuring clicks on the ad. I'm talking about either surveying people or about measuring their post-ad behavior, e.g. whether they buy a product. Yes, this generally misses offline behavior, but it captures a lot more online behavior than whether or not you click the ad. People measure clicks on the ad too, but that's not what I'm describing.
What this requires is accurately tracking a user across the internet, i.e. being able to identify a user who is part of your experiment as the same user later buying a product (or visiting a website, or answering a survey). Which is an imperfect mechanism. But it works well enough to run this kind of experiment.
Ad blockers don't really mess this up. The experiment takes the existence of ad-blockers into account. E.g. if everyone used ad blockers, this kind of experiment wouldn't show positive results (except by random variation).
And you're right, you do need a lot of data to get statistically significant results when people don't buy the product that often (when "conversion rates are low", in the lingo), which is a challenge with measuring "conversions". It's a lot easier to measure those for, say, mobile games than it is for cars. If you're a car manufacturer, then measuring car buying this way isn't going to work.
When you do this in practice, it turns out sometimes the results are significant and sometimes they aren't. Probably because some ads work and some ads don't.
> It tells you which style of ad works best, without telling you how much better it works than simply not bothering.
The style of experiment I described is a "holdback" experiment. It compares showing people the ad vs simply not bothering showing (some subset of) people that ad. People in control group B are treated as though the ad under the experiment never existed in the first place. (Which typically means showing them some other ad in its place, because that's what would be done to users if the ad campaign under the experiment wasn't being run.)
>But what about everyone else (group B)? That's the rest of the population of the planet.
This isn't how A-B tests work. Groups A and B aren't "people who see your website with change A" and "everyone else, including people who never interact with anything you showed them at all". A good experiment design means a good control group that you can measure something about. These experiments aren't stupid. (Well, sometimes they are. You have to set it up well.)
> And it doesn't tell you how many people were sufficiently annoyed by the ad to vow never to buy that brand.
Well sure, but it can tell you if your ad results in people answering survey questions about your brand more negatively, which might help you notice that your ad is annoying and counterproductive.
Anyway, long story short, internet advertising is a whole lot more measurable than you were originally suggesting with "But how could one prove such a thing? It would involve peering into people's minds."
Yes, there are limitations. Yes, a lot of statistics about marketing "working" is bullshit. But some of it isn't.
Pick an ad. Assign users at random into groups A and B. Show users in group A the ad. Don't show users in group B the ad. Watch and see at what rate users in group A and in group B buy the product. (Or survey the users in the group to ask how they feel about the brand.) If there's a statistically significant difference between the behaviors of people in group A and in group B, then you have statistically significant evidence that the ad works.
Yes, this kind of thing is harder to do with billboards and with TV advertising than with internet ads. This is one of the selling points of internet ads over TV and billboards. If you buy ads from Google or Facebook or whatever, they can run this kind of experiment to measure how effective your advertising actually is. It doesn't involve peering into people's minds, just watching their internet behavior and/or surveying them.
Relevant links:
Google help page: https://support.google.com/displayvideo/answer/9570506
Facebook help page: https://www.facebook.com/business/help/1693381447650068