At JSM 2011 today, three Google employees (amongst the more than 20 Google delegates there) gave a little insight into how statistical analysis with R yields better results for companies using Google's various advertising products.
Bill Heavlin from Google kicked off the session with a talk about conditional regression models, a statistical technique at Google used to evaluate the factors that lead to user satisfaction of Google products, such as when users are surveyed on satisfaction with search reports, or when users are asked to rate YouTube videos. Google has graciously shared the fruits of Bill's research by publishing an open-source R package for conditional regression.
Next up was Tim Hesterberg from Google, who talked about how Google determines the effectiveness of display ads for its customers. When a brand-name company places a display (or banner) ad on a popular website like ESPN.com or CNN.com, it can be hard to judge its effectiveness, because a small percentage of visitors will click on a display ad. But that's not to say that a display ad won't affect future purchasing behavior, for example by searching for "HTC" or visiting the HTC website a couple of days after seeing a display ad for an HTC phone. Using observational data from more than 10 million web users, Google compares the search behavior of people who were exposed to the display ad (i.e. those that never visited a web page displaying the ad) to similar users who did see the ad, to figure out how many additional people visit the advertiser's web site as a result of seeing the display ad.
Tim was very clear in pointing out that no private information from any individual web user is used to make this determination, and that several techniques are used to minimise the bias inherent in using an observational, rather than experimental, process to make the estimate of additional visitors. (For example, Google tests the uplift of irrelevant "decoy" phrases, like searching for "wool socks", to make sure no spurious benefit is detected.) Google runs hundreds of studies each month, using R software for the statistical analysis and visualization, to ensure that its advertisers are always getting the best bang for their marketing dollar.
Finally, John Vaver from Google discussed yet another method Google uses for ad effectiveness, this time with respect to the ads that appear alongside Google searches. For advertisers who buy ads around the world, an elegant statistical trick is used to determine how spending in a geographic region drives additional benefits (as measured by goal completement, such as ordering a product or signing up for a newsletter). By temporararily turning off ads in a given region, and cycling this through all the regions covered, Google can double up on data used to determine the effectiveness of the ad: once when the ad is turned off, and again when it's turned back on again. This information is then combined to determine the overall effectiveness in the ad. Once again, R was used for the data analysis and visualization.
Overall, the session was a fascinating insight into how advanced statistical analysis on massive data sets, and the R statistical software system, is used by Google to help marketers get the best value out of their advertising.
You are doing a good job on posting JSM summaries, David.
It is a shame I did not manage to talk to other Revo delegates finally, but I will keep an eye on what you guys are doing.
Posted by: Yihui | August 03, 2011 at 17:32
Thanks, Yihui! It was nice talking to you at the conference.
Posted by: David Smith | August 03, 2011 at 19:10
That's really interesting. Is there any way that we could see the talks or their presentations?
Posted by: Erik | August 04, 2011 at 06:18
The topic is interesting, but I am distracted by the SEO-y links to your site for common phrases.
Posted by: Andrew | August 04, 2011 at 08:44
@Erik, I'm not sure -- the presenters didn't offer any links to download the slides in their presentations. You'd have to ask the Googlers directly.
Posted by: David Smith | August 04, 2011 at 14:50
This is interesting - Is there any way to implement this on ones own website?
Posted by: Michael | August 05, 2011 at 12:55
I'm assuming it will come out in the JSM 2011 proceedings. I unfortunately had to miss all the cool Google and FB talks as well.
Posted by: John Johnson | August 06, 2011 at 05:53
Is R a good tool for solving optimization problems? It doesnt scale at all.
I used it to solve a quadratic optimization problem. It didnt allow more than 100 variables..
Posted by: Sriram | August 22, 2011 at 21:15
Excellent post. I want to thank you for this informative read, I really appreciate sharing this great post. Keep up your work. . . .
Posted by: Helen | January 16, 2012 at 04:04
Google really doing very well about advertising and its effectiveness,in Finland country many of people are really into internet in fact there are many services that giving online survey which is really helpful for business.I am really amaze to see everything in world wide web very useful in people.
Posted by: Hannes Ketola | November 08, 2012 at 05:31