Dissecting Google’s Box Office Prediction Study

By J. Sperling Reich | June 10, 2013 8:19 am PDT
Google's Comparison of 2012 Box Office Index and Film-Related Search Index

Predicting box office receipts for a motion picture release, whether for opening weekend or an entire theatrical run, is anything but an exact science. Leave it to the good folks at Google, those lords of the algorithm, to rely on math rather than fuzzy logic when coming up with a better formula for “tracking”, as the practice of box office prognostication is often referred. Last Thursday Google released a white paper titled “Quantifying Movie Magic with Google Search” which claims it can predict box office grosses for movies four weeks before their release with 94% accuracy.

As a white paper, the document does its job rather effectively and can hardly be faulted; it favorably promotes Google’s products and services through the use of carefully chosen facts and statistics, all in the guise of a well researched report. Its publication served its promotional purpose with industry and technology publications regurgitating Google’s findings in their own reporting. Few, if any, media outlets took the time to read between the lines and highlight the facts being presented from a more circumspect position. That is my intention here.

Don’t get me wrong, adding the kind of user behavior data Google has at its finger tips should most certainly make predicting box office far more accurate. Rather, I would suggest that Google’s narrow study conveniently produced complimentary results that most industry professionals already know or would rightly assume. In Google’s defense, it is their job to continue reminding potential users and customers of its value, even if certain facts can be deduced via common sense and observing overall consumer trends.

For instance, Google states that searching online for information about movies has increased by 56% from 2011 to 2012. We have to take their percentage at face value, but frankly it doesn’t really matter. Of course more moviegoers are searching for info online; (1) newspapers and magazines are disappearing by the day as their subscribers flock to the Internet so there are fewer and fewer places to look up showtimes and reviews, (2) more-and-more people have become Internet users in the same time period, and (3) an influx of smartphones means that more consumers can search for movie information while on-the-go, even if they don’t have a computer at home. Put another way… it’s a big no duh.

Google’s study examined 99 top box office hits from 2012 and found that increased search activity and paid ad clicks on the company’s products were a good predictor of box office grosses during an opening weekend and in the week or two that followed. On top of this Google learned that trailer searches are leading indicators of moviegoers interest and intent in a title as much as four weeks in advance of a theatrical release. Some of the facts and methodology that Google uses to make its point, though not necessarily inaccurate, are at times not transparent or must be taken on faith, and thus, could be construed as slanted. Here are just a few examples:

  • Google claims that moviegoers consult 13 sources before making a decision on which film to see. They also report that 48% of moviegoers decide on which film to watch the day they purchase tickets. These two figures are drawn from a Google Social Research Study titled “Understanding the Role of Social and Social Media in the Shopper’s Journey, Movie Tickets” from December of 2012. However, several attempts to find the study online proved fruitless.
  • According to Google, moviegoers learn about a film four weeks prior to its release. This is according to a Google Consumer Survey titled “The Moviegoer Research Process” from March 3, 2013 which has also been impossible to find online. Even so, anyone working in the film industry could have informed Google of this without the need for a survey, as four weeks out is roughly when most movie marketing campaigns kick into high gear, so the time frame makes sense.

Google’s formula for calculating a movie’s opening weekend box office the day before its release with 92% accuracy relies on a linear regression model. By their very nature such models rely on more than one variable to calculate their predications. Google’s model included:

  • Search query volume the week prior to release
  • Ad click volume the week prior to release
  • Number of theatres a film played on (i.e. screen count)
  • Franchise status

The first two variables are ones proprietary to Google. Screen counts can be learned online and as for the status of the franchise, it appears Google came up with its own classification for whether a film was a “Tier A Franchise”, such as James Bond, or a tentpole “Midnight” movie like “Hunger Games”. What’s remarkable was Google’s finding that “70% of the variation in box office performance can be explained with search query volume.” To Google’s credit, this underscores the value and high usage of the company’s search product.

This led Google to a predictive model which might be considered a tad more self-serving and promotional in nature. When calculating the potential earnings for holdover films in the weeks after their initial release, Google altered the variables it considered significant indicators to the following:

  • Ad click volume (Monday to Thursday of opening week)
  • Number of theatres a film played on (i.e. screen count)
  • Previous weekend performance
  • Rotten Tomatoes audience score

In this formula, the last three variables are publicly available figures. Only the ad click volume is proprietary to Google. It’s probably the most important value to Google since the company makes most of its revenue through paid search advertising. As Google put it:

Our hypothesis is that once a film has opened, search ad clicks are a strong sign of intent to purchase a ticket, whereas the intent associated with a search query is more varied. The significance in our model of search ad click volume during weeks 2 and beyond for a film illustrates the importance of search marketing presence beyond opening weekend.

Let me provide one potential translation of the above excerpt, albeit a pessimistic one; “Hey Hollywood movie studios and film distributors – just because your release is already in theatres doesn’t mean you should pull all of your search engine marketing. In fact, it probably pays to increase your search advertising budgets after opening weekend.” It goes without saying that the beneficiary of most search engine marketing dollars happens to be Google.

Google was quick to point out the limited usefulness of having accurate tracking on a film just 24 hours before it hits theaters or in the weeks that follow, writing:

“While an accurate opening weekend forecast calculated the day before premiere is certainly a valuable data point for planning post-release marketing strategy, it doesn’t leave movie marketers with very much time to react. Fortunately, Google (and YouTube) search data gives a great indication of where a movie is headed as early as four weeks from release week.”

This led them to produce a third model, which despite its increased time span, turns out to be 2% more accurate in its predictive power than either of those discussed above. More impressive is that the regression model relies on only three variables:

  • Trailer related search volume four weeks prior to release
  • Franchise status
  • Seasonality

Google came up with its own methodology for both franchise status (as described above) and seasonality, the latter depending on whether a film was released during the summer or major holiday season such as Christmas. Trailer related search volume however is a figure only Google would have access to. With an accuracy rate of 94% the company proposes that trailer searches four weeks before a film’s opening weekend “signify strong intent” in the minds of moviegoers.

As previously mentioned, none of these models are necessarily wrong, but without the actual numbers behind some of the variable their formulas are completely opaque. I don’t doubt that examining moviegoers online activity, combined with additional analytics, could prove more accurate and useful in tracking the marketing of film releases than current practices. (It should be noted Google never describes how tracking is presently conducted). Hopefully Google will provide access to their own metrics (if they don’t already), even if its only to authenticated entities, to help better arm the industry in helping spread the word about new releases. Who knows, based on the final sentence in their white paper, they just might be considering such a move:

Ultimately, it is this online engagement that gives us tangible insight into intent, arming movie marketers with actionable data in their never-ending quest to quantify ‘movie magic.’

J. Sperling Reich