Causal Impact of a TV documentary on the demand of KeepCup

A few months ago ABC aired a documentary about how much waste we produce called War on Waste. The episode revealed that Australians use about 28 disposable takeaway coffee cups per second, triggering a lot of viewers to consider alternatives such as reusable cups.

The story did not mention any specific brand of reusable cup, but it did cause a significant impact on Google searches for the most popular brand: KeepCup. Looking at historic data from Google Trends it is clear that after the episode, searches for reusable coffee cups skyrocketed, as well as specific searches for KeepCup.

I will use the R package CausalImpact developed by Google to quantify the uplift in demand for KeepCups after the episode went to air. The code uses Bayesian structural time-series to build an estimate of what the search trends would look like if the documentary never happened, and then compares that with what actually happened. I will use weekly search index data from Google Trends as a proxy for demand as search trends data is usually highly correlated with sales of a product.


# Download last 5 years of KeepCup weekly search trends data in Australia from Google Trends
# Use the term "reusable coffee cup" as control
searchTrends <- read_csv("data/keepcup_Google_trends.csv", 
                         col_types = cols(week = col_date(format = "%Y-%m-%d")))

# Plot data
ggplot() +
  geom_line(data = searchTrends, aes(x = week, y = keep_cup_index, color = "Keep Cup")) +
  geom_line(data = searchTrends, aes(x = week, y = reusable_coffee_cup_index, color = "Reusable Coffee Cup")) +
  scale_x_date(date_labels = "%b-%Y", date_breaks = "1 year") + 
  xlab("Week") + ylab("Search Index") +
  labs(title="Keep Cup Search Trends in Australia",
       caption="Source: Google Trends") +
  theme(legend.position = "bottom") +

Since 2012 the term “KeepCup” has been considerably more popular than “Reusable Coffee Cup” with a small upward trend since January 2016. The data prior to May 2017 serves as a baseline for the analysis and the generic term “Reusable Coffee Cup” is used to control for the overall interest and demand for reusable cups.

# Transformn data into time-series
time.points <- seq.Date(as.Date("2012-09-02"), by = 7, length.out = 260)
data <- zoo(searchTrends[, 2:3], time.points)

# To estimate a causal effect, we begin by specifying which period in the data should be used for training the model 
# (pre-intervention period) and which period for computing a counterfactual prediction (post-intervention period).
# Episode 3 of War on Waste at 8.30pm on Tues 30 May, therefore our post period starts from Sunday, May 28th as we 
# would like to capture all search activity that occurred right after the episode went to air.
pre.period <- as.Date(c("2012-09-02", "2017-05-21"))
post.period <- as.Date(c("2017-05-28", "2017-08-20"))

# Build the model using KeepCup and control search index, with 5000 iterations and yearly seasonality of 52 weeks
impact <- CausalImpact(data, pre.period, post.period, model.args = list(niter = 5000, nseasons = 52))

Posterior inference {CausalImpact}

                         Average      Cumulative
Actual                   68           883       
Prediction (s.d.)        39 (3.3)     508 (42.5)
95% CI                   [33, 46]     [425, 592]
Absolute effect (s.d.)   29 (3.3)     375 (42.5)
95% CI                   [22, 35]     [291, 458]
Relative effect (s.d.)   74% (8.4%)   74% (8.4%)
95% CI                   [57%, 90%]   [57%, 90%]

Posterior tail-area probability p:   2e-04
Posterior prob. of a causal effect:  99.97998%

For more details, type: summary(impact, "report")

The results point to a significant uplift in search queries for “KeepCup” above and beyond the volume of search queries for “Reusable Coffee Cup”.
Putting into numbers, the expected search index for “KeepCup” in the absence of any documentary would be 39. However, the actual index measured after May 30th was 68, a whopping 74% increase! KeepCup achieved a 74% increase in demand (search queries) with no spend in media and advertising.

Note: since the metric used here is an index, it doesn’t make sense to report cumulative results. If the metric analysed was unit sales or dollars we could get a total cumulative benefit.

Code and data can be found in this Github repo.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s