Home » Technical Topics » Data Science

Is Facebook’s “Prophet” the Time-Series Messiah, or Just a Very Naughty Boy?

  • PeterCotton 

1-04w5eijDcjek9scTZFwDw

A debate rages on page one of Hacker News about the merits of the world’s most downloaded time-series library. Facebook’s Prophet package aims to provide a simple, automated approach to the prediction of a large number of different time series. The package employs an easily interpreted, three-component additive model whose Bayesian posterior is sampled using STAN. In contrast to some other approaches, the user of Prophet might hope for good performance without tweaking a lot of parameters. Instead, hyper-parameters control how likely those parameters are a priori, and the Bayesian sampling tries to sort things out when data arrives.

Prophet’s Claims, and Lukewarm Reviews

The funny thing is though, that if you poke around a little you’ll quickly come to the conclusion that few people who have taken the trouble to assess Prophet’s accuracy are gushing about its performance. The article by Hideaki Hayashi is somewhat typical, insofar as it tries to say nice things but struggles. Yahashi notes that out-of-the-box, “Prophet is showing a reasonable seasonal trend unlike auto.arima, even though the absolute values are kind of off from the actual 2007 data.” However, in the same breath, the author observes that telling ARIMA to include a yearly cycle turns the tables. With that hint, ARIMA easily beats prophet in accuracy — at least on the one example he looked at.

0bz0lfI_v2VJeIe61

Taking Prophet for a Spin

I began writing this post because I was working on integrating Prophet into a Python package I call time machines, which is my attempt to remove some ceremony from the use of forecasting packages. These power some bots that the prediction network (explained at www.microprediction.com if you are interested). How could I not include the most popular time series package?

  • We call m.fit(df) after each and every data point arrives, where m is a previously instantiated Prophet model. There is no alternative, as there is no notion of “advancing” a Prophet model without refit.
  • We make a “future dataframe” called forecast say, that has k extra rows, holding the times when we want predictions to be made and also known-in-advance exogenous variables.
  • We call m.predict(forecast) to populate the term structure of predictions and confidence intervals.
  • We call m.plot(forecast) and voila!

0nIP6Q4unOr1nYHQB

0exfum88UaC9TSEQ9

What’s Going On?

Perhaps we start by looking at some of the more bold Prophet predictions.

0T-gCUCCO3b5m_9rY

0x8rT0QhpJwxTv1lE

09Hzr-yu9miAVXgRV

0LpqfbpXxNtMstfYh

0xjUh9BZMrZ_TgsVG

0cGYB_OPVaKWvdH3m

0wKBiRKljnz2H2HFD

0s1_zK1oxVCimExiQ

0Z56nro_0ehBE-AY4

0rUa0b-gA7lnDSRSa

0vHr9ooHTb0HZ36_W

0MOUVaXEN9DAYk5S6

0aURvAhuvcd5hZh5d

05tJVOr_-Om8JI9FY

01-i6gVRHHiF3vtRI

Reigning in Prophet for Better Accuracy

Now, having shown you in-sample data, let’s look at some examples with the truth revealed. You’ll see that some of those wagers made by Prophet do pay out. For example, here’s Prophet predicting the daily cycle of activity in bike sharing stations close to New York City hospitals. It does a nice job of anticipating the dropoff, don’t you think?

0La5A-g1ctnWtddCR

0Mklfbd0xdng_oZ71

0QbHLyHdKIE1c5JGP

00tWsT5VTwnAS8GtR

0kkw9DTGAXnpRBwBl

0WFNYAgCkdqxadOj3

  1. Construct an upper bound by adding m standard deviations to the highest data point, plus a constant. Similarly for a lower bound.
  2. If Prophet’s prediction is outside these bounds, use an average of the last three data points instead.

0AO_wWexkuzzd7pUm

An Ongoing Assessment, and Elo ratings

I have begun a more systematic assessment of Prophet, as well as tweaks to the same. As with this post, I’m using a number of different real world time series and analyzing different forecast horizons. The Elo ratings seem to be indicative of Prophet’s poor performance — though I’ll give them more time to bake. However, unless things change my conclusions are:

  • In keeping with some of the cited work, I find that Prophet is beaten by exponential moving averages at every horizon thus far (ranging from 1 step ahead to 34 steps ahead when trained on 400 historical data points). More worrying, the moving average models don’t calibrate. I simply hard wired two choices of parameter.