Facebook Prophet - Forecasting library

It all started yesterday morning when I saw multiple tweets mentioning new forecasting library published on my way to work:

tweet_03jpeg

Sounds interesting, I thought. I bookmarked the link for “weekend fun with code” and moved on. The minute I stepped in the office, Amazon S3 had the outage (coincidence?) which impacted half of the internet and KBC as well. Ok, what can i do now then?

I opened link to the facebook engineering page and started reading about the forecasting module. They supplied quite simple instructions and it made me tempted to test it out. Wouldn't it be great to use it in some KBC projects?

Since the code needed for forecasting is pretty simple, I mocked up a script of suitable for KBC use before lunch and when amazon (US-east) got back up, I could implement the code as a custom science app.

The algorithm requires two columns, date and the value column. The current script gets the source and result tables’ information from the input and output mapping and the parameters specified by user. Those parameters will define:

  • Date column name

  • Value column name

  • Required prediction length (period)

This is how it looks like in the Keboola:

custom_sciencepng

To see the output in a visual form, I used Jupyter, which has been recently integrated within KBC. Not bad for a day’s work, what do you say?

chartpng


Just imagine how easy would be for our user to orchestrate the forecasting process:

  1. Extract sales data

  2. Run forecasting

  3. Enrich data by forecasted values

  4. Publish them to sales and marketing teams

orchestrationpng

Notes:

  • The sample data I used sucks. I bet yours will be better!
  • Here is the link for Jupyter notebook.
  • Feel free to check some other custom science apps I did: https://bitbucket.org/VFisa/

Where Prophet shines (from Facebook page)

Not all forecasting problems can be solved by the same procedure. Prophet is optimized for the business forecast tasks we have encountered at Facebook, which typically have any of the following characteristics:
  • hourly, daily, or weekly observations with at least a few months (preferably a year) of history
  • strong multiple “human-scale” seasonalities: day of week and time of year
  • important holidays that occur at irregular intervals that are known in advance (e.g. the Super Bowl)
  • a reasonable number of missing observations or large outliers
  • historical trend changes, for instance due to product launches or logging changes
  • trends that are non-linear growth curves, where a trend hits a natural limit or saturates

Martin Fiser (Fisa)

Keboola, Vancouver, Canada

Twitter: @VFisa