It all started yesterday morning when I saw multiple tweets mentioning new forecasting library published on my way to work:
Sounds interesting, I thought. I bookmarked the link for “weekend fun with code” and moved on. The minute I stepped in the office, Amazon S3 had the outage (coincidence?) which impacted half of the internet and KBC as well. Ok, what can i do now then?
I opened link to the facebook engineering page and started reading about the forecasting module. They supplied quite simple instructions and it made me tempted to test it out. Wouldn't it be great to use it in some KBC projects?
Since the code needed for forecasting is pretty simple, I mocked up a script of suitable for KBC use before lunch and when amazon (US-east) got back up, I could implement the code as a custom science app.
The algorithm requires two columns, date and the value column. The current script gets the source and result tables’ information from the input and output mapping and the parameters specified by user. Those parameters will define:
Date column name
Value column name
Required prediction length (period)
This is how it looks like in the Keboola:
To see the output in a visual form, I used Jupyter, which has been recently integrated within KBC. Not bad for a day’s work, what do you say?
Just imagine how easy would be for our user to orchestrate the forecasting process:
Extract sales data
Enrich data by forecasted values
Publish them to sales and marketing teams
- The sample data I used sucks. I bet yours will be better!
- Here is the link for Jupyter notebook.
- Feel free to check some other custom science apps I did: https://bitbucket.org/VFisa/
Where Prophet shines (from Facebook page)
- hourly, daily, or weekly observations with at least a few months (preferably a year) of history
- strong multiple “human-scale” seasonalities: day of week and time of year
- important holidays that occur at irregular intervals that are known in advance (e.g. the Super Bowl)
- a reasonable number of missing observations or large outliers
- historical trend changes, for instance due to product launches or logging changes
- trends that are non-linear growth curves, where a trend hits a natural limit or saturates
Martin Fiser (Fisa)
Keboola, Vancouver, Canada