Yet we all spend 95% of our analytics time trying to go from simple to completely overcomplicated, without stopping to ask the question…
Is it fucking worth it?
This blog is about one of the dead simple models we use at foody, which is actually awesome. I am writing it today because… well… it’s just today that I realized (or better calculated) that the results of this model are much better than we thought.
One of the things we would love to do is build a prediction model on how many orders foody will have each today. Once that is built you can boil the model town per city, or even per restaurant.
This report allows you to spot any potential significant changes early and try to understand what caused them.
Now understanding what caused them is very complex – and probably a fools errand, as there are so many different variables interacting at each point
Foody related variables Facebook ads, Google ads, TV ads, Radio ads, Magazine ads, online articles, PR of our team, Presentations, New restaurants on foody, Offers of restaurants, Changes in Menu
Non foody related variables Day of the week, Week of the month, Month of the year, Weather, football matches (each team has a different effect on foody, and that effect is different if it plays home or away), special events like Eurovision, New restaurants launching not on foody (dine in , or takeaway included), Change in marketing budget & special offers for restaurants not on foody, If its wedding season, If its festival season, if its exam season for students, if students from abroad in Cyprus and many many many more
So building a model that incorporates the above factors (and many more) is a truly mouthwatering prospect for an analytics nut. For the last year we have been flirting with the idea of giving the data to a university student and have them build this model as their dissertation.
Given that this is a realy attractive project if you are into statistics, we might go ahead and do it at some point in the future, but the truth of the matter is it will be a whole lot of work to achieve a small result .
Why is that? Because the current simple model we have Rocks.
How much does it rock? Well 92% :)
The Current Model
We currently use a very simple model that takes in only the following figures:
This morning, I examined the results of this model for the last 394 days. In these 394 days I removed 16 instances which are clear outliers (like Christmas day, 15th of August etc) and calculated the average absolute difference of expected vs actual.
It turns out that it’s 7.98%.
This simple model, based on elementary school math, can predict the orders we will have each day with an average error of only 7.98%.
Even though the model complete ignores the 50+ significant variables that affect the actual figure, it does an amazing job predicting the total value of the orders we will get at foody every day.
So lets all say “Hail the simple model”. Hail!
If you are Netflix and are trying to improve your algorithm then even a small change can make a big difference. But if you are a startup not yet at the size of Netflix (bummer) you should be looking for awesome simple models instead of trying to build a 200-variable model.
In one our first presentations with Michael we said that as a startup you should keep your eyes open for elephants.
That means that you should try and find big value opportunities and ride them, which is another way of saying that you should use the 80-20 rule as much as you can :)