Machine Learning Models

This website uses machine learning models to predict the probability of certain outcomes in the matches listed.

All models are built in house - none are outsourced and the historical data and features used to calculate the predictions are shared in the various data tables in this site.

Fundamentally, the models work by collating a vast set of relevant historical data and using it to train different machine learning techniques and then testing the predictive performance on future matches.

For example, to calculate the probability that Bailey Smith will amass 25 or more disposals in his next AFL match against Richmond at the MCG, we can create a statistical representation of his past performance by calculating "features" such as what his average number of disposals for the current season is, his average time on ground, his average disposals as part of a home/away team, his performance in wet/dry weather, his average/mean/max disposals over the last 3/5/10 games and other data features to create this picture. We can do this for many players across many games and feed all this data to train the machine learning model. Once the model has a good enough idea of how these features can predict performance, we can ask for specific predictions such as: Bailey Smith, MCG, day time start, vs Richmond; and receive back a percentage probability that he will get 25 touches.

The models are retrained and new predictions are calculated before each round of matches.

The models used for Punters Toolkit predictions are all Python based ensemble models using gradient boosting and logistic regression. The current performance of each model is also listed so users can decide for themselves how much trust to place in it.

All that said, you should also use your common sense judgement before deciding whether or not to punt on certain bets. The models are trained on a lot of relevant statistical data and they do return reliable predictions as evidenced by the published performance metrics, however if Jack Ginnivan was spotted crunching beers at the Londern Tavern before a final then that is context the models are not aware of and you should consider that when making your decision about whether or not to punt on any aspect of his upcoming performance.

Examples of uncaptured data points the machine learning models are not aware of that may potentially affect on-field performance