Machine Learning in Excel with xl8ml! Part 1: Collaborative filtering
Introduction
Almost every day we hear about some new amazing thing developed by Tesla/Apple/Google/Facebook. And very often they say, we’ve used Machine Learning or Deep Learning to develop this innovation.
And you think to yourself, I want to do the same. I have some data, I can use it to improve/optimize my business, and I think Machine Learning can help me. But how do I start, do I need to learn a programming language? Do I learn Python or R maybe?
Well, that was maybe the case before, but you don’t need to be a programmer to use Machine Learning. You can use the tool most of us already use-Microsoft Excel! And it is a lot easier to do ML in Excel with an xl8ml product.
Through a series of articles, I’m going to show you how to use Excel powered with xl8ml to perform different machine learning tasks. In this article, we are going to create a food recommendation system with a collaborative filtering method.
Food recommendations
Let’s imagine you have an app about food recipes. Users can rate each recipe, and you collect rating information in your database. This information can be used to recommend recipes that users might like. This could improve user engagement in your app. But how exactly to utilize rating information for recommendations? Well, the answer is — collaborative filtering!
Collaborative filtering is an algorithm of making recommendations for a user by collecting preferences from other users (collaborators). Simply put, if user A and B have the same taste, user A is more likely to have B’s opinion on some recipe he has not seen yet than that of a randomly chosen user. But let’s see how it works in an example.
In the image below you can see some typical structure of data collecting information about food preferences:
Intuitively we want: sim(John, Mary) > sim(John, Steve)
To do it properly we need to:
- replace missing values with zeros — so we don’t have missing
- subtract each value with a mean value of their row — so missing values are not treated as negatives
- use cosine similarity function
So we’ll end up with something like this:
Now, we need to calculate cosine similarity. And with xl8ml this is really easy, we just use xl_ml_cosine_similarity(v1, v2):
And results show our intuition is correct, Mary’s taste is more similar to John’s than Steve’s taste:
And now we can create predictions for certain user-recipe combinations. Let’s say we want to predict if Mary is going to like chicken. We do that by either:
- taking average rating of other N most similar users that already rated chicken — for this, you can use Excel’s AVERAGE formula
- or, we can use a better solution: weighted average! Which takes into account calculated cosine similarities between users. And this is very simple with xl8ml — we just use following function xl_v_mul(ratings_chicken, similarities_users) / sum(similarities_users)
Conclusion
So we’ve built a recommendation system without writing one line of the code. And that would be a much more tedious task if not for xl8ml. Please check it out at https://www.xl8ml.com/! But this is only a small demonstration of what you can with xl8ml. Besides collaborative filtering, you can use it for other machine learning task, e.g. classification, regression etc. Please check https://www.xl8ml.com/case-studies/ for case studies that show in more detail what can be done with it. With xl8ml everybody can be a Machine Learning Engineer!