Warning: DOMDocument::loadHTMLFile(): s3:// wrapper is disabled in the server configuration by allow_url_fopen=0 in /home/keepthef/datasciencesociety.net/wp-content/plugins/dss-core/dss-core.php on line 447
Warning: DOMDocument::loadHTMLFile(s3://dss-www-production/uploads/2020/05/Shopup_hybrid_recommender-1.html): failed to open stream: no suitable wrapper could be found in /home/keepthef/datasciencesociety.net/wp-content/plugins/dss-core/dss-core.php on line 447
Warning: DOMDocument::loadHTMLFile(): I/O warning : failed to load external entity "s3://dss-www-production/uploads/2020/05/Shopup_hybrid_recommender-1.html" in /home/keepthef/datasciencesociety.net/wp-content/plugins/dss-core/dss-core.php on line 447
Warning: Invalid argument supplied for foreach() in /home/keepthef/datasciencesociety.net/wp-content/plugins/dss-core/dss-core.php on line 452
We are team of experts from ShopUp (AI in Retail and eCommerce) Sergi Sergiev and Desislava Nikolova . Our goal was to experiment, play with data and share our experience. We hope you are going to enjoy the reading and please vote + .
The main objective of the Article recommender case, is to optimize the suggestions to the readers of articles online. The case has the final goal of engaging the user with topics which are the closest to his points of interest.
The main idea of the case is to predict the next best article for the visitor.
The evaluation of the model will be for the same users using the data for the next time period. The training dataset is for the 30 days. The articles are almost 60 000 and the visitors are over 2 300 000.. Evaluation dataset is going to be for the next 1 day.
Figure: The chart of the tasks
We looked at different models and made a short list of them:
Solution and approach
We decided to create a hybrid recommender focusing on content and user preferences.
About the content we wanted on use Deep Learning in order to provide unsupervised embedings based on the text of the article.
The first step is to gather some additional data, so we decided to scrape some information from the provided articles. The scraping is done with selenium and it creates an additional file, in which we store some important things from the article: website_link, title, subtitle, text, date_of_posting and hashtags. It is not clear which one will describe the article the best. We can compare them and choose one.
- For BERT model we use Russian model because of the time limitations
BERT is designed to help computers understand the meaning of ambiguous language in text by using surrounding text to establish context. https://searchenterpriseai.techtarget.com/definition/BERT-language-model
- Recurrent neural network (RNN)
A recurrent neural network (RNN) is a type of artificial neural network commonly used in natural language processing (NLP). RNNs are designed to recognize a data’s sequential characteristics and use patterns to predict the next likely scenario. We can use this method to predict the next cluster given the previous history we have for each user, in that case we can predict what is the future cluster of interest for that person. We used old code from article which we create in 2019 on summer school https://shopup.me/blog/recommenders_systems/
Figure: Basic RNN
The clusters are representation of the different topics and the interest of the users. For the final decisions, we will use the idea that similar people with similar interests might like the articles which others like. With those clusters will be possible to group people, use some of them for verification and then improve the suggestions for the others.
In order to evaluate the performance of a model and thus be able to choose the highest performing model, a testing functionality is available.
The function makes use of the split into training data and test data.We suggest to convert the data to sequence of chosen articles and to take the first 20 per user and to predict the next 10. For ranking methods we decided to use three metrics which are relevant for measuring ranking:
- Mean Reciprocal Rank (MRR)
- mean Average Precision (MAP) – is supposed to be a classic and a ‘go-to’ metric for measuring the order
- Distribution coef which we defined as how many correct clusters are selected. For example 6 out of all (10) = 60%
The three functions are in the code mean_av_pres(),distribution_coef() and compute_mrr().
The following is the diagram of what out model looks like. The blue parts are ready and the red ones not yet.
Figure: Block scheme of the final solution
The code can be viewed and examined here:
This case is a great example of what our future thinking should be: the best personalized options for the people, wanting to learn more of the things they care about.
In the process we faced some problems: the 48h time interval is motivating but also not enough; finding people to work with is also a great challenge; the dataset is big, but still possible to compute in a personal computer; the collaboration with the teammates on a distance was also something we had to coup with. However, it all was worth it and it was a good improvement.
There were many problems in the steps but overall the chosen solution gives good results. The whole training process is not going to take a long time, but it all depends on the given time interval.