Popular comments by caseyp
ShopUp Datathon2020 – Article recommender case
Hi guys, I wasn’t able to find the output.csv prediction file – do you know where I can find it. I tried to understand the code (don’t have much experience with Python), and based on my limited understanding of the code the predictions are in the array called “vector”. But that seems to sometimes have multiple predictions. I am likely not understanding something.
Case NetInfo/Vesti.bg article recommendation — Team Army of Ones and Zeroes — Datathon 2020
We didn’t have time to test that. So at this point it is just a hypothesis (guess) that needs to be tested. To be clear, we are somewhat confident that it will NOT help FOR THIS PARTICULAR DATASET. (There are clearly many scenarios where it will)
The way to test it would be erase the last day of data (or last 2 days, or 3, etc) and use the erased data for evaluation.
The only other thing needed is some model for what “types of articles users like”. For that we can use words from the title of the article, or the natural categories that vesti.bg has (like coronavirus, bulgaria, sviat, etc), or some other way to model user preference.
With the above two, once can measure the difference in the objective (accurately predicted user article visits for the next 24 hours) and see which one is better and by how much (and whether it is statistically significant).
We didn’t have time to do all of that 🙂
But we did estimate/guess/wager that it won’t improve significantly our prediction score, so we prioritized it lower and ultimately didn’t do it.
Case NetInfo/Vesti.bg article recommendation — Team Army of Ones and Zeroes — Datathon 2020
Hi zenpanik,
The copy/paste lost the formatting and pictures. We wrote everything on a Google Doc (see below) which has pictures and fixed-font formatting. But even with the formatting you are right – Perl is hard to read. We tried to add comments, but the main goal is to make it work, and Perl is very fast and easy for prototyping (at least for old-timers like me :-).
Here are the links to the pretty google docs and PDF:
https://docs.google.com/document/d/186Bcv4DbrYLY7m3ZCeGji9QM7DAjm0arY2doV5TTNAs/edit?usp=sharing
Also in PDF format:
https://drive.google.com/file/d/1eFiw4lqNtMXAbQZLwhLPBCOyODPQY4oU/view?usp=sharing
Case NetInfo/Vesti.bg article recommendation — Team Army of Ones and Zeroes — Datathon 2020
I couldn’t figure out how to include the graphs in this document. Here is a link to the Google Driver version that has the graphs: https://docs.google.com/document/d/186Bcv4DbrYLY7m3ZCeGji9QM7DAjm0arY2doV5TTNAs/edit?usp=sharing
Also in PDF format:
https://drive.google.com/file/d/1eFiw4lqNtMXAbQZLwhLPBCOyODPQY4oU/view?usp=sharing