Task 1 Part 1 Task 1 Part 2
Edit: link to github is here: https://github.com/mboyanov/propaganda-deteciton Article
Social networks are characterized by the links between the nodes. We employed six different link analysis algorithms to rank the nodes in the network by their importance. For the task of leader detection, the best link analysis algorithm proved to be the vanilla PageRank. Out of the top 2000 nodes, 1725 are leaders, achieving a precision of 0.8625 and a recall ot 0.8792.
We also explored an alternative solution based on embeddings. We trained a Skipgram model where we set the context of a given node to its neighbourhood. To avoid the assumption for word order made by the model we repeated and reshuffled longer neighbourhoods. These embeddings were then used to train a softmax classifier, achieving a macro-f1 score of 0.6667.
We futher provide a visualization tool which can be used to explore the graph manually.