Business Understanding Telenor wants to identify Social Network leaders from a list of A and B nodes, their connection counts and connection strengths. A second goal is to characterize a node’s possibility of turning “bad” from its relationship to a list of truly bad nodes. Data Understanding We have 118 690 unique nodes with 1 […]
Social networks are characterized by the links between the nodes. We employed six different link analysis algorithms to rank the nodes in the network by their importance. For the task of leader detection, the best link analysis algorithm proved to be the vanilla PageRank. Out of the top 2000 nodes, 1725 are leaders, achieving a precision of 0.8625 and a recall ot 0.8792.
We also explored an alternative solution based on embeddings. We trained a Skipgram model where we set the context of a given node to its neighbourhood. To avoid the assumption for word order made by the model we repeated and reshuffled longer neighbourhoods. These embeddings were then used to train a softmax classifier, achieving a macro-f1 score of 0.6667.
We futher provide a visualization tool which can be used to explore the graph manually.