Critical Outliers – VMware Case

Using LDA for solving this problem would have been ideal if we one had the knowledge of number of topics as number of topics is a required input to the model. In my opinion, the best way to get the number of topics would have been using some segmentation or agglomerative clustering. Once we have the number of topics, then giving it as input to the LDA model would have given better results. Just limiting yourself to 5 topics doesn’t give the desired result. While, I will give you thumbs-up on the chosen algorithm, I think some more though was needed on getting optimum input parameters (no.of topics)