preslav

Popular articles by preslav

Popular comments by preslav

Ontotext case – Team _A

You have done terrific job at analyzing the data in various ways and at designing a reasonable, directed neural network model for the task. The model uses deep learning and state-of-the-art tools and techniques (but TF.IDF-based SVM solutions have been also tried for comparison).

What is the baseline F1? Also, what is the accuracy?

Any results on cross-validation based on the training dataset for different choices of the hyperparameters of the network architecture?

Any thought what can be done next to further improve the model? Maybe combine TF.IDF with deep learning? Or perform system combination? Did the different systems perform similarly on the training set (e.g., using cross-validation)?

Critical Outliers – VMware Case

This is a reasonable solution: it is based on LDA, which is a state-of-the-art clustering algorithm. The preprocessing makes sense, and also the implementation and the planned future work. There is also visualization and some analysis, which is nice to see.

My worry is about the number of clusters: why 5 clusters? How was this number selected? Is it too little, given that there are so many articles?

Ontotext case – Team _A

Also, your confusion matrix is non-standard: it should show the raw counts. I wanted to calculate accuracy, but I cannot do it from this matrix.

BTW, it is nice that the network can give an explanation about what triggered the decision.

CASE Ontotext, Team CENTROIDA

Nice work! You have found a very relevant paper, by a world-top research group, and it was further extended, based on ideas from another paper, and using improvements on the network architecture, and based on exploration of the values of the hyper-parameters. The model uses deep learning and state-of-the-art tools and techniques.

How were the company names normalized exactly?

Do you do anything special to handle the asymmetricity of the relation?

The accuracy is very high, but what is the baseline? Also, what is F1?
Any results on cross-validation based on the training dataset for different choices of the hyperparameters of the network architecture?

Any thought what can be done next to further improve the model?