preslav

Popular articles by preslav

Popular comments by preslav

Datathon – HackNews – Solution – LAMAs

LAMA:

* Summary:

– The source code is made publicly available on github.

– The article is somewhat short, but gives sufficient detail.

– The approach is standard but efficient (for tasks 1 and 2).

– This is best-ranked team overall:
– DEV: 3-4th, 1st, and 5th for task 1, task 2, and task 3
– TEST: 2nd, 1st, and 5th for task 1, task 2, and task 3
– Remarkably, on task 2, the team wins by a large margin.

* Detailed comments:

This is an exercise in using BERT (for tasks 1 and 2):
– paper: https://arxiv.org/abs/1810.04805
– code: https://github.com/google-research/bert
– other code: https://github.com/hanxiao/bert-as-service

BERT is a state-of-the-art model for Natural Language Processing (NLP), and beats earlier advancements such as ElMo. See more here:

https://medium.com/syncedreview/best-nlp-model-ever-google-bert-sets-new-standards-in-11-language-tasks-4a2a189bc155

The authors used fine-tuning based on parameters they have found in earlier experiments for other tasks. Fine-tuning BERT takes a lot of time…

* Questions:

1. Which model did you use for tasks 1 and 2? Is it model (b) from Figure 3? https://arxiv.org/pdf/1810.04805.pdf

2. Why did you use the uncased version of BERT?

3. Do you think that the large BERT model would help?

4. Did you try BERT without fine-tuning? If so, how much did you gain from fine-tuning?

5. Do you think you could be losing something by truncating the input to 256 for task 1?

Ontotext case – Team _A

You have done terrific job at analyzing the data in various ways and at designing a reasonable, directed neural network model for the task. The model uses deep learning and state-of-the-art tools and techniques (but TF.IDF-based SVM solutions have been also tried for comparison).

What is the baseline F1? Also, what is the accuracy?

Any results on cross-validation based on the training dataset for different choices of the hyperparameters of the network architecture?

Any thought what can be done next to further improve the model? Maybe combine TF.IDF with deep learning? Or perform system combination? Did the different systems perform similarly on the training set (e.g., using cross-validation)?