Team solutions

Ontotext case – The LSTM way

In this article I will describe my approach using bi-directional LSTM and eventually stacking them for creating deeper network resulting in better results.

1
votes

What is this?

In the following article I will describe my second attempt on the Ontotext case. The described bellow models are result from adding more layers and additional feature: the organisation positions. You can check the notebook from the following experiments at https://github.com/radpet/stuff/blob/master/datathon/ontotext/BI_LSTM_POS.ipynb

Data Preparation

There is not much change from the preprocessing done in https://www.datasciencesociety.net/datathon/2018/02/09/ontotext-case-team-_a/, however some small tweaks were introduced such as replacing n and xa0 with blank space.

Model1

This is the first attempt of making the model more complex. The main difference from the one I ran on the datathon is that the positions of the company pairs are also passed to the network explicitly.

I trained the model on the test+dev split and validated versus the test split and this is the result on the test split.

 

The model also scores better on the test set provided by Ontotext.

Model 2

The model which gave the best result so far looks like this:

The main difference from Model1 is that it has one additional lstm layer that makes intuitively makes the network deeper, able to detect more complex features from the text snippet.

Results

The following confusion matrix is the result of the predictions of the model on the test split on the train data without any hyperparametere tuning on the dev set.

 

Training the network with train+dev splits and validating against the test split gives even better results:

Lastly I scored the network against the test set provided by Ontotext. The results from the model above are the best so far.

Share this

One thought on “Ontotext case – The LSTM way

Leave a Reply