The Ontotext Case- Global Datathon 2018

Data Case Introduction

Classification of companies into industry sectors is a fundamental task for unlocking advanced business intelligence capabilities. However different data sources rarely use one and the same classification system if any. This is a huge obstacle for taking advantage of the available details in Open Data and very niche commercial data sources that lack or use inconsistent industry classifications.

High quality commercially available company data may be unaffordable for many data analytics and business analytics. Many of the niche, but highly valuable data sources come short of details about industry sector. At the same time, the amount of Open Data (official or crowdsourced) is growing but it often lacks standardized but practical approach to industry classification.




The task is to improve consistency and coverage of industry classification through automated and standardized classification model that can be used on any source to enrich the originally available data with industry sector information.


More information about the Ontotext case can be found at