Business problem
Clients of Ontotext often have large text collections that need to be searched efficiently. They are particularly interested in key concepts like Organizations, People, Locations, but also relations between them. Many of these key actors and their relations are typically expressed in open knowledge bases like Wikipedia/ DBPedia. Onto-text analytics takes care of annotating documents with these concepts.
But being a crowd-sources resource, it may happen that Wikipedia is:
incomplete
wrong
.. therefore Ontotext’s annotations are upper bound by the quality of Wikipedia.
Can this limitation be overcome?
Yes, if we teach the computer to “read and understand” from text about concepts and relations we don’t know from Open Data.
Paradigm:
TRAIN ML on text that is annotated with concepts and relations that WE KNOW ABOUT from Open Data (here, we can also find ways to limit to curated subsets, that we know are true).
Infer relations that ARE MISSING or WRONG in Open Data.
Case videos