DNA sequence identification

Case provided by IGEM and sponsored by Telelink

Business problem

The goal of this section is the participants in our Datathon to obtain enough understanding about the problem under investigation from business perspective. More precisely answers of the questions have to be provided:
• What is the business problem?
Obtain the complete set of genome traces found in a single food sample, e.g. find out ALL sources of meat that have been used to produce a certain sausage and ALL organisms (or traces of them) that should not be found in the sausage (pathogens, bugs, human).
• Why business needs to solve the problem?
For improved quality control to be utilized in supply chains supervision and health care and protection.
• What are the important problem specifics (from business sight), which have to be accounted for in the solution?
Most of the above mentioned organisms are closely related, hence substantial parts of their genomes are almost or even completely identical.
• Are there hypotheses, which have to be investigated and possibly introduced in the solution?
The sample taken was statistically representative, i.e. the ratios between the numbers of genome sequence reads from different organisms are the same as the ratios between the amounts of the respective meet types used to produce the sausage.
• What are the business requirements, which have to be satisfied in order the final result to be satisfactory?
The sequence reads assigned to specific genomes need to be proved with higher than 90% probability.
Practical examples can put additional light on the problem description.