Team name: Cheetahs Case: Telelink (iGEM) Provider: IBM Business Understanding The task for the Telelink case is to obtain the complete set of genome traces found in a single food sample and ALL organisms that should not be found in the food sample. The business needs a solution to this DNA Sequence identification case for improved […]
In this article the mentors give some preliminary guidelines, advice and suggestions to the participants for the case. Every mentor should write their name and chat name in the beginning of their texts, so that there are no mix-ups with the other menthors. By rules it is essential to follow CRISP-DM methodology (http://www.sv-europe.com/crisp-dm-methodology/). The DSS […]
Warning: DOMDocument::loadHTMLFile(): ID Load-Data already defined in s3://dss-www-production/uploads/2020/05/02.-Deep-Learning-Model-1.html, line: 13758 in /home/keepthef/datasciencesociety.net/wp-content/plugins/dss-core/dss-core.php on line 447
Authors The ACES team that worked on the solution is listed in alphabetical order: Atanas Blagoev ([email protected]) Atanas Panayotov([email protected]) Emil Gyorev ([email protected]) Georgi Buyukliev ([email protected]) Iliana Voynichka ([email protected]) Slav-Konstantin Ivanov ([email protected]) Ventsislav Yordanov ([email protected]) Business Understanding Even though the news is perceived as one of the most important sources of information to people in […]
The food industry is governed by strict laws and regulations, which provide certainty that each product meets health and safety standards. In addition to existing biochemical food product analysis, we propose a metagenomic approach. Main benefit of this approach is the ability to perform next generation sequencing as a standard first step and then align the sampled data to genomes references of many organisms suspected to be present in the sample. Additionaly, if another organism is suspected at a later date, it is easy to reause the sampled data set to perform another analysis – in the biochemical analysis this would require expensive sample storage and performing more laboratory tests. We examined three approaches to metagenomic analysis – BLAST, Centrifuge and BWA MEM.
We developed workflow utilizing Blast and Centrifuge toolkits, that is able to provide precise metagenomics information about food composition, from comparing DNA reads with reference genomes of various species. Our workflow is optimized to work on Google Cloud instance (Compute Engine) with 24 CPUs and 200 GB of RAM.