A simple approach using neural networks.
I’m learning data science, and this is my first experiment with Neural Networks.
The accuracy is about 2.78%, because we have over 50 categories in the training dataset
Thank you for your support. This is an amazing competition.
Many big retailers offer in store a rapidly growing variety of fresh produce from local and global sources, such as fruit and vegetables that need to be weighed quickly and effortlessly to mark their quantity and respective price. Smart scales that use image recognition to optimise user experience and allow additional features, such as e.g. cooking recipes can provide a new solution to this problems. The solution we provide to the Kaufland case includes training a Convolutional Neural Network (CNN) with GoogLeNet architecture on the original Kaufland data set and fine-tuning it with a Custom Training Set we have created, achieving the following results (Kaufland Case Model #13): training accuracy: Top-1: 91%, Top-5: 100%; validation accuracy: Top-1: 86.1% , Top-5: 99%, and TEST dataset accuracy of: Top-1: 86.1%, Top-5: 99.2%. We have also created another model (Kaufland Case Model #14) by combining similar categories, achieving: training accuracy: Top-1: 96%, Top-5: 100%; validation accuracy: Top-1: 92.5%, Top-5: 100%, and TEST dataset accuracy: Top-1: 91.3%, Top-5: 100%. All trainings were done on our NVIDIA DGX Station training machine using BVLC Caffe and the NVIDIA DIGITS framework. In our article we show visualisations of our numerous trainings, provide an online demo with the best classifiers, which can be further tested. During the final DSS Datathon event we plan to show a live food recognition demo with one of our best models running on a mobile phone . Demo URL: http://norris.imagga.com/demos/kaufland-case/
In this article we present our solution for helping customers and making their shopping experience easier while identifying products from images. We bring forward our idea and discuss the results of our CV experiment.
Our best model (derived from VGG) achieved 99.46% top3 accuracy (90.18% top1) with processing time during training of 0.006 s per image on a single GPU Titan X (200s / epoch with 37 000 images).
The teams vision is for the team members to see where they stand compared to others in terms of ideas and approaches to computer vision and to learn new ideas and approaches from the other team-mates and the mentors.
Therefore the team is pursuing a pure computer vision approach to solving the Kaufland and/or the ReceiptBank cases.