Image recognition

Datathon 2019 – Kaufland Case – Solution – Phoenix

4
votes

5 thoughts on “Datathon 2019 – Kaufland Case – Solution – Phoenix

  1. 1
    votes

    That is a job well done on very short terms. Wish you had time to handle the misplaced items also. Thank you for choosing this case once again and really good “Other comments” section you got there 😉 One quick question to keep the ideas floating around:
    Do you think this approach can be used in reality, where the number of items is ~20 000 and continuously changing?

  2. 1
    votes

    Lol! It seems you’ve done a lot with this data. What are your evaluation criteria (how do you count a correctly detected and localized object?)? I don’t see any stats for the performance of your approach (ROC curves may be).

  3. 0
    votes

    @penchodobrev
    Thanks for the feedback! Yes, I believe that the approach can be applied in reality given that similar images and xml files are supplied for all products. It seems that the data augmentation code is working well enough, so even with smaller set of images the neural network can still be trained to differentiate between objects and background. However, an important thing to remember is that all images are made from a similar perspective and the background of the shelves is relatevely consistent. This has also to be the case in any future implementation.

    I don’t think that the perspective is a huge issue, as there are freely available tools, which can help with generating images, which simualte such behavior and the products are not always placed in the same way even in the current sample. However, I think that any change in the background should be paired with a fine tuning of the model before implementation.

  4. 0
    votes

    @pepe
    Thanks for the feedback!
    1) You can refer to code 05. Calculate FInal Prediction.ipynb to notebook “https://github.com/datasciencesociety/Phoenix/blob/master/!Clean%20Project%20Files/05.%20Calculate%20FInal%20Prediction.ipynb” to see the final asignment of the flags supplied to the leaderboard. We are just searching the space outlined in the xml files for a product. I don’t think that the evaluation approach used in the leaderboard is very good, as the coordinates in the xml file are not standardized and in cases of missing products account for the full available space, while in case of available products account only for the space of the specified product. 🙂 However, as seen in the BoxPlot_data.zip file we are correctly identifying almost all of the objects and 100% of the labels. – https://github.com/datasciencesociety/Phoenix/blob/master/!Clean%20Project%20Files/BoxPlot_Data.zip

    2) For performance we only used the leaderboard, as in Computer Vision it really depends on what you are interested in.

Leave a Reply