# Datathon 2018- Receipt Bank Solution

* Why are you dropping specific words like ‘insertsomenumber’, ‘ce’,’cid’,’skype’,’www’,’com’? If these are very common words that you want removed, I think a better approach would be to tweak the max_df parameter of TfidfVectorizer.
* I guess this wasn’t included in the submission template but I would have liked to see Future Ideas or something like this.