Build a VLOOKUP alternative using a natural language processing (NLP) technique from scikit-learn.
Excel’s VLOOKUP function is a well known way of finding data in a table based on a key lookup, but often an exact string lookup isn’t convenient and a more ‘fuzzy’ matching is a better way joining two data sets together. For example, when comparing company names it is very common to have slight differences in how the same names appear in different data sets (eg ‘Apple’ vs ‘Apple Inc’).
This video demonstrates a straightforward but effective string matching technique using Natural Language Processing. Using PyXLL, the resulting function can be used from Excel as an alternative to VLOOKUP.
During this video you will learn about the following topics:
The code is available on GitHub here: https://github.com/pyxll/pyxll-examples/tree/master/scikit-learn/nlp_vlookup
Inspired by the article by Mauro Di Pietro: https://towardsdatascience.com/surpass-excel-vlookup-with-python-and-nlp-ab20d56c4a1a