# Name Matching

## **Articles**

1. [Analytics Vidhya](https://medium.com/analytics-vidhya/fuzzy-name-matching-datasets-1ae28884f226) on fuzzy name matching datasets, by Zaki Jefferson
2. [Fuzzy matching people names](https://towardsdatascience.com/fuzzy-matching-people-names-6e738d6b8fe) by vadim markovtsev
3. [Name Matching Across datasets](https://ai.nic.in/AI/NameMatchingCaseML) - POC by Centere of Excellence in AI National Informatics Centre
4. [fuzzy name matching algorithms](https://towardsdatascience.com/python-tutorial-fuzzy-name-matching-algorithms-7a6f43322cc5) by felix kuestahler

## **Datasets**

1. [first and last name dataset](https://github.com/philipperemy/name-dataset), facebook 533M records, philippe remy
2. [data.world name datasets](https://data.world/datasets/names)
3. [Kaggle](https://www.kaggle.com/fivethirtyeight/fivethirtyeight-most-common-name-dataset/version/108), Name datasets, by fivethirtyeight\
   ![](https://83674056-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-Mgd48oS5_duTKOVE_Et%2Fuploads%2FkGtm8BxEYVCo0jf0CocQ%2Fimage.png?alt=media\&token=6eddc8fe-2657-4e13-861d-4375e7a3caf1)
4. [gender by name dataset](https://archive.ics.uci.edu/ml/datasets/Gender+by+Name)
5. [paper](http://www.lrec-conf.org/proceedings/lrec2008/pdf/291_paper.pdf) - a ground truth dataset for matching coltural diverse romanized person names

## Tools

1. [Dedupe](https://www.reddit.com/r/datasets/comments/4zrozk/request_name_matching_dataset/) - a python library for accurate and scalable fuzzy matching record deduplication and entity resolution
2. [name](https://github.com/bradhackinen/nama) - fast flexible name matching for large datasets
3. [name matcher](https://github.com/athenianco/names-matcher) by athenianco\
   ![](https://83674056-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-Mgd48oS5_duTKOVE_Et%2Fuploads%2FfoupFKJPJzb6ZFzMONYP%2Fimage.png?alt=media\&token=9c98af0c-b01d-435e-ac01-fdbbc6bd2f58)
