Pipal Email Checker

Fri 13th June 14

As there has just been a nice little dump of Netflix data which contains both email addresses and passwords I thought it was time to add an email address checker to Pipal. The checker looks at both the whole email address as well as just the name part of it. I also considered looking at the domain but figured that the hit rate for that would be fairly low so isn't worth the extra cycles. For both of these it looks for exact matches and for short Levenshtein distance matches.

Here is a sample of the output from Nexflix, I think the output is quite good with the close matches at the end particularly interesting with a number of people basing their passwords on their name which is part of their email address.

Email Checker
=============

Exact Matches
-------------
Whole Email Address
No Exact Matches

Just Name
gog1873 from gog1873@hotmail.com
ziggy1962 from ziggy1962@sympatico.ca

Levenshtein Results
-------------------
Average distance (email) 17.87
Average distance (name) 8.83

Close Matches
-------------
Whole Email Address
No matches within supplied tolerance

Just Name
D: 1 U: yashinl (yashinl@discovery.co.za) P: yashin1
D: 2 U: unni79 (unni79@gmail.com) P: unni12
D: 2 U: stuart (stuart@moabretreat.com) P: stuart53
D: 3 U: jason_215 (jason_215@hotmail.com) P: jasonf14
D: 3 U: tutug60 (tutug60@hotmail.com) P: tutuye2
D: 3 U: xraychen73 (xraychen73@gmail.com) P: Xraychen2
D: 3 U: Rick (Rick@Havu.us) P: rick59
D: 3 U: zuzujar (zuzujar@msn.com) P: zuzu02

Matching is case sensitive so an address of robin@test.com would give a match with Levenshtein distance of one to a password of Robin.

You can get this new checker simply by checking out the latest code from Github.

If you have any comments or would like to see other checkers created please let me know.