The last time Hackerfall tried to access this page, it returned a not found error. A cached version of the page is below, or clickhereto continue anyway

It's Impossible to Validate an Email Address | Elliot Chance

(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t]

)+|\Z|(?=[\["()@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:

\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(

?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[

\t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()@,;:\\".\[\] \000-\0

31]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\

](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()@,;:\\".\[\] \000-\031]+

(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:

(?:\r\n)?[ \t])*))*|(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z

|(?=[\["()@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)

?[ \t])*)*\@,;:\\".\[\] \000-\031]+(?:(?:(?:\

r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[

\t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)

?[ \t])+|\Z|(?=[\["()@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t]

)*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[

\t])+|\Z|(?=[\["()@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*

)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t]

)+|\Z|(?=[\["()@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)

*:(?:(?:\r\n)?[ \t])*)?(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+

|\Z|(?=[\["()@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r

\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:

\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t

]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()@,;:\\".\[\] \000-\031

]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](

?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()@,;:\\".\[\] \000-\031]+(?

:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?

:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)|(?:[^()@,;:\\".\[\] \000-\031]+(?:(?

:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?

[ \t]))*"(?:(?:\r\n)?[ \t])*)*:(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()@,;:\\".\[\]

\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".\[\]]))|"(?:[^\"\r\\]|

\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()

@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".\[\]]))|"

(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t]

)*(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\

".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?

:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".\[

\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()@,;:\\".\[\] \000-

\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(

?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\@,;

:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".\[\]]))|\[([

^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()@,;:\\"

.\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".\[\]]))|\[([^\[\

]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()@,;:\\".\

[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".\[\]]))|\[([^\[\]\

r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()@,;:\\".\[\]

\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".\[\]]))|\[([^\[\]\r\\]

|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()@,;:\\".\[\] \0

00-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\

.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()@,

;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".\[\]]))|"(?

:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*

(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".

\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[

^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".\[\]

]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)(?:,\s*(

?:(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\

".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(

?:\r\n)?[ \t])*(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[

\["()@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t

])*))*@(?:(?:\r\n)?[ \t])*(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t

])+|\Z|(?=[\["()@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?

:\.(?:(?:\r\n)?[ \t])*(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|

\Z|(?=[\["()@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:

[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".\[\

]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\

?[ \t])*(?:@(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["

()@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)

?[ \t])*(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()

@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[

\t])*(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,

;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t]

)*(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\

".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?

(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()@,;:\\".

\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:

\r\n)?[ \t])*(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[

"()@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])

*))*@(?:(?:\r\n)?[ \t])*(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])

+|\Z|(?=[\["()@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\

.(?:(?:\r\n)?[ \t])*(?:[^()@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z

|(?=[\["()@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(

?:\r\n)?[ \t])*))*)?;\s*)

Even this monster can not truly validate an email address. How can this be? It turns out there is a lot more in the

humble email address

. Some parts of the RFC822 are actually quite useful, some are just insane. Either way it's interesting, so let's dive in...

Sub-addresses

One thing that is particularly worth noting is sub-addresses because they can be extremely useful and are supported almost everywhere. A sub-address allow you to create different email addresses to go to the same physical mailbox.

Let's say Bob's emails address is bob@smith.com. A sub-address uses a + to add a label like bob+spam@smith.com. If Bob were to sign up to a site with the latter he would still get the messages as normal to bob@smith.com but now you (or rather, he) can create filters or simply switch off one of the sub-addresses altogether.

One more interesting tidbit is if you use unique sub-addresses for each of the sites you sign up to you will be able to see when someone, or rather who, sells your email to someone else... Busted!

Where the Regexp Starts to Break Down

Unbeknownst to most people, this is actually a valid email address because all of the characters you see are perfectly acceptable in the local-part (that's the bit before the @):

Furthermore the local-part can contain any characters, including an @ sign, if they are enclosed within double quotes. There are also perfectly valid:

You will notice that the emails above have been partially converted into links by the Markdown parser for Silvrback because they can get so difficult to parse in text as well.

To Insanity and Beyond!

I would be surprised if your not at least a little bit impressed at how crazy you can get with an email address. However, before you feel the wash of guilt over all the inadequate regular expressions you've implemented or borrowed in your past software it's about to get to get even more intense...

Up until now we are still able to put these rules into a regular expression, in fact it would look like the monster that is shown above, but we must continue. It's time to talk about comments.

Comments are arbitrary text encapsulated in parenthesis that can appear in 4 possible places of an email address:

All of these have the same semantic meaning. They work in a similar way to sub-addressing in that they are just cosmetic and the email will actually arrive in the a@b.com mailbox.

"If it's worth doing, it's worth overdoing." - Ayn Rand

Once again taking it one step further, comments can be nested:

If you've ever had to parse recursive regular expressions you know that it can be very difficult even with the most simple scenarios. Now try mixing that with the monster regular expression above and you now can let your brain explode.

Despite the RFC822 spec, we have all agreed that using simple, memorable email addresses seems to be the way to go. Maybe we will find a better use for these features in the future, but until then you can contact me on:

P.S. I'm hoping the spam bots trawling for email addresses on pages like these aren't smart enough to pick up that email in their regular expressions...

EDIT: One thing that may not have been clear is that I was talking about validating an email address through a regular expression. There are of course many other ways to properly validate an email address.

Thank you for reading. I'd really appreciate any and all feedback, please leave your comments below or consider subscribing. Happy coding.

Tagged: email

Please enable JavaScript to view the comments powered by Disqus. comments powered by Disqus

Continue reading on elliot.land