For simplicity's sake, I sanitize the submission by removing all text within double quotes and those associated surrounding double quotes before validation, putting the kibosh on email address submissions based on what is disallowed. Just because someone can have the John.."The*$hizzle*Bizzle"[email protected] address doesn't mean I have to allow it in my system. We are living in the future where it maybe takes less time to get a free email address than to do a good job wiping your butt. And it isn't as if the email criteria are not plastered right next to the input saying what is and isn't allowed.
I also sanitize what is specifically not allowed by various RFCs after the quoted material is removed. The list of specifically disallowed characters and patterns seems to be a much shorter list to test for.
Disallowed:
local part starts with a period ( [email protected] )
local part ends with a period ( [email protected] )
two or more periods in series ( [email protected] )
&’`*|/ ( some&thing`[email protected] )
more than one @ ( which@[email protected] )
:% ( mo:characters%mo:[email protected] )
In the example given:
John.."The*$hizzle*Bizzle"[email protected] --> [email protected]
[email protected] --> [email protected]
Sending a confirm email message to the leftover result upon an attempt to add or change the email address is a good way to see if your code can handle the email address submitted. If the email passes validation after as many rounds of sanitization as needed, then fire off that confirmation. If a request comes back from the confirmation link, then the new email can be moved from the holding||temporary||purgatory status or storage to become a real, bonafide first-class stored email.
A notification of email address change failure or success can be sent to the old email address if you want to be considerate. Unconfirmed account setups might fall out of the system as failed attempts entirely after a reasonable amount of time.
I don't allow stinkhole emails on my system, maybe that is just throwing away money. But, 99.9% of the time people just do the right thing and have an email that doesn't push conformity limits to the brink utilizing edge case compatibility scenarios. Be careful of regex DDoS, this is a place where you can get into trouble. And this is related to the third thing I do, I put a limit on how long I am willing to process any one email. If it needs to slow down my machine to get validated-- it isn't getting past the my incoming data API endpoint logic.
Edit: This answer kept on getting dinged for being "bad", and maybe it deserved it. Maybe it is still bad, maybe not.