We believe that Stack Overflow should not just be a resource for very specific technical questions, but also for general guidelines on how to solve variations on common problems. "Form based authentication for websites" should be a fine topic for such an experiment.
I just thought I'd share this solution that I found to be working just fine.
I call it the Dummy Field (though I haven't invented this so don't credit me).
In short: you just have to insert this into your
<form> and check for it to be empty at when validating:
<input type="text" name="email" style="display:none" />
The trick is to fool a bot into thinking it has to insert data into a required field, that's why I named the input "email". If you already have a field called email that you're using you should try naming the dummy field something else like "company", "phone" or "emailaddress". Just pick something you know you don't need and what sounds like something people would normally find logical to fill in into a web form. Now hide the
hidden or else the bot won't fall for it.
When you are validating the form (either client or server side) check if your dummy field has been filled to determine if it was sent by a human or a bot.
In case of a human: The user will not see the dummy field (in my case named "email") and will not attempt to fill it. So the value of the dummy field should still be empty when the form has been sent.
In case of a bot: The bot will see a field whose type is
text and a name
I used this method on a guestbook in combination with CAPTCHA, and I haven't seen a single spam post since. I had used a CAPTCHA-only solution before, but eventually, it resulted in about five spam posts every hour. Adding the dummy field in the form has stopped (at least until now) all the spam from appearing.
I believe this can also be used just fine with a login/authentication form.
Warning: Of course this method is not 100% foolproof. Bots can be programmed to ignore input fields with the style
display:none applied to it. You also have to think about people who use some form of auto-completion (like most browsers have built-in!) to auto-fill all form fields for them. They might just as well pick up a dummy field.
You can also vary this up a little by leaving the dummy field visible but outside the boundaries of the screen, but this is totally up to you.
A good article about realistic password strength estimation is:
I dont't know whether it was best to answer this as an answer or as a comment. I opted for the first option.
Regarding the poing PART IV: Forgotten Password Functionality in the first answer, I would make a point about Timing Attacks.
In the Remember your password forms, an attacker could potentially check a full list of emails and detect which are registered to the system (see link below).
Regarding the Forgotten Password Form, I would add that it is a good idea to equal times between successful and unsucessful queries with some delay function.
I do not think the above answer is "wrong" but there are large areas of authentication that are not touched upon (or rather the emphasis is on "how to implement cookie sessions", not on "what options are available and what are the trade-offs".
My suggested edits/answers are
Do NOT try to implement your own login form or database storage of passwords, unless the data being stored is valueless at account creation and self-generated (that is, web 2.0 style like Facebook, Flickr, etc.)
This avoids any need to have "sessions" or cookies as the browser itself will re-encrypt the communication each time. It is the most "lightweight" development approach.
First, we are confusing the initial creation of an account (with a password) with the re-checking of the password subsequently. If I am Flickr and creating your site for the first time, the new user has access to zero value (blank web space). I truly do not care if the person creating the account is lying about their name. If I am creating an account of the hospital intranet/extranet, the value lies in all the medical records, and so I do care about the identity (*) of the account creator.
This is the very very hard part. The only decent solution is a web of trust. For example, you join the hospital as a doctor. You create a web page hosted somewhere with your photo, your passport number, and a public key, and hash them all with the private key. You then visit the hospital and the system administrator looks at your passport, sees if the photo matches you, and then hashes the web page/photo hash with the hospital private key. From now on we can securely exchange keys and tokens. As can anyone who trusts the hospital (there is the secret sauce BTW). The system administrator can also give you an RSA dongle or other two-factor authentication.
But this is a lot of a hassle, and not very web 2.0. However, it is the only secure way to create new accounts that have access to valuable information that is not self-created.
Kerberos and SPNEGO - single sign-on mechanisms with a trusted third party - basically the user verifies against a trusted third party. (NB this is not in any way the not to be trusted OAuth)
SRP - sort of clever password authentication without a trusted third party. But here we are getting into the realms of "it's safer to use two-factor authentication, even if that's costlier"
SSL client side - give the clients a public key certificate (support in all major browsers - but raises questions over client machine security).
In the end, it's a tradeoff - what is the cost of a security breach vs the cost of implementing more secure approaches. One day, we may see a proper PKI widely accepted and so no more own rolled authentication forms and databases. One day...
First, a strong caveat that this answer is not the best fit for this exact question. It should definitely not be the top answer!
I will go ahead and mention Mozilla’s proposed BrowserID (or perhaps more precisely, the Verified Email Protocol) in the spirit of finding an upgrade path to better approaches to authentication in the future.
I’ll summarize it this way:
@domain” is concise and supported by a wide range of protocols and URI schemes. Such an identifier is, of course, most universally recognized as an email address.
This is not strictly “form-based authentication for websites”. But it is an effort to transition from the current norm of form-based authentication to something more secure: browser-supported authentication.
My favourite rule in regards to authentication systems: use passphrases, not passwords. Easy to remember, hard to crack. More info: Coding Horror: Passwords vs. Pass Phrases
When hashing, don't use fast hash algorithms such as MD5 (many hardware implementations exist). Use something like SHA-512. For passwords, slower hashes are better.
The faster you can create hashes, the faster any brute force checker can work. Slower hashes will therefore slow down brute forcing. A slow hash algorithm will make brute forcing impractical for longer passwords (8 digits +)
I'd like to add one suggestion I've used, based on defense in depth. You don't need to have the same auth&auth system for admins as regular users. You can have a separate login form on a separate url executing separate code for requests that will grant high privileges. This one can make choices that would be a total pain to regular users. One such that I've used is to actually scramble the login URL for admin access and email the admin the new URL. Stops any brute force attack right away as your new URL can be arbitrarily difficult (very long random string) but your admin user's only inconvenience is following a link in their email. The attacker no longer knows where to even POST to.
I would like to add one very important comment: -
Many corporations deploy "internal use only" websites which are, effectively, "corporate applications" that happen to have been implemented through URLs. These URLs can (supposedly ...) only be resolved within "the company's internal network." (Which network magically includes all VPN-connected 'road warriors.')
When a user is dutifully-connected to the aforesaid network, their identity ("authentication") is [already ...] "conclusively known," as is their permission ("authorization") to do certain things ... such as ... "to access this website."
This "authentication + authorization" service can be provided by several different technologies, such as LDAP (Microsoft OpenDirectory), or Kerberos.
From your point-of-view, you simply know this: that anyone who legitimately winds-up at your website must be accompanied by [an environment-variable magically containing ...] a "token." (i.e. The absence of such a token must be immediate grounds for
404 Not Found.)
The token's value makes no sense to you, but, should the need arise, "appropriate means exist" by which your website can "[authoritatively] ask someone who knows (LDAP... etc.)" about any and every(!) question that you may have. In other words, you do not avail yourself of any "home-grown logic." Instead, you inquire of The Authority and implicitly trust its verdict.
Uh huh ... it's quite a mental-switch from the "wild-and-wooly Internet."
There's another secure method called SRP, but it's patented (although it is freely licensed) and there are few good implementations available.
Don't ever store passwords as plaintext in the database. Not even if you don't care about the security of your own site. Assume that some of your users will reuse the password of their online bank account. So, store the hashed password, and throw away the original. And make sure the password doesn't show up in access logs or application logs. OWASP recommends the use of Argon2 as your first choice for new applications. If this is not available, PBKDF2 or scrypt should be used instead. And finally if none of the above are available, use bcrypt.
Hashes by themselves are also insecure. For instance, identical passwords mean identical hashes--this makes hash lookup tables an effective way of cracking lots of passwords at once. Instead, store the salted hash. A salt is a string appended to the password prior to hashing - use a different (random) salt per user. The salt is a public value, so you can store them with the hash in the database. See here for more on this.
This means that you can't send the user their forgotten passwords (because you only have the hash). Don't reset the user's password unless you have authenticated the user (users must prove that they are able to read emails sent to the stored (and validated) email address.)
Security questions are insecure - avoid using them. Why? Anything a security question does, a password does better. Read PART III: Using Secret Questions in @Jens Roland answer here in this wiki.
After the user logs in, the server sends the user a session cookie. The server can retrieve the username or id from the cookie, but nobody else can generate such a cookie (TODO explain mechanisms).
Cookies can be hijacked: they are only as secure as the rest of the client's machine and other communications. They can be read from disk, sniffed in network traffic, lifted by a cross-site scripting attack, phished from a poisoned DNS so the client sends their cookies to the wrong servers. Don't send persistent cookies. Cookies should expire at the end of the client session (browser close or leaving your domain).
If you want to autologin your users, you can set a persistent cookie, but it should be distinct from a full-session cookie. You can set an additional flag that the user has auto-logged in, and needs to log in for real for sensitive operations. This is popular with shopping sites that want to provide you with a seamless, personalized shopping experience but still protect your financial details. For example, when you return to visit Amazon, they show you a page that looks like you're logged in, but when you go to place an order (or change your shipping address, credit card etc.), they ask you to confirm your password.
Financial websites such as banks and credit cards, on the other hand, only have sensitive data and should not allow auto-login or a low-security mode.