2015-04-03

Passwords

I posted on this a while ago and was quite impressed with some of the replies.

We are taking on board many of the comments, and have this week been working on the some changes. The status posts go in to some detail, but I thought I would explain more.

Firstly, there is more than can, and will be done over time, but as with all security it is all about having the right level of inconvenience for the risk.

So what is happening?

Well, for a start, we have used hashes for storing passwords in various places for some time. This means we do not know the password, but we can check it. We make the same hash from the attempted password and check against the hash we have stored.

One of the improvements that has been made is salted hashes where, in addition to the password, a salt is used (random data) when making the hash. This means two people with the same password do not have the same hash, and means you cannot make a huge dictionary of common passwords and combinations of dictionary words with pre-calculated hashes (rainbow tables).

However, sadly, some systems work in such a way that we have to know the "plain text" password. This applies to CHAP (Challenge Handshake Authentication Protocol) and similar systems used for VoIP. To check the password we have to know it. However, the systems that allow control and management and even visibility of such passwords can be locked down under systems that can hash the password.

This is the main change we are making... Our "control pages" (called "clueless") were originally for DSL lines, and that was it. We needed plain text passwords for the CHAP login, and that was the only password.

However, over time, things have become more complex. We now allow control of VoIP logins which can be used to make lots of calls and run up bills, using the control pages login. We also allow visibility of call logs. So we need to beef up security a bit.

Salted hashes where we can.

The first step is to ensure salted hashes are used, not only on our accounts system but also on the control pages. We are changing them so the control pages login is a salted hash, and the line login is a separate line password. The latter is just for DSL/L2TP/SIM login and is often ignored if we have a valid login and circuit ID anyway.

These changes will soon be visible, allowing DSL passwords on lines to be seen/changed, but passwords on the login pages to be set using a password change process with https and no plain text passwords in emails and no way for staff, or anyone else, to see the password.

Making a system that allows improvement.

Doing this has meant we have the opportunity to make new password generation and checking code in common libraries used over many systems. These functions will check a salted hash, but they also have the ability to report that the correct login used an old hash of some sort and provide a new hash. The systems allowing logins then update the hash on the next login.

This allows us to improve things in the future.

For today we are using a salted SHA1. This is not ideal as SHA1 is considered a bit quick and efficient for a password hash, but it is a good starting point. There is a competition for password hashing algorithms running now, and the plan is to move to a new best practice password hash in due course. As it happens, FireBricks are moving to heavily salted SHA256 for now.

Having a system library to handle this means that when we upgrade, it will happen automatically. Using the new hash for all new passwords, and updating the hash on existing logins when used.

Of course, ultimately, these libraries may allow OTP and two factor logic to be deployed where sensible.

Creating passwords

Part of the library is a single place to manage the creation of passwords. We now have a function allowing a password to be generated with specified min/max lengths and entropy. This tries to use XKCD/936 style passwords where possible, or words and numbers, or letters and numbers or just and letters/numbers/symbols, as needed to meet the specified size and entropy.

We also try to encourage people to use easy to remember passwords that are system generated and random. This avoids many of the common mistakes of passwords, including: re-using the same password; picking easy passwords; and having to write down passwords.

This does mean a few changes. For many passwords you will see the familiar four word passwords that are long, easy to remember and of the order of 48 bits of entropy, but for some things like WiFi passwords you will see a couple of words and several digits to meet space and entropy requirements.

However, as ever, comments welcome...

17 comments:

  1. Salted hashes aren't a good password storage mechanism - GPUs can now do many MHash/s, so can bruteforce "reasonable" length passwords with alarming speed.

    The recommendations for password storage are bcrypt, PBKDF2 or scrypt. PBKDF2 is kind of the "de jure" standard, bcrypt kind of the "defacto standard", and scrypt is an improvement on bcrypt. All of them are good. Importantly, all of them are *slow*. Hashes are built to be fast; this makes them a poor choice of tool to build a password storage system on top of.

    (PBKDF2 is built upon a hash, yes; in this case, it achieves its' security by iterating the hash many times to slow it down, among some other things)

    ReplyDelete
    Replies
    1. As I explain above, one of the key points here is ensuring we have a means to change to a new key. Salted SHA1 will suffice for now as even at millions of hashes a second it would take years to get one password, and you have to have got the hashes first. However, we expect to change to another password specific hashing function once we have the new system deployed and working.

      We are also changing so that staff cannot set a password any more - staff will never see/hear a user password. They will have options to lock out (reset) a password, and to email a link to allow the end user to set a password, but not to actually set the password themselves where hashing is used.

      Delete
    2. It's not obvious to me that 'Because the shop will take them back' is a great answer to "Why did you just go and buy a pair of shoes which you knew didn't fit?"

      At some point in the last few weeks you must have had to make a decision between "known to be not very good homebrew SHA-1-based algorithm" and "known to be a lot better bcrypt/scrypt/whatever proper password crypto/hash algorithm". What made you go with the former?

      Delete
    3. Not quite sure I follow - there is a lot to be said for changing one thing at a time and we know and use SHA1 hashes. The fact that even at millions of hashes a second one of the passwords would take years to break means the solution is adequate (the shoes fit). The work is a step forward, regardless. There is not really "one best practice password hashing solution" at present, as seen by the fact that there is a competition to find one. And finally, a really good password hash will have downsides that need careful consideration when we move on to it. Building a mechanism that means all of the password hashing we do over several systems can move to our chosen hash easily in future, is sensible. Delaying that while we consider and choose a new hashing algorithm seemed somewhat pointless.

      Delete
    4. Well, fair enough, I misunderstood the article to mean that 'salted SHA1' was actually a new thing you'd introduced, not some legacy thing you were sticking with for the time being. But you should definitely be thinking in terms of thousands of Mh/s for SHA1 nowadays, not just Mh/s - and that's just for single GPU systems, not special rigs built for password cracking.

      Delete
    5. An AMD R9 295X2 graphics card (~£500 new) is capable of 5 _billion_ SHA-1 hashes per second.

      In other words, its' capable of brute force cracking any 10 character password matching [A-Za-z0-9] password in just ~1/2th a second. A&A customers probably have stronger than average passwords, but still, the scale of the problem is clear

      Salted SHA-1 isn't good enough. Salted SHA-2 isn't good enough. Secure password storage *needs* to be done using a purpose designed *slow* algorithm.

      Delete
    6. Thanks for the details, and as I say we are working towards that. You do have to get the hashes first, obviously.

      Delete
    7. The word you want by the way is "pessimisation". The opposite of optimisation. The idea of PBKDF or bcrypt() is to deliberately make it more CPU intensive to implement the algorithm. It's not enough to make the sample implementation slow, any possible approach to the problem must be slow too or the bad guys will just use a better one than you have.
      More recently the "resources" the bad guys have are increasingly massively parallel graphics cards. Because they do many operations in parallel a good approach is to run them out of memory. If your login server handles no more than 10 logins at once, and each needs 1MB of RAM to perform the needed calculations quickly, it's no problem, but for a graphics chip running a million copies of the algorithm that's 1TB of RAM total, and so the bad guys are crippled. scrypt() applies this approach, the scrypt() algorithm can be performed with low memory, but only much more slowly, forcing bad guys to trade while normal users needn't care unless verifying passwords is a substantial fraction of their business.

      Delete
    8. Owen, how do you get (62^10)/(5*10^9) to come out to half a second? The correct value is about five years.

      Alternatively, building a lookup table requires about 7 exabytes of storage.

      Delete
  2. Won't passwords be old hat soon with the advent of Steve Gibson's SQRL system?
    https://www.grc.com/sqrl/sqrl.htm

    ReplyDelete
  3. SHA1 is considered to be an at risk algorithm, it has been at least partially compromised and should not be used for any new systems.

    ReplyDelete
    Replies
    1. SHA1's potential weaknesses are inconsequential for password hashing.

      Delete
  4. Sorry to once again be pointing at others content, but Troy Hunt has a good blog on how password resets should be handled here - http://www.troyhunt.com/2012/05/everything-you-ever-wanted-to-know.html

    I really rate Troy Hunt as a security blogger, (partly as I've not found much in the way of valid criticism of him about) and the person behind https://haveibeenpwned.com

    Again, hoping that this is helpful!

    ReplyDelete
    Replies
    1. Good article, and we have most of those points already for the main accounts and control pages password reset process.

      Delete
  5. The issue with auto-generating XKCD style passwords is that the entropy is much lower than if you are an end-user creating them freely in a sea of passwords which are not generated this way.

    If an attacker knows that AAISP uses four words concatenated, brute forcing becomes a variation of a dictionary attack, combining words in a dictionary. There's no need for an attacker to brute force the individual letters.

    Of course it all depends on the size of your dictionary and whether you use other mitigating factors, but I'm assuming that your dictionary isn't too large as it will contain only words that most people know. Otherwise the passwords wouldn't be memorable.

    ReplyDelete
    Replies
    1. The xkcd.com/936 cartoon doesn't go into too much detail regarding the information theoretical arguments about why this isn't the case but it does show the entropy content of the passwords generated which is the main thing.

      The 4 "symbols" in the xkcd.com/936 style passwords contain a certain amount of entropy. The 11 symbols in the traditional password that they cite contain a different amount of entropy. You can calculate these numbers for yourself by estimating the number of words in a dictionary and the number of glyphs in the (approximate) set [A-za-z0-9!"£$%^&*()<>/;:'@#~\[{\\\|`¬]}].


      It turns out that there's more entropy in those 4 symbols than in the 11 randomly chosen glyphs. Therefore there's less likelyhood of randomly selecting the right combination for the xkcd.com/936 passwords as there is for regular passwords.

      This argument in the cartoon is not entirely independent of assumptions about the style of passwords you are a trying to crack but he mentions that in the footnote of the first frame. Still, the numbers he comes up with are several orders of magnitude apart.

      Delete
    2. Importantly, his assumptions on entropy are based on knowing the scheme used and even the word lists used. We picks some 4000+ word lists, and two of them, so 12 bits per symbol when knowning the word lists used!

      If the scheme and word lists are not known you add many more bits to each symbol or consider it to be a sequence of letters which would be massive!

      His logic is sound and the details depend on implementation.

      Delete

Comments are moderated purely to filter out obvious spam, but it means they may not show immediately.

Trying Tindie

So some good news, it is worked. I tried Tindie for the "coasters", listed 5 of them, and by the end of the day all sold and shipp...