Designing Vault’s generator algorithm

A few weeks ago now, I released Vault, a stateless password generator. It took about six months to develop, which sounds ridiculous for something that’s only a couple hundred lines of JavaScript, but during that time I learned a fair bit about cryptography that changed the generator’s design, and I’d like to discuss that here.

Vault’s password generation algorithm went through about four distinct iterations between the 0.1 release and the for-public-consumption 0.2 version. The strategy throughout could be summarized as:

  • Take two values from the user and make a hash (A) out of them
  • Construct a character set (B) acceptable by the target site
  • Encode the hash A using the character set B

However the details of how you do each of these things affect the security of the system in important ways, as we’ll see.

The inputs to the process are:

  • The phrase: a master passphrase the person uses to generate all her passwords
  • The service: the name of the site the person is logging into
  • Optionally, a set of character constraints on the generated password, e.g. ‘must not contain spaces’, or ‘must have at least two uppercase letters’

The initial design worked like this: you start by calculating the SHA-256 hash of a combination of the phrase, service and a UUID that’s fixed in Vault’s codebase. (Vault uses either crypto-js or Node’s crypto module, depending on where it’s running.)

var hash = CryptoJS.SHA256(phrase + Vault.UUID + service);

Let’s take a concrete example:

var phrase  = 'you look nice today',
    service = 'gmail';

var hash = CryptoJS.SHA256(phrase + Vault.UUID + service);
// -> 'af7382c35e42ae2a5599a497d69ee5484559925265e609d69dc5026a4a214d90'

Then you turn this hash into binary form, like so:

Vault.toBits = function(digit) {
  var string = parseInt(digit, 16).toString(2);
  while (string.length < 4) string = '0' + string;
  return string;
};

var binary = hash.split('').map(Vault.toBits).join('');
// -> '1010111101110011100000101100001101011110010000101010111000101010 ...'

So now you have a big blob of random-looking data based on the phrase and service, and you need to encode this in a format acceptable to the site you’re logging into. This means picking an appropriate character set. Vault has a collection of different types of characters (these are from the stable 0.2 release):

Vault.LOWER     = 'abcdefghijklmnopqrstuvwxyz'.split('');
Vault.UPPER     = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'.split('');
Vault.ALPHA     = Vault.LOWER.concat(Vault.UPPER);
Vault.NUMBER    = '0123456789'.split('');
Vault.ALPHANUM  = Vault.ALPHA.concat(Vault.NUMBER);
Vault.SPACE     = [' '];
Vault.DASH      = ['-', '_'];
Vault.SYMBOL    = '!"#$%&\'()*+,./:;<=>?@[\\]^{|}~'.split('').concat(Vault.DASH);
Vault.ALL       = Vault.ALPHANUM.concat(Vault.SPACE).concat(Vault.SYMBOL);

Vault.ALL ends up having 94 unique characters in it: all the printable ASCII characters, except for the backtick which I couldn’t find on my Android keyboard, so I thought it might cause problems. In the simplest case, you just encode the bits of the hash using Vault.ALL, which means selecting characters from a set of 94.

To select a number, we first figure out how many bits we need.

Math.ceil(Math.log(94)/Math.log(2))
// -> 7

So we shift 7 bits off the front of our hash, and see what number it gives us. In this case, the first 7 bits are 1010111, which is 87. Vault.ALL[87] is ^, so ^ is the first character of our password. The next 7 bits are 1011100 which is 92, so our password is now ^-. We continue this until we have a password of the desired length.

Obviously, some strings of 7 bits give values outside the range 0–93. I’ll come to this later.

But this system does not allow for any constraints. Say we want the password to contain no spaces and at least 2 uppercase letters, which we represent like this:

var constraints = {space: 0, upper: 2};

We need to force the generator to obey these constraints in a secure, deterministic way. This means the ordering needs to appear random when sampled over many phrase/service pairs (i.e. the uppercase letters must not always appear in the same place), but must always be the same for a given pair. We’ve already got a deterministic random number: our binary hash. We can use it both to select characters, and select which set to draw each character from.

Let’s say we want an 8-character password with the above constraints. We start by making an empty list and a copy of Vault.ALL:

 var required = [],
     allowed  = Vault.ALL.slice();

required will be a list of character sets as long as the required password. In order to make sure our password gets the required sets, we begin by pushing each required set onto the list the required number of times. We want 2 uppercase characters, so we add Vault.UPPER to the list twice. Then we remove any disallowed characters from the allowed list, and push copies of the allowed list onto required until it’s long enough – in this case, 8 characters long. In code this looks like:

for (var type in constraints) {
  var n   = constraints[type],
      set = Vault[type.toUpperCase()];
  
  if (n === 0) {
    removeCharacters(allowed, set);
  } else {
    while (n--) required.push(set);
  }
}
while (required.length < length)
  required.push(allowed);

In our example, this means required ends up looking like this, where allowed is Vault.ALL with Vault.SPACE removed:

required = [Vault.UPPER, Vault.UPPER, allowed, allowed,
            allowed, allowed, allowed, allowed];

Wait! Did you spot the deliberate mistake? That’s right, for in loops on objects do not guarantee order, so the constraints {digit: 3, upper: 2} and {upper: 2, digit: 3} might result in different orderings of required and hence different passwords. This is clearly wrong, let’s fix it:

var TYPES = 'LOWER UPPER DIGIT SPACE DASH SYMBOL'.split(' ');

for (var i = 0, m = TYPES.length; i < m; i++) {
  var n   = constraints[TYPES[i].toLowerCase()],
      set = Vault[TYPES[i]];
  
  // etc.
}

Now we have everything we need. Again, we take our hash and take some bits off the front to select items from lists. First thing we need to know is which set from required to use. required is 8 items long, so we need a 3-bit number. The first 3 bits are 101 which gives 5, and required[5] is the allowed set. We pop this out of the list using required.splice(5,1), and use the next few bits to select a value from allowed as before. We go through our list of bits, using them to pick a character set and than a character from that set, until we’ve used all the items in the required list. This gives an apparently random ordering of character sets, and random selection from each set. Easy.

Well, not quite. There are several implementation details that ended up making this design insecure that needed fixing to get to 0.2. See if you can spot them before continuing.

Okay, the most glaring one first:

var hash = CryptoJS.SHA256(phrase + Vault.UUID + service);

As I’ve previously explained, this is vulnerable to an extension attack on the service name. If someone discovered your gmail password, say, they could figure out your password for a service whose name begins with the letters gmail. A small risk, and possibly confounded by character constraints, but worth fixing. I switched to using HMAC-SHA256 to make the hash, and suffixed the service name with the UUID.

var hash = CryptoJS.HmacSHA256(secret + Vault.UUID, phrase);

This effectively means the user is signing the service name using their passphrase, a more sensible model for what Vault does, and it gets rid of the extension attack risk.

The second problem is not a threat to the security of the passwords Vault is capable of generating, but a practical problem: because SHA-256 produces a finite output, we can only generate passwords up to a certain length, depending on how many bits you spend on selecting character sets. Users didn’t want this artificial-seeming limit, so we needed a way to get as many bits as we like. Fortunately there’s a standard crypto function for that: the key-expanding function PBKDF2. This is based on a hashing function (the default is HMAC-SHA1, so we’re still using a MAC rather than a hash), but can safely generate arbitrarily large output. We just calculate how many bits we need for a password of a certain length, and generate them:

var hash = CryptoJS.PBKDF2(phrase,
                           service + Vault.UUID,
                           {keySize: k, iterations: i});

PBKDF2 takes an iterations argument that tells it how many rounds of hashing to do; this is basically used to make the function more expensive. This is important on the server-side where passwords must be stored using expensive hashes, but in Vault it’s only used because it can generate arbitrary amounts of output. (I arbitrarily set it to use 8 iterations, though unless this number’s very large it doesn’t make much practical difference what you use. However expensive you make it for the user to generate the password, they’re still sending it over a wire and asking someone to store it, and the burden of protection needs to lie with the server at that point. Making Vault as slow as server-side hashing ought to be would just be annoying.)

Great, so now we’re combining phrase and service to make a secure hash as large as we like, job done, right? Nope. Remember I said I’d come back to how the bits are turned into numbers to select items from lists? Well it turns out that was badly broken for a long time.

Say you need to select an item from a 94-ary list, which Vault does very frequently. 94 sits somewhere between 64 and 128, so to make sure you cover all possible values you need to use 7 bits to select a number. But this could give you a number you can’t use: say you get 1110001 (113), what do you do with it? Well, Vault’s original plan was: if you take 7 bits and get a number you can’t use, only keep the first 6 bits, and leave the 7th in the queue. This gets you numbers you can always use, but has the horrible side effect of producing bias: it makes some numbers more likely than others.

We can show this with an experiment. Here’s a small script that models Vault’s old encoding process, for selecting digits between 0 and 9:

var stream  = [],
    size    = 65536,
    results = {};

while (size--) stream.push(Math.floor(Math.random() * 2));

while (stream.length > 4) {
  var bits = stream.splice(0,4),
      num  = parseInt(bits.join(''), 2);
  
  if (num > 9) {
    stream.unshift(bits.pop());
    num = parseInt(bits.join(''), 2);
  }
  results[num] = (results[num] || 0) + 1;
}

console.log(results);

When we run this, we see that some numbers are substantially more likely than others:

$ node test.js 
{ '0': 1137,
  '1': 1120,
  '2': 1132,
  '3': 1121,
  '4': 1117,
  '5': 3473,
  '6': 3339,
  '7': 3369,
  '8': 1116,
  '9': 1144 }

This same systematic bias was present in Vault too. It meant that some orderings of character sets were more likely than others, and within each set some characters were more likely than others, by a factor of 3. If given a collection of passwords, this bias was enough to easily identify them as being generated by Vault rather than being truly random noise. For the generated passwords to be safe, they had to be evenly distributed.

The easiest way to implement this is to simply throw a number away and try the next group of bits if the number if too large. This is safe but inefficient. After asking around on Twitter, Christian Perfect showed me a better way that attempts to ‘recycle’ digits that are too large but pushing them into higher-order streams. This works great, and is exactly what Vault implements. It means Vault passwords now have the same character distribution as genuine pseudo-random noise.

Those are the major changes Vault went through before I decided it was safe enough to release. An important factor all these changes have in common is that they’re all irreversible: every one of them changes the output of the generator such that a set of inputs gives one password today and another password tomorrow. This is not acceptable so all these bugs needed stamping out before going live. There are other potential problems that can be fixed without breaking things though. For example, if it turns out there are cases where we’re not generating enough bits, we can just bump up the PBKDF2 key size and it won’t change existing outputs. Asking PBKDF2 to double its output size replaces ABC with ABCDEF: the bits you were already generating don’t change. Similarly, we can introduce new character sets without breaking old passwords as long as we add them without changing the order the existing sets appear in Vault.TYPES, which affects set ordering before encoding.

Backwards compatibility is a deal-breaker on this project: if you mess it up, everyone loses their passwords. The milestone the let me ship was realizing that all the remaining potential problems were reversibly fixable. Fortunately they’ve not come up yet.