The password problem1 can be stated in the following way:

  • Passwords must be strong enough to be secure, but

  • Passwords must be remembered by users.

These two requirements present users with conflicting constraints. As password policies require users to include numbers, uppercase letters, and special characters in their passwords, the resulting strings become less meaningful and more difficult to remember. In short, more secure passwords are more difficult to remember because they are more random.

However, this is not the whole story. A set of passwords is only truly strong if it is unpredictable to an attacker. To explore this, my dissertation estimated password strength using machine learning to build models from the passwords leaked in data breaches. Even if a password appears random, an attacker could know about it through a data breach.

Billions of passwords have been leaked from over 600 websites (as of this writing), rendering many human-chosen passwords insecure. In addition, machine learning can mimic how humans generate passwords, making passwords we will create in the future insecure as well.

The chart below shows how the model I developed performed on a random sample of 1000 passwords created under an 8-character password policy. This password policy is easy to predict and therefore a poor choice compared to other policies. Much of my work focused on finding better password policies. Our research group also used this framework to evaluate the passwords of over 25,000 students, faculty, and staff at CMU, providing further evidence of the strength (or weakness) of human-chosen passwords.

Make this chart full screen

My dissertation is available here and the code for the framework I developed can be found on GitHub.


  1. The phrase "password problem" is attributed to Susan Wiedenbeck, Jim Waters, Jean-Camille Birget, Alex Brodskiy, and Nasir Memon. 2005. Authentication using graphical passwords: effects of tolerance and image choice. In Proceedings of the 2005 symposium on Usable privacy and security (SOUPS '05). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/1073001.1073002.