In current web-based login protocols, a person logs in to a service provider by sending his user identity and password to the server in question, who then looks up the corresponding record in its database, and performs a comparison to determine whether the password is valid. The password is typically not stored in plaintext, but rather, a “salted one-way function” of the password is stored. This means that if somebody gains access to the database of the service provider, they will not be able to obtain plaintext passwords. However, the password itself is generally sent prior to have the salted one-way function applied.
In order to protect the session against an eavesdropper, it is common to encrypt the transmission of the password. However, passwords are only used in situations where the two communicating machines do not store any prior cryptographic key – if they did, then passwords would be an inferior alternative to standard cryptographic authentication mechanisms such as digital signatures (such as RSA) or message authentication codes (such as HMAC), for example.
Known vulnerabilities of current password authentication practices.
There are many potential attacks that can be mounted on the common password authentication method. To begin with, one should notice that the above described method does not offer any protection against an attack in which an attacker claims to be a service provider, and convinces a user to attempt to log in – clearly, if there is no encryption, then the attacker will simply obtain the password of his victim. The same holds if encryption is used, but the attacker sends his a public key to which it knows the corresponding secret key to the user instead of that of the bank’s. This may occur even if certificates are employed; see, e.g., www.shmoo.com/idn/. In fact, in many, if not most, scenarios, average users are not capable of distinguishing authentic from illegitimate certificates.
Recently, this type of attack has become a very common, and is used by attackers wanting to perform identity theft, also referred to as phishing. Its popularity can be seen by noting with the now daily examples of so-called phishers trying to harvest passwords by sending out emails that appear to originate from a bank. Even though the phishers’ success rate is relatively low, this is a profitable attack, as evidenced by how common it is. This is due to the ease with which attackers can spam large populations at a negligible cost, and the straightforwardness of spoofing emails. There are indications that the problem may become worse as attacker become more sophisticated; see, e.g., www.markus-jakobsson.com/papers/phishing_jakobsson.pdf
Known defenses against attacks.
Some researchers have argued that the above-described attack could be prevented if a mutual authentication protocol were used. Well-known examples of such are EKE, SPEKE, PAK, and more. In these protocols, neither party would learn the password from the other, but they would simply determine if they both know the same password. This would not leak any information to an impostor, other than the confirmation that a guessed password was incorrect (which it will be with an overwhelming probability, of course). After this is determined, the session will be ended.
The user would not have to do anything unusual in order for a mutual authentication protocol to be executed. He would enter his user name and password, and the software performing the mutual authentication would connect to the other party, send the user name to this, and then perform a comparison of the password the user entered and the password the other party has stored. If the comparison succeeds, then the user gains access to the service; if not, then the log-in is aborted on both ends. Thus, if an attacker tries to impersonate a bank to a user, then the user will – by running the correct software – not give out its password to the attacker. Similarly, if the attacker were to contact the bank and try to impersonate the user, then he would fail for similar reasons. Finally, if the attacker contacts both user and bank – a so-called “man-in-the-middle attack” – then one out of two things will occur: (1) the user and the bank both agree that the same password is user, but the attacker does not learn this, or any part thereof; or (2) the attempt fails and both user and bank aborts – again, the attacker does not learn any part of the secret.
Problems with known mutual authentication methods.
All known mutual authentication methods suffer from a particular vulnerability that is likely to be taken advantage of by phishers. Before we describe this vulnerability, we must note that phishing is not a matter of technology alone, but an equal part social engineering. Namely, a phisher may not necessarily try to circumvent security measures by breaking cryptographic protocols, but may instead attempt to make his victims use these incorrectly, or not at all. This is the crux of the vulnerability: if a phisher can create a window that looks like the window during a mutual authentication session, but which is not, then he can dupe the user to enter his password into a form that will cause it to be sent over to the attacker. We refer to this as the doppelganger window attack. This attack works, because the user does not know whether he is actually using the mutual authentication program or not. Just like today’s phishers create websites that look like banking websites, it is possible to customize the appearance of windows to make them look exactly like those used for mutual authentication – at least with all browsers currently available. (Given that mutual authentication is not deployed on a large scale, the latter is merely an existential observation, of course.)
If browsers were to display some particular information when a mutual identification session is to be started, then this may be attacked in the same way as https sessions can be attacked – whether by hoping that the user does not notice, or by a method similar to that described in www.shmoo.com/idn/.
Delayed Password Disclosure.
Delayed Password Disclosure (DPD) is a method that has been developed to address the above problem. It is a mutual authentication technique, just like the techniques mentioned above, and may in fact even be based on one of those, in order to inherit their security features. In addition, it achieves protection against the doppelganger-window attack.
The technique is based on augmenting each user password with a sequence of images that are specific to the user and service provider, and specific to the particular password. Each user would learn to recognize his sequence of images. At the onset, and before he has learnt to recognize it, our solution achieves the same level of security as traditional protocols for mutual authentication. However, as he learns to recognize the images, our technique becomes more secure.
DPD works according to the following principle: instead of entering his entire password, and then initiating the comparison, the comparison is done after each letter. However, neither side learns the actual outcome of the comparison – except the comparison performed after the last character. Instead, the result of each intermediary comparison is used to select an image to be displayed in the user’s window. One such image is therefore displayed for each character of the password. If the user does not recognize a given image, then he will halt the password authentication session immediately. Even though the set of available images may not be specific to the user (but only the choice of these), we obtain probabilistic protection against an attacker, based on the low likelihood of being able to guess the correct image to send over to the user. This probability can easily be made smaller than the probability of simply guessing the password. The images do not need to be stored on the user’s computer, but can be downloaded from the bank server when needed, using a method referred to as “oblivious transfer”. This well-known cryptographic methods prevents the bank from detecting what image is being accessed and downloaded – the same holds for impersonators of the bank, of course.
A look under the hood.
Let us now briefly look at how the technique works. For ease of disposition, we will describe this using a metaphor for the real solution. In our example, we assume that Alice’s password is broken up into pieces, each one of which is five bits long. Each one of these portions would correspond to a character Alice enters. (In reality, we need a few more bits, but that does not change the principle of the solution.) In particular, we say that the first portion of the password is „01001“ when written in binary. We assume that the Bank knows Alice’s password as well, and for simplicity, that this is stored in plaintext.
In our metaphor of the solution, Alice (on the left) puts carbon papers over positions in a grid corresponding to the first character of her password. Here, if the first bit of this password portion is a zero, then the first row of the first column is selected. If it is a one, then the second row is selected. Similarly, the second bit of the password portion is represented in the second column, and so on. Thus, the password portion “01001” corresponds to carbon papers over the positions shown in the figure below:
Finally, Alice takes the piece of paper with the pieces of carbon paper attached to selected positions, and puts this in a “magic envelope” – an envelope only Alice can open! She sends this to the person she thinks is the bank. The bank selects a sequence of random numbers (indicated in red) that add up to the number of the image that Alice should see after correctly entering the first character of her password. In our example, this is the number zero (this is simply chosen at random the first time Alice uses the protocol with this particular bank, and then stored by the bank.) Then, the bank selects a sequence of random numbers (in green) that do not add up to anything in particular (but which add up to the same random number each time the protocol is run, for a technical reason we will not go into here.) The bank writes all of these numbers on the envelope using invisible ink. The “red” invisible digits are written in positions that correspond to the password (in the same way as how Alice selected the positions to place carbon paper). The “green” invisible digits are written in the other positions. Then the envelope is sent back to Alice. Alice cannot read any of the digits – until she opens the envelope. Then, she can read only those that were written in positions with carbon paper. She adds these up, and displays the image with that number. If she recognizes this image, then that means that with a big likelihood she is speaking to the bank. If not, then she knows she is under attack. If the latter happens, then she stops the login process, but otherwise she continues by entering her second password character. This will cause the above protocol to be performed again, but with a new image being produced. Again, Alice stops if she does not recognize it, but continues otherwise. After she has entered the last character and hit enter, then a traditional protocol for mutual authentication is run. This allows the bank to verify that Alice knows the correct password, but without enabling any man-in-the-middle attack.
We note that a Bank-impersonator will not be able to obtain the password from Alice, since it will select the wrong positions with a big probability. (The set of images can be made very large, thus making the probability of correctly guessing the image very small.) Similarly, an Alice-impersonator speaking to the bank will not learn anything, even if she can open the envelope. This is so, since if she selects the wrong positions for the carbon paper, she will simply receive a set of random numbers. She cannot tell a “red” random number from a “green” random number: everything copied by the carbon paper looks the same. (We use color only to make the figure easier to interpret.) One could argue that an attacker could place carbon paper over all positions, but this is not possible in a real implementation of this solution.
Attack scenarios and security guarantees.
Let us now consider a few attack scenarios for reason of illustration. In a first scenario, the attacker impersonates a bank to the user, and manages to make the user connect to the attacker using the proper DPD software. Then, given that this is a mutual authentication method, we can see that the attacker does not learn any information about the password. The same holds if the attacker were to try to impersonate a user to the bank, and if the attacker mounts a traditional man-in-the-middle attack on a user and a bank. Note that the attacker learns neither password nor images by performing these attacks. While it may not seem obvious at first that he learns no information about the images, this is due to the fact that the actual image is not sent over by the bank, but rather computed by the user machine as a function of the password characters entered so far, and the transcript received from the bank. This means that each input character will result in a valid-looking image – but the attacker cannot tell whether it is the right one or not!
In a second scenario, the attacker manages to display a doppelganger window on the user’s screen, and the user is tricked to perform a password authentication. However, since the doppelganger window outwardly looks like a valid DPD window but is not, then the attacker manages to have the user establish a potentially un-encrypted session directly with the attacker. The user enters the first character of the password. Now the attacker has to guess what image this corresponds to – note that this set may be substantially larger than the number of alphanumeric symbols, given that the correct image is also a function of the user name and of a secret value only known to the bank. (Note further that we do not in any sense rely on the user or his machine knowing this value!) If there are a million images to choose from, and these are selected in a way that makes them all distinguishable from each other, then the attacker has a probability of success of one in a million. However, success does not mean that he learns the password – it means that the user goes on to enter the second character of his password. After that occurs, the attacker has to guess the next image, again posing him with a success probability of one in a million. Therefore, we can see that DPD achieved a higher degree of protection against this attack than other methods for mutual authentication, as other such methods do not protect at all against this type of attack.
In a third scenario, the attacker performs a man-in-the-middle attack in which he opens a doppelganger window for the user, and then performs a valid DPD connection to the bank in which he claims to be the user. He forwards all information received from the user to the bank. He obtains information back, and computes the valid image from this. The image is sent to the user, and displayed in the doppelganger window. The user recognizes it, and enters the next password character, and so on. As a result of this attack, the attacker manages to learn the entire password sequence and the entire image sequence. While one might argue that this achieves the same degree of security as would traditional methods for mutual authentication, this is actually not the case. The reason is that as a result of having to interact with the bank, the bank will learn the IP address of the attacker. If the attacker launches multiple attacks over a short period of time, then this will be noticeable by the bank, since many users will log in from the same IP address in this interval of time. Moreover, if the attacker is located in a geographic area quite different from the victim user, then this will also be evident from the IP address, and special actions can be taken by the bank. Further, because of the interactive nature of this attack, it is inherently more difficult for attackers to perform. Therefore, we automatically exclude more naïve attackers.
These security measures are therefore heuristic (from the perspective of cryptographers) and pattern-based, and are closely related to the fraud-detection techniques employed by credit card companies and telecoms. Therefore, while not achieving perfect security, our technique provides better security against this attack than both traditional password authentication techniques, and previously proposed mutual authentication techniques.
About DPD and its inventors.
DPD is currently being implemented, and a beta version will be made available within a few months. The technology behind DPD is owned by its inventors; a detailed description of the techniques can be obtained under disclosure.
Markus Jakobsson (www.markus-jakobsson.com) is Associate Professor of Informatics at Indiana University at Bloomington (IUB), Adjunct Associate Professor of Computer Science at IUB, and Associate Director of the Center for Applied Cybersecurity (http://cacr.iu.edu/). Previous to working at IUB, he was Principal Research Scientist at RSA Security, and a Member of the Technical Staff at Bell Labs, Lucent Technologies. He holds a PhD in computer science from University of California at San Diego, and is an inventor or co-inventor of over 50 patents and patents pending.
Steven Myers is an Assistant Professor in the School of Informatics and an Adjunct Assistant Professor for the Department of Computer Science at IUB. He will be receiving his PhD in the spring of 2005 from the University of Toronto. In industry, he has interned at the Mathematical Research Division of Telcordia Technologies and developed and implemented cryptographic technology for Echoworx Corp., a web-security based start-up company. He has four patents pending.