End-to-end encryption of web services is increasingly popular: Mailvelope aims to bolt a PGP client onto webmail and both Yahoo and Google are working to add support directly. However, the fundamental nature of the web and the limits of human cognition make web-based E2E encryption susceptible to MITM attacks. While still potentially useful, such systems should not be used by high-risk populations such as journalists and human rights workers.
The dynamic nature of the web gives service providers the ability to target individual users with a backdoored version of their web client every time the site is loaded, an attack that Hushmail validated back in 2007. Mailvelope and similar browser addons can move message decryption to iFrames or new windows and rely on the same-origin policy to restrict the reading of content from the service provider. Unfortunately, as long as a service provider can spoof the UI, they can copy the plain text version of new messages and send an encrypted version to the recipient.
Mailvelope attempts to mitigate spoofing attacks through the use of security iconography, or “watermarks” as Mailvelope calls them. Mailvelope randomly generates a security icon during installation which is incorporated into Mailvelope UI elements. If the icon is different, users are not supposed to proceed, akin to site-authentication images used in some bank logins. However, security icons cannot effectively mitigate a UI spoofing attack because security icons do not work.
Researchers have been testing the efficacy of security iconography for over a decade, and the results are dismal. The most dramatic “experiment” was performed by Moxie Marlinspike in 2009. Marlinspike removed encryption from connections using a malicious Tor exit node, which also removed the browser encryption icons. Despite drawing his sample from a population with above average technical acumen and paranoia, he achieved a 100% “success” rate; meaning that every user who visited a login page logged into to their account. Marlinspike collected over 400 logins and 16 credit card numbers in 24 hours.
Of course, the encryption icons for browsers are smaller and somewhat different from what Mailvelope uses. The closest thing to Mailvelope’s “watermark” are personalized site authentication images displayed by many banks during the login process. In The Emperor’s New Security Indicators, researchers asked users to login to their bank and surreptitiously removed site authentication images, 22/25 of the participants sent their login information, a 92% failure rate. Some will question the validity of this statistic due to the small sample size, however, the results are in-line with a decade of research and the lab setting boosts user awareness.
Increasing the size and prominence of the security indicator will not decrease the failure rate to acceptable levels. One study devoted the entire browser skin to conveying encryption information and saw only modest improvements to user behavior. It shows that most users don’t understand the purpose of the information and that the software must determine if something is safe.
The fundamental issue is that human cognition has limits: we cannot process unlimited amounts of information. The assumptions made by the security model underpinning security iconography ignores a decade of behavioral studies and runs counter to 50 years of cognitive psychological research. Just try to accurately count the number of times a player passes a basketball in the following video:
The 50% failure rate for the above video is artificially high, as the laboratory environment heightens user awareness. The task is also very different from that of checking a security icon. The tricks employed by pickpockets are a better real-world analogy for spoofing a security icon. Watch as Apollo Robbins carefully manages the mark’s “cognitive spotlight” using misdirection to control the information that the mark is consciously aware of:
At 3:10 you can see how Robin applies pressure to the wrist near the watch clasp and then draws the mark’s attention elsewhere. The initial stimulus is registered, deemed non-relevant, and then Apollo uses misdirection to remove the stimulus from the cognitive spotlight. The stimulus is still present, but the nervous system and the brain must filter the signal down to relevant stimulus.
A user composing messages is in an even more precarious situation, as habituation conditions us to preemptively ignore information. Unless something is integral to the task itself, we will filter it from your cognition. Even if the user is asked to confirm that the icon is valid, they will habituate the task and complete it automatically1. While a pickpocket must create new stimulus to control the cognitive spotlight, habituation will suppress the stimulus from even reaching the cognitive spotlight.
I don’t want to get too deep into the neurological details, but the inevitability of filtering stimulus that is irrelevant to the task workflow is fairly obvious if you think about the amount of information your body processes: temperature and pressure from your skin, taste and smell from your nose and tongue, your complete field of vision, and all of the sounds present in your local environment. There are even filters for information that has been processed abstractly, which is why you can pick up on someone saying your name at a cocktail party but ignore ambient conversations after you start talking with your friend. Without these filters, we would be overwhelmed with irrelevant information.
The inability to effectively mitigate user interface spoofing attacks cripples the usability of these bolted-on E2E interfaces. They must lift the new message and reply UI elements out of the browser chrome. They must also create a distinct contact manager to handle public keys. The only thing left is detecting encrypted messages, which Mailvelope decrypts and displays in an iFrame. I’m not sure that even this is safe, since the service provider could display a “Reply Securely” button over the decrypted message.
A website is a very hostile environment to be operating in. The URL bar is a remote function call interface which retrieves a Turing complete programming environment in the form of a website. The service provider can target individual users, deliver new exploits at any time, and has total control over the messaging system. I’m just not sure that we should be relying on the same-origin security policy of browsers to protect our encrypted communications.
On the web, the best we can do is ensure a secure connection and valid DNS information; trust in the service provider should be assumed. With traditional software systems, we can use reproducible build systems to distribute trust and security audits to increase the cost of backdooring software. But without a clear separation between the messaging system and the software used to retrieve messages, we cannot build usable messaging systems that deploy end-to-end encryption. Any user interface that is secure against UI spoofing will only be a step above manually copying and pasting in the ciphertext.
Mailvelope and service provider based end-to-end encryption is still potentially useful. They raise the cost of an attack and force service providers to participate in serving backdoored versions of their sites and may add additional legal hurdles. It *may* be possible to bolt a usable PGP client onto website in a that can defend against malicious service providers.
But one of the many lessons Snowden has taught us is that the only thing worse than bad security is the illusion of good security. Such solutions should not be used by high-risk groups until they can prove that they can reliably defend against malicious service providers. Until then, vendors of such software have a moral duty to try and prevent users from high risk groups from using their software.
Update: Some comments on my blog assert that the security situation is very similar to what can be performed through basic software updates. I’m aware of this, an I have a few thoughts.
First of all, journalists and human rights workers really should be using TAILS, which is at least publicly auditable and in a position to refuse to comply with US court orders. Furthermore, I believe that we should treat development of software for these users as if lives are on the line. In that regard, it is at least possible to make attacks against the operating system and applications more expensive, which isn’t true of the attacks against E2E web crypto.
For example, we can create an operating system that uses reproducible builds for everything. We can also define a software subset that is required for journalists or human rights workers (an email client, a chat client, a basic document editor, and a web browser) and use security audits, sandboxing, and (eventually) formal verification to drastically increase the cost of an attack.
I’m also concerned about projects like okTurtles, which want to make it easy to build a Mailvelope-like E2E client for any software platform, such Facebook and Twitter. okTurtles claims to be “MITM-proof” and mentions journalists and human rights workers in its marketing. I’m afraid that someone will take them at their word build an okTurtles front end for vKontakt or Weibo. I hate the NSA and the US has a shitty human right record, but this would make it easy for Russia and China to precisely target users.
Browser add-ons cannot switch to blocking the user’s task flow, as they are dependent upon the service provider delivering accurate hooks into their interface. For example, no one would notice a silent forward from mail.google.com to www.mail.google.com. ↩