Mozilla Internship: Usability Testing & Triangulation

I interned for the Ubiquity project at Mozilla Labs in 2008 and 2009, performing usability testing and usability bug triangulation. I worked with a team of developers sprinkled across the globe on an experimental language/command line interface for Firefox.

This portfolio is split into two parts. The first part consists of videos of the usability testing. The second part demonstrates how triangulation integrates this testing with the rest of the development process.

Usability Testing

Knowledge of how a computer program works is inherently ruinous to one’s ability to model novice user behavior. It is difficult for a programmer to understand how a novice user perceives and reasons about an interface because the very knowledge of the interface sets a programmer’s mind to the “right” format.  Usability testing is a way for us test our understanding of how users think.

Good usability engineers use the cheapest prototype possible, the favorite being paper prototypes. The engineer draws rough sketches of the UI and then acts as the computer, swapping out new pages as the user navigates through the prototype.  Paper prototypes are favored for two reasons:

  1. There is zero cost to change something.  If a user clicks on something that wasn’t meant to be a button the human “computer” can ask “what do you think will happen next” and then fulfill the users request.
  2. Usability tests give users the impression that they are the ones being tested and feel stupid when an interface confuses them.  When they see a bunch of crude sketches, they have no problem criticizing adult children pretending to be a computer.

Unfortunately, the text-based interactions with Ubiquity could not be replicated with enough fidelity using paper prototypes and the experience made me miss the ability to rapidly iterate with a pencil and some paper*.  Below are samples of later stage usability testing; the roughness of the interface shows just how much time and effort can be saved by testing early and often.


Usability engineering must integrate with the development process to be of use to anyone.  We do this through triangulation: filing bug tickets and supporting them with evidence gathered by usability tests as well as data from customer support queries, website metrics, etc.

Below is a snippet from the full report I made for Mozilla.


Begin probing how to make Ubiquity accessible and useful in the context of including Ubiquity with the mainline Firefox distribution. As this is an early alpha I will be focusing on the core Ubiquity interface, while logging bugs on separate commands.


See main article Methodology

This is an exploratory, qualitative test exploring what users do when presented with Ubiquity for the first time. It is based on the interview format where the participants lead which tasks they perform next.

Analyzing Data

These tables are being pushed out before I have had a chance to have someone verify them and link them to the Trac DB. They are subject to changes, revisions, and mistakes. This is a wiki, so you can help out ; )

Usability issue trends
Tag Freq. Sev. Videos Motivation Notes Trac #
Gives up (Discovery) 5 3 4, 6, 9, 10, 11 Discovery Users who gave up on finding the hotkey combo. 402, 440
Finds Hotkey 3 0 3, 5, 8. Discovery Anywhere from 1-9 min.
Notices Hotkey Setting Field 2 1 5, 8, Learnability #5 Didn’t know what it was (and his two guesses were wrong) and 8 found it by accident, and changed it by accident. #7 accidentally changed the setting.
Changes hot key accidentally. 2 3 7, 8 545
Problems with command structure 1 2 10 Learnability I.e. the need for modifiers etc
Confused by Jargon 5 1.5 6, 7, 11 Learnability Wiki, hotkey, mashup, etc. See Demonstrating Value 547
Confused by mozilla wiki 2 2 6, 8 Learnability We should move the help to something other than this Wiki. 402
Tries Wikipedia for help 2 1 9, 5 Learnability The “Help system” includes Wikipedia and Google.
URL instead of email command 3 2 11,4,9 Learnability Lead for stat analysis, synonym common email URL’s w/ email command. 572
Tries Right Click 1 1 9 Discovery Contextual menus have discovery issues, the one tester used it only because he read it in the tutorial. He never did open Ubiquity either.(link to other studies on contextual menus)
Tries watching video 6 6, 7, 8, 9, 10, 11 Learnability, Value
Video won’t load 3 2 6, 11, 10 Learnability External validity of this data is poor because all users were using the same wifi connection (ie not a random sample) 547
Other video problems 1 1.5 7 Video is buzzword laden 547
Video volume 2 2 7, 10 Sound 547


commandname- number of unsuccessful attempts total

commandname+ number of successful testers total

If a command was executed subsequent successful attempts don’t tell us anything. If someone screws up the total number of attempts and separate issues are counted. Think of it this way, there are an infinite number of ways to crash a plane but only one right way to land one.

Remember, these numbers (as is with all usability stats) are to bring a level of objectivity to what is inherently subjective observations.

Command completion/error rates
Tag Freq Videos Notes Trac#
Email- 5 9, 10, 11 Multiple issues, one being that users don’t understand the need for modifiers (“to janedoe@gmail.com” or “email this“). Another being that users try typing in the URL of their service provider (mail.yahoo.com)- when that failed they assumed it didn’t work with their email service provider. Finally, the email command is just very buggy. 572 574
Email+ 3 5, 6, 7
Map- 3 8, 9, 10 Somewhat invalid as the errors are due to discovery problems with Ub itself. All participants cruised to Google Maps instead of using the command, provide contextual reminders/clue on Google Maps itself?
Map+ 4 7, 8, 10, 11
Wiki- 0
Wiki+ 4 6, 7, 11, 10
Weather- 0
Weather+ 3 5, 10, 11
Define- 1 5 Sudo did not show up in Define.com. Fallback dictionaries (urban, wiktionary, Google’s define:, etc) would be a smart idea. 404
Define+ 3 5, 7, 10
Translate- 2 10 This caused more confusion than is reflected here. Executing it was not the problem, users didn’t expect it to change the text on the page. 54
Translate+ 2 5, 8
Help- 1 10 Couldn’t guess the command correctly.
Map-insert- 6 5, 6, 7, 8 Requiring the user to click on the map is “counter intuitive.” 542
Map-insert+ 2 5
Yelp+ 1 8
Google+ 2 6, 7

Synonyms & new command suggestions

The email command received the most amount of trouble, especially when chained with inserting a map. Some hotkeys tried were just “control E” or typing the url into the command line i.e. email.yahoo.com. Tester 8 and 10 were prolific guessers.

Tester 08

  • Lyrics 14:09
  • Yelp insert- 21:14
  • Find 18:44
  • Locate 19:10
  • Help add command 23:37
  • New command 23:39

Tester 10

  • Gadgets-Command list
  • Find gadgets-Command list
  • Update-Command list
  • What’s new-Command list
  • Add command-Command list
  • Stockprices
  • -Stock Prices
  • Ticker–Stock Prices
  • Zipcode -Map
  • Directions-Map
  • Closest ie Closest japanese restaurant
  • “map home” modifier of map command for location.

What really jumps out at me here is yelp insert and help add command. Should insert be a universal modifier? And help add command suggests that the help structure for commands would be centered around a command entry, like in Enso.

Thanks again to Dana ChisnelAza RaskinJonoAndy and everyone else who during this internship!

*I honestly believe that the inability to test early was one of the reasons that Ubiquity failed: the architecture was based on a deterministic model of interaction which required more training than the average user cared to put in.  I spent the next few years digging into linguistics figuring out the right model, which is a probabilistic model resembling Siri, Google Now, etc.  Note that this project came out when the iPhone was new.