This article has been updated, and moved here
Archive for the ‘Design’ category
Actually I do have one big issue with the article:
If there’s one behavior of your application that you should focus on eliminating, it’s the behavior of crashing. Above all other “nuisances,” this is the one that is absolutely unacceptable.
But preserving someone’s data is more important then crashing. Having to rewrite your paper because your PC devoured it is worse then crashing. Crashing may be the worst “nuisance”, but there are more important bugs to squash first. However that is a topic for another time — we all agree that crashes are a problem should be fixed.
Although Daniel shows how to synthesize debugging symbols from hex-address, I think it’s worth considering leaving debugging symbols in your shipping app.
The reasons for [building applications without debugging symbols] are mostly to keep the shipping application as small as possible, but there are also some good reasons to hide the symbols if they reveal too much about your business strategy.
I can’t say anything about your business strategies, but removing information that can help you diagnose problems “in the field” seems like a very bad trade-off for slightly smaller files.
Hard-drives cost about $0.30 per gigabyte (GB), and the price is still falling fast*. Because the GB is the unit hard-disks are sold by, I am going to use it instead of MB or KB; I think it puts file-sizes into the right perspective.
Today’s applications are “big”, for a very good reason. That article says it better then I can, but the gist of it is that megabytes are cheaper then air and bigger programs can do more, making them more useful (the cost of a GB of storage space has fallen over 20 fold since the article was written, by the way).
The largest program I use every day that was built with debugging symbols on is EyeTV. It weighs in under 0.11 GB, and I don’t consider that “bloated”, because I get good value for my 3 cents of disk-space. Stripping debugging symbols with
strip makes it 0.0076 GB smaller. That translates into $0.002 worth of hard disk, that could store 13.7 seconds of TV . And that is insignificant. A few thousandths of a GB make little difference, and that’s all stripping debugging symbols will get you.
Of course, this is all academic if no one ever sees the crash-logs. Unfortunately, developers know, that’s the current crappy state of affairs. Crash reports are sent to Apple, and only Apple. The developers who wrote the program — the ones who could best fix the problem, and who desperately want to know about it, are completely out of the loop.
If Apple passed crash logs on to developers, everyone would win. Developers would be able to squash more bugs in less time. Users would have a better, more productive and bug free, experience. Apple could sell those improvements. Microsoft already does this, and it seems to work well for them. Most people are unaware of this SNAFU, and probably think that reporting the crash to Apple gets the information to the right people. I don’t know if educating people about the issue would light a fire under Apple, but it might.
If enough people start using flash memory over current magnetic-platter harddrives, then the price-per-GB ratio could change, because flash is currently about 100x more expensive (per GB). But the trend of the current storage-medium’s price exponentially falling will continue
But by the time flash-based computers become popular, their cost-per-GB will probably be as good, or better, then full-sized hard drives of today. Tiny hard-drives, using conventional magnetic platters, like the ones in the iPod, are also a compelling alternative to flash.
Less than half the population of the world has the manual dexterity to wiggle their fingers at the speed of 50 words per minute or better.
–Dr. Alan Lloyd, seminal typing instructor.
Computer professionals often seem to have unrealistically high expectations of what the “average” typist can do. For example, according to this Wikipedia article (as of 2007-12-04)
An average typist reaches 50 to 70 wpm, while some positions can require 80 to 95 (usually the minimum required for dispatch positions and other typing jobs), and some advanced typists work at speeds above 120.
But as we shall see 70 WPM is an absurdly high “average”. 120 WPM means 12 strokes a second, or a split-time of 83mesc between keypresses. That borders on the physically impossible.
As Teresia R. Ostrach, President of Five Star Staffing, Inc. says,
After 27 years in the Staffing Industry, I’ve encountered just about every misconception regarding the performance of office workers. The most frustrating of these is the belief in what constitutes “average” typing scores.
“For years I tried to explain that 65 WPM is a lot faster than average, but I had no proof. After all, everybody knows what an average typist is, right? Somebody who types between 50 and 60 WPM? Well, isn’t it? Well, NO, it’s not!
Here are her findings:
Mean = 40 WPM = 240 characters/minute
Median = 38 WPM = 228 characters/minute
Standard Deviation = 16.7-WPM = 100 characters/minute
Notice that that out of the three thousand four hundred and seventy five applicants, not a single one could manage 120 WPM. And only the top 5% of applicants could manage 70 WPM or higher.
Typing Speed: How Fast is Average
4,000 typing scores statistically analyzed and interpreted
It’s an excellent paper. Short and accessible, yet relevant, authoritative, and eye-opening. Well worth the read. (Unfortunately it’s laid out poorly in the liked PDF. If someone has a more readable source I’d love to link to it).
But what’s more interesting to me is this chart:
Which shows an average error-rate of about 6% per word. Put another way, more then 1 out of every 17 words has a typo in it, which is kind of a big deal.
The error-rate is probably artificially high, because subjects were taking the test under a lot of pressure — it determined if they got a job or not! But even the best group of over-qualfied typists still had a 4% error rate; or a fumble on 1 out of every 25 words. And that’s significant.
The implications of a 4%-6% error-rate are enormous. If people are making that many errors, then good spellcheckers, and auto correctors are essential. If one out of every 17-25 words is mistyped, then long command-lines seem like a very bad idea, because something like one out of every 20 commands would be in-error. Systems should be able to gracefully recover from bad input; because they will be inundated with it.
It looks like the average typist is much slower, and makes more mistakes, then “folk-wisdom” leads us to believe.
I do not put much faith in Hick’s Law. I’ve seen it misapplied and drastically misinterpreted. Its limits, and edge-cases, are not widely known. I am convinced that it is generally not a dominant factor, even when it is relevant. I don’t agree with many design choices it is used to justify. In the past 50 years, exceptions to Hick’s Law have been found.
Hick’s “Law” is simply the observation that the time it takes a person to make a decision is proportional to the information-theoretic entropy of the decision. Put another way reaction-time = a-constant-delay + O(entropy of possible responses) ≤ a-constant-delay + O(log(number of choices)). So it takes longer to decide between more options. But adding an option increases the time sub-linearly (at least with a “few” options) — and adding a likely choice slows down the decision time more then adding a few unlikely choices.
Write it right
Unfortunately, most people do not have a good understanding of what Entropy is in information theory. Interaction designers and programmers should at least understand the concept. Unfortunately they don’t always.
When every option has the same probability of being chosen, entropy is maximized. Recall that lg(N) is the entropy when every one of N options is equally probable. So lg(N) is the maximum possible entropy involved in selecting one of N options. (The minimum possible entropy, 0, occurs if one item is always chosen 100% of the time, or no item is ever chosen.) Owing to it’s simplicity, and attractive (but misleading) similarity to Fitt’s Law,
reaction_time = a + b*lg(N), where
b are empirically determined positive constants, has become the most common formulation of Hick’s Law.
I am not fond of that formulation.
It implies a connection to Fitt’s Law, when it’s pretty clear to me that none exists. Hick’s Law deals with the cognitive processes of decision-making; but Fitt’s Law deals with the iterative physical action of pointing to an object. The two equations are not related, except that that they appear together in HCI literature, and both model a human completing some task. Logarithms also appear in equations modeling radioactive decay — but have no connection to ether’ Hick’s or Fitt’s law.
Stating Hick’s Law in terms of entropy gives better intuition about the decision-process. It shows that the time to make a decision depends as much on the qualities of the alternatives, as how many of them there are. For example, imagine you’ve just won a new sports-car on a game show — now you have to pick one of several different paint-jobs, and drive it off the set. Your choices are: a classic red, safety-green, neon-pink, or Chartreuse and violet tiger-stripes. Like most people, you will probably choose red, and quickly. Now imagine that the choices are: this elegant silver-blue, or classic red. Even though there are only half as many options, it’s clearly a much harder decision, that will take more time. This contradicts the “reaction-time ~ lg(N)” model, but is clearly explained by the entropy-model, because two equally-likly options have a higher associated entropy then one popular option, and several very unpopular options.
A bad justification for bad ideas
Hick’s law has been used to argue that, “giving a user many choices simultaneously is usually faster than organizing the same choices into hierarchical groups. Making choices from one menu of eight items is faster than is making choices from two menus of four items each.” (The Humaine Interface, page 96). Sometimes this is called the Rule of Large Menus. I strongly disagree with this rule of thumb.
The decision that Hick’s Law models is only made after the user has divined enough relevant options. Hierarchically organizing options makes it easier, and faster, for the user to find relevant options. And this makes the whole process faster. Even when Hick’s Law is applicable, it’s not necessarily dominant. Other factors, such as if the users has to scroll or not, have a far greater impact on how fast, and how ergonomically friendly, completing a task is. But we can have our cake and eat it too.
A hierarchically organized presentation does not mean people will build a a hierarchical mental-model. For example, the word processor I am typing this in has hierarchically-organized menus. The Edit menu has top-level commands, including cut/copy/paste, and a sub-menu called Find that has 6 different commands to search for strings in a document. Each command has a keyboard shortcut, ⌘C for copy, ⌘F to enter a string to search, ⌘G to select the next occurrence of the string, and so on. Any of these shortcuts can be used at any time to initiate any of the commands. When I decide what shortcut to use, I am selecting one shortcut out of all possible shortcuts that I know.
People will string-together multipul commands, making them one action in their head. For example, if a “delete” command is always followed by a confirmation dialog, users will learn to automatically hit enter after hitting delete . So the two actions: “delete” and “confirm delete” become one action “delete and confirm”. (This is why confirmation dialogs are a bad idea). So as long as commands exist to navigate a hierarchy, they can be strung together to make a “flat” command that directly selects an option. A user can use consider all “flattened” commands at the same time.
I am not aware on research into, the limits on Hick’s Law — aka what happens if there are a lot of choices? People simply can’t hold 4 billion choices in their head, yet Hick’s Law tells us that choosing between 4 billion equally-likely options should only be about 30 times slower then choosing between 4. And I just can’t accept that as true. At some point, the number of options exceeds a person’s mental capacity — and I would expect that to affect reaction time. But exactly what this limit is, or if it even matters, is not commonly known.
Whisky. Tango. Foxtrot.
I’ve come across some amazingly … incorrect … takes on Hick’s Law. And that makes me even more skeptical of it’s utility.
If I add more choices, I slow down response time. And if I add more stimuli, I slow down response time. Exponentially.
Exponential growth is of course the exact opposite of what happens, which is logarithmic growth. Yet according to Hock Hochheim, “Many modern instructors just associate a doubling ratio to Hicks-that is, for every two choices, selection time doubles per added choice.” His rebuttal of that exponentially-wrong take on Hick’s Law is interesting reading, if for no other reason then it shows just how prevalent a bit of bad-science can become in a field. It also touches on the notion that the brain has a “fast-track” for dealing with sudden “fight or flight” situations.
I don’t know enough about research into the amygdala and the brain to give any hard facts. But it is my understanding that current research suggests instinctual responses to danger can occur much faster then deliberate thought. Humanly taping into this stress-response seems difficult though…
Another “I don’t know for sure, but it’s worth keeping an eye on” is muscle memory and sports. Athletes seem to be able to respond to a stimulus (a flying ball, a punch, etc.) with blinding speed and without conscious thought.
A phenomenon that Hick’s Law does not account for is habituation. If there is one option, A, in a menu that is chosen many times in a row, the user can not help but develop an automatic response to select A after clicking on the menu.
Hick’s law is best stated as: “Reaction-time = a-constant-delay + O(entropy of possible responses)”.
Hick’s law has been totally misunderstood, and used to draw some very strange conclusions.
This is a bit of the design process behind one line of one settings panel inside IMLocation.
The “Locations” panel controls everything having to do with to locations. The pane’s “headline”, outlined in red, shows what is assumed to be the current location.
It reads like “Your current location is home”. It does not say “You are: Home”, or “You are at: Home”, even though that’s shorter, and closer to the familiar “you are here” stickers. “You are at:” is out, because people need to be free to choose whatever names work best for them. Naming a job-location “working” should not turn the headline into nonsense like “You are at: working”. I choose not to go with “you are: …”, because it felt too imperative — like it was dictating what the user was doing. I wanted the copy to say “This is where the program assumes you are”. I’m still not 100% sure that this was the right phrase to use, but it is clear, and it works well enough.
Immediately to the right of the headline is a button, “That’s not where i am…”, which lets the user fix things if the presumed location isn’t where they are. The button is on the same line as the the headline, because I think this makes it a little more clear that the button corrects the current location. Putting it under the headline would separate it from the current location.
I wanted clicking on the headline to select the current location, so it could be edited. This seemed like a very intuitive action to me, but affording it turned out to be surprisingly tricky.
This showed that it was clickable, but looked kind of ugly, and testing showed it wasn’t clear to all users what clicking it did (“Why is that a button?”).
To clarify, I made only phrases that meant “the current location” clickable.
This was a big improvement, but still not good enough. The button’s borders broke up the text, making it choppy and slow to read. The “current location” button looked ugly and wrong, because normally buttons in OSX start with a capital letter. But capitalizing words in the middle of a phrase would be even more dissonant.
At this point I realized traditional buttons just weren’t a good fit. Every other button in the interface modified a location, but the buttons in the headline just select something. They don’t change anything. Every button I’ve ever seen in a good interface makes something happen — it changes data, or how data is presented, or searches billions of web-pages. I needed something a little less “heavy duty” then a button, that still afforded clicking, but didn’t break the flow of text.
Hyperlinks were a great fit. Clicking them means “show me that” — which is exactly what clicking the location headline was supposed to do — show something. They afford clicking, without breaking the flow of text.
So starting with v0.27 I made the key phrases links inside the headline.
I also put the “That’s not where I am…” button and the headline together in a box, to help re-inforce their relationship, and to give the headline some emphasis by giving it a border.
Leopard introduced a new button style, called “Recessed Button” in Interface Builder, that is a perfect fit. It has no border, and hilights on mouse-over, just like a hyperlink. (Basically, it’s what is used in the Safari Bookmarks bar).
Right now I’m leaning towards dropping support for Tiger, so that I can take advantage of the UI improvements in Leopard. I just wish I had a better understanding of how many users that move would alienate.
Earlier this year, a major scandal erupted in France when it was discovered that between 1989 and 2006, two radiotherapy units had accidentally given hundreds of cancer patients too high a dose of radiation. Five patients have since died and many others have been left in crippling pain.
My first thought was how eerily similar this is to Therac-25. But this incident could be worse once all the facts are out. 5 are already dead, and hundreds affected, according to the BBC.
A major investigation is now under way to try to establish how so many mistakes could have been made…. Incredibly, one of the lines of inquiry will be why the instruction booklets that accompanied the equipment were in English when the hospital staff of course were French.
This investigation is very much worth following. A lot can be learned about designing safe and usable systems from this disaster. Cynically, I worry that the massive liability involved will lead to politics and cover ups, instead of through investigation. Be prepared to read between the lines.
… staff then explained to newcomers how to operate the programmes, who later explained to subsequent trainees, and so on. To add to the confusion, the procedures were all in English.
Eventually, an incorrect default setting was made that resulted in a number of patients being given overdoses of between 20% and 40%.
Poor training is an issue, sure. But the real question I have here is, how could the software be designed so that it could possible be rendered lethal by default?
According to the AP “In both the Epinal and Lyon incidents, hospitals blamed the problems on human error.” I agree, but I think the humans at fault were the designers, not the operators. “Human error” is usually a euphemism for “operator error”, or “customer error”, or “blame them”. Disasters are a chain of failures; operators are only one link in that chain. The system as implemented in the hospitals included hardware, software, training, and standard operating procedures. From all accounts, it looks like there were systematic errors, over a period of years — about the strongest indicator you can have that the system was deeply flawed.
What Therac-25 was to engineering, this could be to interaction design. I think there were probably engineering mistakes made, but if the instructions weren’t even in the right language, chances are usability was a bigger factor. Actually, the similarities to Therac-25 still bother me. It’s a bit of history that should not be repeated.
I’ve said it before and I’ll say it again, these incidents are worth following. I just wish more hard facts were public (and in English as well, I can’t read French).