The 5 User Sample Size myth: How many users should you really test your UX with?

March 13, 2019
Frank Spillers
No Comments

Summary: Testing with 5 users has become a commandment in UX research. Worse yet, a sample of 5 has become the rule, for any type of user research (field studies or diary studies). Sample size is a big deal in UX because it impacts your learning and decision-making. Skimp on sample size and you will likely cut yourself short of valuable insights.

For Usability Testing: If you follow Jakob Nielsen’s advice: test with 5 users or less (2 users for low-fidelity prototypes) and do many iterations of testing (at least 3 rounds of testing). Our advice: if you have multiple user segments and you don’t have time to play with 3-5 iterations, test with 8-10 users for prototypes, and 15-20 users for finished products. If you can, or need to iterate then a second round of testing should suffice.

For Field Studies: Interviewing users at home/work or with diary studies–you should be talking to 15-40 users. Note: We have found that deeply understanding user needs in a field study, reduces your need to do many testing iterations. It also helps you get as close to “right first time” as possible. This factor is rarely mentioned when the question is asked ‘How Many Test Users in a Usability Study?’.

How many users should you really test your UX with?

This is the common question many product managers have to ask. The standard answer you’ll hear is “5 users”. But we’ve found this to be out of touch with the reality of modern Agile development. If you go to the source of UX sample size evangelism, you’ll discover it comes from Jakob Nielsen, a “guru of Webpage usability”. Nielsen always backs up his advice with statistics (see below) from studies his consulting firm has run. He is a big voice in the UX field and has influenced a lot of teams with his “5 users” mantra. This was part of his ‘discount usability engineering’ evangelism of the early 1990’s– when usability was largely hostile in engineering groups.

After following his advice, and practicing UX with hundreds of teams over two decades at Experience Dynamics, it appears the 5 users myth has officially spun out of control, and is hurting more teams than it is helping.

The 5 User Sample Size myth is repeated so many times by agencies, UX designers and people in start-ups and enterprises, that few people question it. It’s considered gospel. It is also being used as the standard sample size for any UX research: Field studies, Diary Studies etc…making it not only an urban myth but also an example of ‘Lazy UX’.

Jakob Nielsen’s sample size evangelism is out of touch, here’s why

Pushed by a dramatic need to evangelize usability in the early 1990’s, UX had to be fought for button by button. In response, Jakob Nielsen promoted “discount usability engineering” methods: quick, cheap user testing and expert reviews. The iteration model and supposed “cheap, quick” attraction of the 5 users myth seems to have its roots in hostile engineering environments. Usability pioneers like Nielsen and his co-author Landauer, were working in such environments at the time. UX has moved way beyond that today, where it is a strategic advantage today. Getting UX right is valued much more than it was back then.

Here’s the bottom line: UX maturity can be seriously stunted by cutting corners. If you don’t learn enough from your users, or you make generalizations too quickly, we’ve seen this lead to poor design decisions. The source of many valued UX efforts is based on user testing. This user testing follows the 5 user myth, aka tiny sample sizes.

In this video clip, Jakob Nielsen recommends “smaller studies, but more of them”. He gives an example, “if you have 100 users, you should go through 33 iterations of your design”. This is ridiculous, but even if you were to bring it down to 25 users (considered by all measures a large sample), you would be iterating your design 8x. With 12 users, you would have 4 rounds of design changes and 4 user tests. Show me an Agile development team (start-up or enterprise team) in the world, who would permit that level of disruption and protracted testing?

In his article on the topic, “Why You Only Need to Test with 5 Users“, Jakob Nielsen says: “Let us say that you do have the funding to recruit 15 representative customers and have them test your design. Great. Spend this budget on 3 studies with 5 users each!”

Even 3 smaller studies are likely to not scale to the realities of most projects–unless you have an established UX team that is embedded and doing ongoing Agile user testing as part of an ongoing UX management effort.

So what do Agile teams pushed with sprint deadlines do instead?

Either they don’t test at all or their UX teams test with 5 users. Don’t get me wrong, I’m a fan of light, lean and preach iteration like the next UX’er. But, the lingering question is: Does iteration of user testing happen in the real world? No, it does not.

The appeal of testing with fewer users has been a universally adopted mantra. It’s like the teacher telling you to do 5 minutes of homework instead of 10 minutes. Who would not love to hear that message?

Where the 5 user sample size for UX studies breaks down

In User Testing, the following factors influence sample size choices:

1. Varied segments: The 5 user sample is predicated on having one audience segment. Most organizations have numerous customer segments or user types. Note: It is extremely rare to only have one user type on a site or app, 2-15 is very common.

Nielsen stated: “You need to test additional users when a website has several highly distinct groups of users. The formula only holds for comparable users who will be using the site in fairly similar ways”. Yet this issue rarely gets mentioned by Nielsen when he promotes the 5 user myth in his evangelism.

2. Style of test: Low Fi or High Fi? If you are testing initial ideas in low-fidelity (non-working code), you can test with fewer users. Nielsen recommends 2, we recommend 8-10. If your prototype is higher fidelity, Nielsen recommends 5-8, we recommend 10-15 users.

3. Type of test: Formative (directional insights) or Summative (statistics or quantitative). If your design is a concept that’s usually formative, so you can often test with fewer users eg 8-10 (Agile User Testing). If your design is published and/or you are looking for solid metrics to quantify the user experience, more users are needed eg 15-20.

Jakob Nielsen uses this scatter plot to defend the 5 user sample. It seems most engineers and designers (and agencies) interpret this as “Why doing extra homework is a waste of time”.

Look at the above plot closely. In particular the highlight I’ve added: users 6-10. Here’s how I interpret this graph: (humor intended)

User 1: No insights, hmm, that’s weird. Researcher on a popcorn break 😉
User 2: Lots of insights, wow.
User 3: Insights drop, not a lot to see.
User 4: Insights trailing off but steady.
User 5: All the insights pile in with the fifth user. Just in time to end the testing!

What about all that data happening on users 6-10? Holy cow, that’s some serious insights stacking up. Would you want to miss those? I wouldn’t.

What else can go wrong with small sample sizes?

Sample size is a big deal in UX because it impacts your learning and decision-making. Small sample sizes can be jeopardized by the following:

1. Recruiting issues- poorly recruited (read: sloppy) users can yield inaccurate insights. Grabbing random people from down the hallway?, Pretty sure that’s not your customer! Instead you want actual customers or end-users who live the ‘pain’ you are focusing on– not those pretending to feel the pain.

2. Moderation– poor moderation can yield skewed data. eg Asking users what they expect here and there is not best practice moderation. Listening is the most important skill in user research.

3. Observation– poor pattern recognition by making massive generalizations from 1 or 2 users can lead to bad design decision-making.

4. Low insight users– if you have to “throw out” 1 or 2 ‘low data’ users, your insights will be diminished. Usually caused by imprecise or poor recruiting technique, some users just don’t have a lot to say.

Testing with 5 users can be a gamble. If your sample size does not yield insights or sustain patterns, you get thin on actionable data pretty quickly. Worse, try convincing your team that those 2 users in Nielsen’s low-fidelity testing recommendation are solid enough to act upon.

So, how many users does it really take to get the insight you need?

It depends on your approach. If you follow Nielsen’s advice completely, do 5 users (per customer segment), 3x with enough time to iterate in between tests, then you should be fine. For example, you have 3 customer segments, that’s 15 users. Now iterate that 3 rounds.

The practical answer: base your tests on 5 users per segment (per the Nielsen/Landauer original study). Determine the # of iterations, after you do the initial first test with a healthy enough sample size, eg. more like 8-10 users, not 2 users as Nielsen suggests for low-fi prototyping.

Also remember: for optimal UX, do a field study first so you understand who and what you are designing for, and then later testing.

Summary: User testing sample size- how many users do you need to test with?

For Field Studies: Interviewing users at home/work or with diary studies–you should be talking to 15-40 users. Note: We have found that deeply understanding user needs in a field study, reduces your need to do many testing iterations. It also helps you get as close to “right first time” as possible. This factor is rarely mentioned when the question is aked of ‘How Many Test Users in a Usability Study?’.