We have learned a few methods to evaluate the quality of user interface design previously. Inspection methods like heuristic evaluation and cognitive walkthrough let you understand usability issues through UI/UX experts' evaluation. We learned theories like Fitts' law. We also learned about ethnography, in which you observe how people interact with a system to identify what usability problems people face in interacting with a system.

But so far, we haven't learned to evaluate usability with actual users. We will cover testing with actual users in the next few modules.

Usability

We have been casually using a term "usability." Usability refers to the quality of a software system being easy to learn, effective to use, and enjoyable to interact with. The notion of "good" usability depends on the context; for example, what is considered as "good" for a real-time, first-person shooter game's UI would be different from "good usability" for a word processor application. The most importance aspect of the usability for the former would be responsiveness, quality of how exciting the game is, and aesthetics. On the other hand, the users of the latter application would value qualities like ease of learning and ease of putting together a document.

Having said that, knowing commonly used usability criteria would be helpful. Here is a list of commonly accepted usability goals (criteria) (Preece et al., pp.19-22):

Effective to use (effectiveness): Effectiveness refers to how good a product is at doing what it is supposed to do. You can ask "can the user successfully perform a task using this system?"
Efficient to use (efficiency): Efficiency refers to the way a product supports users in carrying out their tasks. You would ask "how many steps do users have to take to achieve his/her goal using this interface?" and "can they sustain a high level of productivity?"
Safe to use (safety): Safety involves protecting the user from dangerous conditions and undesirable situations. It not only refers to physical safety, but also other aspects like "does the system help user protect his/her privacy?" and "does it allow people to cancel unwanted operations easily?"
Having good utility (utility): Utility refers to the extent to which the product provides the right kind of functionality so that users can do what they need or want to do.
Easy to learn (learnability): Learnability refers to how easy a system is to learn to use. It is well known that people don't like spending a long time learning how to use a system.
Easy to remember how to use (memorability): Memorability refers to how easy a product is to remember how to use, once learned. This is especially important for interactive products that are used infrequently. If users haven't used an operation for a few months or longer, they should be able to remember or at least rapidly be reminded how to use it.

Methods to Evaluate Usability

We have three large categories of evaluations by which we can assess the usability of the system (Preece et al., pp.456-462):

Evaluation methods not involving users: Use of models like Fitts' law and employing UI experts for heuristic evaluation is classified as this category. In this class of evaluation methods, you do not directly interact with the users.
Testing system's design in natural settings involving users (field study): There is little or no control over users' activities in order to determine how the product would be used in the real world. An example of this type of evaluation is an observation and interview in the field. What you have previously done—engaging with the target audience to either interview and observe—is classified as this type. At the beginning of the design process, we were interested in using interviews and observation to empathize with users. We could repurpose these methods to investigate how people use the system that you have created.
Evaluating system's design in controlled settings involving users (lab study). The most typical form of this kind of evaluation is a usability testing conducted in a lab. Users' activities are controlled in order to test hypotheses. You install ways to measure or observe certain behaviors of the users. We will discuss more on this below, as well as in 13. Qualitative Analysis and 14. Laboratory Study.

A different study methods—interview, observation, heuristic evaluation, usability testing, etc—and whether it is done in the lab or field have advantages and disadvantages. They allow us to collect different types of data. It is your role as a designer to think what would be a suitable method to evaluate the usability of your system. Luckily, you do not have to pick a single method; the best evaluation strategy would be to use multiple methods that complement each other. Using multiple study methods to evaluate a design from different angles is called triangulation.

https://youtu.be/v8JJrDvQDF4

Before we talk about usability testing, let's take a step back and ask, "why would you test the system with actual users anyway?" For example, why can't we use an inspection method and get over with evaluating usability? Though inspection is a good first step in teasing out obvious usability problems, evaluators may know too much about typically accepted "good practice". They wouldn't know enough about the user's tasks, the user's environment, and the user's background. And these could introduce discrepancy between what the UI experts think is "good" and what actual users perceive as usable.