Proceedings of HF 2002, Nov. 25-27, 2002, Melbourne, Australia
Towards an Activity Scenario Based Methodology for Usability Testing of Websites
Lejla Vrazalic1 and Peter Hyland2
University of Wollongong
1lejla@uow.edu.au, 2peter_hyland@uow.edu.au
Keywords: Usability testing, methodology, activity theory, scenarios, website, laboratory.
Abstract
Traditional laboratory based usability testing methodologies do not take into account the context in which users engage in socially-driven activities when using a website along with other types of mediating tools. A preliminary study with 34 users indicates a series of shortcomings with this traditional approach. The activity scenario based usability testing methodology, currently being developed, aims to overcome these shortcomings by utilising a combination of Activity theory principles and scenario development. This paper describes the initial theoretical investigation and empirical evidence which will be used as the basis for further development and refinement of the methodology.
1. Introduction
Current laboratory based usability testing methodologies rely on investigating how individual users interact with WWW interfaces. This type of testing enables evaluators to obtain detailed data about the cognitive processes involved in a direct interaction between a human and an interface. However, it also fails to take into account the context in which the interaction occurs, as well as the reason for its occurrence. According to Kuutti (1996, p.19) “the Cartesian ideal of cognitive science […] has been seen as unable to penetrate the human side of the interface”. No provisions are made for the study of users’ real activities and the way in which users employ a website as one of the various tools which support their activities. Whiteside and Wixon (1987) called for studying real users and systems in rich contexts, as early as 1987, while Bannon (1991) pointed out that actual system use is a long-term process and, as such, it is inappropriate to research inexperienced users during short periods of time.
This paper provides a general description and overview of an Activity theory (AT) and scenario based (Carroll, 2000) website usability testing methodology that is currently being developed. The methodology aims to overcome the problems associated with traditional laboratory based testing by taking contextual factors into account and still retaining the level of control afforded by a laboratory. The paper begins with a discussion of the rationale behind developing the methodology, which includes empirical evidence of the problematic nature of traditional laboratory based usability testing from usability tests conducted recently with 34 users, along with the propositions on which the methodology is based. The paper then aims to describe in some detail the phases of the methodology, which are currently being developed, from a theoretical perspective, in order to present the theoretical foundations on which the methodology is based. Finally the paper concludes with a statement of potential benefits and pitfalls, along with future refinements and developments.
2. Rationale and Initial Exploratory Study
This section aims to explain the rationale and motivation behind developing an activity scenario based usability testing methodology, in view of the need for contextual evaluation at all stages of the design process. It begins with a closer examination of some of the issues associated with traditional usability testing in a laboratory setting and then describes the results of a series of 34 usability tests conducted to derive an initial list of shortcomings of traditional usability testing of websites. Spinuzzi’s (1999) view of distributed usability is then presented as the underlying proposition and starting point for the development of the methodology.
2.1. Traditional laboratory-based usability testing
Traditional usability testing methods employed in a laboratory setting are constrained by the lack of contextual factors inherent to real user activities. These factors include the work, time, motivational and social contexts (Whiteside et al., 1988) that encase human activities. Experiments carried out in a laboratory are radically different to the natural, everyday practices that humans engage in through interaction with various websites, objects and other humans.
“Experiments involve and experimenter and a subject. A subject is someone who is under the power, control, influence, observation, or direction of another person. Indeed, experimenters wield considerable power over subjects – they remove them from their accustomed environment, remove them from their usual social context, prescribe what work they are to do during the experiment, prescribe the time available for completing the work, minimize ‘external variables’ such as interruptions […], and often conceal the details of why the experiment is being run (for fear of contaminating the data).” (ibid., p.806)
The testing methods used in a laboratory setting, such as the ones described by Rubin (1994), tend to focus on how one individual interacts directly with a computer in an isolated setting. The cognitive processes and abilities of the individual, including memory, perception and motor skills, are scrutinised and measured using performance based metrics such as time taken to complete a task, number of errors made -and perceived ease of use. However, this micro-level of analysis does not take into account the social setting in which the human-computer interaction takes place in the real world, nor any of the other contextual factors identified by Whiteside et al. (1988). In fact, usability testing done in a typical laboratory environment tends to be technology driven (Sweeney et al., 1993) rather than focused on users’ activities, motives and goals.
Furthermore, a laboratory necessitates the use of cameras and video-recording equipment for observation purposes. One of the main disadvantages associated with this technique is the user involved in the testing feeling self-conscious and altering his/her behaviour and performance as a result. This introduces a significant bias into the testing process and, thus, a contamination of the data collected. The following section will illustrate some of the shortcomings of traditional usability tests derived from an empirical study.
2.2. Initial exploratory study to determine shortcomings of traditional approach
A series of usability tests was carried out with 34 participants (who were also typical website users) at an Australian university in April 2002. The participants included 19 mature-age students and 15 first year students who had completed the HSC in 2001. A pre-test survey was prepared to collect data about the users’ background, computer and Internet experience and previous usage of the website being tested. The participants were then asked to evaluate a specific part of the university’s website by completing two typical task scenarios which required participants to use the website to find specific information about courses, fees and entry requirements. The scenarios were developed in consultation with the designers of the website to reflect typical uses of the website. Participants were asked to think aloud while doing the scenarios. A facilitator was also present in the room during the testing to prompt the participants and deal with any technical issues. Following the scenarios, users were asked to complete a post-test survey which consisted of 32 statements about the perceived usefulness and ease of use, as well as the navigation, content and appearance of the website. Users were required to rate these statements across a standard Likert scale. Finally, the users were briefly interviewed about their prior personal usage of the website. A pilot test was also used to verify the surveys and scenarios and minor adjustments were made where required. Since the purpose of the usability testing was to compile an initial list of problems and shortcomings of the actual testing process, the results of the tests will be reported only to the extent that they are relevant to the discussion of the list of shortcomings drawn from the tests. This list was compiled based on observing the participants, noting comments and questions by the participants and analyzing the responses provided by the participants during the interview. The shortcomings have been categorised into two types: user related and scenario related. They are shown in Table 1 below. Facilitator related issues were also observed, however, these require further study and will be presented elsewhere.
Table 1. Initial list of shortcomings of traditional laboratory based usability testing
|
Type |
Shortcomings |
Evidence |
|
User related |
User motives: users are not engaged in tasks that are directly relevant to them and therefore the users’ motives are not real. |
Users observed as being uninterested in completing a scenario, adopting a “get it over and done with” attitude to the test. Interviews also reveal that motives for using the site differ and are not reflected by the scenarios developed. |
|
|
Previous experience with website: users’ impressions of website are based on previous experiences and usage of the site (including the learning process). |
Users observed experiencing difficulties using the website and expressing frustration, however results of the post-test survey indicate a positive attitude towards the website. An explanation for this may be found in the interview responses which generally show that users were satisfied with the website in previous usage and these impressions took precedence over the usage during the usability test. |
|
Scenario related |
Scenarios of isolated activities: users given two unrelated and distinct scenarios to complete using only the website. In real life, users’ activities are often related and dependent on other factors, without a well-defined boundary. |
In the interview users were asked what they had previously used the website for. They indicated that, as prospective students, they had used the website for exploratory purposes, rather than finding specific information, one piece of information often leading to another activity. |
|
|
Reliance on other sources of information: users do not rely on the website exclusively for information. They use other sources, such as other websites, books or people. |
Users were observed following links external to the website being tested to find information. The pre-test survey indicated that the majority of users did not use the website as an exclusive source of information when applying for university, while the interview revealed that users also used the UAC guidebook, and contacted the university directly either by telephone or e-mail. |
|
|
Length of time spent on a scenario task: in a lab setting the amount of time allocated to a task is limited by test objectives and the resources available. In real life, users can spend hours browsing websites and switching between various activities as they find information or get distracted by other activities/ information. Also users may initiate a new search based on information they find by accident. |
While completing the second scenario tasks, users would accidentally find the information for the first task scenario, which they could not find previously. Comments made by users indicate that this is often the case when browsing the Internet for personal purposes. |
Despite the problems associated with conducting usability testing in a usability laboratory, this type of testing environment is practical, affords the highest degree of control and allows evaluators to manipulate the testing process by making necessary adjustments as the testing proceeds. Furthermore, the advantages of video-recording users’ interactions include the possibility of obtaining comprehensive recordings which can later be replayed and analysed in detail, the reliability provided by having several evaluators analyse the same recording, and the opportunity to edit a compilation tape for presenting to clients as an illustrative accompaniment to the report (Sweeney et al., 1993).
Considering the above mentioned factors, the key issue then becomes how to overcome the shortcomings identified and still retain all the benefits of using a laboratory. This is particularly relevant since the current shift is increasingly towards the study of human-human interaction mediated by computer technology (Aboulafia, 2001). By adopting this perspective, the cognitive model to which traditional usability laboratories subscribe is made redundant. There is a need to, not only incorporate social interaction and group activity into the usability evaluation process in order to gain an authentic insight into how users actually use the technology in a social context, but also to reveal to mediating role of technology in the network of human-human and human-computer interactions. There is a need to develop an understanding of the different ways in which users, as members of a communal domain, use the technology and other mediating tools. In other words, there is a requirement to re-examine the way we think of usability.
3. Re-defining Usability
The notion of usability has conventionally been viewed as the extent to which an intended user can meet his or her goals by using a particular technology, in this case a website. According to Spinuzzi (1999) this implies that usability is located within the interface itself and as such it is inadequate for understanding how users carry out activities which involve the interaction of various users with several different tools, other than the interface. Instead, Spinuzzi (1999) argues that usability is distributed across the activity network which is comprised of assorted genres, practices, uses and goals. Nardi and O’Day (1999) view this arrangement of tools, which jointly mediate activities, as belonging to an information ecology. They define an ecology as a “system of people, practices, values, and technologies in a particular local environment” (p.49) which focuses on human activities served by technology, rather than technology itself. Through this idea, we see further movement away from the cognitive viewpoint utilised in traditional laboratory testing methods.
The re-defined concept of usability and the notion of an information ecology form the starting point of our interest in developing an activity scenario based methodology for usability testing. In addition to studying the direct interaction between a user and a computer, it is necessary to gather an in-depth understanding of users’ activities in the context in which they occur by investigating the ways in which users interact with each other and use other tools such as manuals, documentation, pens and paper, to support their activities. As Sweeney et al. (1993) correctly point out, usability laboratories often invest heavily into technology and equipment at the expense of developing an appropriate, user-driven and user-based evaluation methodology. The theory on which we base our methodology will be described next.
4. Cultural Historical Activity Theory
Cultural Historical Activity Theory, or simply Activity Theory (AT) as it is widely known, provides a broad conceptual framework that can be applied to the human-computer interface in such a way as to empower the computer user with the necessary tools to work though the interface in order to achieve desired outcomes without the need for them to embark on lengthy periods of training. Historically, AT draws on the Vygotskian theory of tool mediation or the mediation of human activities by the use of tools. This approach deviates from the cognitive approach in that the computer is seen as distinctly different in both character and composition to its human user. From an AT perspective, people are embedded in a socio-cultural context and their behaviour cannot be understood independently of it. Furthermore they are not just surrounded by the context but actively interact with it and change it. Humans are continually changing activities and creating new tools. This complex interaction of individuals with their surroundings has been called an activity and is regarded theoretically as the fundamental unit of analysis, a system that has structure, its own internal transitions and transformations, its own development (Leont’ev, 1981). AT is becoming more widely known by human computer interaction researchers in the west (Kuttii, 1996; Engeström, 1995; Kaptelinin, 1994; Bødker, 1996, 1991; Nardi, 1996) since it was introduced in Russia in the eighties and early nineties. Its most current and widely-adopted form is Engeström’s (1987) systemic model shown in Figure 1 below.

Figure 1. Human Activity System (Engeström, 1987)
Kuutti (1996) describes the key principles of AT as follows:
· Activity as the basic unit of analysis
Instead of analysing only human actions, AT proposes that a minimal meaningful context for these actions should be included in the analysis and this unit comprising actions in a context is an activity.
· History and development
Activities are in a constant state of evolution and therefore, it is necessary to historically analyse an activity in order to gain an understanding of the current situation.
· Artifacts and mediation
Activities are mediated by artifacts and artifacts themselves are created during the development of an activity. This dual relationship further implies the developmental nature of activities.
· Structure of an activity
An activity is directed towards an object and the object is what distinguishes one activity from another. The transformation of the object into the outcome motivates the existence of the activity. Furthermore, the object and motive could undergo changes during the development of an activity.
· Levels of an activity
An activity is realised through conscious actions which have defined goals. Those actions, in turn, consist of operations which are dependent on the available conditions. The relationship between the elements of this hierarchy, depicted in Figure 2, is dynamic so that initially operations are actually conscious actions. Through practice, these actions will collapse to the level of operations. However, if conditions change, the operation can return to the level of a conscious action.

Figure 2. Structure of an Activity (Leont’ev, 1981)
· Zone of Proximal Development (ZPD)
A person has two levels of performance: the level he/she can achieve alone and unaided, and the level that can be achieved with help of a more experienced individual. The latter performance ability is referred to by Vygotsky (1978) as the zone of proximal development (Bellamy, 1996).
The principles of Activity Theory described above are of direct relevance to overcoming the shortcomings described in Table 1. Table 2 indicates which principles can be applied to specific shortcomings. The proposed usability testing methodology, which incorporates these principles, will be described in the following section.
Table 2. Mapping of Activity Theory principles to shortcomings identified in Table 1
|
Type |
Shortcomings |
Activity Theory Prinicples |
|
User related |
User motives |
Levels of an activity Structure of an activity |
|
|
Previous experience with website |
Activity as the basic unit of analysis History and development Zone or proximal development |
|
Scenario related |
Scenarios of isolated activities |
Activity as the basic unit of analysis |
|
|
Reliance on other sources of information |
Artifacts and mediation Activity as the basic unit of analysis |
|
|
Length of time spent on a scenario task |
Structure of an activity |
5. Activity scenario based usability testing methodology
The activity scenario based usability testing methodology finds its theoretical foundations in the marriage of activity theory and Carroll’s (2000) scenarios. By combining several aspects of AT and scenarios, we arrive at a methodology that offers evaluators an insight into the natural context of use in an artificial laboratory setting which proffers a high degree of control. In this methodology, the computer is reduced to a support role as one of the many mediating tools in user activities. The focus, instead, is on identifying usability issues and problems across the entire activity network or humans and tools. No specific usability attributes are examined in an attempt to create a holistic, rich, qualitative representation of the usability of a website. A preliminary working model diagram of the methodology is shown in Figure 3 below.

Figure 3. Working model of the activity scenario based usability testing methodology
5.1. Defining user activities
The initial phase involves defining real user activities by observing and interviewing users who actually use the website being tested in their everyday activities. Where appropriate field interviews and observations can be carried out in order to understand users’ needs, desires and their approach to the work they do (Beyer and Holtzblatt, 1999). The interviews can be carried out on a one-to-one basis or in focus groups involving teams that carry out the same activity. This provides a forum for discussing and observing the social interactions between users, and for developing an understanding of the social context by gathering information, stories and anecdotes. Due to the problematic nature of gathering this type of ad hoc information, the AT principles described previously can be used to make sense of the information gathered and also provide evaluators with a common vocabulary (Nardi, 1996) as AT terminology is a close reflection of users’ activities and, as such, easily understood by users. The information collected from the interviews or focus groups would provide an integrated, holistic view of the main activity and intersecting activities and a description of the various mediating tools used in performing the activity, as well as an explanation of how they are used. In other words, the evaluator would have a rich view of the users’ activities which can then be simulated in the laboratory through the use of scenarios. Notes about the layout of the user’s environment should also be taken to accommodate for a more authentic physical layout in the laboratory.
In order to be able to fully simulate the users’ environment, the evaluators also need to note carefully all the different types of interruptions and disruptions which occur in the environment, such as phone calls, various queries, etc. These can then also be transferred to the laboratory setting and used to effectively study the different types of breakdowns (Bødker, 1991) and contradictions (Engeström, 1987) which result from the disruptions. This aspect of the users’ context will be incorporated into the proposed methodology as part of future refinements.
The key objective of this phase is to explore the users’ work practice (Borgholm & Madsen, 1999) and gain an understanding of real user activities. It is important to allow the evaluators to immerse themselves in the users’ practice and, by applying AT principles, gain a shared understanding and interpretation of what transpires during a typical activity which is supported by the website being evaluated, before moving to the usability laboratory. Once a common interpretation has been developed, the evaluators can proceed with phase two, which involves developing activity scenarios to be used during the actual usability testing process. The scenarios developed for the purpose of usability testing need to be an accurate reflection of the information gathered from the first phase about user activities. This is where the methodology borrows from Carroll’s (2000) concept of scenarios.
5.2. Scenario development
Having gained a rich and detailed understanding of the users’ context in terms of activities and mediating tools, the methodology proceeds by designing a set of task scenarios for use in the actual usability testing phase. Carroll (2000) has advocated the use of scenarios for understanding human activities and designing tools to support these activities. Scenarios have been widely used at all stages of the systems development process, and in particular for designing evaluation tasks, both for summative and formative purposes (ibid.). According to Kahn (1962), using scenarios offers several important advantages, including providing an emphasis of various circumstances that may arise, and, as such, a focus on the contextual issues. The use of scenarios also makes the evaluators’ understanding of the context more material, and it helps them reflect on the knowledge they have gained from the interviews, focus groups and observation.
When developing scenarios, the evaluators need to consider the notion of distributed usability as defined by Spinuzzi (1999). This means considering the usability issues in terms of the support afforded for the whole activity and its ecology of tools, and not only in terms of the website itself. This will affect the scenario design because the scenarios need to reflect this type of distributed usability and should allow such usability issues to emerge during the testing process. The development of scenarios is an iterative, prototyping process in itself involving. Once the final versions have been developed, the actual usability testing in the laboratory can proceed.
5.3. Usability testing
During this phase the key user or users are invited to the laboratory where they are asked to test the website based on the developed scenarios. If the nature of the activity is such that it involves interaction with other users, they are also invited to be present and part of the usability test. The laboratory should be set up in such a way that enables the monitoring of social interactions in the room and allows evaluators to design a realistic setting closely resembling the users’ actual environment. Cameras placed strategically and around the room should enable the observers to view the interaction between the user and the website, as well as the interaction between all the users in the room.
A typical usability laboratory is often a sterile, empty room with one desk and a computer. It is usually quiet and a far cry from the typical user environments which may be noisy and sometimes crowded. To mimic this environment the laboratory should be set up to include typical artifacts used in the user’s setting, including shelves with books, a noticeboard, filing cabinets, various chairs and desks, etc. A telephone on the desk would afford interruptions while users are doing the testing. Other interruptions may be in the form of intermittent queries and questions from other users in the room. There should be no one-way observation mirror because this is not natural to the typical user environment. The cameras should be placed inconspicuously behind plants and on top of high shelves in order to get a wide-angle view of the events taking place in the room. This setting enables evaluators to study both the ecology and social context of an activity supported by the website being evaluated. When the usability testing is complete, the evaluators proceed with the analysis and discussion of the data collected, referring once again to the AT framework, in focus group discussions. This phase of the methodology is yet to be refined.
6. Potential benefits and pitfalls
The activity scenario based usability testing methodology described in this paper will offer several advantages to both researchers and practitioners once it has been fully developed. These include: providing a rich and comprehensive view of the actual use of the websites being tested in the context of real user activities; providing a profile of the intended users’, the mediating tools with which the website is being used in conjunction with as well as the various activities it supports and the different ways in which it does so; and providing a common AT-based vocabulary for conducting qualitative usability testing.
However, even at this early stage, several problems have already emerged. The potential pitfalls of the methodology may include one or more of the following: it is time consuming due to the extensive nature of initial interviews, focus groups and observation, and consequently it may be expensive; it requires trained evaluators; and it relies on intended users to be available at the testing site. With further refinements of the methodology, some of these pitfalls may be overcome eventually.
7. Further developments
As is evident from the previous discussion, the development of the activity scenario based usability testing methodology is still in its infancy. Preliminary investigations clearly point to the need for overcoming the shortcomings associated with traditional usability testing in a laboratory and take into account the real user activities, yet still retain the degree of control afforded by the laboratory. The development of scenarios based on information gathered using AT principles may offer a solution. The development of the methodology is an iterative process and will proceed in the next stage with further theoretical refinements followed by a series of tests to enhance the methodology and expose potential problems and other relevant issues, such as incorporating learning through facilitation into testing process.
8. References
· Aboulafia, A.L. (2001) The cognitive and social aspect of computer-mediated work, exemplified by the research traditions of HCI and CSCW, in H. Hasan, E. Gould, P. Larkin and L. Vrazalic (eds.) Information Systems and Activity Theory: Volume 2 Theory and Practice (University of Wollongong Press: Wollongong, Australia).
· Bannon, L.J. (1991) From Human Factors to Human Actors: The role of psychology and human-computer interaction studies in system design, in J. Greenbaum and M. Kyng (eds.) Design at Work: Cooperative Design of Computer Systems (Lawrence Erlbaum: Hillsdale, NJ).
· Bellamy, R.K.E. (1996) Designing Educational Technology: Computer-Mediated Change, in B. Nardi (ed.) Context and Consciousness: Activity Theory and Human Computer Interaction (MIT Press: Cambridge, MA).
· Beyer, H. and Holtzblatt, K. (1999) Contextual Design, interactions, 6, 32-42.
· Bødker, S. (1991) Through the Interface: A Human Activity Approach to User Interface Design (Lawrence Erlbaum: Hillsdale, NJ).
· Bødker, S. (1996) Applying Activity Theory to Video Analysis: How To Make Sense of Video Data in HCI, in B. Nardi (ed.) Context and Consciousness: Activity Theory and Human Computer Interaction (MIT Press: Cambridge, MA).
· Borgholm, T. and Madsen, K.H. (1999) Cooperative Usability Practices, Communications of the ACM, 42, 91-97.
· Carroll, J.M. (2000) Making Use: Scenarios and Scenario-Based Design, OZCHI 2000 Conference Proceedings, December 4-8, 2000, Sydney, 36-48.
· Engeström, Y. (1987) Learning by Expanding: An activity-theoretical approach to developmental research (Orienta-Konsultit: Helsinki).
· Engeström, Y. (1995) Polycontextuality and Boundary Crossing in Expert Cognition: Learning and Problem Solving in Complex Work Activities, Learning and Instruction, 5, 319-336.
· Kahn, H. (1962) Thinking about the unthinkable (Horizon Press: New York).
· Kaptelinin, V. (1994) Activity Theory: Implications For Human Computer Interaction, in M.D. Brouwer-Janse and T.L. Harrington (eds.) Human-Machine Communication For Educational SystemsDesign (Springer-Verlag: Berlin).
· Kuutti, K. (1996) Activity Theory as a Potential Framework for Human-Computer Interaction, in B. Nardi (ed.) Context and Consciousness: Activity Theory and Human Computer Interaction (MIT Press: Cambridge, MA).
· Leontiev, A.N. (1981) Problems of The Development of The Mind (Progress Publishers: Moscow).
· Nardi, B. (1996) Activity Theory and Human-Computer Interaction, in B. Nardi (ed.) Context and Consciousness: Activity Theory and Human-Computer Interaction (MIT Press: Cambridge, MA).
· Nardi, B.A. and O’Day, V.L. (1999) Information Ecologies: Using Technology with Heart (MIT Press: Cambridge, MA).
· Rubin, J. (1994) Handbook of usability testing: How to plan, design, and conduct effective tests (Wiley: New York).
· Spinuzzi, C. (1999) Grappling with distributed usability: A cultural-historical examination of documentation genres over four decades, Proceedings of the 17th annual international conference on Computer documentation, September 12-14, 1999, New Orleans, 16-21.
· Sweeney, M., Maguire, M. and Shackel, B. (1993) Evaluating user-computer interaction: A framework, International Journal of Man-Machine Studies 38, 689-711.
· Vygotsky, L.S. (1978) Mind in Society (Harvard University Press).
· Whiteside, J., Bennett, J. and Holtzblatt, K. (1988) Usability engineering: our experience and evolution, in M. Helander (ed.) Handbook of Human-Computer Interaction (North-Holland: Amsterdam).
· Whiteside, J. and Wixon, D. (1987) Discussion: Improving human-computer interaction – a quest for cognitive science, in J. Carroll (ed.) Interfacing Thought: Cognitive Aspects of Human-Computer Interaction (MIT Press: Cambridge, MA).