Proceedings of HF 2002, Nov. 25-27, 2002, Melbourne, Australia
Learning and Transfer in an Applied Visual Spatial Task
Shayne Loft, Andrew Neal and Michael Humphreys
Key Centre for Human Factors and Applied Cognitive Psychology
University of Queensland
s.loft@humanfactors.uq.edu.au
Keywords: Learning, transfer, air traffic control, training, context
Abstract
This paper presents a new dynamic visual spatial task for use in applied cognition research. The aim of the experiment reported is to illustrate a major limitation of learning from individualized examples - the inability to transfer across different contexts. Instance-based models of learning emphasize the role that memory for previous examples plays in subsequent task performance and the predictions were based on this framework. The task required participants to decide as quickly and as accurately as possible whether pairs of aircraft moving on the screen would come within 1 cm of each other (conflict). During training the surface features of the items were held constant, and during transfer they were changed. Changing the spatial configuration of the aircraft had the largest negative impact on performance, followed by orientation and then position. The results illustrate some key ways in which episodic memory influences performance in a dynamic visual spatial environment. The results reported raise some avenues for further enquiry. Design implications are discussed.
1. Introduction
This paper presents a new visual spatial task, based on air traffic control, for use in applied cognition
research. The experiment reported in this paper is the first in a series of studies concerning how people learn from individualized examples.
1.1. Learning from examples
Performance improvements usually occur after repeated performance on the same task. Practice involves the presentation of new and old examples of problems that need to be classified or solved. There are two general classes of theories concerning how individuals learn from examples; abstraction and instance models of learning. These two distinct approaches carry different assumptions regarding the casual mechanisms responsible for improvements in performance associated with task practice (Estes, 1986; Logan, 1988; Anderson, 1983). Abstraction models of learning claim that performance improvements result from abstract representations of statistical regularities within presented practice examples (Anderson, 1983; Rumelhart & Ortony, 1977). That is, individuals develop rules based on information abstracted from each successive learning example.
Instance models of learning emphasize knowledge derived from individualized learning episodes (Estes, 1986). A basic premise present in all instance models of learning is that individuals retrieve previously encountered examples, and use this information for subsequent decisions (e.g., Hinztman, 1986; Logan, 1988). That is, individuals use memories for past examples to solve new examples, rather than abstracting and applying general rules to new examples. A wide variety of models share the assumption that the retrieval of previous examples is essential for learning and these models have been applied to cognitive task domains such as judgment (Kahneman & Miller, 1986), problem solving (Ross, 1984), and categorization (Allen & Brooks, 1991).
1.2 Transfer of cognitive skill
Kimball and Holyoak’s (2000) taxonomy of skill transfer and expertise draws clear distinctions between structural and surface features of practice examples. To illustrate, consider the following air traffic control example. In this example, a pair of converging aircraft is flying at the same altitude toward a 105-degree common angle of intersection. One of the aircraft is traveling at 1000km per hour and the other at 500km per hour. Controllers would use the speeds and relative positions of the aircraft to make a conflict status decision, also taking into account the angle at the point of intersection. These are structural features of the item, because they are functional and goal relevant.
The item also contains non-functional and goal irrelevant features considered surface in nature. Surface features are commonly referred to in the learning and transfer literature as context. Surface features include the position of the aircraft event on the radar screen, the orientation of the angle of intersection and the configuration of the aircraft. The key point regarding surface features is that they make absolutely no difference to conflict status, constituting only the context in which the aircraft are presented.
Instance models of learning emphasize the role of memory for previously presented examples. According to such models (e.g. Hinztman, 1986; Logan, 1988), each time a training example is presented, the entire processing episode is stored as a separate memory trace. This trace contains information concerning the presenting structural conditions of the example (speeds, relative position of the aircraft, and angle of intersection) and associated conflict status (conflict/non-conflict). When presented with a new example, individuals can retrieve traces of previously seen examples. The speeds, relative position and angle of intersection of the current pair of aircraft can be compared to the retrieved traces to make an accurate conflict status decision.
An assumption present in all episodic learning models is that each trace includes the particular context in which the processing operations were carried out. Instance learning is inherently dependent on memory, and context changes can have a powerful effect on memory. When presented with a new example, the retrieval of previous examples is dependent on the match between the current context and previous contexts (Tulving & Thomson, 1973). Task performance should be sensitive to the surface features of items, even if these features are irrelevant to the outcome of items. While transfer should be very good to new items presented in the same or similar context as training, learning will most likely not transfer well beyond the original context. With sufficient contextual variation, individuals might not use the knowledge derived from training because they think they are looking at a completely new type of item, based on superficially dissimilar features.
1.3. The air traffic control task
The task required participants to predict the conflict status of pairs of aircraft, based on their speeds, relative position, and angle of intersection. The training conditions in this experiment were optimal for memory retrieval and instance learning. That is, we used low variability examples, in terms of both structural and surface features, during training. Participants were expected to benefit from the repeated presentation and quickly establish strong memory representations for training examples. Training conditions were not considered diverse enough to allow generalization mechanisms to abstract rules from the examples presented.
During training, we presented participants with two pairs of aircraft at set speeds and angles of intersection. Participants were required to decide whether each pair of aircraft would conflict, as quickly and as accurately as possible. Figure 1 presents an example of a training trial. On each of the 14 training trials, items with the same structural (speeds, angle of intersection) and surface features (position of the item as a whole on the screen, orientation, and configuration) were presented. The relative positions of the aircraft differed on every training trial. This determined conflict status, making the outcome of the item either a conflict or a non-conflict. On the test trial, we presented them with a series of items containing different surface features than the trained items. The test items were identical to the training examples structurally (speeds, angle of intersection). Surface features in the task were superficial in nature and had no impact on the conflict status of items. Surface features simply provide the context in which the structural and goal-relevant features of the item take place. There were four types of transfer items presented on the transfer trials

Figure 1: The training task
Trained items were exactly the same as the item presented during training in terms of both structural and surface features. For position items, the position of the item was moved. The aircraft speeds and angle of intersection are the same as training and the item is simply moved to a new location on the screen. For orientation items, the rotation of the angle was altered by 180 degrees. For configuration items, the configuration of the item was altered. For example, for item B in Figure 1, the 1000 km/hr aircraft is moved to the upper flight path and the 500 km/hr aircraft is moved to the lower flight path.
It was predicted that the participants would become both faster and more accurate at making conflict status decisions over training. For the transfer trials, it was predicted that the configuration surface change would have the largest negative impact on accuracy and reaction time of conflict status decisions, followed by the orientation change and then the position change. These transfer trial predictions were based on the reasoning that changing the configuration of the item represents the largest surface level change, followed by changing the orientation, followed by changing the position.
2. Method
Figure 1 illustrates the task interface that was used during training. Small green circles symbolize aircraft. Each aircraft has a letter and a speed shown on a green flight strip. Aircraft fly on set flight paths. Many of the flight paths cross at some point on the screen. These intersection points provide a set of angles. Angles are replicated at different points on the screen at different degrees of rotation. Aircraft are in conflict when a five nautical mile separation standard (1 cm on the computer screen) is violated. When in conflict, aircraft symbols and flight strips turn yellow, and they turn green again once the five nautical mile separation standard is re-established.
All items consist of a pair of aircraft traveling along different flight paths that intersect. Conflicts occur if the two aircraft reach the intersection at approximately the same time and violate the 1cm or five nautical mile separation rule. The development of conflicts and non-conflicts was standardized across all speed and angle of intersection combinations using an automated application tool.
Participants were presented with two sets of training trials. There were 14 trials in each training set and the duration of each trial was one minute and fifty seconds. The two training sets are labeled AB and CD. Participants were also presented with two sets of analogous transfer trials. There were 16 trials in each transfer set and the duration of each trial was 30 seconds. The two transfer sets are labeled AB and CD and correspond to the training sets. Half the participants were presented with training set AB, transfer set AB, training set CD and then transfer set CD. The other half of participants were presented with training set CD, transfer set CD, training set AB and then transfer set AB.
Training sets AB and CD consisted of two pairs of aircraft. Each pair of aircraft was heading toward a common point of intersection. The structural and surface features of the pairs of aircraft remained constant on each of the 14 trials in the training set. The details of the structural features of the four training items (A, B, C & D) are presented below in Table 1. The relative starting positions of the pairs of aircraft in each training trial were altered to produce the conflicts and non-conflicts. The term range refers to how long, in seconds, the pair of aircraft takes to conflict or pass each other safely. The ranges used during training were 85, 90, 95 and 100 seconds. This ensured that participants could not search for patterns in starting positions of aircraft to determine conflict status.
|
Item Type |
Speed of Pair km/hr |
Angle of Intersection |
Range (seconds) |
|
Item A |
1200 and 300 |
120 degrees |
85, 90, 95, 100 |
|
Item B |
1000 and 500 |
75 degrees |
85, 90, 95, 100 |
|
Item C |
1100 and 400 |
105 degrees |
85, 90, 95, 100 |
|
Item D |
900 and 600 |
60 degrees |
85, 90, 95, 100 |
Table 1: Details of the four training items
For the two transfer sets AB and CD, single pairs of aircraft were presented to participants. Participants had 30 seconds to determine the conflict status of each item presented before the aircraft disappeared and the next item was presented. Each transfer item had a range of 85 seconds. They were run for 30 seconds, from a range of 85 seconds to 55 seconds. The 16 items contained in transfer sets AB and CD were presented in blocks of 4; with a 20 second break between each block. In transfer set AB, for example, the 16 items consisted of trained items A and B, position item A and B, orientation items A and B and configuration items A and B.
On both the training and transfer tasks, the scoring system encouraged both fast and accurate performance. Participants responding early were rewarded with more points, but if incorrect, were penalized with the same score.
Thirty-five first year psychology students volunteered to participate in return for course credit. Firstly, participants were given instructions on how to complete the training trials. They then completed the training set. Participants were then given instructions for how to complete the transfer set and completed it. This was followed by a 10-minute break. After the break participants completed the second set of training and test materials.
3.1 Training
Figures 2 presents accuracy over training, collapsed over the 4 different types of training items (A, B, C and D). All training data analyses were carried out within a repeated measures analysis of variance. The effect of training was significant, [F (1,34) = 4.74, MSE = .134, p <. 001], indicating that the accuracy of conflict status decisions increased significantly with training. Figure 3 presents reaction time improvements over training. The effect of training was significant, [F (1.34) = 12.92, MSE = 298.1 p <. 001], indicating that reaction times of decreased significantly with training.
|
Figure 2: Accuracy proportions over training |
|
Figure 3: Reaction time over training |
3.2 Transfer
Figure 4 presents accuracy on the transfer trial. Analyses of the accuracy data were carried out within a repeated measures analysis of variance. Five people were excluded from the analysis because they did not follow the transfer trial instructions correctly. A significant difference in accuracy was found between the transfer items [F (1,29) = 12.62, MSE = .135, p <. 001]. Planned contrasts revealed no significant difference for accuracy on trained items (85%) compared to position items (77%). There was a significant difference between position items (77%) and orientation items (68%), [F (1,29) = 5.58, MSE = .181, p <. 05], and between orientation items (68%) and configuration items (56%), [F (1,29) = 5.06, MSE = .323, p <. 05].
Figure 5 presents reaction times on the transfer trial. A significant difference in reaction time was found between the transfer items [F (1,29) = 7.49, MSE = 41.92 p <. 001]. Planned contrasts revealed that a significant difference between reaction times for trained items (13.63 sec) compared to position items (15.1 sec), [F (1,29) = 4.7, MSE = 62.63, p <. 05], and between the position items (15.1 sec) and orientation items (16.67 sec), [F (1,29) = 4.86, MSE = 68.87, p <. 05]. There was no significant difference in the reaction times of orientation items (16.67 sec) and configuration items (16.89 sec).
|
Figure 4: Accuracy on transfer items |
|
Figure 5: Reaction time on transfer items |
4. Discussion
The current paper used instance-based models of learning and memory (e.g., Hintzman, 1986; Logan, 1988) as a theoretical framework for predicting how people would perform on the transfer trial items. Instance-based models emphasize the role of memories for examples that the controller has previously experienced during training. The majority of the predictions made were supported by the data. Both the accuracy and speed of conflict status decisions improved over the training trials. This was not surprising given the magnitude of structural and surface feature repetition in items present during the training trials. During training, only the relative positions of the two aircraft and the range were changed.
As predicted, changing the configuration of transfer items had the largest impact on the accuracy of decisions, followed by the orientation surface change, and then the position surface change. Individuals were significantly less accurate when the configuration was changed compared to when the orientation was changed, and subsequently less accurate when the orientation was changed than when the position of the item on the screen was moved. There was no significant difference between the trained items (85%) and the position items (77%). This result indicates that changing the position of the item has a marginal but non-significant influence on conflict status decision accuracy. Furthermore, participants were significantly slower at making their conflict status decisions on position items compared to trained items. Consistent with the accuracy data, participants were slower at making conflict status decisions on orientation items than position items. There was no difference in reaction time between orientation and configuration items.
The transfer trial findings are interesting considering that surface feature changes have no impact on conflict status. Structurally, the training and transfer items are exactly the same. The only element that is changed is the context in which the item is presented. Task performance was sensitive to the surface features of the trained items because of the lack of contextual variation during training, consistent with research that has shown that the retrieval of previous examples is dependent on a match between the current context and the previous contexts (e.g., Tulving and Thomson, 1973).
The findings in this experiment are thought to be highly dependent on the type of task being performed and the type of training on the task. Tasks with high item repetition are optimal for memory retrieval and present ideal conditions for instance learning. The task conditions in the present experiment deliberately contained large amounts of item regularity in order to tap into the instance-based processes that have been shown to influence performance in applied work domains. Consequently, the findings from this experiment need to be qualified according to the task conditions present. Most real-world tasks contain much less item repetition than present here.
There is little doubt that cognitive skill reflects more than just sets of analytical rules that can be applied to different problems. In most task environments, new situations are often similar to old ones, allowing individuals to judge typicality. Analyses of expertise in air traffic control have suggested that controllers recognize specific types of conflicts that occur routinely at specific parts of their sector. Such recognition is thought to rely on a progressively developing ‘conflict library’ that stores information concerning previous air traffic scenarios experienced and their outcome (Nea1, Griffin, Paterson & Bordia, 2000). Studies in other complex work domains, such as dermatology and firefighting, have provided evidence suggesting that, in addition to learning procedures and rules, experts do make use of prior instances (e.g., Brooks, Norman & Allen, 1991). Individuals are clearly influenced by memories for previous experiences.
At the time of submission of this paper, a second condition was being conducted in which participants are presented with the same repeated structural features during training as in the present experiment, but which contain variation in surface features. The surface features are manipulated during training such that participants are presented with multiple examples of each surface feature component (position, orientation, configuration) that is eventually varied on the transfer trial. Two surface features are varied together at a time during training so that the participants are not presented with any of the test items themselves during training. This new condition of participants will then be presented with the same transfer trial as the present experiment. It is predicted that exposure to wider instances of surface features in training will led to greater generalization and greater transfer on the transfer trial. Variations in surface features of items should have little impact on transfer for this condition because they have been presented with contextual variation during training.
4.1. Implications for Design
The ability to recognize conflicts between aircraft is a key component of expertise in air traffic control (Nea1, Griffin, Paterson & Bordia, 2000). A large proportion of a controller’s time is spent scanning the air situation display and flight strips in order to recognize potential conflicts between aircraft. This task is made difficult by the fact that controllers can have up to 40 aircraft “on frequency” at any point in time, and they often have a number of different conflicts to attend to that are at varying stages of development at any one point in time. The consequences of failing to identify a conflict in sufficient time to prevent the aircraft from violating the minimum separation standard can be disastrous. Surprisingly, no formal models have yet been developed that can explain how humans perform this task.
This paper reported the first in a series of experiments, which aims to develop techniques for modeling and predicting human performance in complex tasks, such as air traffic control and fire fighting. The ultimate goal of this research program is to develop tools that industry can use to assess the risks associated with operator error, and to evaluate alternative mitigation strategies (e.g., the redesign of human-computer interfaces).
5. References
· Neal, A., Griffin, M.A., Paterson, J. & Bordia, P. (2000). Development of measures of situation awareness, task performance, and contextual performance in Air Traffic Control. In A.R. Lowe & B.J. Hayward, (Eds.), Aviation Resource Management, Volume Two. (pp 305-314) Aldershot, UK: Ashgate.