E-Z Reader vs. SWIFT Models: Exploring Eye-Movement Control in Reading

Computational Models of Eye Movement Control in Reading — E-Z Reader and SWIFT

Models of eye-movement control in reading

Computational models have become an increasingly important method for understanding eye-movement control in reading. Models of eye-movement control in reading can be classified into two categories, oculomotor models and primary cognitive models (Rayner, 1998). The former assumes that eye-movement in reading is mainly controlled by low-level, oculomotor processes, and are un-directly related to cognitive processing. On the other hand, cognitive-control models believe that ongoing cognitive, lexical processing drives the eye-movement in reading, which have gained wide acceptance amongst most cognitive psychologists.

Cognitive models can be further distinguished according to how they conceptualized visual attention. In models based on sequential attention shift (SAS), attention is focused on one word at a time, in a strictly serial manner, with lexical processing only occurring on the word being attended (Reichle, 2006). One of the representative models of the SAS models is E-Z Reader model. In contrast, GAG (guidance by attentional gradient) model contends that attention is distributive. Due to the distributive allocation of attention, therefore more than one word can be processed at a time. To be specific, all the words fall into the perceptual span can be processed to a certain degree. SWIFT (saccade-generation with inhibition by foveal targets) model is the leading model of GAG models.

In many fields of cognitive psychology, the issue of to what extent we are able to process more than one item at one time has been extensively discussed. Taken reading research into account, one can translate this query into whether lexical access in silent reading is serial or parallel. Two leading computational models of eye-movement control in reading, E-Z Reader and SWIFT, have explored the issue of seriality and parallelism to the very extent, in which distinct assumptions and predictions have been derived.

In the following paragraphs, core unique assumptions and predictions E-Z Reader and SWIFT model will be discussed in detail, and conclusion will be made that SWIFT model is more consistent with existing experimental data and can better predict reading behaviors.

Core assumptions of E-Z Reader and SWIFT model

E-Z Reader Model:

As the first computational model of eye-movement control in reading, the E-Z Reader model (Reichle et al., 1998) is a family of models has been developed over the past few years to explain how various perceptual, cognitive, and motor processes guide readers’ eye movements. The fundamental claim of E-Z Reader is that lexical processing is the ‘engine’ that determines when the eyes will move from one word to the next (Reichle et al., 1998), which is consistent with the assumption of cognitive control models.

The E-Z Reader is based on the ‘attention-shift’ theory (e.g. Morrison, 1984). According to Morrison, attention moves serially, and a saccade to a new word is always preceded by a shift of attention occurring after a certain amount of lexical processing on the current word had been completed. At the same time, word recognition is initiated by parafoveal information. These assumptions account for both frequency effects and parafoveal preview benefits, in that words that are short and frequent can be easily identified in parafoveal vision, and thus those words tend to be fixated for less time and skipped more often (Morrison, 1984). However, Morrison’s (1984) theory failed to explain some phenomena in reading, such as “spillover” effects due to word frequency (e.g., Rayner & Duffy 1986). The limitation of Morrison’s (1984) model can be circumvented by E-Z Reader models (Reichle et al., 2003), which will be specified in assumption 2.

The core assumptions of the E–Z Reader model of eye-movement control during reading are that:

1. Attention is allocated serially, to strictly one word at a time.

According to Reichle et al. (1998), the assumption of seriality of attention indicates that attention that is required for lexical processing (e.g., by binding together the features that make up a word) is allocated in a strictly serial manner, to exactly one word at a time. Attention is intrinsically linked to lexical processing, and attention needs to be focused on the word that is being processing to access lexical information. The identification of one word causes the focus of attention to shift so that the word-identification system can begin identifying the next word.

The benefit of allocating attention serially is that it provides a simple mechanism for encoding the order of the words that are being read (Pollatsed & Rayner, 1999). Attention is allocated serially during reading because word order needs to be straight for readers to process. Word order conveys a large amount of syntactic information, and by shifting the focus of attention from one word to the next, readers identify and process each word in its correct order. In this way, the forward saccade of the next word can be considered as default.

2. Sequential shifts of attention are decoupled from saccade programming.

E-Z Reader model posits that lexical processing is completed in two stages. In the first stage, identification of the orthographic form of the word is referred as “familiarity check” (Reichle et al., 1998), or L1 in E-Z Reader 7. The second stage involves the identification of a word’s phonological and/or semantic information, which is called as “lexical access” or L2.

The two stages of lexical processing play distinct roles. The completion of the first stage of lexical access (L1) triggers the oculomotor system to begin programming the next saccade, while the completion of the second stage (L2) causes attention to shift to the next word. Therefore, in E-Z Reader, saccadic programming is decoupled from the shifting of attention.

Two main advantages of decoupling eye movements from attention can be concluded as follow:

It provides an explanation for the interaction between foveal processing difficulty and preview benefits: words that are difficult to process require more time to identify, and thus less time is available for parafoveal processing of the next word; thereby increasing the fixation on that word (Henderson and Ferreira, 1990).

spillover effects: Because less parafoveal processing is carried out on difficult-to-process words, the next word receives less parafoveal processing, thereby increasing the fixation on that word (Rayner et al., 1989).

SWIFT Model:

As a representative of GAG model, SWIFT (Saccade-Generation with Inhibition by Foveal Targets) was developed as a theoretical alternative to SAS model (in particular, the E-Z Reader model, due to its insufficiency for explaining a number of phenomena). GAG models proposes the idea of spatially distributed processing according to an attentional gradient (Engbert et al., 2002), and relax the assumption that attention is confined to exactly one word at a time.

The key distinctive principles of SWIFT model of eye-movement control during reading are that:

1. Spatially distributed processing of an activation field

SWIFT (Engbert et al., 2002) is motivated by the concept of interactions between local excitation and global inhibition in the dynamic field theory (Erlhagen & Schoner, 2002). In the dynamic field theory (Erlhagen & Schoner, 2002), spatially-distributed activations determine the probabilities for target selection during movement planning, and the parallel built-up of activations enables parallel processing.

Therefore, in SWIFT model, word of a sentence is considered as units of the activation field, and target selection is caused by a competition among words with different activations (Engbert et al., 2005). The parallel build-up of activations over several words enables that distributed processing over several words at a time, and thereby all types of saccades derive from one common underlying principle.

According to Engbert & Kliegl (2011), one of the advantages of this architecture are that it can account for the similarity between eye-movement in reading and other visuomotor behavior, such as z-string scanning (Nuthmann & Engbert, 2009); 2).

2. Separate pathways for saccade timing and saccade target selection

Based on the finding from oculomotor system (e.g., Wurtz, 1996) that the temporal (when) and spatial (where) pathways for control saccadic programs is separated, SWIFT proposes that the issue of temporal saccade program (when to start the saccade to the next) and spatial (where to go next), are decoupled.

As discussed above, for the spatial pathway, the activation field determines the probability to select a word as the next saccade according to the relative activations of words. However, in the temporal pathway, a random timer, modulated by processing difficulty, determines when to initiate a new saccade program (Engbert & Kliegl, 2011).

More importantly, as explained by Engbert & Kliegl (2011), the temporal and spatial aspects are synchronized by the process of foveal inhibition, which modulates fixation durations by processing difficulty. The mechanism of foveal inhibition therefore provides an explanation why additional fixation time during processing difficult words is mainly acquired by multiple fixations. Therefore, SWIFT model answered the question of refixations.

Based on the discussion above, the following chapter will evaluate the unique predictions that derived from each model, and intend to argue that SWIFT model is more consistent with existing data of reading behavior, based on the analysis of parafoveal-on-foveal and parafoveal preview benefits effects.

Predictions of models

Parafoveal information can influence reading in different ways and the upcoming words can guide saccadic programming. There are two ways in which the processing of parafoveal information can influence reading behavior — Parafoveal-on-foveal effects and Parafoveal preview benefits (PB) effects.

E-Z Reader and SWIFT has made unique predictions upon parafoveal processing, and testing those predictions is of great significance to evaluate which model are more consistent with reading phenomenon. The following paragraphs will focus on the issue of seriality and parallelism of the two models, from the perspectives of Parafoveal-on-foveal and Preview Benefit effects in reading.

1. Parafoveal-on-foveal effects

Parafoveal-on-foveal effects indicate that the characteristics of word n+1 influence the fixation duration of word n, but the two models make different predictions upon this issue. According to E-Z Reader, the information from parafoveal source only becomes available after attention has been shifted away from the fovea, and thus there should be no lexical parafoveal-on-foveal effects (Rayner, 2009).

In contrast, based on the principle of distributed attention allocation, SWIFT model affirms the existence of such effects. For instance, Kliegl et al. (2006) experimentally reported a large set of data demonstrating the influence of past, present and future words on the current fixation duration, supporting the notion that more than one word are processed in parallel.

The first counter argument has been made is that such parafoveal-on-foveal effects can be explained in terms of mislocated fixations (Drieghe, Rayner, & Pollatsek, 2007), such as calibration and saccadic errors. This argument is inferred from the explanation that the inverted optimal viewing position effect might due to saccadic errors (Nuthmann et al., 2005). However, by using fixations that both of the eyes were fixated on the same word, Kliegl et al. ruled out the possibility resulted from saccadic errors, and the observed parafoveal-on-foveal effects cannot possibly all be explained by calibration errors.

In addition, Risse and Kliegl (2012) conducted two display-change experiments, in which the preview and target difficulty of word n+2 (word n is defined as the fixated word) were manipulated so as to examine the role of mislocated fixations on word n+1. Their results show that fixations on the short word n+1 (more likely to be mislocated), were not influenced by the difficulty of word n+2 (the hypothesized target of the mislocated fixation); instead, word n+1 was influenced by the preview difficulty of the next word, indicating a delayed parafoveal-on-foveal effect.

The second objection arguing that those parafoveal-on-foveal effects might due to orthographic factors, e.g. low frequency words which contains irregular letter sequences (Drieghe, Rayner, & Pollatsek, 2007). Nevertheless, growing evidence supporting the existence of parafoveal-on-foveal effects. Studies (e.g., Angele, Tran, & Rayner, 2013; Dare & Shillcock, 2013) have illustrated that there is an influence of lexical parafoveal-on-foveal effects of orthographically unrelated non-word stimuli at word n+1, manipulated either by transposing two letters of the target word (e.g., cheap–cehap) or by changing only one letter of the target word (e.g., news–niws). Moreover, there is a growing interest in investigating parafoveal processing using electrophysiogical measures, such as ERPs and FRPs, etc. For example, Niefind and Dimigen (2016) combined eye movement and EEG and successfully replicated the delayed parafoveal-on-foveal effects in gaze durations.

In conclusion, the two models made unique predictions about the existence of parafoveal-on-foveal effects. However, based on experimental evidence, one can fairly conclude that the there is a delayed parafoveal-on-foveal effect, indicating that in this regard parallel lexical processing is more consistent with reading behavior.

2. Parafoveal preview benefits (PB effects)

It is generally acknowledged that readers obtain an advantage from previewing word n+1 in parafoveal vision. However, preview benefits effects from word n+2 have sparked increasing interest among researchers, for its significance of testing the competing predictions of E-Z Reader and SWIFT.

Based on the serial processing structure, E-Z Reader generally assumes that readers should obtain PB effects only from word n+1, but not word n+2 (Rayner, 2009). However, SWIFT believes that letter information is acquired concurrently with the perpetual span, and thus preview benefits effects from both word n+1 and word n+2 should be observed.

Experimental evidence concerning this issue is mixed. Rayner et al. (2007) confirmed that there are no preview benefits for word n+2, and a certain amount of studies (e.g., Angele, Slattery, Yang, Kliegl, & Rayner, 2008; McDonald, 2006) are consistent with the result. Nevertheless, by using the gaze-contingent boundary paradigm (Rayner, 1975), Kliegl et al. (2007) have successfully demonstrated that preview benefits effects can be obtained from word n+2 prior to crossing the boundary, but on the condition that word n+1 is a short (three to four letters) and/or of high frequency. Furthermore, Radach et al., (2013) adopted short and high-frequency words n+1, together with a contextual manipulation so that word n+2 became predictable, generating reliable word n+2 PB effects.

Moreover, this result is in line with cross language studies. For example, Yan et al. (2010) presented a study of reading Chinese using a boundary manipulation of word n+2 preview with low and high frequency word n+1, and their results indicate that when word n+1 is of high frequency, PB effects for word n+2 were observed.

To sum up, the two models disagree on the existence of preview benefits effects from word n+2, which concerns with the key issue of sequential versus parallel word processing. The existing data reinforce predictions of SWIFT model and PB effects from word n+2 consolidate the idea of parallel processing.

Conclusion

Computational models have become an increasingly important method for understanding eye-movement control in reading. As the most leading models of eye-movement control in reading, E-Z Reader and SWIFT made distinctive assumptions and predications, which have sparked intensive discussion in cognitive psychology, and no consensus has been made so far.

This essay presents a summary of the key unique assumptions of the two model. E-Z Reader as the most advanced sequential attention shift (SAS) model assumes that attention is focused on one word at a time, in a strictly serial manner, with lexical processing only occurring on the word being attended (Reichle, 2006). The two fundamental assumptions of E-Z Reader can be summarized that 1) attention is allocated serially, to strictly one word at a time; 2) sequential shifts of attention are decoupled from saccade programming.

In contrast, SWIFT model, as the most sophisticated GAG (guidance by attentional gradient) model, argues that attention is distributive, and during reading more than one word can be processed at a time. Key distinctive principles of SWIFT can be concluded as follow: 1) the spatially distributed lexical processing implies that all types of saccades can be generated by one common principle; 2) the separate pathways for saccade timing and saccade target selection indicates that the spatial pathway is determined by relative activations of words and a random timer decide when the eyes move to the next.

Concerning the seriality and parallelism of processing, unique predictions of the two models have also been evaluated, based on the experimental evidence of Parafoveal-on-foveal effects and Parafoveal Preview Benefits. According to E-Z Reader, the information from parafoveal source only becomes available after attention has been shifted away from the fovea, and thus there should be no lexical parafoveal-on-foveal effects (Rayner, 2009). Parafoveal-on-foveal effects can be explained in terms of mislocated fixations (Drieghe, Rayner, & Pollatsek, 2007), or orthographic factors. Conversely, based on the principle of distributed attention allocation, SWIFT model affirms the existence of a delayed parafoveal-on-foveal effect. Furthermore, the existing data seem to exclude the possibility of mislocated fixations or word frequency.

On the issue of PB effects, similarly, E-Z Reader generally assumes that readers should obtain preview benefits effects only from word n+1, but not word n+2 (Rayner, 2009). On the other hand, SWIFT believes PB from both word n+1 and word n+2 should be observed. The experimental evidence is much more mixed on this issue, however, the increasing data (including cross language studies) suggest that PBeffects for word n+2 can be observed when word n+1 is of high frequency.

In conclusion, this essay presents a summary of the core assumptions of the E-Z Reader and SWIFT model. Based on the discussion of seriality and parallelism of the two models, from the analysis of parafoveal-on-foveal and preview benefits effects in reading research, the conclusion can therefore be safely made that SWIFT model is more consistent with existing experimental data and can better predict reading behaviors.

Essay: E-Z Reader vs. SWIFT Models: Exploring Eye-Movement Control in Reading

Essay details and download:

Text preview of this essay:

Conclusion

About this essay:

Essay details and download:

Text preview of this essay:

Conclusion

About this essay:

Essay Categories: