Eye tracking is the process of measuring either the point of gaze (where one is looking) or the motion of an eye relative to the head. An eye tracker is a device for measuring eye positions and eye movement. Eye trackers are used in research on the visual system, in psychology, in psycholinguistics, marketing, as an input device for human-computer interaction, and in product design. There are a number of methods for measuring eye movement. The most popular variant uses video images from which the eye position is extracted. Other methods use search coils or are based on the electrooculogram.
In the 1800s, studies of eye movement were made using direct observations.
In 1879 in Paris, Louis Émile Javal observed that reading does not involve a smooth sweeping of the eyes along the text, as previously assumed, but a series of short stops (called fixations) and quick saccades. This observation raised important questions about reading, questions which were explored during the 1900s: On which words do the eyes stop? For how long? When do they regress to already seen words?
Edmund Huey built an early eye tracker, using a sort of contact lens with a hole for the pupil. The lens was connected to an aluminum pointer that moved in response to the movement of the eye. Huey studied and quantified regressions (only a small proportion of saccades are regressions), and he showed that some words in a sentence are not fixated.
The first non-intrusive eye-trackers were built by Guy Thomas Buswell in Chicago, using beams of light that were reflected on the eye and then recording them on film. Buswell made systematic studies into reading and picture viewing.
In the 1950s, Alfred L. Yarbus did important eye tracking research and his 1967 book is often quoted. He showed that the task given to a subject has a very large influence on the subject's eye movement. He also wrote about the relation between fixations and interest:
In 1980, Just and Carpenter formulated the influential Strong eye-mind hypothesis, that "there is no appreciable lag between what is fixated and what is processed". If this hypothesis is correct, then when a subject looks at a word or object, he or she also thinks about it (process cognitively), and for exactly as long as the recorded fixation. The hypothesis is often taken for granted by researchers using eye-tracking. However, gaze-contingent techniques offer an interesting option in order to disentangle overt and covert attentions, to differentiate what is fixated and what is processed.
During the 1980s, the eye-mind hypothesis was often questioned in light of covert attention, the attention to something that one is not looking at, which people often do. If covert attention is common during eye-tracking recordings, the resulting scan-path and fixation patterns would often show not where our attention has been, but only where the eye has been looking, failing to indicate cognitive processing.
The 1980s also saw the birth of using eye-tracking to answer questions related to human-computer interaction. Specifically, researchers investigated how users search for commands in computer menus. Additionally, computers allowed researchers to use eye-tracking results in real time, primarily to help disabled users.
More recently, there has been growth in using eye tracking to study how users interact with different computer interfaces. Specific questions researchers ask are related to how easy different interfaces are for users. The results of the eye tracking research can lead to changes in design of the interface. Yet another recent area of research focuses on Web development. This can include how users react to drop-down menus or where they focus their attention on a website so the developer knows where to place an advertisement.
According to Hoffman, current consensus is that visual attention is always slightly (100 to 250 ms) ahead of the eye. But as soon as attention moves to a new position, the eyes will want to follow.
We still cannot infer specific cognitive processes directly from a fixation on a particular object in a scene. For instance, a fixation on a face in a picture may indicate recognition, liking, dislike, puzzlement etc. Therefore, eye tracking is often coupled with other methodologies, such as introspective verbal protocols.
Eye-trackers measure rotations of the eye in one of several ways, but principally they fall into three categories: (i) measurement of the movement of an object (normally, a special contact lens) attached to the eye; (ii) optical tracking without direct contact to the eye; and (iii) measurement of electric potentials using electrodes placed around the eyes.
The first type uses an attachment to the eye, such as a special contact lens with an embedded mirror or magnetic field sensor, and the movement of the attachment is measured with the assumption that it does not slip significantly as the eye rotates. Measurements with tight-fitting contact lenses have provided extremely sensitive recordings of eye movement, and magnetic search coils are the method of choice for researchers studying the dynamics and underlying physiology of eye movement. This method allows the measurement of eye movement in horizontal, vertical and torsion directions.
The second broad category uses some non-contact, optical method for measuring eye motion. Light, typically infrared, is reflected from the eye and sensed by a video camera or some other specially designed optical sensor. The information is then analyzed to extract eye rotation from changes in reflections. Video-based eye trackers typically use the corneal reflection (the first Purkinje image) and the center of the pupil as features to track over time. A more sensitive type of eye-tracker, the dual-Purkinje eye tracker, uses reflections from the front of the cornea (first Purkinje image) and the back of the lens (fourth Purkinje image) as features to track. A still more sensitive method of tracking is to image features from inside the eye, such as the retinal blood vessels, and follow these features as the eye rotates. Optical methods, particularly those based on video recording, are widely used for gaze-tracking and are favored for being non-invasive and inexpensive.
The third category uses electric potentials measured with electrodes placed around the eyes. The eyes are the origin of a steady electric potential field which can also be detected in total darkness and if the eyes are closed. It can be modelled to be generated by a dipole with its positive pole at the cornea and its negative pole at the retina. The electric signal that can be derived using two pairs of contact electrodes placed on the skin around one eye is called Electrooculogram (EOG). If the eyes move from the centre position towards the periphery, the retina approaches one electrode while the cornea approaches the opposing one. This change in the orientation of the dipole and consequently the electric potential field results in a change in the measured EOG signal. Inversely, by analysing these changes in eye movement can be tracked. Due to the discretisation given by the common electrode setup, two separate movement components - a horizontal and a vertical - can be identified. A third EOG component is the radial EOG channel, which is the average of the EOG channels referenced to some posterior scalp electrode. This radial EOG channel is sensitive to the saccadic spike potentials stemming from the extra-ocular muscles at the onset of saccades, and allows reliable detection of even miniature saccades.
Due to potential drifts and variable relations between the EOG signal amplitudes and the saccade sizes, it is challenging to use EOG for measuring slow eye movement and detecting gaze direction. EOG is, however, a very robust technique for measuring saccadic eye movement associated with gaze shifts and detecting blinks. Contrary to video-based eye-trackers, EOG allows recording of eye movements even with eyes closed, and can thus be used in sleep research. It is a very light-weight approach that, in contrast to current video-based eye-trackers, requires only very low computational power; works under different lighting conditions; and can be implemented as an embedded, self-contained wearable system. It is thus the method of choice for measuring eye movement in mobile daily-life situations and REM phases during sleep. The major disadvantage of EOG is its relatively poor gaze-direction accuracy compared to a video tracker. That is, it is difficult to determine with good accuracy exactly where a subject is looking, though the time of eye movements can be determined.
The most widely used current designs are video-based eye-trackers. A camera focuses on one or both eyes and records eye movement as the viewer looks at some kind of stimulus. Most modern eye-trackers use the center of the pupil and infrared / near-infrared non-collimated light to create corneal reflections (CR). The vector between the pupil center and the corneal reflections can be used to compute the point of regard on surface or the gaze direction. A simple calibration procedure of the individual is usually needed before using the eye tracker.
Two general types of infrared / near-infrared (also known as active light) eye-tracking techniques are used: bright-pupil and dark-pupil. Their difference is based on the location of the illumination source with respect to the optics. If the illumination is coaxial with the optical path, then the eye acts as a retroreflector as the light reflects off the retina creating a bright pupil effect similar to red eye. If the illumination source is offset from the optical path, then the pupil appears dark because the retroreflection from the retina is directed away from the camera.
Bright-pupil tracking creates greater iris/pupil contrast, allowing more robust eye-tracking with all iris pigmentation, and greatly reduces interference caused by eyelashes and other obscuring features. It also allows tracking in lighting conditions ranging from total darkness to very bright. Bright-pupil techniques are however not effective for tracking outdoors, as extraneous IR sources interfere with monitoring.
Another, less used, method is known as passive light. It uses visible light to illuminate, something which may cause some distractions to users. Another challenge with this method is that the contrast of the pupil is less than in the active light methods, therefore, the center of iris is used for calculating the vector instead. This calculation needs to detect the boundary of the iris and the white sclera (limbus tracking). It presents another challenge for vertical eye movements due to obstruction of eyelids.
Eye-tracking setups vary greatly: some are head-mounted, some require the head to be stable (for example, with a chin rest), and some function remotely and automatically track the head during motion. Most use a sampling rate of at least 30 Hz. Although 50/60 Hz is more common, today many video-based eye trackers run at 240, 350 or even 1000/1250 Hz, speeds needed in order to capture fixational eye movements or correctly measure saccade dynamics.
Eye movements are typically divided into fixations and saccades - when the eye gaze pauses in a certain position, and when it moves to another position, respectively. The resulting series of fixations and saccades is called a scanpath. Smooth pursuit describes the eye following a moving object. Fixational eye movements include micro saccades: small, involuntary saccades that occur during attempted fixation. Most information from the eye is made available during a fixation or smooth pursuit, but not during a saccade. The central one or two degrees of the visual angle (that area of the visual field which falls on the fovea) provide the bulk of visual information; the input from larger eccentricities (the periphery) has less resolution and little to no colour, although contrast and movement is detected better in peripheral vision. Hence, the locations of fixations or smooth pursuit along a scanpath show what information loci on the stimulus were processed during an eye-tracking session. On average, fixations last for around 200 ms during the reading of linguistic text, and 350 ms during the viewing of a scene. Preparing a saccade towards a new goal takes around 200 ms.
Scanpaths are useful for analyzing cognitive intent, interest, and salience. Other biological factors (some as simple as gender) may affect the scanpath as well. Eye tracking in human-computer interaction (HCI) typically investigates the scanpath for usability purposes, or as a method of input in gaze-contingent displays, also known as gaze-based interfaces.
Interpretation of the data that is recorded by the various types of eye-trackers employs a variety of software that animates or visually represents it, so that the visual behavior of one or more users can be graphically resumed. The video is generally manually coded to identify the AOIs(Area Of Interests) or recently using artificial intelligence. Graphical presentation is rarely the basis of research results, since they are limited in terms of what can be analysed - research relying on eye-tracking, for example, usually requires quantitative measures of the eye movement events and their parameters, The following visualisations are the most commonly used:
Animated representations of a point on the interface This method is used when the visual behavior is examined individually indicating where the user focused their gaze in each moment, complemented with a small path that indicates the previous saccade movements, as seen in the image.
Static representations of the saccade path This is fairly similar to the one described above, with the difference that this is static method. A higher level of expertise than with the animated ones is required to interpret this.
Heat maps An alternative static representation, used mainly for the agglomerated analysis of the visual exploration patterns in a group of users, differing from both methods explained before. In these representations, the 'hot' zones or zones with higher density designate where the users focused their gaze (not their attention) with a higher frequency. Heat maps are the best known visualization technique for eyetracking studies.
Blind zones maps, or focus maps This method is a simplified version of the Heat maps where the visually less attended zones by the users are displayed clearly, thus allowing for an easier understanding of the most relevant information, that is to say, we are informed about which zones were not seen by the users.
Eye-trackers necessarily measure the rotation of the eye with respect to some frame of reference. This is usually tied to the measuring system. Thus, if the measuring system is head-mounted, as with EOG or a video-based system mounted to a helmet, then eye-in-head angles are measured. To deduce the line of sight in world coordinates, the head must be kept in a constant position or its movements must be tracked as well. In these cases, head direction is added to eye-in-head direction to determine gaze direction.
If the measuring system is table-mounted, as with scleral search coils or table-mounted camera ("remote") systems, then gaze angles are measured directly in world coordinates. Typically, in these situations head movements are prohibited. For example, the head position is fixed using a bite bar or a forehead support. Then a head-centered reference frame is identical to a world-centered reference frame. Or colloquially, the eye-in-head position directly determines the gaze direction.
Some results are available on human eye movements under natural conditions where head movements are allowed as well. The relative position of eye and head, even with constant gaze direction, influences neuronal activity in higher visual areas.
A great deal of research has gone into studies of the mechanisms and dynamics of eye rotation, but the goal of eye- tracking is most often to estimate gaze direction. Users may be interested in what features of an image draw the eye, for example. It is important to realize that the eye-tracker does not provide absolute gaze direction, but rather can measure only changes in gaze direction. In order to know precisely what a subject is looking at, some calibration procedure is required in which the subject looks at a point or series of points, while the eye tracker records the value that corresponds to each gaze position. (Even those techniques that track features of the retina cannot provide exact gaze direction because there is no specific anatomical feature that marks the exact point where the visual axis meets the retina, if indeed there is such a single, stable point.) An accurate and reliable calibration is essential for obtaining valid and repeatable eye movement data, and this can be a significant challenge for non-verbal subjects or those who have unstable gaze.
Each method of eye-tracking has advantages and disadvantages, and the choice of an eye-tracking system depends on considerations of cost and application. There are offline methods and online procedures like AttentionTracking. There is a trade-off between cost and sensitivity, with the most sensitive systems costing many tens of thousands of dollars and requiring considerable expertise to operate properly. Advances in computer and video technology have led to the development of relatively low-cost systems that are useful for many applications and fairly easy to use. Interpretation of the results still requires some level of expertise, however, because a misaligned or poorly calibrated system can produce wildly erroneous data.
The eye movement of two groups of drivers have been filmed with a special head camera by a team of the Swiss Federal Institute of Technology: Novice and experienced drivers had their eye-movement recorded while approaching a bend of a narrow road. The series of images has been condensed from the original film frames to show 2 eye fixations per image for better comprehension.
Each of these stills corresponds to approximately 0.5 seconds in realtime.
The series of images shows an example of eye fixations #9 to #14 of a typical novice and an experienced driver.
Comparison of the top images shows that the experienced driver checks the curve and even has Fixation No. 9 left to look aside while the novice driver needs to check the road and estimate his distance to the parked car.
In the middle images, the experienced driver is now fully concentrating on the location where an oncoming car could be seen. The novice driver concentrates his view on the parked car.
In the bottom image the novice is busy estimating the distance between the left wall and the parked car, while the experienced driver can use his peripheral vision for that and still concentrate his view on the dangerous point of the curve: If a car appears there, he has to give way, i. e. stop to the right instead of passing the parked car.
While walking, elderly subjects depend more on foveal vision than do younger subjects. Their walking speed is decreased by a limited visual field, probably caused by a deteriorated peripheral vision.
Younger subjects make use of both their central and peripheral vision while walking. Their peripheral vision allows faster control over the process of walking.
A wide variety of disciplines use eye-tracking techniques, including cognitive science; psychology (notably psycholinguistics; the visual world paradigm); human-computer interaction (HCI); marketing research and medical research (neurological diagnosis). Specific applications include the tracking eye movement in language reading, music reading, human activity recognition, the perception of advertising, and the playing of sports.
In recent years, the increased sophistication and accessibility of eye-tracking technologies have generated a great deal of interest in the commercial sector. Applications include web usability, advertising, sponsorship, package design and automotive engineering. In general, commercial eye-tracking studies function by presenting a target stimulus to a sample of consumers while an eye tracker is used to record the activity of the eye. Examples of target stimuli may include websites; television programs; sporting events; films and commercials; magazines and newspapers; packages; shelf displays; consumer systems (ATMs, checkout systems, kiosks); and software. The resulting data can be statistically analyzed and graphically rendered to provide evidence of specific visual patterns. By examining fixations, saccades, pupil dilation, blinks and a variety of other behaviors, researchers can determine a great deal about the effectiveness of a given medium or product. While some companies complete this type of research internally, there are many private companies that offer eye-tracking services and analysis.
One of the most prominent fields of commercial eye-tracking research is web usability. While traditional usability techniques are often quite powerful in providing information on clicking and scrolling patterns, eye-tracking offers the ability to analyze user interaction between the clicks and how much time a user spends between clicks, thereby providing valuable insight into which features are the most eye-catching, which features cause confusion and which are ignored altogether. Specifically, eye-tracking can be used to assess search efficiency, branding, online advertisements, navigation usability, overall design and many other site components. Analyses may target a prototype or competitor site in addition to the main client site.
Eye-tracking is commonly used in a variety of different advertising media. Commercials, print ads, online ads and sponsored programs are all conducive to analysis with current eye-tracking technology. For instance in newspapers, eye-tracking studies can be used to find out in what way advertisements should be mixed with the news in order to catch the reader's eyes. Analyses focus on visibility of a target product or logo in the context of a magazine, newspaper, website, or televised event. One example is an analysis of eye movements over advertisements in the Yellow Pages. The study focused on what particular features caused people to notice an ad, whether they viewed ads in a particular order and how viewing times varied. The study revealed that ad size, graphics, color, and copy all influence attention to advertisements. Knowing this allows researchers to assess in great detail how often a sample of consumers fixates on the target logo, product or ad. As a result, an advertiser can quantify the success of a given campaign in terms of actual visual attention. Another example of this is a study that found that in a search engine results page, authorship snippets received more attention than the paid ads or even the first organic result.