How a picture can foster comprehension of text: Evidence for scaffolding

In studies on learning with text and pictures, learners are often required to construct mental models of causal systems, where extracting the system’s spatial structure precedes understanding its functions (Hegarty & Just, 1993). To construct a mental model of the spatial structure of a causal system from text, text has to be interpreted, leaving room for erroneous inferences about the system’s spatial structure (Schnotz & Bannert, 2003). In contrast, a causal system picture is analogous to the required structure of a mental model of the causal system, and thus adding a picture to text can facilitate mental model construction (Hegarty & Just, 1993). Further, perception of pictures proceeds from global to local (Navon, 1979), meaning that information on a global level (i.e., gist) is extracted from a first glance at the picture, but information on a local level (i.e., visual details) is not. Since the gist of a picture is assumed to represent the picture’s global spatial information (Oliva & Torralba, 2006), we assume that the global spatial information extracted from briefly inspecting a causal system picture may act as a mental scaffold, facilitating mental model construction of the causal system’s spatial structure from text, and in turn comprehension (i.e., scaffolding assumption).

The scaffolding assumption was tested in two experiments using a pulley system as to-be-learnt content. In Experiment 1, 85 participants had to draw a pulley system either without seeing a picture of it, after inspection of the pulley-system picture for 600ms or 2sec, or after self-paced picture inspection time. Results revealed that spatial information on a global level (i.e., diagonal pulley orientation), but not on a local level (i.e., pictorial relations) were extracted from inspecting the pulley system picture for 600ms and 2sec. In Experiment 2, 84 participants learned about the structure and function of pulley systems from text or from text with previous presentation of a picture (same as in Experiment 1) for 600ms, 2sec, or self-paced. The text did not contain information about the diagonal orientation of pulleys in the system (i.e., global spatial information); this information could be extracted only from (brief) initial picture inspection. Text was presented auditory while participants looked at a blank screen (cf. blank screen paradigm; Altmann, 2004). Participants’ eye movements on blank screen while listening to text and comprehension of the pulley system’s functioning were assessed. Participants’ eye movements were analyzed by means of the standardized relative frequency of saccades made according to the global spatial information (i.e., about diagonal pulley orientation) that could be extracted only from the picture.

Results revealed that in conditions with initial picture inspection (for 600ms, 2sec, and self-paced) more eye movements in line with the picture’s global spatial orientation were made and comprehension was better than in the text-only condition. Results from both experiments thus suggest that mental model construction from text was (positively) influenced by global spatial information extracted from brief initial picture inspection, thus supporting the scaffolding assumption.