Secondary Logo

Share this article on:

Human Perception of Objects: Early Visual Processing of Spatial Form Defined by Luminance, Color, Texture, Motion, and Binocular Disparity

Pelli, Denis G.

Optometry and Vision Science: November 2001 - Volume 78 - Issue 11 - p 779

Psychology and Neural Science , New York University , New York, New York

Human Perception of Objects: Early Visual Processing of Spatial Form Defined by Luminance, Color, Texture, Motion, and Binocular Disparity. David Regan. Sunderland, MA: Sinauer Associates, Inc, 2000. Pages: 577. Price: $39.95. ISBN 0-87893-753-6.

Denis G. Pelli, Psychology and Neural Science , New York University , New York, New York

The main title, Human Perception of Objects, and the camouflaged lion peering out from the cover both give the misleading impression that the topic of this book is object recognition. In fact, the topic is given by the subtitle, “Early visual processing of spatial form defined by luminance, color, texture, motion, and binocular disparity.” Instead of recognition, which is a hot but poorly understood topic, Regan talks about detection, which is quite well understood. The book is a serious graduate student-level tutorial of what we know about visual detection of spatial forms. He follows the field in breaking this down into five kinds of cues—luminance, color, texture, motion, and disparity—devoting a long chapter to each. This amounts to a survey of most of what has happened in visual psychophysics in the past 40 years.

The book is well organized: an introductory chapter explaining the approach, one chapter each on the five cues, and a concluding chapter trying to put it all together. It’s a serious book, with a good index, some 1,500 (full) references, and 10 appendices (125 pages) that teach the basics of Fourier analysis, optics, and photometry. “In these appendixes, I have done my utmost to explain the relevant topics in physics and mathematics in such a way as to place them (without misleading oversimplification) within the intuitive grasp of a student with little experience of physics or mathematics.”

The book’s voice is quite personal. Regan seems to be sitting across the table from us, sipping coffee, patiently explaining it all to us, the reader, who he takes to be a promising student, making up in enthusiasm what perhaps we lack in training. Regan is a charming writer, self deprecating, and anxious to share his hard-won insights into both how vision works and how to do vision research.

Many will find this book useful as a reference. The clear organization and index make it easy to find topics. But, trying to put myself in the shoes of the intended audience—a new graduate student—I think I’d find it hard to just sit down and read it. Despite the excellent organization and the many practical examples that relate visual thresholds to interesting tasks, like cricket, I find the development fairly unmotivated. It doesn’t excite me. I am, in fact, quite enthusiastic about vision research, but many of the reasons that excite me aren’t presented in this text. Regan says, “My interest in sensory perception was implanted through an accidental exposure during the susceptible midteens to Immanuel Kant’s Critique of Pure Reason. Although the immediate effect was a motivation to study physics, after many years I moved on to human brain electrophysiology and sensory psychophysics as offering more direct approaches to the study of the perceived physical world in which we live.”

Let me touch on two examples that illustrate this point. The Introduction (commendably) specifies the general method: “Psychophysical models are based on a comparison of the information presented to an observer’s eye and the observer’s response to that information.”

Unfortunately, this statement treats the observer as a passive stimulus-response box (like one of Sherrington’s decerebrate dogs), forgetting that human observers must be assigned a task. That is, the difference between supposing that what we know about vision is a series of autonomous modules that passively respond to stimuli (which seems to explain a lot of visual detection) and the idea that there is a nontrivial perceptual integration and decision process that mediates the observer’s response (which is the emphasis in object recognition). Omitting the word “task” leaves out this vital connection.

“Threshold” is one of the most useful ideas in visual psychophysics. Instead of measuring variations of response (e.g., frequency of seeing) in scales that we don’t know how to interpret, we pick a criterion level of response and find the levels of physical intensity (or other physical parameter) that yield that fixed response. The consequence is that our graphs have physically meaningful scales, allowing us to draw many useful inferences. The explanation of threshold in this book fails to sing its praise. Instead, the reader is told about psychometric functions, fixation marks, two-interval forced choice, etc. From this, I wouldn’t expect the reader to realize that threshold is one of the gems of our field.

In April of this year, Regan received the Proctor Medal from the Association for Research in Vision and Ophthalmology for his broad range of contributions to vision, spanning basic and clinical, electrophysiology and psychophysics, and all five kinds of cue (luminance, color, etc.). He’s first author on more than 100 of the papers cited in his book. Regan knows what he’s talking about.




© 2001 American Academy of Optometry