UC Berkeley News
Berkeleyan

Berkeleyan


The findings from a new study by Berkeley vision scientists may someday help researchers develop artificial-intelligence programs that can accurately sort out suspicious activity from mundane actions - such as a bank teller photographed handing over money to an armed thief rather than cashing a customer's paycheck.
 

The eye may be a camera . . .
. but a lot happens during image processing. The same "best guess" neurology that enables us to identify a figure's gender and emotional state makes us subject to optical illusions

| 13 September 2006

A new study by vision scientists at Berkeley finds that the human visual system is better able to discriminate the movements of a single person when his or her actions are coordinated in a meaningful way with a second individual.

When shown a pair of figures in motion, for instance, the brain is better able to pick out an individual if it perceives that one person is throwing a punch while another is making a defensive block, the researchers says. This is especially important when the view is somehow obscured or impaired.

"Our study reveals a greater degree of complexity in human visual processing than previously understood," says Dennis Levi, dean of optometry and principal investigator of the study. "When we watch two people interacting, knowing what one is doing helps us understand the actions of the other. We think of it as, 'It takes two to tango.'"

This research provides insight into how accurately we can interpret what we see from grainy security cameras, particularly when identifying whether a crime is taking place. There is even research taking place on the development of artificial-intelligence computer programs that can automatically detect which actions are suspicious.

The human brain's ability to interpret and react to the action it sees within a fraction of a second can be a matter of life and death. But this skill - when not impaired by such disorders as autism - is also essential to our success as a social species, allowing us to determine whether someone is happy, sad, or nervous based upon visual cues both subtle and obvious.

"Our study is the first to provide quantitative evidence that this ability to interpret action enhances the processing of visual discrimination," says Peter Neri, a postdoctoral fellow in Levi's laboratory and lead author of the new study.

To build their dataset of human actions, the researchers outfitted members of the UC Martial Arts Program and the UC Ballroom Dance Team with sports suits that each contained 13 lights marking key body parts, such as the head, hands, knees, and feet. The "actors" were then asked to fight or dance in a dark room while the researchers filmed the movements.

Using motion-capture technology, the lights in the video were converted into dots that the researchers could manipulate. The original samples of each action - fighting or dancing - lasted 22 seconds each. In the synchronized video files, figures' actions corresponded naturally with each other.

The researchers also created desynchronized files in which the movements of one figure filmed in the first 11 seconds were paired with those for the other figure in the second 11 seconds. In these files, the movements of the two figures were no longer coordinated with each other.

Four Berkeley undergraduate students with normal vision who had been recruited for the study were then shown 1.5- to 3-second sequences randomly selected from either the synchronized or desynchronized files. In half of the sequences, the researchers had further scrambled the dots of one figure and added many other dots to the sequence to mask the action. The researchers were able to vary the amount of dots to measure participants' tolerance for "noise" or "fog."

The participants had to determine whether the dots they were viewing represented one or two figures.

"We found that people can tolerate more masking dots and answer correctly when the figures' actions are interacting in a meaningful way," says Neri, who also participated in the study. "If the sequence is desynchronized, we can tolerate 10 dots. When viewing synchronized action, we can tolerate a minimum of 20 and up to 100 dots. Having the action coordinated improves the performance by at least a factor of two."

The researchers note that the relationship between two figures held true whether the movements were coordinated but antagonistic, as in the fighting sequences, or coordinated and cooperative, as in the dancing movements.

"This study shows that the cortex encodes complex information about actions, and that there can be a clear evolutionary advantage - improved visual discrimination - to doing so," says Neri.

The researchers point out that the human brain is constantly adapting to limited amounts of visual information, enabling us to imagine a full picture of a tiger, for instance, even if trees and bushes obscure parts of its body.

The brain also allows us to understand when a group of dots or lines represents a human being. When presented with a group of dots in motion, the human visual system can distinguish whether the dots comprise a male or a female, and even judge the figure's emotional state, the researchers says.

"We used to talk about how the eye is a camera that passes information intact to the brain, but that's actually not how we see," says Levi. "Things are always changing before our eyes, and our brain is constantly making best guesses about what it's seeing."

When the view is somehow obscured, that task naturally becomes more difficult. This guessing game played by the visual system is also why the brain is susceptible to optical illusions, Levi adds. This best-guess approach to interpreting what we see is also an important factor when considering eyewitness accounts of events.

[an error occurred while processing this directive]