Skip to main content

Main navigation

  • About BCS
    • Mission
    • History
    • Building 46
      • Building 46 Room Reservations
    • Leadership
    • Employment
    • Contact
      • BCS Spot Awards
      • Building 46 Email and Slack
    • Directory
  • Faculty + Research
    • Faculty
    • Areas of Research
    • Postdoctoral Research
      • Postdoctoral Association and Committees
    • Core Facilities
    • InBrain
      • InBRAIN Collaboration Data Sharing Policy
  • Academics
    • Course 9: Brain and Cognitive Sciences
    • Course 6-9: Computation and Cognition
      • Course 6-9 MEng
    • Brain and Cognitive Sciences PhD
      • How to Apply
      • Program Details
      • Classes
      • Research
      • Student Life
      • For Current Students
    • Molecular and Cellular Neuroscience Program
      • How to Apply to MCN
      • MCN Faculty and Research Areas
      • MCN Curriculum
      • Model Systems
      • MCN Events
      • MCN FAQ
      • MCN Contacts
    • Computationally-Enabled Integrative Neuroscience Program
    • Research Scholars Program
    • Course Offerings
  • News + Events
    • News
    • Events
    • Recordings
    • Newsletter
  • Community + Culture
    • Community + Culture
    • Community Stories
    • Outreach
      • MIT Summer Research Program (MSRP)
      • Post-Baccalaureate Research Scholars
      • Conferences, Outreach and Networking Opportunities
    • Get Involved (MIT login required)
    • Resources (MIT login Required)
    • Upcoming Events
  • Give to BCS
    • Join the Champions of the Brain Fellows Society
    • Meet Our Donors

Utility Menu

  • Directory
  • Apply to BCS
  • Contact Us

Footer

  • Contact Us
  • Employment
  • Be a Test Subject
  • Login

Footer 2

  • McGovern
  • Picower

Utility Menu

  • Directory
  • Apply to BCS
  • Contact Us
Brain and Cognitive Sciences
Menu
MIT

Main navigation

  • About BCS
    • Mission
    • History
    • Building 46
    • Leadership
    • Employment
    • Contact
    • Directory
  • Faculty + Research
    • Faculty
    • Areas of Research
    • Postdoctoral Research
    • Core Facilities
    • InBrain
  • Academics
    • Course 9: Brain and Cognitive Sciences
    • Course 6-9: Computation and Cognition
    • Brain and Cognitive Sciences PhD
    • Molecular and Cellular Neuroscience Program
    • Computationally-Enabled Integrative Neuroscience Program
    • Research Scholars Program
    • Course Offerings
  • News + Events
    • News
    • Events
    • Recordings
    • Newsletter
  • Community + Culture
    • Community + Culture
    • Community Stories
    • Outreach
    • Get Involved (MIT login required)
    • Resources (MIT login Required)
    • Upcoming Events
  • Give to BCS
    • Join the Champions of the Brain Fellows Society
    • Meet Our Donors

News

News Menu

  • News
  • Events
  • Newsletters

Breadcrumb

  1. Home
  2. News
  3. Researchers enhance peripheral vision in AI models
March 8, 2024

Researchers enhance peripheral vision in AI models

by
Adam Zewe | MIT News
Image
Researchers enhance peripheral vision in AI models
A dataset of transformed images can be used to effectively simulate peripheral vision in a machine-learning model, improving the performance of these models on detecting and recognizing objects that are off to the side or in the corner of a scene.

Peripheral vision enables humans to see shapes that aren’t directly in our line of sight, albeit with less detail. This ability expands our field of vision and can be helpful in many situations, such as detecting a vehicle approaching our car from the side.

Unlike humans, AI does not have peripheral vision. Equipping computer vision models with this ability could help them detect approaching hazards more effectively or predict whether a human driver would notice an oncoming object.

Taking a step in this direction, MIT researchers developed an image dataset that allows them to simulate peripheral vision in machine learning models. They found that training models with this dataset improved the models’ ability to detect objects in the visual periphery, although the models still performed worse than humans.

Their results also revealed that, unlike with humans, neither the size of objects nor the amount of visual clutter in a scene had a strong impact on the AI’s performance.

“There is something fundamental going on here. We tested so many different models, and even when we train them, they get a little bit better but they are not quite like humans. So, the question is: What is missing in these models?” says Vasha DuTell, a postdoc and co-author of a paper detailing this study.

Answering that question may help researchers build machine learning models that can see the world more like humans do. In addition to improving driver safety, such models could be used to develop displays that are easier for people to view.

Plus, a deeper understanding of peripheral vision in AI models could help researchers better predict human behavior, adds lead author Anne Harrington MEng ’23.

“Modeling peripheral vision, if we can really capture the essence of what is represented in the periphery, can help us understand the features in a visual scene that make our eyes move to collect more information,” she explains.

Their co-authors include Mark Hamilton, an electrical engineering and computer science graduate student; Ayush Tewari, a postdoc; Simon Stent, research manager at the Toyota Research Institute; and senior authors William T. Freeman, the Thomas and Gerd Perkins Professor of Electrical Engineering and Computer Science and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL); and Ruth Rosenholtz, principal research scientist in the Department of Brain and Cognitive Sciences and a member of CSAIL. The research will be presented at the International Conference on Learning Representations.

“Any time you have a human interacting with a machine — a car, a robot, a user interface — it is hugely important to understand what the person can see. Peripheral vision plays a critical role in that understanding,” Rosenholtz says.

Simulating peripheral vision

Extend your arm in front of you and put your thumb up — the small area around your thumbnail is seen by your fovea, the small depression in the middle of your retina that provides the sharpest vision. Everything else you can see is in your visual periphery. Your visual cortex represents a scene with less detail and reliability as it moves farther from that sharp point of focus.

Many existing approaches to model peripheral vision in AI represent this deteriorating detail by blurring the edges of images, but the information loss that occurs in the optic nerve and visual cortex is far more complex.

For a more accurate approach, the MIT researchers started with a technique used to model peripheral vision in humans. Known as the texture tiling model, this method transforms images to represent a human’s visual information loss.  

They modified this model so it could transform images similarly, but in a more flexible way that doesn’t require knowing in advance where the person or AI will point their eyes.

“That let us faithfully model peripheral vision the same way it is being done in human vision research,” says Harrington.

The researchers used this modified technique to generate a huge dataset of transformed images that appear more textural in certain areas, to represent the loss of detail that occurs when a human looks further into the periphery.

Then they used the dataset to train several computer vision models and compared their performance with that of humans on an object detection task.

“We had to be very clever in how we set up the experiment so we could also test it in the machine learning models. We didn’t want to have to retrain the models on a toy task that they weren’t meant to be doing,” she says.

Peculiar performance

Humans and models were shown pairs of transformed images which were identical, except that one image had a target object located in the periphery. Then, each participant was asked to pick the image with the target object.

“One thing that really surprised us was how good people were at detecting objects in their periphery. We went through at least 10 different sets of images that were just too easy. We kept needing to use smaller and smaller objects,” Harrington adds.

The researchers found that training models from scratch with their dataset led to the greatest performance boosts, improving their ability to detect and recognize objects. Fine-tuning a model with their dataset, a process that involves tweaking a pretrained model so it can perform a new task, resulted in smaller performance gains.

But in every case, the machines weren’t as good as humans, and they were especially bad at detecting objects in the far periphery. Their performance also didn’t follow the same patterns as humans.

“That might suggest that the models aren’t using context in the same way as humans are to do these detection tasks. The strategy of the models might be different,” Harrington says.

The researchers plan to continue exploring these differences, with a goal of finding a model that can predict human performance in the visual periphery. This could enable AI systems that alert drivers to hazards they might not see, for instance. They also hope to inspire other researchers to conduct additional computer vision studies with their publicly available dataset.

“This work is important because it contributes to our understanding that human vision in the periphery should not be considered just impoverished vision due to limits in the number of photoreceptors we have, but rather, a representation that is optimized for us to perform tasks of real-world consequence,” says Justin Gardner, an associate professor in the Department of Psychology at Stanford University who was not involved with this work. “Moreover, the work shows that neural network models, despite their advancement in recent years, are unable to match human performance in this regard, which should lead to more AI research to learn from the neuroscience of human vision. This future research will be aided significantly by the database of images provided by the authors to mimic peripheral human vision.”

This work is supported, in part, by the Toyota Research Institute and the MIT CSAIL METEOR Fellowship.

Read the Original Article
Don't miss our next newsletter!
Sign Up

Footer menu

  • Contact Us
  • Employment
  • Be a Test Subject
  • Login

Footer 2

  • McGovern
  • Picower
Brain and Cognitive Sciences

MIT Department of Brain and Cognitive Sciences

Massachusetts Institute of Technology

77 Massachusetts Avenue, Room 46-2005

Cambridge, MA 02139-4307 | (617) 253-5748

For Emergencies | Accessibility

Massachusetts Institute of Technology