AI
Audio and Multimedia
Big Data
Core Technology for TCS
Extensible Internet
Networking and Security
Research Initiatives
Speech
Usable Security and Privacy
Vision
2i2c
- 2i2c
Robust Deep Learning
- Resilient Dynamic Autoencoders for Modeling and Predicting Earthquake Threats
- Backdoor Detection via Eigenvalues, Hessians, Internal Behaviors, and Robust Statistics

Previous Research: Sound and Vision Integration

Principal Investigator(s):

Stella Yu

Sound carries complementary information to vision and can help scene understanding and navigation. We train a model to tell individual sounds apart without using labels, which can be used to accelerate subsequent training on supervised sound event classification, and to explain how song birds such as zebra finch can develop communication without any external supervision. We also demonstrate with a low-cost real system that learns echolocation and generates depth images only from sound.

Main menu

Previous Research: Sound and Vision Integration

Quick Links

Research Areas

Projects

Visitor Information

Follow ICSI

Search form

Main menu

Previous Research: Sound and Vision Integration

Quick Links

Research Areas

Projects

Visitor Information

Follow ICSI