Introduction

I have graduated and I am currently a Research Scientist at the Google Brain. My new webpage is here.

Hello and welcome to my website! I'm a PhD student in the Electrical Engineering & Computer Science department at the University of California, Berkeley. I currently work on interdisciplinary topics including Speech Recognition and Signal Processing under Nelson Morgan's supervision, TeleImmersive technologies with Ruzena Bajcsy, and Computer Vision and Machine Learning with Trevor Darrell. I am a recipient of the 2011 Microsoft Research PhD Fellowship.

During my years as a researcher I have developed a deep passion for Artificial Intelligence, Speech Recognition, Speaker Diarization, Machine Learning, Computer Vision and Optimization. I'm very excited to be working on these topics as part of my research at Berkeley. I have recently become involved with the Berkeley Overmind, a project to build an AI that plays a popular real time strategy game.

I graduated in September 2009 with a master's degree (MSc) in Computer Science and Engineering from the University of California, San Diego, and in 2007 I completed a dual degree in Telecommunication Engineering and Mathematics from the Polytechnic University of Catalonia in Barcelona, Spain. I did my undergrad thesis at the Robotics Institute of the Carnegie Mellon University on Machine Learning and Computer Vision. In addition, I got to work with a fabulous group of people during three summer internships at Microsoft Research in Redmond, WA, and at Google Research in Mountain View, CA.

Selected Publications
  • O. Vinyals, N. Morgan.
    Deep vs. Wide: Depth on a Budget for Robust Speech Recognition
    Proceedings of Interspeech 2013, Lyon, France, August 2013.
    (PDF, Bibtex)
  • Y. Jia, O. Vinyals, T. Darrell.
    On Compact Codes for Spatially Pooled Features.
    International Conference of Machine Learning (ICML 2013), Atlanta, GA, June 2013.
    (PDF, Bibtex) (a subset of this paper was also presented at the ICLR workshop [arXiv]
  • O. Vinyals, Y. Jia, L. Deng, T. Darrell.
    Learning with Recursive Perceptual Representations.
    Neural Information Processing Systems (NIPS 2012), South Lake Tahoe, CA, December 2012.
    (PDF, Bibtex)
  • O. Vinyals, D. Bohus, R. Caruana.
    Learning Speaker, Addressee and Overlap Detection Models from Multimodal Streams.
    In ACM International Conference on Multimodal Interaction (ICMI 2012), Santa Monica, CA, October 2012.
    (PDF, Bibtex)
  • O. Vinyals, L. Deng.
    Are Sparse Representations Rich Enough for Acoustic Modeling?
    Proceedings of Interspeech 2012, Portland, OR, September 2012.
    (PDF, Bibtex)
  • A. Maas, Q. Le, T. O'Neil, O. Vinyals, P. Nguyen, A. Ng.
    Recurrent Neural Networks for Noise Reduction in Robust ASR.
    Proceedings of Interspeech 2012, Portland, OR, September 2012.
    (PDF, Bibtex)
  • O. Vinyals, S. Ravuri, D. Povey.
    Revisiting Recurrent Neural Networks for Robust ASR.
    Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 12), Kyoto, Japan, March 2012.
    (PDF, Bibtex)
  • O. Vinyals, D. Povey.
    Krylov Subspace Descent for Deep Learning.
    In International Conference on Artificial Intelligence and Statistics (AISTATS 2012), Canary Islands, Spain, April 2012; also in Neural Information Processing Systems Optimization Workshop and Hierarchial Learning Workshop (NIPS 2011), Granada, Spain, December 2011.
    (PDF, Bibtex)
  • X. Anguera, S. Bozonnet, N. Evans, C. Fredouille, G. Friedland, O. Vinyals.
    Speaker Diarization: A Review of Recent Research.
    In IEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2011.
    (PDF, Bibtex)
  • O. Vinyals, S. Ravuri.
    Comparing Multilayer Perceptron to Deep Belief Network Tandem Features for Robust ASR.
    Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 11), Prague, Czech Republic, May 2011.
    (PDF, Bibtex)
  • G. Friedland, O. Vinyals, T. Darrell.
    Multimodal Location Estimation.
    Proceedings of ACM International Conference on Multimedia (ACM Multimedia 2010), Florence, Italy, October 2010.
    (PDF, Bibtex)
  • E. Martin, O. Vinyals, G. Friedland, R. Bajcsy.
    Using Android and Indoor Localization for Diaries.
    Proceedings of ACM International Conference on Multimedia (ACM Multimedia 2010), Florence, Italy, October 2010. Finalist of the ACM Multimedia Grand Challenge, Google diaries task.
    (PDF, Bibtex)
  • C. Vaquero, O. Vinyals, G. Friedland.
    A Hybrid Approach to Online Speaker Diarization.
    Proceedings of Interspeech 2010, Makuhari, Japan, September 2010.
    (PDF, Bibtex)
  • O. Vinyals, G. Friedland, N. Morgan.
    Discriminative Training for Hierarchical Clustering in Speaker Diarization.
    Proceedings of Interspeech 2010, Makuhari, Japan, September 2010.
    (PDF, Bibtex)
  • O. Vinyals, L. Deng, D. Yu, A. Acero.
    Discriminative Pronunciation Learning Using Phonetic Decoder and Minimum-Classification-Error Criterion.
    Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 09), Taipei, Taiwan, April 2009. Finalist of the Best Student Paper Award.
    (PDF, Bibtex)
  • G. Friedland, O. Vinyals, Y. Huang, C. Müller.
    Prosodic and other Long-Term Features for Speaker Diarization.
    In IEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2009.
    (PDF, Bibtex)
  • G. Friedland, O. Vinyals.
    Live Speaker Identification in Conversations.
    Proceedings of ACM International Conference on Multimedia (ACM Multimedia 2008), Vancouver, Canada, October 2008.
    (PDF, Bibtex)
  • O. Vinyals, G. Friedland.
    Modulation Spectrogram Features for Speaker Diarization.
    Proceedings of Interspeech 2008, Brisbane, Australia, September 2008.
    (PDF, Bibtex)
  • K. Boakye, B. Trueba-Hornero, O. Vinyals, G. Friedland.
    Overlapped speech detection for improved Speaker Diarization in multiparty meetings.
    Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 08), Las Vegas, NV, April 2008.
    (PDF, Bibtex)
  • Y. Huang, O. Vinyals, G. Friedland, C. Mueller, N. Mirghafori, C. Wooters.
    A Fast-Match approach for robust, faster than real-time Speaker Diarization.
    Proceedings of IEEE workshop on Automatic Speech Recognition and Understanding (ASRU 07), Kyoto, Japan, December 2007.
    (PDF, Bibtex)
  • O. Vinyals, G. Friedland, N. Mirghafori.
    Revisiting a basic function on current CPUs: A fast logarithm implementation with adjustable accuracy.
    ICSI TR-07-002, June 2007.
    (PDF, Bibtex)
  • F. de la Torre, O. Vinyals.
    Learning Kernel Expansions for Image Classication.
    Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR 07), Minneapolis, MN, June 2007.
    (PDF, Bibtex)
Coursework (@Berkeley)

CS281A Statistical Learning Theory
CS281B Advanced Topics in Learning & Decision Making
Stat260 Bayesian Modeling and Inference
CS294 Practical Machine Learning
CS280 Computer Vision

Coursework (@UCSD)

CSE256 Statistical Natural Language Processing
ECE273 Convex Optimization and Applications

Coursework (@CMU)

10-701 and 15-781 Machine Learning

Last Updated June 14, 2013