About Me

I’m a PhD student at the Language Technologies Institute of the School of Computer Science at Carnegie Mellon University. I am advised by Dr. Shinji Watanabe as a member of WAVLab.

I’m generally interested in improving speech & language processing. My current projects involve speech translation and multilingual speech recognition with end-to-end neural networks.

I also completed my master’s degree at CMU SCS where I was advised by Dr. Michael Shamos. Before that, I was a Technology Strategy Consultant at Accenture. I completed my undergraduate degree in Economics and Computer Science at The University of Chicago.

Updates

May 2024: Interning at Meta FAIR with Dr. Michael Auli
June 2023: Joining the SCALE 2023 Workshop at John’s Hopkins University
June 2022: Joining the JSALT 2022 Workshop at John’s Hopkins University
May 2021: Interning at Dr. Dong Yu’s AI lab at Tencent America

Selected Publications

Improving Multilingual ASR in the Wild Using Simple N-best Re-ranking
_{Brian Yan, Vineel Pratap, Shinji Watanabe, Michael Auli}
_{Pre-print, 2024}
_paper

Improving Massively Multilingual ASR With Auxiliary CTC Objectives
_{William Chen, Brian Yan, Jiatong Shi, Yifan Peng, Soumi Maiti, Shinji Watanabe}
_{2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023}
_{Best student paper award at IEEE ICASSP 2023}
_paper

Exploration of Efficient End-to-End ASR Using Discretized Input from Self-Supervised Learning
_{Xuankai Chang, Brian Yan, Yuya Fujita, Takashi Maekaku, Shinji Watanabe}
_{24th Annual Conference of the International Speech Communication Association (INTERSPEECH), 2023}
_paper

ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit
_{Brian Yan*, Jiatong Shi*, Yun Tang, Hirofumi Inaguma, Yifan Peng, Siddharth Dalmia, Peter Polák, Patrick Fernandes, Dan Berrebbi, Tomoki Hayashi, Xiaohui Zhang, Zhaoheng Ni, Moto Hira, Soumi Maiti, Juan Pino, Shinji Watanabe}
_{Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), 2023}
_{paper / poster}

CMU’s IWSLT 2023 Simultaneous Speech Translation System
_{Brian Yan*, Jiatong Shi*, Soumi Maiti, William Chen, Xinjian Li, Yifan Peng, Siddhant Arora, Shinji Watanabe}
_{Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT), 2023}
_{Winning submission to the IWSLT 2023 Simultaneous Speech-to-Speech Translation Track (English-to-German)}
_paper

Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization
_{Puyuan Peng, Brian Yan, Shinji Watanabe, David Harwath}
_{24th Annual Conference of the International Speech Communication Association (INTERSPEECH), 2023}
_paper

CTC Alignments Improve Autoregressive Translation
_{Brian Yan, Siddharth Dalmia, Yosuke Higuchi, Graham Neubig, Florian Metze, Alan W Black, Shinji Watanabe}
_{Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023}
_{paper / talk / poster / TLDR}

Towards Zero-Shot Code-Switched Speech Recognition
_{Brian Yan, Matthew Wiesner, Ondrej Klejch, Preethi Jyothi, Shinji Watanabe}
_{2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023}
_{paper / poster / TLDR}

CMU’s IWSLT 2022 Dialect Speech Translation System
_{Brian Yan, Patrick Fernandes, Siddharth Dalmia, Jiatong Shi, Yifan Peng, Dan Berrebbi, Xinyi Wang, Graham Neubig, Shinji Watanabe}
_{Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT), 2022}
_{Winning submission to the IWSLT 2022 Dialectal Track}
_{paper / talk}

Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization
_{Brian Yan, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Siddharth Dalmia, Dan Berrebbi, Chao Weng, Shinji Watanabe, Dong Yu}
_{2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022}
_{paper / talk / poster / TLDR}

My Google Scholar is more comprehensive.

Activities

Talks

Controllable and Explainable End-to-End Speech Translation
_{SIG SLT Seminar, 2022}

Code-Switched Modeling
_{JSALT Workshop, John’s Hopkins University, 2022}

Building End-to-End Speech Translation Systems
_{JSALT Workshop, John’s Hopkins University, 2022}

Teaching

CS 11-751: Speech Recognition and Understanding
_{Teaching Assistant}
_{Carnegie Mellon University, Fall 2023}

CS 11-700: Language Technologies Institute Colloquium
_{Teaching Assistant}
_{Carnegie Mellon University, 2021-22 Academic Year}

CS 11-737: Multilingual NLP
_{Teaching Assistant}
_{Carnegie Mellon University, Spring 2021 DSTA Course}

Academic Service

Reviewer
_{ICASSP, Interspeech, ACL, EMNLP, NAACL, SLT, ASRU, APSIPA}

Contact

Email: byan[at]cs.cmu.edu