About the Team
The speech team's mission is to empower content understanding, interaction and creation across a global popular appand other products using speech&audio related technologies. We focus on cutting-edge R&D in areas like speech & audio, music processing, natural language understanding and multimodal deep learning. We are looking for top talents to work on these exciting technologies, integrate them into this global popular app and other products and ultimately bring joy to our global user base!
Responsibilities
- Conduct cutting-edge research and development in speech / audio foundation models.
- Contribute to the advancement of audio understanding, including multilingual speech recognition, speech translation, multimodal understanding and etc.
- Focus on and drive the practical application of relevant technologies in business scenarios, including but not limited to closed-captions, voice dubbing, video understanding.
Requirements
Qualifications
Master's or PhD in computer science, mathematics, engineering or related field.Experience in one or more areas of machine learning and deep learning, including but not limited to :Automatic Speech RecognitionAutomatic Speech TranslationSpeech / audio self-supervised learning and foundation modelsSpeaker recognition and verificationSpeech emotion recognitionMultimodal foundation modelsLarge Language Model pretraining and finetuningPreferred Qualifications
Publications in top-tier ML / DL venues such as NeurIPS, ICLR, ICML, AAAI and speech venues such as ICASSP, ASRU, Interspeech.Deep understanding of Large Language models.Familiar with distributed computing and large scale model training.Familiar with deep learning frameworks such as Tensorflow and Pytorch.Familiar with engineering principles and best practices.Highly competent in algorithms and programming; Strong coding skills in C / C++ and Python.Ability to work collaboratively in a fast-paced, multi-functional environment.