Key Responsibilities
Research and development in one or more of the following projects:
1. AI Speech
- Full-duplex conversational system – Intelligent interruption, waiting, target speaker identification, semantics understanding, and back-channeling. Also for Audio end-to-end LLM.
- Speech understanding – Speech recognition like Hubert、Dialogue understanding; Preserve speaker intonation and pace across languages for Speech-to-speech translation and Robust ASR for Noisy environments, multilingual input, various accents, and so on.
2. AI Music
- AI Music understanding and generation – Develop next-generation, LLM-friendly music codec capturing expressive characteristics at low bitrates;End-to-end generation from text descriptions, lyrics, and/or singing voice.
- Audio Editor – Foundational MIR technology; editable music accompaniment generation for singing voice or instrumental performance; music continuation; Voice and sound separation、AI mixing/remastering.
3. Others
- Track cutting-edge technologies in the industry and implement innovative algorithms to address complex scenario speech understanding issues.
Qualifications
The Job rank (Junior / Senior) depends on candidate’s qualification and experience.
- PhD/ Masters in computer science, audio engineering or related field, with strong publication record demonstrating innovative research.
- Strong understanding in the latest AI technology, familiar with such as Transformer, Diffusion, LORA. Strong experience in LLM training, fine-tuning and RAG.
- Proficient in python or C++, deep understanding in common algorithms and data structures, familiar with PyTorch or any machine learning framework.
Optional Qualification
- Experienced in traditional audio DSP technique.
- Experienced in pop music production and music performance.