publications
publications by categories in reversed chronological order. generated by jekyll-scholar.
2025
- Tech ReportOverview of the Amphion Toolkit (v0. 2)arXiv preprint arXiv:2501.15442, 2025TL;DR: This is the technical report for the second version of the Amphion toolkit.
2024
- SLT 2024Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech GenerationIn 2024 IEEE Spoken Language Technology Workshop (SLT), 2024TL;DR: We collect a 100k hours in-the-wild speech dataset for speech generation.
- SLT 2024Amphion: an Open-Source Audio, Music, and Speech Generation ToolkitIn 2024 IEEE Spoken Language Technology Workshop (SLT), 2024TL;DR: We develop a unified toolkit for audio, music, and speech generation.
- Debatts: Zero-shot debating text-to-speech synthesisarXiv preprint arXiv:2411.06540, 2024
2023
- ROME: Testing image captioning systems via recursive object meltingIn Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, 2023