On March 13, 2024, an engaging seminar titled “Seminar on a Robust Text-to-Speech System in Bangla with Stochastic Duration Predictor” took place at the auditorium of the Department of Computer Science and Engineering, University of Dhaka. This seminar marked the culmination of the “Speech Synthesis in Bangla” project, which commenced in July 2023 and received funding from the University Grants Commission, Government of Bangladesh. The primary objective of the project was to curate a vast, high-quality dataset of audio recordings from a single speaker for speech synthesis applications, and subsequently develop and train a Bangla speech synthesis model using this dataset.
The seminar provided a comprehensive overview of the project's journey and presented the key research findings to attendees. Professor Dr. Md Rezaul Karim served as the Principal Investigator, while Mr Md Ashraful Islam acted as the Co-Investigator of this project. They were supported by Abdullah Ibne Masud and Mushahid Intesum, research associates from the 25th batch of students. This event was a culmination of collaborative efforts aimed at advancing Bangla language technology.
The seminar attracted current students from the department as part of its audience. Notably, the event was honored by the presence of Prof. Dr. Abdur Razzaque, the chairman of the Department of Computer Science and Engineering, Prof. Dr. Suraiya Pervin and Prof. Dr Saifuddin Md Tareeq. Dr. Md Rezaul Karim, the project's Principal Investigator, served as the host for the seminar. Unfortunately, Mr. Md Ashraful Islam was unable to attend due to being on study leave for his PhD pursuits.
Dr. Md Rezaul Karim took the stage as the host of the seminar, delivering a concise introduction outlining the project's objectives and overall outcomes. Following this, Mushahid Intesum and Abdullah Ibne Masud delved into the project's specifics. The project was divided into two main phases: data collection and model training.
The group showcased promising outcomes achieved with their model, highlighting improvements in audio perception quality, operational efficiency, audio duration metrics, and convergence rates. This presentation emphasized the advancements made in developing a robust and effective Bangla TTS system.
Following the presentations on data collection and the proposed TTS model, a lively and engaging Q&A session ensued, highlighting the audience's deep interest and insightful inquiries. Attendees, particularly students from our department, posed questions that underscored their curiosity and eagerness to understand the project's nuances. The questions ranged from exploring potential enhancements to the TTS model to inquiries about the effectiveness of specific modules and seeking overall clarity on the project's methodologies. This interactive session reflected the enthusiastic learning spirit within our department's student community, fostering a vibrant exchange of ideas and constructive feedback. The interactive Q&A session reflected the genuine interest and enthusiasm of our student community in exploring cutting-edge research and technology. The valuable feedback and insightful questions posed during the session will undoubtedly contribute to further refining and improving the project's outcomes.
We extend our gratitude to all participants, attendees, and sponsors for their support and engagement. This seminar not only showcased the achievements of the project but also highlighted the collaborative and forward-thinking environment fostered within our department.
Looking ahead, we are excited about the potential impact of this research on advancing Bangla language technology and look forward to future endeavors that will continue to push the boundaries of innovation in our field.
Thank you all for being a part of this enriching journey.