🚀 We're excited to announce our latest work: "Discrete Audio Tokens: More Than a Survey!"
It presents a comprehensive survey and benchmark of audio tokenizers across speech, music, and general audio.
preprint: arxiv.org/pdf/2506.10274
website: poonehmousavi.github.io/dates-website/
Proud of this work published a few days ago at SLT 2024 on continual learning for end to end ASR. Turns out changing the CL paradigm to parallel training on different tasks and merging these experts can reduce the forgetting rate to as low as 0.4%! poonehmousavi.github.io/assets/publica…
Join us for the Conversational AI Reading Group! 📚 We meet every Thursday, 11-12 AM EST, to discuss the latest advancements in conversational AI, multimodal models, and speech processing. Everyone is welcome! More info: poonehmousavi.github.io/rg.html & follow us on Twitter: @convAI2024
SpeechBrain version 1.0.2 is now out! My personal contribution is a clean adapters interface that allows custom adapters or integration with PEFT layers, your choice. You can see the tutorial here:
speechbrain.readthedocs.io/en/latest/tuto…
We just released v1.0.1 of SpeechBrain with some cool updates to Whisper integration: various tasks supported, fine-tuning fixes, performance improvements, and more!
📢 I'll be presenting our paper "How Should We Extract Discrete Audio Tokens from Self-Supervised Models?" at InterSpeech! 🎙️
Meet us at the Speech Processing Using Discrete Speech Units, Oral Session on Sep 3, 16:20.
🔗 Paper: arxiv.org/abs/2406.10735#INTERSPEECH2024
We will have XAI-SA: Explainable Machine Learning for Speech and Audio, next week at ICASSP 2024. The date is April 15.
You can sign-up for it here to receive more information for it:
forms.gle/VPBP3Mojq3Ewqw…
Workshop website for the schedule:
xai-sa-workshop.github.io
For a deep, thoughtful discussion of proliferation, regulation, and why open source is the better—and safer—path to take with AI, I HIGHLY recommend this piece by @jeremyphoward. I learned much from it and encourage others to as well: fast.ai/posts/2023-11-…
For a deep, thoughtful discussion of proliferation, regulation, and why open source is the better—and safer—path to take with AI, I HIGHLY recommend this piece by @jeremyphoward. I learned much from it and encourage others to as well: fast.ai/posts/2023-11-…
Fresh paper out #EMNLP2023
LLMs excel in zero-shot text-to-SQL but still benefit greatly from in-domain demonstrations. This work is driven by two questions: (1) What are the key factors within in-domain examples? (2) Can we harness these benefits without in-domain annotations?
Fresh paper out #EMNLP2023
LLMs excel in zero-shot text-to-SQL but still benefit greatly from in-domain demonstrations. This work is driven by two questions: (1) What are the key factors within in-domain examples? (2) Can we harness these benefits without in-domain annotations? https://t.co/r3D2w23rLh
Excited to present at the SpeechBrain online summit on Monday 28th Aug
I'll be joined by @shinjiw_at_cmu, @functiontelechym, Daniel Povey, and Zhaoheng Ni for a panel discussion on open-source speech
It's not too late to register if you haven't already: speechbrain.github.io/sb_summit2023
197 Followers 236 FollowingTTS researcher at @lovolabs | we are building speech technology for highly expressive speech | interests: prosody control, singing synthesis, speaker modeling
361 Followers 61 FollowingA job board for Speech Tech Professionals - https://t.co/bOKlFRPHw1
Made with ❤️ by an ASR Engineer @DSwagger_online, read and reply every DM🥳
49K Followers 673 FollowingProfessor, Santa Fe Institute. Mostly posting on https://t.co/4NpA2IL5Va (at-melaniemitchell). More thoughts at https://t.co/nC43NHRozX.
476 Followers 30 FollowingScience, philosophy, epistemology, mayhem.
Brought to you by the suspiciously handsome @VadenMasrani and @BennyChugg
YT: https://t.co/XnZNOqKkXp
9K Followers 2K FollowingAsst Prof of Philosophy @UALR. Writing about technology, bioethics, and religion. Mother of daughters. Co-host of @philofringes podcast with @frankviii
1.1M Followers 2K FollowingFlaneur: probability (philosophy), probability (mathematics), probability (real life),Phoenician wine, deadlifts & dead languages. Greco-Levantine.Canaan. #RWRI
197 Followers 236 FollowingTTS researcher at @lovolabs | we are building speech technology for highly expressive speech | interests: prosody control, singing synthesis, speaker modeling
214 Followers 713 FollowingPhD student @ University of Cambridge | Core Maintainer @SpeechBrain1 | Researching multimodal AI (speech & text) | Working with a few tokens window 🤖
221K Followers 378 FollowingLinks to the latest Columbus area news as well as Ohio State Buckeye sports news from Ohio's Greatest Home Newspaper, publishing since 1871.
100K Followers 38 FollowingHi! I'm Molly and I'm interested in feline history and welfare. Owner of 2 amazing FIV+ cats! Header by W. Eugene Smith. Profile pic is a postcard from 1908.
13K Followers 72 FollowingCataloging wholesome music notation. Parody account & participant of the Threatening Music Notation Cinematic Universe. Banner by @threatnotation
4.3M Followers 772 FollowingVlogbrother, TB hater, AFC Wimbledon supporter. Wrote Anthropocene Reviewed & Turtles All the Way Down (out now on Max). Biz Q's: [email protected]
4K Followers 149 FollowingWelcome to the 26th Interspeech Conference, the premier global event on spoken language processing technology, held in August 17-21, 2025, in Rotterdam, NL.
10K Followers 174 FollowingRideshare, but owned by the drivers. Join the movement. Download our new app! 100% custom-built to build wealth for our community.
4K Followers 363 FollowingI'm working at CMU (2021-). I was working at NTT (2001-2011), MERL (2012-2017), and JHU (2017-2020). Speech and Audio Processing is my main research topic.
3K Followers 494 Followingdeaf | CI user | Ed AuD | founder: @AudOTB | studying PhD: peds aud, vestib | opinions my own | ASL is amazing! CIs and sign languages are not opposites! 🤟🏻