Patrick Fernandes @psanfernandes
PhD Student @LTIatCMU & @istecnico Previously research @Google, @Microsoft & @Unbabel patricksf.dev Pittsburgh Joined February 2011-
Tweets152
-
Followers531
-
Following237
-
Likes220
Today we release the Tower paper! 🗼 Tower is an open-weight suite of multilingual models — built on top of LLaMA-2 — for translation-related tasks. It supports 10 different languages. Paper: arxiv.org/pdf/2402.17733… Models and data: huggingface.co/collections/Un… 🧵Thread below.
Interested in models of associative memory with *exact* retrieval and supporting memory *structure*? 🚀Check out our new paper on sparse and structured Hopfield networks (arxiv.org/abs/2402.13725)! With @vnfrombucharest, @mcneural_ and @andre_t_martins .
CroissantLLM 🥐 A Truly Bilingual French-English Language Model paper page: huggingface.co/papers/2402.00… introduce CroissantLLM, a 1.3B language model pretrained on a set of 3T English and French tokens, to bring to the research and industrial community a high-performance, fully…
CMU is hiring a system administrator for our GPU cluster (350 GPUs and growing!): cmu.wd5.myworkdayjobs.com/en-US/CMU/deta… Come work with us to help build out compute infrastructure to enable new discoveries in AI, large language models, and beyond!
Excited to receive an Outstanding Paper award for this work at @emnlpmeeting! Thanks to my co-authors George Foster and @markuseful! Updated version available here: aclanthology.org/2023.emnlp-mai…
Excited to receive an Outstanding Paper award for this work at @emnlpmeeting! Thanks to my co-authors George Foster and @markuseful! Updated version available here: aclanthology.org/2023.emnlp-mai…
Part of the recipe for training a SOTA language model these days is RLHF/SFT based on human preferences. But…what are the different ways it can be used and collected? What open questions exist? We wrote a survey on this, and you can find out: Poster session 1 (11:00am Dec 8)
Part of the recipe for training a SOTA language model these days is RLHF/SFT based on human preferences. But…what are the different ways it can be used and collected? What open questions exist? We wrote a survey on this, and you can find out: Poster session 1 (11:00am Dec 8)
Researchers 🔎 at the Instituto Superior Técnico propose the integration of quality metrics as reward models into the #MT pipeline to enhance #translation quality. @psanfernandes @tozefarinhas @andre_t_martins @istecnico @itnewspt @CarnegieMellon @Unbabel slator.com/research-shows…
I have a post-doc position open at @LTIatCMU, starting Summer or Fall 2024. If you are interested in working with me at CMU on LLMs, agents, machine learning for NLP, or intelligent evaluation please apply (it should take about 5-10 minutes): forms.gle/FbAdCcxBTPWAwe…
If you want to study NLP, LLMs, or broader language technology in grad school, please apply to @LTIatCMU! lti.cs.cmu.edu/apply-lti We have a great group of faculty covering many topics: lti.cs.cmu.edu/directory/all/… I personally will be recruiting students on LLMs/agents/evaluation.
Happy to share that our paper “Epsilon Sampling Rocks: Investigating Sampling Strategies for Minimum Bayes Risk Decoding for Machine Translation” has been accepted at @emnlpmeeting. This work further pushes the SOTA in inference methods for machine translation and NLG in…
Happy to share that our paper “Epsilon Sampling Rocks: Investigating Sampling Strategies for Minimum Bayes Risk Decoding for Machine Translation” has been accepted at @emnlpmeeting. This work further pushes the SOTA in inference methods for machine translation and NLG in…
What do decoding methods including self-consistency, output ensembling, and range voting have in common? They’re all variants of Minimum Bayes Risk (MBR) decoding! This useful and easy-to-apply method generalizes many modern generation techniques! arxiv.org/abs/2310.01387 👇🧵
It has been shown that MBR decoding significantly outperforms MAP decoding, but its high cost makes it impractical for most applications. Can we harvest the quality gains of MBR at train time w/o sacrificing inference-time efficiency? TLDR: yes! arxiv.org/abs/2309.10966…
I am very excited to officially announce that I joined the Department of Engenharia Informatica of Instituto Superior Tecnico as an assistant professor in AI !
I am very excited to officially announce that I joined the Department of Engenharia Informatica of Instituto Superior Tecnico as an assistant professor in AI !
Interested in document-level MT but have been held back by the lack of automatic metrics? If so, you won't want to miss our new paper! We study the quality of sentence-level metrics on long-form text and augment them with paragraph-level training data. arxiv.org/abs/2308.13506
LLM APIs let you build NLP systems in seconds; just write a prompt and use it as you wish. But APIs cost money and have privacy concerns. Our new library Prompt2Model turns a prompt into a small expert model that can match LLM performance but runs locally! github.com/neulab/prompt2…
Language AI research shows that fine-tuning LLMs with fine-grained human judgment data boosts MT evaluation. Team launches prompting technique: AutoMQM. @psanfernandes @gneubig @_danieldeutsch @andre_t_martins @gneubig @JonClarkSeattle @markuseful @orf_bnw slator.com/top-language-a…
The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation ai.papers.bar/paper/f81a8f42… Automatic evaluation of machine translation (MT) is a critical tool driving the... 🧵 👇
Many of you asked me the question about an automatic metric that can give us similar insights as MQM. We (mostly Patrick -- hands down one of the best student researchers I ever worked with) investigated how well LLMs can do MQM like error annotation. We present ... 🥳AutoMQM🥳
Many of you asked me the question about an automatic metric that can give us similar insights as MQM. We (mostly Patrick -- hands down one of the best student researchers I ever worked with) investigated how well LLMs can do MQM like error annotation. We present ... 🥳AutoMQM🥳
What is the key for good model-graded evaluation of generated text? Don't ask the model for a score, ask it to point out individual errors! Our new metric AutoMQM does this, and achieves SOTA evaluation accuracy and interpretable results for translation evaluation.
What is the key for good model-graded evaluation of generated text? Don't ask the model for a score, ask it to point out individual errors! Our new metric AutoMQM does this, and achieves SOTA evaluation accuracy and interpretable results for translation evaluation.
Kayo Yin @kayo_yin
8K Followers 556 Following PhD student @berkeley_ai @berkeleynlp working on interpretability and signed languages. Former @msftresearch @deepmind @carnegiemellon @polytechnique. 🇫🇷🇯🇵Graham Neubig @gneubig
31K Followers 586 Following Associate professor at CMU, studying natural language processing and machine learning.Andre Martins @andre_t_martins
2K Followers 397 Following NLP/ML researcher in Lisbon ([email protected])Danish Pruthi @danish037
7K Followers 628 Following Faculty at Indian Institute of Science, Bangalore. PhD from @LTIatCMU.Shruti Rijhwani @shrutirij
4K Followers 499 Following * Research Scientist @GoogleDeepMind * #NLProc research * PhD from @LTIatCMU * Amateur woodworker, scuba diver, foosball playerSiddharth Dalmia @siddalmia05
1K Followers 445 Following Research Scientist @GoogleDeepmind | #SpeechProc and #NLProc | PhD from @LTIatCMU @SCSatCMU | Ex-intern @GoogleAI, @AWSCloud, @FacebookAIShaily @shaily99
5K Followers 2K Following PhD @LTIatCMU Prev: @GoogleAI @MSFTResearch. Working on ethics and evaluation in #NLProc. Usually ranting, often about research & DEI. 📚 @readsndrantsVivek Gupta @keviv9
2K Followers 5K Following PostDoc @cogcomp UPenn | Ph.D. CS @UUtah | @iitkanpur. @Bloomberg & @MSFTResearch Fellow | ex-@MetaAI, @IBM, @Verisk, @samsungresearch, @Synopsys #nlp #mlKelly Marchisio (St. .. @cheeesio
1K Followers 558 Following Member of Technical Staff @cohere. Formerly: PhD @jhuclsp Alexa Fellow @amazon dev @Google MPhil @cambridgenlp EdM @hgse 🔑🔑¬🧀 (@kelvenmar20)Gabriele Sarti @gsarti_
2K Followers 2K Following PhD Student @GroNLP 🐮, core dev of @InseqLib (https://t.co/tTjrg26ygQ). Interpretability ∩ HCI ∩ #NLProc. Prev: @AmazonScience, @Aindo_AI, @ItaliaNLP_Lab.Nuno M. Guerreiro @nunonmg
303 Followers 291 Following Research Scientist at @unbabel, PhD Student at @istecnico. From Lisbon, Portugal 🇵🇹.Pasquale Minervini �.. @PMinervini
7K Followers 4K Following Researcher in ML/NLP at the University of Edinburgh (faculty @InfAtEd @EdinburghNLP), @ELLISforEurope, @UCL_NLP, PI for @Clarify2020, https://t.co/WydvfU8ugz he/theyMachel Reid @machelreid
2K Followers 1K Following Research Scientist @GoogleDeepMind Working on LLMs on the Gemini Team; did gemini 1.5 proSara Hooker @sarahookr
39K Followers 7K Following I lead @CohereForAI. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, @trustworthy_ml. Changing spaces where breakthroughs happen.Alexis Ross @alexisjross
3K Followers 887 Following phd-ing @MIT_CSAIL, interested in NLP for education | formerly nlp @allen_ai, comp sci & philosophy @harvard ‘20Antonis Anastasopoulo.. @anas_ant
3K Followers 2K Following Assist. Prof at George Mason CS #nlproc MT, ASR, and documentation of endangered languages.Mars Bay Studio @bazaerbaike
754 Followers 4K Following Embark on a journey with Mars Bay Studio through stunning 4K AI videos and a symphony of sounds spanning genres. Subscribe for exquisite visuals and tunes.Vikram Dutt @vd_
819 Followers 7K FollowingMiranda Gershmel @MGershmel20010
0 Followers 50 FollowingFoughshee @foughshee77453
138 Followers 2K Followingjoewood @joewood251
73 Followers 484 FollowingDuarte Alves @DuarteMRAlves
7 Followers 33 FollowingIbrahim Ahmad @Ibrahim63433664
85 Followers 3K FollowingTiago Pimentel @tpimentelms
1K Followers 248 Following Postdoc at @ETH_en. Formerly, PhD student at @Cambridge_Uni.Piyush @CatAstro_Piyush
315 Followers 843 Following Physics Grad student| Computational Physics| Natural Language Processing| Hydrogen StorageManuel Faysse @ManuelFaysse
199 Followers 232 Following NLP (LLMs) & ML Privacy - PhD Candidate @CentraleSupelec Prev: @imperialcollege, @epfl, @La_UPMCaio @UndefBehavior
586 Followers 1K Followingrickerp @rickerp_
50 Followers 41 Following MSc. Computer Science and Engineering Student @istecnicoCarlos Pinheiro @Capsbrr
24 Followers 391 FollowingBen Holfeld @BenHolfeld
89K Followers 32K Following SF AI Studio Lead @Accenture, partnering with @OpenAI @Google @Microsoft. Pianist. German Quantum Physicist. Creator of the Nth Floor. Views are my own. x/acc.Prince Andrew @PrinceAndr73224
3 Followers 180 FollowingLuke Zettlemoyer @LukeZettlemoyer
8K Followers 2K Followingpawann k. @pawaniiit
220 Followers 4K Following Prof., PhD, Inria, France, Postdoc KU Leuven, Fraunhofer ITWM, FU Berlin. I like Machine learning and mathematics.jluite @jluite2014
299 Followers 4K FollowingVanya BK @VanyaBk
21 Followers 751 Followingnick nassuphis @NNassuphis
119 Followers 5K Followinglucie_nlp @lucie_nlp
4K Followers 4K Following #NLProc researcher, dreamer, traveller, space enthusiast. Comp. sci. prof @CaisaLab - @UniBonn, @LamarrInstitute . Past:Alexa,@CERN,@Google,@CVUTPrahaRochelle Choenni @ChoenniRochelle
104 Followers 179 Following PhD candidate in NLP at the University of Amsterdam. I am supervised by prof. Ekaterina Shutova (UvA) and dr. Dan Garrette (Google Research).Faria Huq @FariaHuqOaishi
572 Followers 1K Following PhD Student @SCSatCMU working with @jeffbigham working on Agents 🤖 and Interaction📱. Prev- SGI Fellow'21 @MIT_CSAIL, Tero labs.María Benavente @_mariabg
1K Followers 894 Following I like data and the stories behind it 🙆🏽✨ Language Engineer @apple – helping build Siri Opinions are my own, etcBeatriz Borges @obiwit
71 Followers 68 Following #NLProc PhD student at #ICepfl - aiming to better align language models with us!Shijie Chen @ShijieChen98
92 Followers 142 FollowingBaban @imbaban00
41 Followers 449 FollowingPinzhen "Patrick" Che.. @pinzhen_chen
77 Followers 224 Following Working on LLMs and MT @EdinburghNLP @InfAtEdDaniel Spokoyny @daniel_spokoyny
324 Followers 2K Following Virtual PhD @ CMU LTI. Core Team at Climate Change AIsheoguo @sheo63009008
87 Followers 1K Following A third-year master student researching in NLP(especially for multilingual MT, low resource MT); A football fun⚽️; A guitar player🎸.Haggard @Haggard743712
3 Followers 105 Following Studied crypto I'm in https://t.co/FCgPImLSZO last year, earned over $2M, achieved financial freedom, This has enabled me to kick-start my global travel plan!Pierre Colombo @PierreColombo6
450 Followers 1K Following Associate Professor at Université Paris Saclay - CentraleSupelec - NLP - GenAITelmo Felgueira @TSFelg
108 Followers 789 Following Making ML work in the real world. Senior ML Engineer @lokahqAkkarawat Susiriprapa.. @S27201Akkarawat
57 Followers 161 Following One more question Akkarawat I think 🤔maviay @maviay21_
0 Followers 16 FollowingHarsh Desai @dreamerharsh
1 Followers 3K FollowingMohammed Amine BEN CH.. @AmineLehocine
34 Followers 1K FollowingPhung Cheng Fei @salmon_shitake
435 Followers 5K FollowingCharly Wargnier @DataChaz
112K Followers 31K Following 🥑 DevRel @Streamlit @SnowflakeDB 🪶 𝕏 about #AI, #LLMs, #DataScience, #WebApps, #SEO 💕 My heart is open source 🌍 Nature Lover 👀 My views!Matthew Finlayson @mattf1n
798 Followers 868 Following First year PhD at @nlp_usc | Former predoc at @allen_ai on @ai2_aristo | Harvard 2021 CS & LinguisticsJung Ryoo @jwryoom
83 Followers 310 Following #Software #Cloud, Building a global software team, #DevOps, #MBA, Trinity College #Dublin #TCD #Ireland #Korea(((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingKayo Yin @kayo_yin
8K Followers 556 Following PhD student @berkeley_ai @berkeleynlp working on interpretability and signed languages. Former @msftresearch @deepmind @carnegiemellon @polytechnique. 🇫🇷🇯🇵Graham Neubig @gneubig
31K Followers 586 Following Associate professor at CMU, studying natural language processing and machine learning.Andre Martins @andre_t_martins
2K Followers 397 Following NLP/ML researcher in Lisbon ([email protected])Danish Pruthi @danish037
7K Followers 628 Following Faculty at Indian Institute of Science, Bangalore. PhD from @LTIatCMU.Andrej Karpathy @karpathy
978K Followers 904 Following 🧑🍳. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets 🧠🤖💥Shruti Rijhwani @shrutirij
4K Followers 499 Following * Research Scientist @GoogleDeepMind * #NLProc research * PhD from @LTIatCMU * Amateur woodworker, scuba diver, foosball playerYann LeCun @ylecun
711K Followers 718 Following Professor at NYU. Chief AI Scientist at Meta. Researcher in AI, Machine Learning, Robotics, etc. ACM Turing Award Laureate.Sasha Rush @srush_nlp
52K Followers 464 Following Professor, Programmer in NYC. Cornell Tech, Hugging Face 🤗 https://t.co/cZl0wTfqGzSiddharth Dalmia @siddalmia05
1K Followers 445 Following Research Scientist @GoogleDeepmind | #SpeechProc and #NLProc | PhD from @LTIatCMU @SCSatCMU | Ex-intern @GoogleAI, @AWSCloud, @FacebookAIZachary Lipton @zacharylipton
59K Followers 2K Following Professor: CMU/@acmi_lab, CTO / CSO: @AbridgeHQ, Creator: @d2l_ai & https://t.co/QQt98VNLUp, Relapsing 🎷Shaily @shaily99
5K Followers 2K Following PhD @LTIatCMU Prev: @GoogleAI @MSFTResearch. Working on ethics and evaluation in #NLProc. Usually ranting, often about research & DEI. 📚 @readsndrantsKyunghyun Cho @kchonyc
61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Sebastian Ruder @seb_ruder
80K Followers 1K Following Multilingual LLMs @cohere • Prev: @GoogleDeepMind • Newsletter: https://t.co/7JGh2qpG98François Chollet @fchollet
469K Followers 770 Following Deep learning @google. Creator of Keras. Author of 'Deep Learning with Python'. Opinions are my own.Kelly Marchisio (St. .. @cheeesio
1K Followers 558 Following Member of Technical Staff @cohere. Formerly: PhD @jhuclsp Alexa Fellow @amazon dev @Google MPhil @cambridgenlp EdM @hgse 🔑🔑¬🧀 (@kelvenmar20)Gabriele Sarti @gsarti_
2K Followers 2K Following PhD Student @GroNLP 🐮, core dev of @InseqLib (https://t.co/tTjrg26ygQ). Interpretability ∩ HCI ∩ #NLProc. Prev: @AmazonScience, @Aindo_AI, @ItaliaNLP_Lab.Google DeepMind @GoogleDeepMind
943K Followers 275 Following We’re a team of scientists, engineers, ethicists and more, committed to solving intelligence, to advance science and benefit humanity.Alexis Conneau @alex_conneau
24K Followers 113 Following Audio AGI Research Lead @OpenAI - GPT-Next - Past: XLM, Unsupervised ASR, Unsupervised MT, Wav2vec 2.0/XLSR, MUSE, Unsupervised cross-lingual transferDuarte Alves @DuarteMRAlves
7 Followers 33 FollowingManuel Faysse @ManuelFaysse
199 Followers 232 Following NLP (LLMs) & ML Privacy - PhD Candidate @CentraleSupelec Prev: @imperialcollege, @epfl, @La_UPMLuke Zettlemoyer @LukeZettlemoyer
8K Followers 2K FollowingRoger This @RogerThisdell
1K Followers 204 Following "When there is a centre to the knowing there is dukkha" Meditation/Phenomenology tutoring for (de)objectifying the mind.Andrew Gallimore @alieninsect
29K Followers 662 Following Tokyo. Computational neurobiologist, chemical pharmacologist, writer interested in psychedelics, especially DMT. Reality Switch Technologies OUT NOWJohn Schulman @johnschulman2
39K Followers 609 Following Cofounder @openai, lead post-training for ChatGPT and the API. Interested in reinforcement learning, alignment, birds, jazz musicEli Steele @Hebro_Steele
27K Followers 368 Following Filmmaker. American. Principles cannot be canceled. Substack: https://t.co/zURrTUrmAz & https://t.co/8wLzgRDGOy https://t.co/NQkInYewt9Teortaxes▶️ @teortaxesTex
7K Followers 1K Following Ours is the age of unaligned utilitarians. Other problems are relatively unimportant, but sometimes I tweet about them anyway. (кто/кого)Sweta Agrawal @swetaagrawal20
946 Followers 1K Following Postdoc Researcher @itnewspt | Ph.D. @ClipUmd, @umdcs #nlprocRochelle Choenni @ChoenniRochelle
104 Followers 179 Following PhD candidate in NLP at the University of Amsterdam. I am supervised by prof. Ekaterina Shutova (UvA) and dr. Dan Garrette (Google Research).María Benavente @_mariabg
1K Followers 894 Following I like data and the stories behind it 🙆🏽✨ Language Engineer @apple – helping build Siri Opinions are my own, etcBeatriz Borges @obiwit
71 Followers 68 Following #NLProc PhD student at #ICepfl - aiming to better align language models with us!Shijie Chen @ShijieChen98
92 Followers 142 FollowingWenda Xu @WendaXu2
675 Followers 319 Following PHD candidate at UCSB’s NLP lab coadvised by William Wang and Lei LiPierre Colombo @PierreColombo6
450 Followers 1K Following Associate Professor at Université Paris Saclay - CentraleSupelec - NLP - GenAIRobin Carhart-Harris @RCarhartHarris
60K Followers 3K Following Ralph Metzner Distinguished Professor of Neurology & Psychiatry, UCSF. Psychedelic neuroscientist and psychologist.Aakanksha Chowdhery @achowdhery
7K Followers 3K Following LLMs @ Google DeepMind :: PaLM, Gemini // Previously @MSFTResearch, @Stanford, @Princeton // views my own and subject to changeRick Zeifman @RickZeifman
524 Followers 302 Following Postdoctoral Fellow, @nyulangone Center for Psychedelic Medicine Researching borderline personality disorder, trauma-related sequelae, and psychedelic therapyMatt Post @mjpost
2K Followers 2K Following Machine translation research for big tech and big academia and director of the @aclanthology. Tweets here are mostly personal.Dan Deutsch @_danieldeutsch
267 Followers 78 Following Research Scientist at Google Translate working on text generation evaluationDavid Mortensen @dmort27
2K Followers 2K Following I make colorless green GPUs sleep brriously @LTIatCMU. phonology • morphology • language change • #NLProc data resources.Susan Zhang @suchenzang
20K Followers 503 Following @ Google Deepmind. Past: @MetaAI, @OpenAI, @unitygames, @losalamosnatlab, @Princeton etc. Always hungry for compute.Leena Mathur @lmathur_
618 Followers 906 Following PhD student @SCSatCMU @LTIatCMU. I work on multimodal AI, embodied AI, & human-machine interaction w/ @lpmorency. prev @USC, @Caltech, @EPFLAndrew McCalip @andrewmccalip
67K Followers 884 Following Building space capsules and robots @vardaspace. Building silly stuff @ https://t.co/UQ3XclTUSF Former: Co-Founder Cosine Additive, acquired by GEMichael Saxon @m2saxon
2K Followers 1K Following CS PhD cand @ucsbNLP 🌊🌴 @NSF GRFP 🧐analyzing semantics in generative lang/img AI models🤖 Big tech ex-intern. BS/MS @ASU 🌵🏜 🔜 @AMD opensrc GenAI RS internShreya Gupta @Shreyagupta08
5K Followers 2K Following ML @Nvidia | Prev: @LaminiAI, MS @Stanford, Google Translate, @WWCode_Delhi, @MLNerdieDelhi, @ipam_ucla & WTM Scholar'19.Ohad Rubin @OhadRubin
712 Followers 3K Following P.hD student. Researching Natural Language Processing at Tel Aviv University. Let's have more paperclips? 📎⏩Ruben Laukkonen @RubenLaukkonen
5K Followers 1K Following The empirical and the ineffable walk into a bar. Ass’t Professor x Zen Fool. https://t.co/xyT5UbvdaUAsher Trockman @ashertrockman
470 Followers 134 Following CS PhD student at Carnegie Mellon UniversityDarcey Riley @DarceyNLP
853 Followers 574 Following Computational epistemologist. Grad student @ND_CSE working on #NLProc with Prof. @davidweichiang.Mengzhou Xia @xiamengzhou
3K Followers 618 Following PhD student @princeton_nlp, MS @CarnegieMellon, Undergrad at Fudan.Nat McAleese @__nmca__
3K Followers 305 Following Superalignment by models helping humans help models help humans at OpenAI. Previously @DeepMind. Views my own.Pedro Martins @PedroHenMartins
75 Followers 586 Following Research Scientist at Unbabel | PhD in Machine learning and NLP | LiberalAmanda Bertsch @abertsch72
1K Followers 673 Following PhD student @LTIatCMU / @SCSatCMU, researching text generation + summarization | she/her | also @ abertsch on bsky or https://t.co/L4HBUh0R9f or by email (https://t.co/bsHqwIMFPL)Bruno Martins @bgmartins
492 Followers 560 Following Associate Prof. at Técnico (@istecnico) and researcher at INESC-ID (@InescID). Mostly AI, NLP and GIScience. Occasionally music, synths and comics.Rodrigo Mira @RodrigomiraA
290 Followers 571 Following Researcher @MetaAI, PhD grad @imperialcollege, ex-research intern 2x @MetaAI 1x @sony_jpn. Audio-visual speech + SSL + generative modelling.Daniel Fried @dan_fried
3K Followers 797 Following Assistant prof. @LTIatCMU @SCSatCMU, working on NLP: language interfaces, applied pragmatics, language-to-code, grounding. 🐘: @[email protected]UTTER @UTTERProject
63 Followers 279 Following UTTER - Unified Transcription and Translation for Extended Reality - is a collaborative Research and Innovation project funded under Horizon EuropeSherry Tongshuang Wu @tongshuangwu
5K Followers 1K Following Assist. Prof @SCSatCMU , CS PhD @uwcse. HCI+AI, map general-purpose models to specific use cases! prev. intern @MSFTResearch @GoogleAI @Apple. She/her.Mel Andrews @bayesianboy
26K Followers 9K Following PhD philosophy of science, AI ethics, mathematical modelling, machine learning, applied maths, science reform. Views expressed here are personal.Javier Ferrando @javifer_96
275 Followers 477 Following PhD Student @la_UPC. Interpretability in NLP💡Now that speech discretization is getting good, it's time to revisit **cross-modal multi-tasking** Turns out we can now perform speech-to-text and text-to-text multi-tasking WITHOUT any modality specific parameters or cross-modal regularization Paper: arxiv.org/abs/2309.15826
I used to think of speech and text as totally distinct: Speech = continuous Text = discrete But now **speech discretization** is getting good Does this mean that speech systems will start adopting more text-based methods? SpeechLMs are a hot topic already 🤔 1/N
I'm super proud of the work we have been doing in Tower! Check the paper to see how we've developed the strongest open-weight LLM for translation.
Today we release the Tower paper! 🗼 Tower is an open-weight suite of multilingual models — built on top of LLaMA-2 — for translation-related tasks. It supports 10 different languages. Paper: arxiv.org/pdf/2402.17733… Models and data: huggingface.co/collections/Un… 🧵Thread below.
Today we release the Tower paper! 🗼 Tower is an open-weight suite of multilingual models — built on top of LLaMA-2 — for translation-related tasks. It supports 10 different languages. Paper: arxiv.org/pdf/2402.17733… Models and data: huggingface.co/collections/Un… 🧵Thread below.
🌐Exciting News in Machine Translation! 🚀MetricX-23, our SOTA evaluation metric, is now OPEN-SOURCE in PyTorch/Transformers! 🎉There are three model sizes available, all trained on 1m+ human judgments of MT quality! 🔗Code github.com/google-researc… 🔗Paper www2.statmt.org/wmt23/pdf/2023…
How do modern RNNs/SSMs such as Mamba perform on in-context learning tasks? How do they relate to attention-based models like Transformers? We find that modern RNNs can implement attention and that they leverage it to solve ICL tasks in an attention-based manner! (1/6)
How can we check LLM outputs in domains where we are not experts? We find that non-expert humans answer questions better after reading debates between expert LLMs. Moreover, human judges are more accurate as experts get more persuasive. 📈 github.com/ucl-dark/llm_d…
Very excited to share the paper from my last @GoogleAI internship: Scaling Laws for Downstream Task Performance of LLMs. arxiv.org/pdf/2402.04177… w/ Natalia Ponomareva, @hazimeh_h, Dimitris Paparas, Sergei Vassilvitskii, and @sanmikoyejo 1/6
CroissantLLM 🥐 A Truly Bilingual French-English Language Model paper page: huggingface.co/papers/2402.00… introduce CroissantLLM, a 1.3B language model pretrained on a set of 3T English and French tokens, to bring to the research and industrial community a high-performance, fully…
I'm excited to share that our paper "Non-Exchangeable Conformal Risk Control" (with @chryssaZrv @dnnslmr @andre_t_martins) has been accepted at #ICLR2024. Check out the updated version of the paper: arxiv.org/abs/2310.01262. See you in Vienna!
Conformal prediction has recently sparked a lot of interest, but what if coverage is not your main concern? What if the data is not i.i.d.? Check our new preprint on “Non-Exchangeable Conformal Risk Control” with @chryssaZrv @dnnslmr @andre_t_martins: arxiv.org/abs/2310.01262. 1/N
[1/8] Our new work (w/ @AriannaBisazza @gchrupala @MalvinaNissim) is finally out! 🎉 We introduce PECoRe, an interpretability framework using model internals to identify & attribute context dependence in language models. 📄Paper: arxiv.org/abs/2310.01188 #NLProc #neuralempty
[5/8] PECoRe is a contrastive framework building on prev work by @kayo_yin @psanfernandes @j_vamvas to enable the end-to-end extraction of context-sensitive generated words and contextual cues contributing to their predictions using only model internals.
Thrilled to share TowerLLM-7b-v0.1: a 7B multilingual LLM that excels at translation-related tasks. Check the blog post and the 2 models released on Huggingface: TowerBase and TowerInstruct.
Introducing Tower our cutting-edge multilingual #LLM for translation-related tasks! 🚀 With 7B parameters and support for 10 languages, Tower dominates in pre-translation tasks and machine translation. 🌎 Explore the future of #NLP now 👉 hubs.li/Q02g7_9B0
Great to see a paper about MBR Decoding being recognized with a best paper reward. In 2021, @BryanEikema's work brought MBR back into business and I am obviously a big fan and my group is pushing on this as well. aclanthology.org/2022.emnlp-mai… MBR Decoding is sooo much better than…
Best short paper is about minimum Bayes risk decoding. #EMNLP2023
Part of the recipe for training a SOTA language model these days is RLHF/SFT based on human preferences. But…what are the different ways it can be used and collected? What open questions exist? We wrote a survey on this, and you can find out: Poster session 1 (11:00am Dec 8)
*Human feedback* was the necessary secret sauce in making #chatgpt so human-like But what exactly is feedback? And how can we leverage it to improve our models? Check out our new survey on the use of (human) feedback in Natural Language Generation! arxiv.org/abs/2305.00955 1/16
Researchers 🔎 at the Instituto Superior Técnico propose the integration of quality metrics as reward models into the #MT pipeline to enhance #translation quality. @psanfernandes @tozefarinhas @andre_t_martins @istecnico @itnewspt @CarnegieMellon @Unbabel slator.com/research-shows…
Can we not criticize LLM but pinpoint errors it makes and automatically guide it with fine-grained actionable feedback? Can we formulate iterative refinement into a local search problem, simulated annealing? My cool summer intern work @Google @_danieldeutsch @markuseful @ucsbNLP
Large language models (LLMs) can make small talk with you. But can they navigate more difficult real-life social scenarios? 👋 Meet SOTOPIA sotopia.world - our new multi-agent social environment from CMU that answers this question (collab w/ @nlpxuhui et al.). 🤖
Existing methods for aligning language models with human preferences (prompting, labeling) narrowly restrict how preferences may be specified or place the burden entirely on the human. We use LMs to *interactively ask questions* about a human’s preferences. 🧵👇 1/
My first post on medium explores probability modelling with Markov Chains through an exploration of the game "Shut the Box". If this sounds interesting, feel free to give it a read!
Modeling Games with Markov Chains - In this article, Kairo Morton shows how Markov Chains can be applied to the game “Shut the Box,” in hopes of inspiring you to use probability to answer your own game related questions: buff.ly/3PGbass