AllenNLP @ai2_allennlp
The AllenNLP team works on language-centered AI that equitably serves humanity. We deliver high-impact research and open-source tools to accelerate progress. allenai.org/allennlp Allen Institute for Artificial Intelligence Joined August 2018-
Tweets196
-
Followers14K
-
Following30
-
Likes77
Introducing our best OLMo yet. OLMo 1.7-7B outperforms LLaMa2-7B, approaching LLaMa2-13B at MMLU and GSM8k. High-quality data and staged training are key. I am so proud of our team making such significant improvement in a short period after our first release.
Introducing our best OLMo yet. OLMo 1.7-7B outperforms LLaMa2-7B, approaching LLaMa2-13B at MMLU and GSM8k. High-quality data and staged training are key. I am so proud of our team making such significant improvement in a short period after our first release. https://t.co/9NNwCxAwj6
Excited to share something that we've needed since the early open RLHF days: RewardBench, the first benchmark for reward models. 1. We evaluated 30+ of the currently available RMs (w/ DPO too). 2. We created new datasets covering chat, safety, code, math, etc. We learned a lot.…
Using our Open Instruct and Tulu 2, we adapt OLMo to acquire different capabilities and safety measures through fine-tuning and Direct Preference Optimization (DPO). The adapted models demonstrate quick improvement on popular reasoning tasks such as MMLU and TruthfulQA, and on…
Don't miss the deadline: Just 1 day left to apply to be a Predoctoral Young Investigator with AllenNLP! Submit your application by tomorrow, January 15th! --> boards.greenhouse.io/thealleninstit…
There are only 10 days left to apply to the Predoctoral Young Investigator program with AllenNLP! Don't miss your chance to work with world-class researchers creating AI for the common good. Applications must be submitted by January 15th --> boards.greenhouse.io/thealleninstit…
Reminder: Apply for the AllenNLP Predoctoral Young Investigator program! Exercise your passion for taking AI to the next level to solve problems that require modeling, reasoning, and more & collab. with world-class AI researchers. Apply by Jan. 15th --> boards.greenhouse.io/thealleninstit…
LMs are used to process text from many topics, styles, dialects, etc., but how well do they do? 📈 Evaluating perplexity on just one corpus like C4 doesn't tell the whole story 📉 ✨📃✨ We introduce Paloma, a benchmark of 585 domains from NY Times to r/depression on Reddit.
Are you looking to be part of AI for the common good while preparing for a PhD program? Consider a 1-3 year-long stint as a Predoctoral Young Investigator! Apply by Jan. 15th 2024 --> boards.greenhouse.io/thealleninstit…
We're taking applications for Predoctoral Young Investigators! Prepare for a PhD program and partner with top mentors on cutting edge NLP research. Apply by Jan. 15th, 2024 --> boards.greenhouse.io/thealleninstit…
We just released the MADLAD-400 dataset on @huggingface! Big (7.2T tokens), remarkably multilingual (419 languages), and cleaner than mC4, check it out: huggingface.co/datasets/allen…
Just released v0.9.0 of the Dolma toolkit 🍇 Lots of goodies (dataset tokenization support, new taggers, data analysis, etc), but the one I'm most proud of is that we now have.... ✨ proper documentation 💫 check it out at github.com/allenai/dolma/…, or `pip install dolma` 😊
LAST CALL: Apply by October 15th for a summer 2024 Research Internship with the AllenNLP team. World-class mentors and exciting NLP research awaits! Click and apply --> boards.greenhouse.io/thealleninstit…
Time is running out: Apply for a Summer 2024 AllenNLP Research Internship before the October 15th deadline and work with world-class mentors on NLP research. Visit this link to apply --> boards.greenhouse.io/thealleninstit… (internships designed for PhD students)
Don't forget to apply for a Summer 2024 Research Internship with the AllenNLP team by October 15th. Collaborate with world-class mentors on NLP research and contribute to AI for the common good! Apply at boards.greenhouse.io/thealleninstit… (internships designed for PhD students)
The deadline for summer 2024 Research Internships on the AllenNLP team is October 15th. If doing NLP research with world-class mentors sounds exciting to you, apply at: boards.greenhouse.io/thealleninstit… (internships designed for PhD students)
The deadline for Spring 2024 Research Internships at AllenNLP is July 15th, in two weeks. If you think 2024 is a great time to do NLP research with top mentors, apply at boards.greenhouse.io/thealleninstit…!
#nlphighlights 139: Kevin Yang (people.eecs.berkeley.edu/~yangk/) tell us about the challenges involved in generating coherent long stories from language models and his recent approach for doing so by recursively prompting these models and revising the outputs. soundcloud.com/nlp-highlights…
There are 12 days left to apply to the Predoctoral Young Investigator program with AllenNLP! Applications must be submitted by February 15th: boards.greenhouse.io/thealleninstit…
(((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingPercy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistDelip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Soumith Chintala @soumithchintala
186K Followers 883 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.Akari Asai @AkariAsai
11K Followers 650 Following Ph.D. student @uwcse & @uwnlp. NLP. IBM Ph.D. fellow (2022-2023). Meta student researcher (2023-) . ☕️ 🐕 🏃♀️🧗♀️🍳Sam Bowman @sleepinyourhat
35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.Thomas Wolf @Thom_Wolf
68K Followers 4K Following Co-founder and CSO @HuggingFace - open-source and open-scienceKyunghyun Cho @kchonyc
61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Luca Soldaini 🎀 @soldni
6K Followers 1K Following I like tokens! Lead for OLMo data team at @allen_ai (Dolma 🍇), OSS is fun, @QueerInAI organizer 🤖☕️🍕they/them (views mine, not my employer’s)Danish Pruthi @danish037
7K Followers 628 Following Faculty at Indian Institute of Science, Bangalore. PhD from @LTIatCMU.Allen Institute for A.. @allen_ai
54K Followers 361 Following AI for the Common Good. › Join us: https://t.co/DqTs1G4bGO › Get our newsletter: https://t.co/tvb1VpySfLOmar Sanseviero @osanseviero
31K Followers 2K Following Chief Llama Officer @huggingface 🦙 Founder @AI_Learners. Xoogler (SWE @Google Assistant, 20% PM TF Graphics). 100% Hacker Llama🇵🇪🇲🇽Bill Yuchen Lin 🤖 @billyuchenlin
6K Followers 2K Following Research @allen_ai. I evaluate (multi-modal) LLMs, build agents, and study the science of LLMs. Previously: @GoogleAI & @MetaAI FAIR @nlp_uscPasquale Minervini �.. @PMinervini
7K Followers 4K Following Researcher in ML/NLP at the University of Edinburgh (faculty @InfAtEd @EdinburghNLP), @ELLISforEurope, @UCL_NLP, PI for @Clarify2020, https://t.co/WydvfU8ugz he/theyAna Marasović @anmarasovic
4K Followers 604 Following Asst prof @UUtah · Ex @allen_ai @uwnlp postdoc @HD_NLP PhD · she/her 🇭🇷Yao Fu @Francis_YAO_
14K Followers 2K Following PhD @EdinburghNLP on LLMs and Machine Reasoning. Ex. @Columbia @PKU1898 @MITIBMLab @allen_ai AGI has yet to come, so keep runningShaily @shaily99
5K Followers 2K Following PhD @LTIatCMU Prev: @GoogleAI @MSFTResearch. Working on ethics and evaluation in #NLProc. Usually ranting, often about research & DEI. 📚 @readsndrantsLeo Boytsov @srchvrs
7K Followers 2K Following Sr. Research Scientist @AWS Labs (ph-D @LTIatCMU) working on unnatural language processing, speaking πtorch & C++. Opinions sampled from MY OWN 100T param LM.Shreya Kapoor @SKapoor_18
333 Followers 1K Following PhD @CogCoVi |Formerly Data Scientist @MPI-CBS| https://t.co/HWJLt7Jhwk. Life Science Informatics @UniBonnYihuai Hong @YihuaiH91773
25 Followers 157 Following CS Undergraduate interested in NLP research @SCUT previously Research Intern in @UCL张昊 @SYdMou
11 Followers 252 Following精神病狗婊子杂.. @frkglp
0 Followers 3K Following 神病狗婊子杂种邓小平,刘少奇就是整个世界的敌人,它那套歪把戏不除,世界战乱不断。Cgkl精神病狗婊子杂种习近平被凌迟处死。Cgk凌迟处死精神病狗婊子杂种中共狗屁家族邓小平,习近平,陈云,刘少奇,陈一新,张又侠,何卫东,刘振立,苗华,董军。锸s你跟踪本人的精神病狗婊子杂种全部中共空军、警察、台湾间谍Khipu Kamayuq @KamayuqKhipu
238 Followers 864 Following waranqa wataq unayachun ama chinkachunchu qichwa rimayninchik ch'uwalla kakuchunNédey Oriane @NedeyOriane
7 Followers 75 FollowingZachary Cross @med_zachary
862 Followers 834 Following 🤖 @GlassHealthHQ //👨🏻⚕ @NUFeinbergMed // @Penn 🎓 // former researcher @ChildrensPhila 🔬 // he/him 🏳🌈jinyu gao @jinyu_gao_
13 Followers 136 FollowingZhaotian Weng @WengZhaotian612
0 Followers 9 Followingddndd @ddndd20
0 Followers 6 FollowingNick Mumero @nickdee96
132 Followers 1K Following Cofounder at Continuum Ads. Focusing on NLP, Simulation Modelling and Optimization.Jack FitzGerald @jgmfitz
5 Followers 190 Following Principal, Applied Scientist at Amazon AGI org; AI model and system builder; LLM researchSimon Dobnik @SimonDobnik
121 Followers 287 Following Professor at University of Gothenburg, Sweden. NLP researcher and lecturer.Amani Abumansour @AmaniAbumansour
58 Followers 522 Following CS PhD student at @QMUL | Social Data Science, NLP @QMSDS | Claims Detection | Lecture @TaifUniversityJoe El Khoury @KhouryJoe76771
0 Followers 21 FollowingAanuoluwapo Abidoye @MedAlresearcher
79 Followers 2K Following Nascent Datascientist|MedAI Researcher # data science, # machine learning, # deep learning, # computational neuroscience(neural coding and brain computing)rigved bhargav @itz_vasuuuu
37 Followers 118 FollowingMelani @melanimaheswar1
41 Followers 449 Following LLM fan girl | prev data quality @cohere | optimize for knowledge | Toronto | 🇨🇦Jocelyn Shen @jocelynjshen
113 Followers 229 Following PhD student @MIT @medialab @MITprg formerly CS @MIT class of '21 she/herRobin Bordoli @rbordoli
13K Followers 1K Following Engineer turned startup executive. Emergent markets + technology discontinuities. CMO @weights_biases building the #generativeAI industry.The Chairman's Cat @ChairmansCat
24 Followers 215 Following Chief Mouser to the Chairman, Digital Native, Data Analytics, MBA in Data Diversity & InclusionVaibhav Singh @wolfcry3_0
24 Followers 449 Following Student | Python Programmer | Reinforcement Learning Enthusiasthappy @happyprincegood
2 Followers 54 FollowingXD J @xdj1110
10 Followers 24 FollowingYale Biomedical Infor.. @YaleBIDS
215 Followers 269 Following Our new @YaleMed department at the intersections of #healthsciences and IT develops new approaches to analyze biomedical #data to promote health for all 🏥📈VegetaAvatar @VeGeTaX29
18 Followers 2K FollowingAntoine Bonnet @BonnetAG
5 Followers 165 FollowingAsm.k @Kofasam99
9 Followers 49 FollowingMary-Anne Hartley (An.. @anniehartley_
150 Followers 202 Following LiGHT : Laboratory for Intelligent Global Health Technology @Yale @EPFL -- (I don't really use this platform anymore -- moved to LinkedIn)Yushi Hu @huyushi98
1K Followers 1K Following 🎓PhD student @uwnlp | Visiting Researcher @allen_ai Prev. @GoogleAI @UChicago @TTIC_Connect | NLP/CV/AI 📖🎹🪗📷⚽️HAFSA SADAF @HAFSA10177938
42 Followers 997 Following Bridging AI and Code || Engineer by Day, AI Enthusiast AlwaysAbdulrahman Tabaza @embed_dim
4 Followers 798 Following enjoyer of various vector spaces, encoders and modalitiesliujie @CoolWind6j
2 Followers 41 Followingupteronext @upteronext
42 Followers 162 FollowingSatish @satishke
132 Followers 1K Followingdzh886 @dengzihao88
22 Followers 586 FollowingJenHao @jenhaoyang
2 Followers 26 Following My research interests include Transformer, LLM, and generative AI. Currently, I am seeking job opportunities in AI, software design, and algorithm engineering.Azmine Wasi @AzmineWasi
156 Followers 2K Following AI/ML Research (GNN, HAI, NLP, XAI) 📊 Kaggle GM 📝 Explorer ⚙️ Seeking Research Opportunities 🧬📌Luca Soldaini 🎀 @soldni
6K Followers 1K Following I like tokens! Lead for OLMo data team at @allen_ai (Dolma 🍇), OSS is fun, @QueerInAI organizer 🤖☕️🍕they/them (views mine, not my employer’s)Allen Institute for A.. @allen_ai
54K Followers 361 Following AI for the Common Good. › Join us: https://t.co/DqTs1G4bGO › Get our newsletter: https://t.co/tvb1VpySfLAna Marasović @anmarasovic
4K Followers 604 Following Asst prof @UUtah · Ex @allen_ai @uwnlp postdoc @HD_NLP PhD · she/her 🇭🇷ACL 2024 @aclmeeting
18K Followers 35 Following Association for Computational Linguistics | ACL 2024 conference | The 62nd Annual Meeting of the ACL Hashtags: #NLProc #ACL2024NLPPete @epwalsh
51 Followers 88 Following Research Engineer at @allen_ai. Lead engineer for OLMo pretraining.Semantic Scholar Rese.. @ai2_s2research
571 Followers 23 Following Research team @allen_ai working on AI, HCI, ML, NLP, accessibility, and comp. social science in support of @SemanticScholar's mission of accelerating science.EarthRanger @EarthRangerTech
917 Followers 632 Following We help protected area managers, ecologists & wildlife biologists stay informed & make conservation-related decisions. A free product of Paul Allen's @allen_ai.Semantic Scholar @SemanticScholar
12K Followers 2K Following A free, AI-driven research tool for scientific literature with more than 200 million papers across all fields of study. Proudly built by @allen_ai.Hanna Hajishirzi @HannaHajishirzi
6K Followers 328 Following Associate professor at @uw_cse; senior director at @allen_ai co-leading @allenNLP; AI/NLP researcher at @uw_nlpKyle Lo @kylelostat
2K Followers 1K Following #nlproc #hci leading data research @allen_ai, he/him, bluesky https://t.co/5Hm9cx3UrzYejin Choi @YejinChoinka
19K Followers 330 Following professor at UW, director at AI2, adventurer at heartDoug Downey @_DougDowney
282 Followers 175 Following Research Manager at @allen_ai, Prof at @northwesterncsHamish Ivison @hamishivi
476 Followers 598 Following Antipodean Abroad. he/him. I (try to) do NLP research. PhD student @uwcse, prev @Sydney_Uni @allen_ai 🇦🇺🇨🇦🇬🇧Ani Kembhavi @anikembhavi
2K Followers 297 Following Senior Director @allen_ai + Affiliate Assoc Prof @UW 📷 : Visual Prog, Unified-IO, BiDAF 🤖 : ProcTHOR, Objaverse, SPOC 🌎 : SATLAS All views my own.Akshita Bhagia @AkshitaB93
210 Followers 92 Following Research Engineer at AI2, compulsive reader, random-things writer.Jesse Dodge @JesseDodge
3K Followers 2K Following Senior Research Scientist at AI2 @ai2_allennlp. Responsibly open work on the science of AI and AI for science. Environmental impact of AI. he/him 🏳️🌈Skylight @SkylightMarine
357 Followers 233 Following Helping reduce illegal, unreported & unregulated (IUU) fishing through technology that provides transparency & actionable intelligence for maritime enforcement.Arjun Subramonian (th.. @arjunsubgraph
2K Followers 2K Following Critical Graph ML, NLP @UCLA, @QueerinAI 🏳️🌈, FAccT socials, LoG outreach, birder🪶, prev: NeurIPS affinity eventsCommon Crawl Foundati.. @CommonCrawl
7K Followers 2K Following Common Crawl is a non-profit foundation dedicated to the Open Web.Papers with Code @paperswithcode
117K Followers 10 Following Our mission is to organize science by converting information into useful knowledge.Pradeep Dasigi @pdasigi
1K Followers 460 Following Senior Research Scientist at Allen Institute for AI (AI2)Sam Skjonsberg @codeviking
285 Followers 404 Following Mostly code. Always coffee. Sometimes bikes. Views are my own.Mechanical Dirk @mechanicaldirk
544 Followers 244 Following Principal Engineer at @allen_ai. Engineering Lead of the OLMo project.Matthew Peters @mattthemathman
2K Followers 572 Following Cofounder @SpiffyAI. Research Scientist at AI2 (@allenai_org).Noah A. Smith @nlpnoah
18K Followers 206 Following NLP&ML researcher. Prof @uwcse @uwnlp & helper @allen_ai @ai2_allennlp. Single reeds, tango, swim, run, cocktails, מאַמע־לשון, GenX. Opinions not your business.Mark Neumann @MarkNeumannnn
3K Followers 988 Following Head of ML at Orbital Materials. BC: Research/Eng at @allenai_orgMatt Gardner @nlpmattg
9K Followers 121 Following Researcher at Scaled Cognition. Formerly at Semantic Machines, @allenai (@ai2_allennlp, #nlphighlights).Oren Etzioni @etzioni
28K Followers 2K Following Founder, https://t.co/IQ6xAlnKcR. Professor Emeritus, UW. Technical Director, AI2 Incubator. Venture Partner, Madrona. Founding CEO, AIlen Institute for AI (AI2).We just crossed 100,000 organizations on HF! Some of my favorites: - The MLX community for on-device AI: huggingface.co/mlx-community - The @AiEleuther org with over 150+ datasets: huggingface.co/EleutherAI - The @Bloomberg org to show big financial institutions can use the hub:…
To me, the coolest part of this work is the code that can find the 1000 most common 100-grams in a multi-TB corpus on a single machine in a few hours. We are going to make bloom filters cool again, one project at a time!
What's In My Big Data? A question we've been asking ourselves for a while. Here is our attempt to answer it. 🧵 Paper - arxiv.org/abs/2310.20707 Demo- wimbd.apps.allenai.org
Congratulations to this outstanding set of reviewers including @tanmay2099 and @LucaWeihs from the PRIOR team @allen_ai and @anand_bhattad who recently interned with us. As an AC, good reviewers are such a joy to work with! They are objective, detailed, responsive and timely.
HUGE thank you for your service to our #ICCV2023 outstanding reviewers!
@saurabh_shah2 @jxmnop @briiterbeams jaaaaaack apply!!!! 🥹
The deadline for summer 2024 Research Internships on the AllenNLP team is October 15th. If doing NLP research with world-class mentors sounds exciting to you, apply at: boards.greenhouse.io/thealleninstit… (internships designed for PhD students)
Rumors say that @allen_ai has the best snacks ever! Come intern with us!!
The deadline for summer 2024 Research Internships on the AllenNLP team is October 15th. If doing NLP research with world-class mentors sounds exciting to you, apply at: boards.greenhouse.io/thealleninstit… (internships designed for PhD students)
@LeonDerczynski @6lackfield @ai2_allennlp @Meta @MetaAI @huggingface Quite the opposite! We released Danish parallel data as part of CCMatrix some time ago: opus.nlpl.eu/CCMatrix.php NLLB-200 data is focused on lower resource languages :)
@6lackfield @ai2_allennlp @Meta @MetaAI Yes it does! It has parallel data for Urdu in and out of 25 languages. You can try this with @huggingface's datasets:
@ai2_allennlp @Meta @MetaAI Excited to share that the dataset we created to train a massively multilingual machine translation model NLLB-200 model is now available for download, thanks to @ai2_allennlp !
Curious to see what's in here! #NLProc #lowresourceNLP #NLLB
🎉Dataset Release 🎉 We reproduced and are releasing the mined bitext training data for @Meta @MetaAI's No Language Left Behind NLLB-200 models! 200 languages, bitext for 148 English-centric & 1,465 non-English-centric language pairs! ~450 GB of text! huggingface.co/datasets/allen…
@ai2_allennlp @Meta @MetaAI New to ML/AI ethical issues but this is great correct? We should have access to all training sets if we are going to allow algorithms to be proprietary, no? FYI I think that is a HORRIBLE idea but it is the road we are on for now.
@ai2_allennlp @Meta @MetaAI Does it have Urdu parallel corpus?
Amazing week for open source in the Software 2.0 era. 450Gb of text 🤯
🎉Dataset Release 🎉 We reproduced and are releasing the mined bitext training data for @Meta @MetaAI's No Language Left Behind NLLB-200 models! 200 languages, bitext for 148 English-centric & 1,465 non-English-centric language pairs! ~450 GB of text! huggingface.co/datasets/allen…
This is BIG
🎉Dataset Release 🎉 We reproduced and are releasing the mined bitext training data for @Meta @MetaAI's No Language Left Behind NLLB-200 models! 200 languages, bitext for 148 English-centric & 1,465 non-English-centric language pairs! ~450 GB of text! huggingface.co/datasets/allen…
nice, we need this
🎉Dataset Release 🎉 We reproduced and are releasing the mined bitext training data for @Meta @MetaAI's No Language Left Behind NLLB-200 models! 200 languages, bitext for 148 English-centric & 1,465 non-English-centric language pairs! ~450 GB of text! huggingface.co/datasets/allen…
Wow! This is huge!
🎉Dataset Release 🎉 We reproduced and are releasing the mined bitext training data for @Meta @MetaAI's No Language Left Behind NLLB-200 models! 200 languages, bitext for 148 English-centric & 1,465 non-English-centric language pairs! ~450 GB of text! huggingface.co/datasets/allen…
Meta and AI2 researchers just released training data for @metaai’s “No Language Left Behind” language models, teaming up to promote #OpenScience and inclusivity in #NLProc.
🎉Dataset Release 🎉 We reproduced and are releasing the mined bitext training data for @Meta @MetaAI's No Language Left Behind NLLB-200 models! 200 languages, bitext for 148 English-centric & 1,465 non-English-centric language pairs! ~450 GB of text! huggingface.co/datasets/allen…
Learn how to use CheckLists with AllenNLP! Easily construct suites of tests defined by inputs and expected outputs to check for robustness across several general linguistic capabilities. medium.com/ai2-blog/using… #allennlp #nlproc @ai2_allennlp
How did we scale up the AllenNLP library to train 11 billion+ parameter models on a single node? AI2's @epwalsh10 from the @ai2_allennlp team wrote a great deep dive about exactly this, today on the AI2 Blog: medium.com/ai2-blog/scali… #NLProc #AllenNLP
@yoavgo For me, @explosion_ai, @huggingface, @ai2_allennlp
@ericjang11 @zacharylipton There are at least 2-3 papers in just messing around with components we already have in @ai2_allennlp. I am always amazed that not more people pick off the low-hanging fruit. Not glamorous enough?