Nitish Gupta @nitish_gup
Research Scientist at @GoogleAI in Natural Language Processing & Machine Learning PhD from University of Pennsylvania | Undergrad from IIT Kanpur nitishgupta.github.io Bengaluru, India Joined April 2016-
Tweets186
-
Followers1K
-
Following481
-
Likes2K
Excited to release IndicGenBench: A suite of evaluation datasets for multiple tasks in 29 Indic languages! Evaluations over many LLMs reveal huge room for improvement. IndicGenBench is multi-way parallel opening doors for interesting research! Great work by @Harman26Singh!
Excited to release IndicGenBench: A suite of evaluation datasets for multiple tasks in 29 Indic languages! Evaluations over many LLMs reveal huge room for improvement. IndicGenBench is multi-way parallel opening doors for interesting research! Great work by @Harman26Singh!
Super excited to be part of this journey with @partha_p_t and other amazing colleagues at Google Research! #GoogleForIndia
Super excited to be part of this journey with @partha_p_t and other amazing colleagues at Google Research! #GoogleForIndia
#NLPaperAlert: QA Dataset Explosion!🔥 A survey of 200+ QA/RC datasets proposing a taxonomy of formats & reasoning skills. Also in the bag: modalities, conversational QA, domains & beyond-English data. Honored to work on this with @nlpmattg & @IAugenstein arxiv.org/abs/2107.12708
Some life updates: I defended my PhD in April! Special thanks to my advisor @DanRothNLP and @sameer_ & @nlpmattg for being there during the journey! Very excited to have joined @GoogleAI as a research scientist this week! Looking forward to continuing research in NLU 😀
Really excited about this work w/ @sameer_ and @nlpmattg! We show that enforcing consistency between semantic parses of related NL utterances leads to improved learning in weakly supervised setting! Also, it got accepted to ACL 2021! Paper - arxiv.org/abs/2107.05833
Really excited about this work w/ @sameer_ and @nlpmattg! We show that enforcing consistency between semantic parses of related NL utterances leads to improved learning in weakly supervised setting! Also, it got accepted to ACL 2021! Paper - arxiv.org/abs/2107.05833
Thrilled to receive the NSF CAREER award! Looking forward to exciting research & collaborations. Thanks to my mentors, colleagues, & esp students. Also thanks to @shsriva for his support & to this gentleman for waiting until the proposal submission ddl to arrive into the world.
Thrilled to receive the NSF CAREER award! Looking forward to exciting research & collaborations. Thanks to my mentors, colleagues, & esp students. Also thanks to @shsriva for his support & to this gentleman for waiting until the proposal submission ddl to arrive into the world. https://t.co/zcgs5rfQhl
A reminder that the AKBC deadline (June 17th) is 4 weeks away! Please help RT and submit your work! The submission site is open and all the details can be found at: akbc.ws/2021/cfp/ #AKBC2021
Check out SacreROUGE, from my (former) lab mate @_danieldeutsch I am a big fan of Sacre* metrics libraries and would like to see more. github.com/danieldeutsch/…
It wasn't easy to review #nlproc research for an extremely unusual year, but here's an attempt. Listen to the episode or see below for the main themes I covered, and links to some of the representative papers for each (mostly in order I talked about them). Warning: Long thread...
It wasn't easy to review #nlproc research for an extremely unusual year, but here's an attempt. Listen to the episode or see below for the main themes I covered, and links to some of the representative papers for each (mostly in order I talked about them). Warning: Long thread...
In case you haven't seen enough papers on arXiv recently a new EMNLP findings paper #emnlp2020 led by Inbar Oren: arxiv.org/abs/2010.05647 We know already seq2seq models generalize badly to new compositions in semantic parsing... 1/3
Come join us in the Q&A session tomorrow on Obtaining Faithful Interpretations from NMNs. We'll be happy to answer questions, discuss, or just chat! (@ben_bogin @sanjayssub @sameer_ @JonathanBerant @nlpmattg) virtual.acl2020.org/paper_main.495…
Come join us in the Q&A session tomorrow on Obtaining Faithful Interpretations from NMNs. We'll be happy to answer questions, discuss, or just chat! (@ben_bogin @sanjayssub @sameer_ @JonathanBerant @nlpmattg) virtual.acl2020.org/paper_main.495…
It was great to meet fellow QA researchers at the #acl2020nlp Birds-of-Feather QA Session. Great job organizing @DanielKhashabi and nice to see @JonathanBerant @HannaHajishirzi @nitish_gup @sewon__min . Seems we should take the discussing to a future workshop? 😊
1/ We have a very poor understanding of privilege in India. This has far-reaching effects on the products we design and build. I am writing this in English. You are able to read it. Congrats - we are both privileged.
Happy to host our first speaker @nitish_gup for the summer zoom seminar series at @UUDataScience. He will be speaking on neural module networks for complex reasoning over text, more details here : datascience.utah.edu/club/summer-se…. Stay tuned for upcoming exciting talks. @UtahSoC @UUtah
Reading arxiv.org/pdf/2005.00724… @sanjayssub et al. Evidence of "non-faithfulness" in NMNs: the disentangled neural modules we get from e.g. CLEVR don't arise in models trained on natural data. But what does it mean for an NMN to be faithful? 1/
ACL paper w/ Qiang Ning, @DanielKhashabi and Dan Roth: arxiv.org/abs/2005.04304. We train LMs to better understand time (events' duration, frequency, etc.) and to give quality predictions. Our model also does better on related extrinsic tasks (e.g., sub-event RE) via fine-tuning.
Obtaining Faithful Interpretations from Compositional Neural Networks deepai.org/publication/ob… by Sanjay Subramanian et al. including @nitishgupta4291, @sameer_, @JonathanBerant, @ml_gardner #Computation #Language
Dataset biases can easily overpower inductive biases giving us a false impression of progress. It is critical to have both in-distribution hard splits and out-of-distribution splits to measure generalization. We examine language grounding 1/n x.com/arjunreddy2613…
Dataset biases can easily overpower inductive biases giving us a false impression of progress. It is critical to have both in-distribution hard splits and out-of-distribution splits to measure generalization. We examine language grounding 1/n x.com/arjunreddy2613…
🥞 Fresh from the press: a new review of #KnowledgeGraph related papers from ongoing ICLR 2020. Among other things we'll review what's happening in complex QA, KG embeddings and entity matching with graph embeddings @iclr_conf medium.com/@mgalkin/knowl…
Danish Pruthi @danish037
7K Followers 628 Following Faculty at Indian Institute of Science, Bangalore. PhD from @LTIatCMU.Delip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Sasha Rush @srush_nlp
52K Followers 464 Following Professor, Programmer in NYC. Cornell Tech, Hugging Face 🤗 https://t.co/cZl0wTfqGzWilliam Wang @WilliamWangNLP
14K Followers 718 Following UCSB NLP Lab + ML Center. https://t.co/6TOnqbk6YT https://t.co/KJYhnav3Et Mellichamp Chair Prof. at UCSB CS. PhD @ CMU SCS. Areas: #NLProc, Machine Learning, AI.Kayo Yin @kayo_yin
8K Followers 560 Following PhD student @berkeley_ai @berkeleynlp working on interpretability and signed languages. Former @msftresearch @deepmind @carnegiemellon @polytechnique. 🇫🇷🇯🇵Sam Bowman @sleepinyourhat
35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.Shruti Rijhwani @shrutirij
4K Followers 499 Following * Research Scientist @GoogleDeepMind * #NLProc research * PhD from @LTIatCMU * Amateur woodworker, scuba diver, foosball playerGreg Durrett @gregd_nlp
6K Followers 752 Following CS professor at UT Austin. I do NLP most of the time. he/himJacob Andreas @jacobandreas
14K Followers 958 Following Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJwSameer Singh @sameer_
7K Followers 2K Following Cofounder @SpiffyAI and Assoc Prof at @UCIrvine, working on reliable LLMs, explanations for AI+ML, adversaries for NLP, and debugging/evaluation.Ofir Press 🖋 @OfirPress
10K Followers 3K Following I build tough benchmarks for LMs and then I get the LMs to solve them. Postdoc @Princeton. PhD from @nlpnoah @UW. Ex-visiting researcher @MetaAI & @MosaicML.Tal Linzen @tallinzen
16K Followers 893 Following Professor @nyuling and @NYUDataScience, research scientist @GoogleAIBill Yuchen Lin 🤖 @billyuchenlin
6K Followers 2K Following Research @allen_ai. I evaluate (multi-modal) LLMs, build agents, and study the science of LLMs. Previously: @GoogleAI & @MetaAI FAIR @nlp_uscStephen Mayhew @mayhewsw
2K Followers 858 Following Following Ratinov and Roth (2009), we choose to use a Twitter BILOU instead of a Twitter BIO. @duolingoDipanjan Das @dipanjand
4K Followers 308 Following Senior Director of Research at @GoogleDeepmind. Working on improving the factuality of LLM generated content.Nathan Schneider @complingy
4K Followers 1K Following Computational Linguist and Professional Nerd at Georgetown University he/him pronouns, ALL the prepositions @[email protected] @complingy.bsky.socialPartha Talukdar @partha_p_t
4K Followers 215 Following Researcher @googleai, Faculty @iiscbangalore, Founder @kenomeioVivek Gupta @keviv9
2K Followers 5K Following PostDoc @cogcomp UPenn | Ph.D. CS @UUtah | @iitkanpur. @Bloomberg & @MSFTResearch Fellow | ex-@MetaAI, @IBM, @Verisk, @samsungresearch, @Synopsys #nlp #mlSebastian Ruder @seb_ruder
80K Followers 1K Following Multilingual LLMs @cohere • Prev: @GoogleDeepMind • Newsletter: https://t.co/7JGh2qpG98Dean Clark @DeanCla88922559
144 Followers 1K Following Disabled part-time student, registered for artificial kidney trials.Gagan Jain @gaganjain1582
53 Followers 747 Following Research Associate @GoogleDeepMind | IIT Bombay'22Koustava Goswami @koustavagoswami
338 Followers 536 Following Research Scientist @Adobe_Research | PhD @insight_centreGuneet Singh Kohli @guneetsk99
460 Followers 3K Following AI Engineer @ GreyOrange, Building Indian LLMs with Odia GenAI Independent Researcher working on variety of random problems.Vishwas Srinivasan @vishwasms
105 Followers 967 Following ವಿಶ್ವಾಸ್ ಶ್ರೀನಿವಾಸನ್ 𑀯𑀺𑀰𑁆𑀯𑀸𑀲𑁆 𑀰𑁆𑀭𑀻𑀦𑀺𑀯𑀸𑀲𑀦𑁆 Viśvās Śrīnivāsan̠ 宝物家 信念Ashutosh Mehra @ashutoshmehra
1K Followers 5K Following Senior Principal Scientist at Adobe. Working on Acrobat AI Assistant, LLMs, and document ML.Ram Samarth @chaostocolor8
5 Followers 161 Following 🌐 CSE Student @ IIIT KOTTAYAM | Graph Representation Learning 🤖 Federated Learning Enthusiast |Aryan Taneja @TanejaAryan
166 Followers 626 Following research fellow @MSFTResearch | ex edg-intern @MathWorks | undergraduate researcher @iiitdelhiAaditya ; @Aaditya26082004
531 Followers 7K Following CS'26 • Machine Learning • Open-Source • Web Dev. • Algorithms • Jai Shree Krishna 🦚🪈TabithaGracie @SF6r6Jg0B3ugxlp
2 Followers 355 FollowingIrisOnions @jr4U78a4Wc066
2 Followers 306 FollowingAakanksha Chowdhery @achowdhery
7K Followers 3K Following LLMs @ Google DeepMind :: PaLM, Gemini // Previously @MSFTResearch, @Stanford, @Princeton // views my own and subject to changeYufan Song @YufanSong98
22 Followers 260 FollowingAYANA K SUNIL @Ayana_Kallayil
5 Followers 41 FollowingAtharva Pawar @Atharva08
45 Followers 737 FollowingManisha @manishalife12
0 Followers 12 FollowingAmulya Pendota @pamulya212
2 Followers 21 Followingsuper intelligence @eacc72
12 Followers 688 Following GPT6 is a Level 2 AGI and will be released in 2025Aishwarya Chandraseka.. @aish2306
764 Followers 3K Following she/her CS PhD student @UDelaware Member of the @sensifylab Advisor : @mattm401 Former Data Scientist @wearemiq Former Data Science Intern @AnalyticsVidhyasandya mannarswamy @sandyasm
864 Followers 5K Following Natural Language Processing Researcher. https://t.co/oYoCTKS2Hopawann k. @pawaniiit
220 Followers 4K Following Prof., PhD, Inria, France, Postdoc KU Leuven, Fraunhofer ITWM, FU Berlin. I like Machine learning and mathematics.Akshita Jha @akshitajha
204 Followers 414 Following Ph.D. Candidate @VT_CS (she/her). Responsible NLP. Previously: @iiit_hyderabad, @debian, @linuxfoundation, 2x Research intern @GoogleAIAshish Papanai @AshishPapanai1
22 Followers 75 Following Research @WadhwaniAI | Computer Vision Winner UNESCO India Africa Hackathon 22 | Smart India Hackathon 22Aditya Sinha @adityaasinha
971 Followers 3K Following MS CS at UIUC | Previously @GoogleAI, @MSFTResearch | BITS Pilani, Goa.Pratinav Seth @ptnv_s
399 Followers 3K Following AAAI-UC Scholar 23 | Intern @Mila_Quebec , Prev- @Bosch_Research ,KLIV IITKGP | Looking for Full Time AI/DS/ML/RS Roles |Anwoy Chatterjee @anwoy_
32 Followers 361 Following PhDing in #NLProc / #AI @LCS2Lab @IITDelhi • Prev: BTech in CS @IITBHU_Varanasi Views are personalMahesh Sathiamoorthy @madiator
9K Followers 933 Following LLMs and Data. Discuss about data for LLMs: https://t.co/x4iAft5cHV Ex-GoogleDeepMindSiddhant Chaudhary @sid_codetalker7
10 Followers 92 Following Interested in Machine Learning Research and Open Source. Currently student @ Chennai Mathematical Institute.Jigar Doshi @jigarkdoshi
2K Followers 2K Following Building LLM agents @artparkindia @iiscbangalore collaboration w @armmanindia Before: @GeorgiaTech, @wadhwaniai, @CrowdAIinc, @IBMResearch, @IntelMayur Mankar (Looking.. @Mayur25mankar
42 Followers 1K Following 👨🎓 BS-MS (Data Science & Engineering) @iiserbhopal | 3D Comp. Vision | Point Cloud | https://t.co/GkOncHkL5K | @Google Research Mentorship @iitroorkeeHassan Hayat 🔥 @TheSeaMouse
5K Followers 4K Following Building the AI assistant for all @ https://t.co/D4gDyw97guShikhar Vashishth @shikharv15
34 Followers 151 Following Research Scientist @GoogleAI | Prev. @MSFTResearch, @LTIatCMUVardaan Pahuja @vardaanpahuja
147 Followers 317 Following Ph.D. student in CSE @osunlp. Research Interests: Multimodal Foundation Models, KG reasoning, NLP Ex-intern @GoogleAI Prev @Mila_Quebec @IBMResearch @IITKgpDivy Thakkar @divy93t
5K Followers 2K Following Strategy, Programs & Product @GoogleAI , HCI Researcher. Ph.D @CityUniLondon Alumni @iift1963 @daiictofficial. Personal views.somuSan @somuSan_
556 Followers 3K Following Autonomous Vehicle Research Assistant @IIITDelhi, 4x @kaggle expert, Former ML intern @IITKgp, @UNSWfeng @hubifeng
11 Followers 295 FollowingDibyajyoti Acharya @DibyajyotiAch04
136 Followers 5K Following Student, Learner, Explorer 🤓 Interested in all things AI.Sriram Ganapathy @tweet4sri
348 Followers 153 Following Associate Professor, Indian Institute of Science, Bangalore. Google Research India, Bangalore.Siddharth Dalmia @siddalmia05
1K Followers 445 Following Research Scientist @GoogleDeepmind | #SpeechProc and #NLProc | PhD from @LTIatCMU @SCSatCMU | Ex-intern @GoogleAI, @AWSCloud, @FacebookAIjluite @jluite2014
276 Followers 4K FollowingGeorge Colclough @LowRankApprox
114 Followers 2K Following “Science is a differential equation. Religion is a boundary condition” — Alan TuringIgor Carron @IgorCarron
5K Followers 6K Following CEO https://t.co/b9fz6WvhTx @LightOnIO Paris Machine Learning Meetup (8200+) @ParisMLGroup https://t.co/jY1eeMkqJE (10M+ pageviews) @NuitBlog Rocket Scientist(((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingAkari Asai @AkariAsai
11K Followers 650 Following Ph.D. student @uwcse & @uwnlp. NLP. IBM Ph.D. fellow (2022-2023). Meta student researcher (2023-) . ☕️ 🐕 🏃♀️🧗♀️🍳Danish Pruthi @danish037
7K Followers 628 Following Faculty at Indian Institute of Science, Bangalore. PhD from @LTIatCMU.Yann LeCun @ylecun
712K Followers 719 Following Professor at NYU. Chief AI Scientist at Meta. Researcher in AI, Machine Learning, Robotics, etc. ACM Turing Award Laureate.Delip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Kyunghyun Cho @kchonyc
61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Andrej Karpathy @karpathy
979K Followers 905 Following 🧑🍳. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets 🧠🤖💥Sasha Rush @srush_nlp
52K Followers 464 Following Professor, Programmer in NYC. Cornell Tech, Hugging Face 🤗 https://t.co/cZl0wTfqGzMark Dredze @mdredze
4K Followers 786 Following John C Malone Professor at @JohnsHopkins @JHUCompSci @jhuclsp @jhumceh; Part time @techatbloomberg (tweets my own) Mastodon @[email protected]Yi Tay @YiTayML
29K Followers 97 Following chief scientist / cofounder @RekaAILabs 🫠 past: research scientist @google brain 🤯 currently learning to be a dad 🍼William Wang @WilliamWangNLP
14K Followers 718 Following UCSB NLP Lab + ML Center. https://t.co/6TOnqbk6YT https://t.co/KJYhnav3Et Mellichamp Chair Prof. at UCSB CS. PhD @ CMU SCS. Areas: #NLProc, Machine Learning, AI.Yoav Artzi @yoavartzi
13K Followers 163 Following Research/prof @cs_cornell + @cornell_tech🚡 / https://t.co/9YnWry7yHs / https://t.co/3VmRSyYm2d / asso. faculty director @arxiv / building https://t.co/f9QkzO5kaCPercy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistLuca Soldaini 🎀 @soldni
6K Followers 1K Following I like tokens! Lead for OLMo data team at @allen_ai (Dolma 🍇), OSS is fun, @QueerInAI organizer 🤖☕️🍕they/them (views mine, not my employer’s)Ana Marasović @anmarasovic
4K Followers 604 Following Asst prof @UUtah · Ex @allen_ai @uwnlp postdoc @HD_NLP PhD · she/her 🇭🇷Kayo Yin @kayo_yin
8K Followers 560 Following PhD student @berkeley_ai @berkeleynlp working on interpretability and signed languages. Former @msftresearch @deepmind @carnegiemellon @polytechnique. 🇫🇷🇯🇵Sam Bowman @sleepinyourhat
35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.Shruti Rijhwani @shrutirij
4K Followers 499 Following * Research Scientist @GoogleDeepMind * #NLProc research * PhD from @LTIatCMU * Amateur woodworker, scuba diver, foosball playerRachit Bansal @rach_it_
892 Followers 1K Following Pre-doctoral Researcher @GoogleAI • Prev. @dtu_delhi '22 @technionlive @AdobeResearch • Anything `science', ~cosmos, and Oxford commasReka @RekaAILabs
11K Followers 13 Following An AI research and product company 🫠. We are a team of scientists and engineers building state-of-the-art multimodal language models 😻Nikhil Kamath @nikhilkamathcio
334K Followers 393 Following Co-Founder, Zerodha | True Beacon | GruhasConference on Languag.. @COLM_conf
2K Followers 6 Following https://t.co/GhGCMEoa4A Abstract submission: March 22, 2024Zack Ives @zgives
1K Followers 575 Following Adani President's Distinguished Professor & Chair of Computer & Info Science at Penn (@CIS_Penn, @PennEngineers).Ivan Vulić @licwu
2K Followers 334 Following PRA@Cambridge; Interested in (way) too many things for his well-being, but mostly (and rarely) (re)tweets about NLP, ML, IR, language(s); (likes parentheses)Dan Garrette @dhgarrette
539 Followers 240 Following Research Scientist at Google. My research focuses on Natural Language Processing and Machine Learning.Isaac R Caswell @iseeaswell
519 Followers 132 Following low resource MT, plants, insects, music+sangeethamCohere For AI @CohereForAI
15K Followers 177 Following We are a research lab and open science initiative that seeks to solve complex machine learning problems. Join us in exploring the unknown, together.Sara Hooker @sarahookr
39K Followers 7K Following I lead @CohereForAI. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, @trustworthy_ml. Changing spaces where breakthroughs happen.Scaled Cognition @ScaledCognition
20 Followers 0 FollowingZoubin Ghahramani @ZoubinGhahrama1
24K Followers 615 Following VP Research, Google DeepMind, ex-head of Google Brain. Professor at University of Cambridge. Machine Learning Researcher. ex-Chief Scientist & VP of AI, Uber.Sanjeev Arora @prfsanjeevarora
21K Followers 32 Following Director, @PrincetonPLI and Professor @PrincetonCS. Seeks math/conceptual understanding of deep learning and large AI models.Surya Kallumadi @kallumadi
548 Followers 2K Following Search Scientist. Working at the intersection of Machine learning, Information retrieval, Recommender Systems, Data Mining and User Experience.Rosanne Liu @savvyRL
33K Followers 968 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDRJay Alammar @JayAlammar
35K Followers 1K Following Machine learning and language models R&D. Builder. Writer. Visualizing AI, ML, and LLMs one concept at a time. @Cohere. https://t.co/TquuQXlLOJNaomi Saphra @nsaphra
7K Followers 1K Following Waiting on a robot body. ML/NLP. All opinions are universal and held by both employers and family. Same username on every lifeboat off this sinking ship.Sebastian Gehrmann @sebgehr
5K Followers 2K Following Head of NLP, CTO office, @Bloomberg. (he/him) Generating natural language, one word at a time. Also making sense of that language afterwards. views my ownJim Fan @DrJimFan
229K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.Anthropic @AnthropicAI
262K Followers 26 Following We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant Claude at https://t.co/aRbQ97uk4d.Sumanth @sumanthd17
2K Followers 1K Following PhD’ing @iitmadras @AI4Bharat, Google PhD Fellow, Past life - @GoogleAI @Mila_Quebec @IIITSCNidhi Goyal @NidhigoyalGoyal
179 Followers 346 Following Faculty at Ecole School of Engineering, Mahindra University, Hyderabad, PhD at KRaCR@IIIT-Delhi, PreCog@IIITH #LoveToTravel #WannaGoBeyondInfinityMonojit Choudhury @monojitchou
3K Followers 556 Following Professor at @mbzuai, #AI #Ethics #NLProc #LinguisticsOlympiad #artlover #foodlover #traveller #philosopher #puzzlist, ex-Microsoft ResearchShaily @shaily99
5K Followers 2K Following PhD @LTIatCMU Prev: @GoogleAI @MSFTResearch. Working on ethics and evaluation in #NLProc. Usually ranting, often about research & DEI. 📚 @readsndrantsAmeya Daigavane @BigAmeya
948 Followers 447 Following PhD student in the Atomic Architects group at @MITEECS! Previously @NASAJPL @GoogleAI @IITGuwahatiYonatan Belinkov @boknilev
4K Followers 1K Following Assistant professor of computer science @TechnionLive. #NLProcSneha Mondal @SnehaMon
178 Followers 170 Following Engineer @GoogleAI. Ex - @IBMResearch, alumnus @iiscbangalore. Machine Learning, Natural Language Processing.Melvin Johnson @melvinjohnsonp
980 Followers 280 Following Researcher @ Google Research. Multilingual NLP and MT. Previously, Stanford CS.Preethi Jyothi @PreethiJyothi1
725 Followers 399 Following At the Dept. of CSE@IIT Bombay since 2016. Interested in speech, language and machine learning.Ankur Bapna @ankurbpn
723 Followers 564 Following Audio in Gemini. Low resource multilingual nlp and speech. At Google Deepmind.Alan Ritter @alan_ritter
5K Followers 1K Following Computing professor at Georgia Tech - natural language processing, machine learning, information extraction, dialogueWei Xu @cocoweixu
9K Followers 1K Following CS professor @GeorgiaTech @gtcomputing @ICatGT @mlatgt. Natural language processing, machine learning, social media research.Jonathan Clark @JonClarkSeattle
3K Followers 2K Following Research Scientist @ Google: Multilingual NLP, Machine Learning, C++. Previously MT@Microsoft and CMU. Opinions are my own.Ponnurangam Kumaragur.. @ponguru
6K Followers 385 Following #ProfGiri #Precog #PGChairGiri @iiit_hyderabad, Distinguished Member @TheOfficialACM, TEDx Speaker, Angel Investor, Alumni @CarnegieMellon @bitspilaniindiaAnother contribution from our Languages team to support further research on AI for Indian languages.
Very pleased IndicGenBench is now out, lots of headroom for PhD students to start cracking 🙂 Particularly happy about the 29 languages coverage, including first-time generative evals (or any evals?) for many languages , eg, Garhwali, Konkani, Rajasthani, etc. Enjoy! @GoogleAI
This is such an important finding. Performative equity for low resource languages, and by proxy - for underrepresented communities - requires efforts that go beyond just better models. Infrastructural changes (such as sliding scale on API cost for languages) would go a long way.
Tokenization: There is a large gap between tokenizer fertility across indic languages. As a consequence, higher fertility allows fewer in-context examples to be part of the input. This leads to large inference compute requirements and cost while using API based LLMs.
My awesome colleague talking about his inspiring work on language inclusivity...
Thank you @bruke_kifle and #ACMByteCast for hosting me and for the opportunity, I very much enjoyed our conversation!
@Harman26Singh @simi_97k Fantastic work. Which open source LLM performs the best as per your benchmark?
New work on evaluating LLMs for generation in Indic Languages: IndicGenBench 👉5 diverse tasks, 29 Indic languages, >100k examples. 👉Curated using human translations ensuring high quality. 👉Multi-way parallel dataset. arxiv.org/abs/2404.16816 github.com/google-researc… (1/n)
@divy93t @Harman26Singh Thanks for your kind words and consistent support! Thanks to @Harman26Singh and @nitish_gup for leading IndicGenBench effort in awesome collaboration with @ShikharSSU and @dinesh_tewari1
IndicGenBench and several works from @partha_p_t 's group are a classical example of establishing a long-term research agenda and then consistently making in-roads through deep work! Great work, @Harman26Singh @partha_p_t and co-authors!
New work on evaluating LLMs for generation in Indic Languages: IndicGenBench 👉5 diverse tasks, 29 Indic languages, >100k examples. 👉Curated using human translations ensuring high quality. 👉Multi-way parallel dataset. arxiv.org/abs/2404.16816 github.com/google-researc… (1/n)
Very pleased IndicGenBench is now out, lots of headroom for PhD students to start cracking 🙂 Particularly happy about the 29 languages coverage, including first-time generative evals (or any evals?) for many languages , eg, Garhwali, Konkani, Rajasthani, etc. Enjoy! @GoogleAI
New work on evaluating LLMs for generation in Indic Languages: IndicGenBench 👉5 diverse tasks, 29 Indic languages, >100k examples. 👉Curated using human translations ensuring high quality. 👉Multi-way parallel dataset. arxiv.org/abs/2404.16816 github.com/google-researc… (1/n)
29 languages!
Very pleased IndicGenBench is now out, lots of headroom for PhD students to start cracking 🙂 Particularly happy about the 29 languages coverage, including first-time generative evals (or any evals?) for many languages , eg, Garhwali, Konkani, Rajasthani, etc. Enjoy! @GoogleAI
🤔 Want to push the boundaries of multilingual NLP? Excited to share our recent work on an evaluation benchmark for generation capabilities in low-resource Indic languages. What will you build with it? 💻 #NLProc github.com/google-researc… arxiv.org/abs/2404.16816
New work on evaluating LLMs for generation in Indic Languages: IndicGenBench 👉5 diverse tasks, 29 Indic languages, >100k examples. 👉Curated using human translations ensuring high quality. 👉Multi-way parallel dataset. arxiv.org/abs/2404.16816 github.com/google-researc… (1/n)
29 languages! And the first ever eval for many of them. Check this cool work out!
New work on evaluating LLMs for generation in Indic Languages: IndicGenBench 👉5 diverse tasks, 29 Indic languages, >100k examples. 👉Curated using human translations ensuring high quality. 👉Multi-way parallel dataset. arxiv.org/abs/2404.16816 github.com/google-researc… (1/n)
See our paper for more details and results: Arxiv link: arxiv.org/abs/2404.16816 Dataset link: github.com/google-researc… Work done @GoogleAI w/ awesome collaborators and mentors: @partha_p_t @nitish_gup @ShikharSSU @dinesh_tewari1
Tokenization: There is a large gap between tokenizer fertility across indic languages. As a consequence, higher fertility allows fewer in-context examples to be part of the input. This leads to large inference compute requirements and cost while using API based LLMs.
We comprehensively study the effect of in-context examples, and compare it with fine-tuning. On Indic languages, in-context learning is often better than fine-tuning on thousands of examples. Transfer from a high resource Indic language is much better than transfer from English.
We evaluate multiple open and proprietary LLMs such as GPT-3.5, GPT-4, PaLM-2, mT5, Gemma, BLOOM and LLaMA on our benchmark. Evaluations reveal a significant performance drop going from higher resourced languages to medium and further to lower resourced languages.
👉IndicGenBench includes 5 user-centric tasks: Cross-lingual Summarization (CrossSum-IN), Machine Translation (Flores-In), Multilingual QA (XQuAD-In) and Cross-lingual QA (XorQA-In-Xx, XorQA-In-En). 👉IndicGenBench supports 29 low to relatively higher resource Indic Languages
was waiting for something like this these benchmarks will soon be added to indic_eval(github.com/adithya-s-k/in…) library to easily evaluate any LLM and also integrate it with the Indic LLM leaderboard. great work @Harman26Singh @nitish_gup
New work on evaluating LLMs for generation in Indic Languages: IndicGenBench 👉5 diverse tasks, 29 Indic languages, >100k examples. 👉Curated using human translations ensuring high quality. 👉Multi-way parallel dataset. arxiv.org/abs/2404.16816 github.com/google-researc… (1/n)
Big Thanks to my managers across my stints in MTV & BLR - Krishna, Sarvjeet, Nitish and Partha