zhenheng tang @ZhenhengT
CS PhD Candidate, Machine learning and MLsys. Homepage: https://t.co/bDJ1U4pOrD Google Scholar: https://t.co/XQRCvTVDvT wizard1203.github.io Hong Kong Joined November 2018-
Tweets27
-
Followers37
-
Following302
-
Likes227
insightful work
amazing and interesting work!
amazing and interesting work!
Why in neural networks the learning rate can transfer from small to large models (both in width and depth)? It turns out that the sharpness dynamics can explain it. Check out our new work! arxiv.org/abs/2402.17457 w/ @alexmeterez (co-first), @orvieto_antonio and T. Hofmann
🚀 Excited to introduce #DistriFusion, our latest innovation to supercharge high-resolution image generation using diffusion models across multiple GPUs! 🌟 Achieve up to 6.1× speedup without sacrificing quality. hanlab.mit.edu/blog/distrifus…, CVPR'24
🚀 Excited to introduce #DistriFusion, our latest innovation to supercharge high-resolution image generation using diffusion models across multiple GPUs! 🌟 Achieve up to 6.1× speedup without sacrificing quality. hanlab.mit.edu/blog/distrifus…, CVPR'24
Video generation will revolutionize decision making in the physical world like how language models have changed the digital world. Interested in the implications of video generation models like UniSim and Sora? Check out our position paper: arxiv.org/abs/2402.17139
A Phase Transition in Diffusion Models Reveals the Hierarchical Nature of Data ift.tt/SLmpbBG
GD with LARGE stepsize induces an oscillatory loss that may sound scary, but the oscillation eventually accelerates optimization, provably Core proof in <= 5 pages, which made me very proud of :) New paper w/ Peter Bartlett, Matus Telgarsky, Bin Yu arxiv.org/abs/2402.15926
The shift from AI model to compound AI system is a super exciting area of innovation for genAI bair.berkeley.edu/blog/2024/02/1…
The shift from AI model to compound AI system is a super exciting area of innovation for genAI bair.berkeley.edu/blog/2024/02/1…
LLMs will soon become a commodity, there are already dozens of them out there and dozens more are being trained Grok, is the latest one trained on Twitter's data and took less than a few months to train. Soon they will be considered part of the software stack just like…
Interesting takeaways
great analogy
amazing...
Carl Jung: "Until you make the unconscious conscious, it will direct your life and you will call it fate. "
Carl Jung: "Until you make the unconscious conscious, it will direct your life and you will call it fate. "
Blink 这本书值得好好读一下, 尤其是第一章 “The theory of thin slices", 讲人脑如何抓重点。 x.com/1deepnote/stat…
Blink 这本书值得好好读一下, 尤其是第一章 “The theory of thin slices", 讲人脑如何抓重点。 x.com/1deepnote/stat…
#WebLLM just completed a major overhaul with typescript rewrite, and modularized packaging. The @JavaScript package is now available in @npmjs . Brings accelerated LLM chats to the browser via @WebGPU . Checkout examples and build your own private chatbot github.com/mlc-ai/web-llm
mark
Recent advances in Hopfield networks of associative memory may be the guiding theoretical principle for designing novel large scale neural architectures. I explain my enthusiasm about these ideas in the article ⬇️⬇️⬇️. Please let me know what you think. nature.com/articles/s4225…
Very interesting work! Log det( ) is a magical function. Realized that when I was doing data clustering via compression back in 2007: people.eecs.berkeley.edu/~yima/psfile/M… -- a paper I am most proud of. Now it seems very likely almost *everything* we do with (deep) learning follows from this.
Very interesting work! Log det( ) is a magical function. Realized that when I was doing data clustering via compression back in 2007: people.eecs.berkeley.edu/~yima/psfile/M… -- a paper I am most proud of. Now it seems very likely almost *everything* we do with (deep) learning follows from this.
1/7 🚨Excited to share our #ICML2023 paper w/ @krikamol, @skornblith, @bschoelkopf, @_beenkim. We explore the link between model predictions (Y) & their explanations (E) using the Potential Outcomes framework. arxiv.org/abs/2212.06925 🧵👇🏼
This is perhaps surprising, as norm layers just reweigh every activation output, unlike full weight matrices which "mix" activation outputs. We show that "mixing" does occur even for norm layer tuning, but between pairs of layers.
Guozheng Ma @Guozheng_Ma
11 Followers 139 Following Master student @Tsinghua_Uni, working on Deep Reinforcement Learning.Itamar Zimerman @ItamarZimerman
254 Followers 333 Following PhD candidate @ Tel Aviv University. AI Research scientist @ IBM Research. Interested in deep learning and algorithms.Chen Zhang@PKU @chenzhang_zc
63 Followers 198 Following PhD student in PIE Lab (@pielabpku), Peking University (@PKU1898) #NLProcXiao Liu @xxxxiaol
156 Followers 232 Following PhD student at PKU #NLProc | Prev Visiting Researcher at UCLAAllan Zhou @AllanZhou17
1K Followers 447 Following Final-year AI PhD student @Stanford. NN architecture design, learned optimizers, and hparam optimization.Dong Carlo An @andongverse
31 Followers 140 Following Ph.D. student at CAS, working on Embodied-AI🤖 and Multimodal Learning.Shangbin Feng @shangbinfeng
1K Followers 1K Following PhD student @uwcse @uwnlp. Understanding and expanding the knowledge abilities of LMs, social NLP, networks and structures. he/him. #水文学家Andy @AndyNosretep
503 Followers 3K Following The world is much better than it used to be. The world can be much better than it currently is. Dev by day. Dev by night. ex @stripe ex @microsoftNingyu Zhang@ZJU @zxlzr
1K Followers 906 Following Associate Professor @ZJU_China. Research interests include NLP, KG.Yin Fang @YinFang22900365
518 Followers 517 Following Ph.D. student in CS @ZJU_China. Looking for a post-doc position in AI4Science/LLM/KG. Feel free to reach me if you are interested in my research!Yangqiu Song @yqsong
883 Followers 960 Following Associate Professor at HKUST, working on knowledge graphs, NLP, data mining on texts and graphsCunxiang Wang @CunxiangWang
451 Followers 789 Following PhD Candidate @NLPwestlake, advised by Dr Yue Zhang. Interning @AWScloud. Research interest lies at Retrieval and LLMs. Seeking for a RS or PostDoc position.D @dylan_works_
191 Followers 787 FollowingYupeng Hou @yupenghou97
745 Followers 679 Following PhD student @UCSanDiego. Previously Renmin Univ. of China, Tencent, Alibaba Group, Ant Group. Research on LLMs, RecSys, etc.urchade @urchadeDS
184 Followers 240 Following PhD student @LipnLab and Research Scientist @FIgroupFR. Working on structured prediction for NLP. Antsatrana 🇲🇬Uttam Patra @UttamPatra90
85 Followers 1K FollowingSheteaus @Sheteaus187786
41 Followers 2K Following跨境物流。 @AlmaField11
100 Followers 3K Following 大家好,我是来自深圳的国际货运代理 为国内外客户提供海运、空运、铁路、快递出口货物运输服务。 主营:中国-转至-欧洲、英国、美国、加拿大 可寄:玩具、食品、药品、化妆品、电子烟、电子产品、日常用品、医疗物资、名牌仿牌、成人用品、家乡特产、超长超大件等... 欢迎咨询/微信:158-1529-9914Mola 相羊 @xiangya94910377
55 Followers 2K Following a psychology student ,curious about neural decoding、cognitive mathematics 、emergent communicationXinyu Yuan @XinyuYuan402
721 Followers 899 Following Transfer learning and generalization problems for representation learning, including different data modalities like knowledge graphs, protein sequences, etc.Robert Scoble @Scobleizer
504K Followers 68K Following Follow me on my new podcast with AI startups, Unaligned. Tech industry color commentator since 1993. Author/Blogger. Former strategist @Microsoft.FutureTechInfluencer @FutureTechInfl
1K Followers 5K Following Your go-to source for the latest news, insights, and commentary on high tech and artificial intelligence. 💯 #tech #ai 💯Qinbin Li @Cubeeli
40 Followers 72 Following Postdoc @UCBerkeley. Machine Learning / Federated Learning / Privacy / SystemsZhen Fang @Abell_Zhen_Fang
45 Followers 261 Following A Machine Learning Researcher. Making ML reliable for the open world.inoic bonding @ionic_bondings
1 Followers 30 FollowingGérard Biau @gerardbiau
854 Followers 1K Following Professor at Sorbonne University, Director of Sorbonne Center for Artificial Intelligence #SCAIAiden Chaoyang He @ChaoyangHe
732 Followers 1K Following Co-founder at FedML, Inc (https://t.co/NYtWFvGsTK), your generative AI platform at scaleXin Eric Wang @xwang_lk
7K Followers 1K Following Multimodal and Embodied AI Researcher / Professor @UCSC. Director of https://t.co/Y4swOBag21. AI for Humanity in the long run. he/himXinjing Zhou @XinjingC
568 Followers 396 Following PhD Student @MIT_CSAIL, working on database systems.Zhanke Zhou @zhankezhou
37 Followers 293 Following PhD student at HKBU. Focus on trustworthy machine reasoning for scientific discoveries.Mengzhou Xia @xiamengzhou
3K Followers 619 Following PhD student @princeton_nlp, MS @CarnegieMellon, Undergrad at Fudan.TSLA99T @Tsla99T
8K Followers 135 Following TSLA long term bull I own 2 Tesla cars MX/HW4/V12.3.4 MS/HW3/V12.3.6 @tsla99t_eng is my account in EnglishQin Ziheng @henryqin1997
3 Followers 3 FollowingAlbert Gu @_albertgu
9K Followers 90 Following assistant prof @mldcmu. chief scientist @cartesia_ai. leading the ssm revolution.Kotoba Technologies @kotoba_tech
643 Followers 85 Following Building End-to-End Speech Foundation Models from Japan/USA. Managed by @noriyuki_kojima, CEO and @jungokasai, CTO. Discord: https://t.co/ZA3An4iK0LInflection AI @inflectionAI
49K Followers 3 Following We are an AI studio creating a personal AI for everyone. Our first is @pi, a supportive and empathetic conversational AI.Zhuoran Yang @zhuoran_yang
2K Followers 911 Following Assistant Professor of Statistics and Data Science @YaleSherry Yang @mengjiao_yang
2K Followers 342 Following Research Scientist @GoogleDeepMind | PhD Student @UCBerkeley. Previously M.Eng. / B.S. @MIT.Xiao Liu @xxxxiaol
156 Followers 232 Following PhD student at PKU #NLProc | Prev Visiting Researcher at UCLASean Liu @_seanliu
2K Followers 3K Following Master's student in HCI, AR&VR at NYU. I build interfaces between humans, everyday things, and AI agents.Tianlong Chen @TianlongChen4
533 Followers 17 Following Incoming Asst. Professor at UNC Chapel Hill (@unccs, @unc). Postdoc, CSAIL@MIT (@MIT_CSAIL) & BMI@Harvard (@Harvard). Ph.D., ECE@UT Austin (@UTAustin). #AI #MLAK @_akhaliq
310K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80GxZheng Yuan @GanjinZero
662 Followers 509 Following NLP Researcher. The author of RRHF, RFT and MATH-Qwen. Focus on Medical & Reasoning & Alignment in LLMs. Prev Tsinghua Ph.D.Eric Jang @ericjang11
69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0pZiru Chen @RonZiruChen
300 Followers 589 Following Ron | チン シジョ. Ph.D. student @osunlp. Researching #NLProc & #ConvAI. “Cogito, ergo sum.”Shumin Deng @dsmall2apple1
260 Followers 294 Following Research Fellow at NUS Research Interests: NLP, Structured Prediction, IE, KG, Neuro Symbolic Reasoning, Multi-Agent Collaboration, Knowledge Editing for LLMsBill Yuchen Lin 🤖 @billyuchenlin
6K Followers 2K Following Research @allen_ai. I evaluate (multi-modal) LLMs, build agents, and study the science of LLMs. Previously: @GoogleAI & @MetaAI FAIR @nlp_uscAllan Zhou @AllanZhou17
1K Followers 447 Following Final-year AI PhD student @Stanford. NN architecture design, learned optimizers, and hparam optimization.知识分享官 @knowledgefxg
68K Followers 882 Following 热爱知识,没事分享点有趣硬核的东西,包含英语学习,AI编程,科技软件,资源网站等等。来都来了,点点关注😘。Saining Xie @sainingxie
14K Followers 1K Following researcher in #deeplearning #computervision | assistant professor at @NYU_Courant @nyuniversity | previous: research scientist @metaai (FAIR) @UCSanDiegoDong Carlo An @andongverse
31 Followers 140 Following Ph.D. student at CAS, working on Embodied-AI🤖 and Multimodal Learning.Future Intelligence @LeverhulmeCFI
12K Followers 766 Following The Leverhulme Centre for the Future of Intelligence. Exploring the nature and impact of AI (Uni of Cambridge, with spokes at Imperial and Berkeley).Shangbin Feng @shangbinfeng
1K Followers 1K Following PhD student @uwcse @uwnlp. Understanding and expanding the knowledge abilities of LMs, social NLP, networks and structures. he/him. #水文学家hazyresearch @HazyResearch
7K Followers 1K Following A research group in @StanfordAILab working on the foundations of machine learning & systems. https://t.co/JHK58TDorG Ostensibly supervised by Chris RéChong Liu @ChongLiuCS
509 Followers 355 Following DSI Postdoc @UChicago, Incoming CS Faculty @UAlbany @SUNY, Pilot. Previously @UCSBCS @Amazon. Machine Learning, Optimization, AI for Drug Discovery.Zhuo Chen @ZhuoCs
411 Followers 676 Following Ph.D. student in Computer Science @ZJU_China | #KG | #MultiModal | #NLProc | #LLM |Banghua Zhu @BanghuaZ
2K Followers 804 Following PhD @Berkeley_EECS, statistics, info theory, LLM, RL, Human-AI Interactions.Ningyu Zhang@ZJU @zxlzr
1K Followers 906 Following Associate Professor @ZJU_China. Research interests include NLP, KG.Xiaohui Chen @XiaohuiChen18
511 Followers 216 Following Associate Professor of Mathematics @USC. I work on statistics and machine learning.Kaiqing Zhang @KaiqingZhang
761 Followers 399 Following Assistant Professor @UofMaryland; Previously {@MIT, @SimonsInstitute, @ECEILLINOIS, @Tsinghua_Uni}; Control + Game Theory + Reinforcement LearningJingfeng Wu @uuujingfeng
748 Followers 910 Following Postdoc @SimonsInstitute @UCBerkeley; alumnus of @JohnsHopkins @PKU1898; deep learning theory, optimization, and statistical learning.Zixiang Chen @_zxchen_
988 Followers 2K Following Ph.D. student in CS @UCLA. 📚 B.S. from Tsinghua Univ. 🔍 Interested in Representation Learning, Generative Model & Reinforcement Learning.Yin Fang @YinFang22900365
518 Followers 517 Following Ph.D. student in CS @ZJU_China. Looking for a post-doc position in AI4Science/LLM/KG. Feel free to reach me if you are interested in my research!Yangqiu Song @yqsong
883 Followers 960 Following Associate Professor at HKUST, working on knowledge graphs, NLP, data mining on texts and graphsYupeng Hou @yupenghou97
745 Followers 679 Following PhD student @UCSanDiego. Previously Renmin Univ. of China, Tencent, Alibaba Group, Ant Group. Research on LLMs, RecSys, etc.Cunxiang Wang @CunxiangWang
451 Followers 789 Following PhD Candidate @NLPwestlake, advised by Dr Yue Zhang. Interning @AWScloud. Research interest lies at Retrieval and LLMs. Seeking for a RS or PostDoc position.Chujie Zheng @ChujieZheng
507 Followers 494 Following LLM alignment and safety #LLMs | Visiting Scholar @CS_UCLA | PhD student @TsinghuaCoAI | he/him/hisYi-01.AI @01AI_Yi
5K Followers 8 Following A global company building AI 2.0 platform and applicationsSiyan Zhao @siyan_zhao
780 Followers 486 Following CS PhD student @UCLA | Interested in decision making, LLMs, generative models | Bachelors @UofT EngSciHaoyi Qiu @HaoyiQiu
443 Followers 559 Following First year PhD student @UCLA 💙 BS in CS&Math @UMich〽️ #NLP 🌷Xian Li @xl_nlp
2K Followers 242 Following Research Scientist @MetaAI. NLP, ML. Opinions are my own.What a crazy week. And why I am (still) waiting for the Llama 3 paper, a little write-up on using and finetuning pretrained transformers! magazine.sebastianraschka.com/p/using-and-fi…
Say hello to Grok-1's new PyTorch+HuggingFace edition! 🚀 314 billion parameters, 3.8x faster inference. Easy to use, open-source, and optimized by Colossal-AI. 🤖 Dive in: #Grok1 #ColossalAI🌟 github.com/hpcaitech/Colo… Download Now: huggingface.co/hpcai-tech/gro…
One year ago, we first introduced BEHAVIOR-1K, which we hope will be an important step towards human-centered robotics. After our year-long beta, we’re thrilled to announce its full release, which our team just presented at NVIDIA #GTC2024. 1/n
不要去读AI的master,再好的学校课程都是脱节的滞后的,看见generative AI的职位就投,不管公司大小,现在其实是最容易入行的,因为有经验的人很少很少,等你真的上学毕业的时候竞争会更激烈,搞不好泡沫都破了
最近一个朋友和我聊天,他孩子名校CS本科毕业,现在奥斯汀一个大厂,问我如何赶上AI这轮风口。我说去硅谷,那里聚集着全世界最好的AI人才,互相影响。特斯拉如此优秀的AI团队,在任何一个地方都不可能,西雅图不行,波士顿不行,奥斯汀也不行,那么其他的城市就更不行
⚡️Thrilled to share our new #ICLR2024 paper on #DataInf! We present an efficient algo to attribute outputs by #LLMs + text-to-image #diffusion #AI to training samples. It works particularly well for LoRA tuned models Paper arxiv.org/abs/2310.00902 Code github.com/ykwon0407/Data…
On the Last-Iterate Convergence of Shuffling Gradient Methods ift.tt/iBo6SX0
Open-sourcing Kotomamba, our distributed training library for Mamba (a state space model that outperforms transformers). We'll release a tech blog post and HuggingFace models soon. Stay tuned! Project led by @okoge_kaz and @hiroto_kurita github.com/kotoba-tech/ko…
Exciting edge AI innovations across the full stack: meet 🌟VILA (CVPR'24), a multi-image visual language model, 🔥AWQ (MLSys'24), a 4-bit LLM quantization algorithm revolutionizing model efficiency, and 🚀TinyChat, powering visual language model inference on edge devices:
现在现在越来越觉一个人年纪越大越大,感觉时间过得越快,只是上班族的一种错觉;被动的时间比例越来越大,主动的时间越来越少,人生体验的丰富度降低就会造成这样的错觉。
@bryan_johnson Soon almost everyone realizes capital is not the constraint, time is the constraint.
New!🚨📰 Mamba is a cool, efficient, and effective DL architecture, but what do we know about Mamba? How does it capture interactions between tokens? Can it be the attention-killer? In our work, "The Hidden Attention of Mamba Models" we provide answers to these questions! [1/4]
**Training dynamics of attention** 1/📜Introducing our latest paper: "Training Dynamics of Multi-Head Softmax Attention for In-Context Learning: Emergence, Convergence, and Optimality." Link: [arxiv.org/abs/2402.19442] Joint work with @siyuc3141, @HeejuneSheen, and @0920wth
⚠️ Jailbreaking attacks for LLMs are crazy. How should we efficiently defend them? Check out 🛡️𝕊𝕒𝕗𝕖𝔻𝕖𝕔𝕠𝕕𝕚𝕟𝕘, a simple inference-time defense method. We found that fine-tuning with more safety data may not work well in defending wild jailbreaking attacks, while make…
Why in neural networks the learning rate can transfer from small to large models (both in width and depth)? It turns out that the sharpness dynamics can explain it. Check out our new work! arxiv.org/abs/2402.17457 w/ @alexmeterez (co-first), @orvieto_antonio and T. Hofmann
🚀 Excited to introduce #DistriFusion, our latest innovation to supercharge high-resolution image generation using diffusion models across multiple GPUs! 🌟 Achieve up to 6.1× speedup without sacrificing quality. hanlab.mit.edu/blog/distrifus…, CVPR'24
As #Sora shows us, the future of diffusion model will be HIGH resolution with intense compute. Can we break the speed barrier with distributed inference like LLMs? Meet DistriFusion, a distributed inference framework speeding up SDXL by up to 6.1×. It’s accepted by CVPR24!(1/6)