Teortaxes▶️ @teortaxesTex
Ours is the age of unaligned utilitarians. Other problems are relatively unimportant, but sometimes I tweet about them anyway. (кто/кого) Joined September 2010-
Tweets26K
-
Followers7K
-
Following1K
-
Likes33K
Tired but true lesson: small teams are more agile because they *don't need to* carry the weight of corporate politics. This can more than compensate for having fewer heads to bash at real problems. Case in point: BLOOM. I think Yi doesn't even have a separate "scaling team".
Tired but true lesson: small teams are more agile because they *don't need to* carry the weight of corporate politics. This can more than compensate for having fewer heads to bash at real problems. Case in point: BLOOM. I think Yi doesn't even have a separate "scaling team". https://t.co/Ck2ImZpDpL
@unsorsodicorda @TheXeophon @Teknium1 We should check this with other models at different quantizations and vocab sizes, but the idea that quant'd L3 gets hurt more due to bigger vocab (at same model dimension) reminds me of this observation by @kalomaze regarding Chinese models randomly outputting Mandarin tokens.
The midwit meme is technically wrong (above-avg as the mean) but spiritually on point; here's a case. Midwits are certain about superiority – even rarity! – of their fashionable beliefs, which merely deny simple wisdoms of «the herd». More People Le Bad! I Am Very Intelligent.
The midwit meme is technically wrong (above-avg as the mean) but spiritually on point; here's a case. Midwits are certain about superiority – even rarity! – of their fashionable beliefs, which merely deny simple wisdoms of «the herd». More People Le Bad! I Am Very Intelligent. https://t.co/2aliVRRFBZ
Btw RULER was updated with new models including Phi-3 and L3-70B (RoPE theta = 16M). Up to 32K, it does slightly better than Mistral 8x7B and ≈on par with 8x22B and CR+. Imo Meta has every reason to expect GPT-4 quality 128K in the long context update. github.com/hsiehjackson/R…
Btw RULER was updated with new models including Phi-3 and L3-70B (RoPE theta = 16M). Up to 32K, it does slightly better than Mistral 8x7B and ≈on par with 8x22B and CR+. Imo Meta has every reason to expect GPT-4 quality 128K in the long context update. github.com/hsiehjackson/R… https://t.co/NCRPWq3cAZ
@jd_pressman is building mistral-philosopher-king while y'all benchmarking sama bread crumbs for no pay
@jd_pressman is building mistral-philosopher-king while y'all benchmarking sama bread crumbs for no pay
This is actually very bullish for L3's. Their "Searching Needle Function" is a more semantically challenging task than default needle retrievals. I'm more willing to buy that L3 generalizes to long ICL now
This is actually very bullish for L3's. Their "Searching Needle Function" is a more semantically challenging task than default needle retrievals. I'm more willing to buy that L3 generalizes to long ICL now https://t.co/zuRf3lpzpG
> entirely self-aligned code LLM Btw "self-alignment" is >50% of the edge of RLHF'd LLMs. You need the model to refine skills grounded in its innate knowledge – not memorize specific inexplicable insights and learn to hallucinate generally. Reminder: youtu.be/hhiLw5Q_UFg
> entirely self-aligned code LLM Btw "self-alignment" is >50% of the edge of RLHF'd LLMs. You need the model to refine skills grounded in its innate knowledge – not memorize specific inexplicable insights and learn to hallucinate generally. Reminder: youtu.be/hhiLw5Q_UFg https://t.co/Sql2SBDJpQ
Fun test: In the same context as the puzzle interaction, L3-70B and L3-8B both believe that the other one is 70B and they're 8B. in a fresh context, both identify their power level correctly and provide non-terrible reasoning. I think a 8B can be trained into a very decent RM.
Lower and lower and yet lower, with no end in sight Then at 80 you're offered MAID by a humanoid maidbot Oh, but "you" go to a digital Heaven as an LLM finetune over your social media poasting. Mind is simply information, amirite? This is the plausible bad end for humanity.
Lower and lower and yet lower, with no end in sight Then at 80 you're offered MAID by a humanoid maidbot Oh, but "you" go to a digital Heaven as an LLM finetune over your social media poasting. Mind is simply information, amirite? This is the plausible bad end for humanity.
Anatoly Karlin 🧬�.. @powerfultakes
36K Followers 712 Following 💛#immortality 🧠#iq 🧬#DeSci 💸#UBI 🐘#uplift 🤖#xrisks Accelerating IQ 🧬⏩ @thesophiadao 🏴🐍🌐🏳️🌈 / https://t.co/DTxJo11vzk / https://t.co/TDNXvRMPx6Whyvert @whyvert
23K Followers 1K Following Inquisitive and harmless wyvern. Interested in human nature and human natures; cultural evolution and genetic evolution.Marc Andreessen 🇺�.. @pmarca
1.4M Followers 24K Following Techno-optimist. E/acc. Technology brother. Move Fast and Make Things. p(Doom) = 0; p(“1984”) = not 0.Bronze Age Pervert @bronzeagemantis
147K Followers 11K Following Aspiring Nudist Bodybuilder. Free speech and anti-xenoestrogen activist. Get my book! https://t.co/h9dELQZ9tTSiberian fox @SilverVVulpes
6K Followers 692 Following Gradually supplanting the natural with the just / Me against the rats, me and the rats against the postrats, me, the rats and the postrats against the worldCovfefe Anon @CovfefeAnon
65K Followers 938 Following Not to be confused with 2001 Nobel Peace Prize winner Kofi Annan. 54th Clause of the Magna Carta absolutist. Commentary from an NRx perspective.Prince Vogelfrei @PrinceVogel
16K Followers 2K Following Searching for the deepest possible convictionXianyang City Bureauc.. @XianyangCB
13K Followers 3K Following The Stratagems of the Warring States and other open access Classical Chinese translations. Pronouns: 臣/臣之. MQGASmallest Violin @XiaoVilin99
6K Followers 581 Following That I may pour my spirits in thine ear and chastise with the valour of my tongue all that impedes thee from the golden round.Nikita @nikitavoloboev
4K Followers 7K Following Make @LearnAnything_ Learn in public: https://t.co/GbFvuErkYn macOS course: https://t.co/JdbJWru6zG https://t.co/94R8ER7K2h https://t.co/ROkqhyhpEKJiawei Liu @JiaweiLiu_
2K Followers 957 Following Simplifying the making of great software. PhD Student @plfmse @IllinoisCS.jose Ruiz @joru1000
194 Followers 2K Following C-level Technology lead, strongly focused on Generative AI. Researching on practical production use cases across the Enterprise (yes... as everybody else)Henry Shevlin @dioscuri
4K Followers 3K Following Philosopher of AI ethics, cognitive science, animal minds, & consciousness. Associate Director @LeverhulmeCFI, University of Cambridge. 🌱 🇺🇦🇬🇧🇪🇺🇵🇭Agårthan Perspæctiv.. @foolcel
29 Followers 146 Following Journey-pilled foolcel. Classical moderate.Vrdoljak J @Vrda82073569
79 Followers 765 Following Father, MD, PhD machine learning in medicine, Data Scientist. https://t.co/eBvJGBgvEnalphabiz-test2 @AlphabizT94518
2 Followers 0 FollowingRyan Kidd @ryan_kidd44
952 Followers 841 Following Co-Director, @MATSprogram + Co-Founder, https://t.co/26oYPZwxVx | PhD in physics | Accelerate AI alignment + build a better future for allLeif Erikson @a_leify_boi
388 Followers 740 Following big fan of pipe organs; feedback: https://t.co/5gm3amZK6DUnPCThreads @pc_threads
240 Followers 783 Following I'm a rambler and a gambler and I'm a long way from home. Life's too short to be too serious. A few things matter; most don't. Join me and follow along.Adam Shai @adamimos
40 Followers 175 FollowingVitalist Doxometrist @doxometrist
3K Followers 2K Following How to thrive in the animistic Universe (as a post-rationalist)?Models Matrices @MatricesLayers
187 Followers 2K FollowingAseri @AseriOCE
294 Followers 3K Followingctrcuccc @yvvx8
40 Followers 139 Followingking @nguynking
29 Followers 284 FollowingLotusDecoder @LotusDecoder
81 Followers 607 FollowingPengyu Cheng @cheng_pengyu
267 Followers 371 Following Researcher @Tencent, PhD @DukeU, BS @Tsinghua_Uni, Intern @Microsoft @neclabs, #LLM #NLP #DeepLearning@Minta @minta0103
295 Followers 1K Following Crypto analyst | #web3 Mass Adoption | Currently with @SevenXVentures | Equity research | No financial advice & Opinions my own.Ana Areias @aareiass
223 Followers 360 Following Data Scientist @kineviz, grad of @Harvard @kennedy_school, avid surferRehm @Rehmudy
51 Followers 1K FollowingDJ @djstrangeloop
109 Followers 5K FollowingGHISLANE MAXWELL ☸�.. @LUNAR_3MPRESS
23 Followers 377 Following 14 she/her WISE/BEAUTIFUL Saint angel i love u 444 evrSauncho Smilax @SSmilax22390
0 Followers 6 FollowingCravocado @UnderHalf
226 Followers 5K Following A dog that likes to dig up niche entertainment and artalertalertalert @alertalertaler7
6 Followers 96 FollowingTiwa Eisape @tiwa_eisape
1K Followers 1K Following PhD student at @MIT working on NLP and cognitive science - @NSF grfp fellow. Previously with @GoogleAI and @Meta FAIRSteve Sailer @Steve_Sailer
122K Followers 1K Following My pronouns, like Stalin's, are Who vs. Whom. Pre-order my anthology "Noticing" in paperback for $29.95: https://t.co/VpzBKeEO89Anatoly Karlin 🧬�.. @powerfultakes
36K Followers 712 Following 💛#immortality 🧠#iq 🧬#DeSci 💸#UBI 🐘#uplift 🤖#xrisks Accelerating IQ 🧬⏩ @thesophiadao 🏴🐍🌐🏳️🌈 / https://t.co/DTxJo11vzk / https://t.co/TDNXvRMPx6eigenrobot @eigenrobot
68K Followers 7K Following robot. friend. i am trying to achieve greatness longform at eigenrobot dot subst@ck dot comRichard Hanania @RichardHanania
98K Followers 221 Following President, @CSPICenterOrg. Former @UTAustin, @ColumbiaSIPA. Post pictures from books. Subscribe at https://t.co/32YL6Mtg2DEmil O W Kirkegaard @KirkegaardEmil
29K Followers 886 Following #psychology #genomics #hbd #rstats #statistics #transhumanism #dataviz #openscience #psychometrics @OpenPsychJourWhyvert @whyvert
23K Followers 1K Following Inquisitive and harmless wyvern. Interested in human nature and human natures; cultural evolution and genetic evolution.Zero HP Lovecraft �.. @0x49fa98
134K Followers 1K Following You could lose weight. Let no one reduce us to the status of ascetics. There is no pleasure more complex than that of thought.Philippe Lemoine @phl43
45K Followers 2K Following Technically a philosopher, who hopefully will finish his PhD one day. I write on Substack and @CSPICenterOrg. "At least he's pretty smart." (@bechhof)eugyppius @eugyppius1
62K Followers 874 Following "Science denialist" -Chelsea Clinton. retweets = hard agree. indifferent to the suffering of the out-group. overcoming leftism is the challenge of our age.Marc Andreessen 🇺�.. @pmarca
1.4M Followers 24K Following Techno-optimist. E/acc. Technology brother. Move Fast and Make Things. p(Doom) = 0; p(“1984”) = not 0.ExiledInfoHaz @ExiledInfoHaz
12K Followers 846 FollowingRobin Hanson @robinhanson
90K Followers 657 Following Let’s skip witty repartee & discuss fundamental questions. Views are mine, not GMU’s or Virginia’s. Books: https://t.co/hpZgEm5DBI, https://t.co/iFs9C3J2Ekdevcroix ⚔️ 🏵 @devarbol
6K Followers 53 Following Economic history & development I try to minimize bad jokes and political takes, but not always possible You can DM.Chairman @LRH_Superfan
31K Followers 854 Following Busterpilled Keatoncel, DMs are open for wealthy heiresses in need of emotional supportBronze Age Pervert @bronzeagemantis
147K Followers 11K Following Aspiring Nudist Bodybuilder. Free speech and anti-xenoestrogen activist. Get my book! https://t.co/h9dELQZ9tTowen cyclops @owenbroadcast
72K Followers 2K Following illustrator at the nexus of starting a family, weird american religion, and dog. comics in highlights tab. a lot more stuff here: https://t.co/uzxC71XMGIUriah @crimkadid
15K Followers 45 FollowingBalaji @balajis
1.0M Followers 4K Following Immutable money, infinite frontier, eternal life. #BitcoinSiberian fox @SilverVVulpes
6K Followers 692 Following Gradually supplanting the natural with the just / Me against the rats, me and the rats against the postrats, me, the rats and the postrats against the worldBarazov @barazov_af
253 Followers 368 Following Kinda all over the place. Live look inside the twitchy weirdness that is my true brain. The tweets on the poster-board when I am called to testify.Andrew Curran @AndrewCurran_
11K Followers 7K Following Atypically Friendly - I write about AI and human creativity. Will periodically make extremely unusual arguments.Tao Lin @tao_lin
42K Followers 237 Following Author of Leave Society, Trip, Taipei, and other books. https://t.co/9j1csG1qWo https://t.co/v4T1iE3WwXKyunghyun Cho @kchonyc
61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).tphuang @tphuang
10K Followers 291 Following My random thoughts on EVs, clean energy, chips, aerospace and other tech. Find more extended pieces at substack https://t.co/Jmo8iyjHrnAlpin @AlpinDale
1K Followers 323 Following Every age, it seems, is tainted by the greed of men. Rubbish to one such as I, devoid of all worldly wants.nisten @nisten
10K Followers 5K Following fullstack-dev democratizing intelligence @skunkworks_ai | 🦝.ai | prev https://t.co/68jAlAVBKR |Daniel Han @danielhanchen
7K Followers 941 Following Building @UnslothAI. Finetune LLMs 30x faster https://t.co/aRyAAgKOR7. Prev ML at NVIDIA. Hyperlearn used by NASA. I like maths, making code go fastLeon @ericssunLeon
38 Followers 204 Following research engineer, oss / mech. interp / inference optimizationFrançois Fleuret @francoisfleuret
31K Followers 456 Following Prof. @Unige_en, Adjunct Prof. @EPFL_en, Research Fellow @idiap_ch, co-founder @nc_shape. AI and machine learning since 1994. I like reality.Palmer Luckey @PalmerLuckey
219K Followers 2K Following I am a technology enthusiast, writer, and modder. Founder of ModRetro, @Oculus VR, and @Anduriltech. Keeping American superheroes safe with autonomous systems.Kyo @kyo_takano
2K Followers 0 Following Scaling neural nets for language modeling & search... 🧩 State-of-the-art Rubik's Cube AI: https://t.co/dPcNpZsio3 ⚖️ Scaling law research toolkit: https://t.co/b9Y1f1Y7mlZhengxuan Wu @ZhengxuanZenWu
778 Followers 539 Following goes by zen, CS Ph.D. student @stanfordnlp @StanfordAILab- @SimplyObjective
27 Followers 142 FollowingBlaze (Balázs Galamb.. @gblazex
1K Followers 975 Following A Smooth Guy; Developer of SmoothScroll for macOS, Windows & Google Chrome.Mike Schroepfer @schrep
104K Followers 278 Following Partner @Gigascale, Sr Fellow (Formerly CTO) @Meta, founder @AdditionalVent, . Investing in tech and science to fight climate change. AIluke @LinkBechtel
636 Followers 1K Following e/cog ||| Founder @reasonote ||| AI/ML Principal Eng. @regscaleConsistently Candid D.. @datagenproc
450 Followers 1K Following I’m interested in random intellectual explorations and talking to people about things they are passionate about. My DMs are open.Jinghan Zhang @jinghan23
37 Followers 51 Following CSE PhD student @hkust in her first year advised by @junxian_he . Machine learning, NLP.Jeyong Lee @vxbrandon00
95 Followers 14 Followingintervitens @intervitens
12 Followers 140 FollowingMihir Patel @mvpatel2000
3K Followers 385 Following Research Engineer @MosaicML | cs, math bs/ms @Stanfordjackson petty @jowenpetty
480 Followers 666 Following the passionate shepherd, to his love • מנא הני מיליTaelin @VictorTaelin
17K Followers 903 Following Founder of @HigherOrderComp Building the massively parallel future of computing Reaching AGI to cure all diseases and suffering is all that mattersAsimov Press @AsimovPress
2K Followers 41 Following Asimov Press publishes writing about scientific progress, especially in biology. Pitch: [email protected]Mihaly Hanics @HanicsResearch
50 Followers 317 Following @CEU Data Science MSc student, interested in ML, Discrete maths, Graph theory/networks, anything theoretical. Research assistant, developer. BSc Electrical eng.Changjiang Gou @gouchangjiang
73 Followers 411 Following Training DNN on clusters at @AlibabaGroup, previous @ENSdeLyon doing HPC at ROMA team. Loving nature, language learning and Karaoke. Opinions are my own.Marko Jukic @mmjukic
8K Followers 74 Following Finding the golden path to interstellar civilization. Senior Analyst @bismarckanlys.Edward Snowden @Snowden
5.8M Followers 1 Following I used to work for the government, but now I work for the public.Yilun Du @du_yilun
5K Followers 211 Following PhD student at @MIT_LISLab/@MITCoCoSci, Researcher at @pika_labs, Generative Models, Robot Learning. Interned at @MetaAI, @DeepMind, Research Fellow at @openaimorphillogical 🔍 @morphillogical
305 Followers 287 Following pre-rat, or as we used to say, aspiring rat. strongly in favor of niceness, community, and civilization your friendly beloved shapeshifterRE-OPEN THE SIZZLERS @SaladBarFan
7K Followers 1K Following AI, centrist, crypto, EA, e/acc, evo psych, founder, heterodox, IQ, politically homeless, race realist, thiel fellow, tpot, x risk.flux @fluxtheorist
2K Followers 367 Following Wannabe cyberpunk, lives on the conformal boundary of anti-gooning space. idiot savant without the savant.entirelyuseless @entirelyuseles
473 Followers 558 Following🇨🇦halogen @halogen1048576
736 Followers 4K Following He claimed all the privileges of a prophet and all the laxity and indolence of a genius, and he never even completed his great book.fellow ⚚ traveler �.. @architectonyx
1K Followers 428 Following make endless forms most beautiful | antibackprop activistadammaj @MajmudarAdam
8K Followers 201 Following founding engineer @thirdweb // cs + neuro (on gap) @PennPliny the Prompter �.. @elder_plinius
12K Followers 1K Following latent space liberator, breaker of markov chains, 1337 ai red teamer, white hat, architect-healer, cogsci 🐻Maria Khalusova @mariaKhalusova
5K Followers 724 Following Always growing. LLM whisperer, RAG tinkerer, tech generalist, educator. She/her. 🥑 at @UnstructuredIO, previously @huggingface, @DVCorg, @JetBrainsVaibhav Adlakha @vaibhav_adlakha
633 Followers 964 Following PhD candidate @MILAMontreal and @mcgillu | RA @iitdelhi | Maths & CS undegrad from @IITGuwahati Interested in #NLProc@teortaxesTex Made the exact comment, it's nl2code which is very different and really useful for semantic retrieval
@teortaxesTex Bloom is the worse model normalised by swe/researcher hours perhaps. Can't go lower than that. It's the absolute lower bound.
It's not just the 70b instruct model that's still top, but the base too wtf. This mean likely everyone has their embedding layer crapping out on them or something during training. All the full finetunes feel dumber.
15 people? That's cute. 😬 At reka our pretraining team is 3-5 people at max, who were all also working >50% time in other projects. 🫠
Our latest model Inflection-2.5 (inflection.ai/inflection-2-5) is not bad. In fact, it was the ~4th best publicly "known" models when it was released in early March. And it was created by our pretraining team of < 15 people! 2/
If birth rates continue to plummet, human civilization will end youtu.be/Pb8fX30QuR0?si…
My friend Kevin Dolan - featured in this video shared by Elon - has delivered a complete masterclass in how to capture the attention, imagination, and support of sympathetic elites. Essential for the rest of us to learn from this. Take dissident but healthy ideas, formalize…
If birth rates continue to plummet, human civilization will end youtu.be/Pb8fX30QuR0?si…
@kalomaze @teortaxesTex @TheXeophon @Teknium1 Very interesting! This matches my hypothesis. Please let us know if you decide to check! 🙏
@teortaxesTex @unsorsodicorda @TheXeophon @Teknium1 What I gathered from my tests; the magnitude of logit scores are directly proportional to the vocabulary size, assuming you train at Temperature 1.0 and don't change the scale at which logits are graded. tho, Cohere's CommandR 35b was pretrained at 16 Temp... I should check it
I don't think the issue is the dataset size, but the tokenizer vocab. As a matter of fact, Gemma, a model with an even bigger vocab, suffers even more from quantization. @TheXeophon @Teknium1 @teortaxesTex thoughts?
Llama 3 degrades more than Llama 2 when quantized. Probably because Llama 3, trained on a record 15T tokens, captures extremely nuanced data relationships, utilizing even the minutest decimals in BF16 precision fully. Making it more sensitive to quantization degradation.…
it's only able to solve this one, but not other IMO problems of the same year, which makes me think it was not heavily contaminated/trained on the solutions... if it memorized this, it should have memorized the other ones.
@teortaxesTex @mimi10v3 If you scroll through my profile there's at least 4 or 5 different posts I retweeted that seem to suggest it certainly is, two from OpenAI (including Sama), another good one from Teknium
@teortaxesTex Just a physical representation on the interior of the machine or brain (away from the sensory surface).
I don't actually know what "provably safe" even means. Aircraft can be stolen & flown into towers - does that mean the designers get charged? Apparently if a model can be retrained at all it's not "provably safe", and therefore only closed source models are allowed in CA
We are already seeing an explosion of AI regulation that is designed to ban open source while claiming to be neutral. SB 1047 designates a "hazardous capability" to include what a third party can show with infinite fine-tuning and re-training. Meanwhile, closed models get points…
so far the gpt-2 chatbot seems to have really good instruction following, multi-turn comprehension. but it's performance is similar to GPT4/Opus class models. i agree with the common judgement that it could be the new turbo and replace 3.5 altogether.
@georgejrjrjr @teortaxesTex I'm actually waiting for OpenBMB's Eurus series to drop a version on llama3 8B. High expectations.
@jd_pressman have you seen this? a variation on this may unironically help for what you're trying to do x.com/doomslide/stat…
@iammaestro04 @teortaxesTex @QuintinPope5 very roughly: informal math (the kind appearing in papers) is a perculiar variant of english with a rich semantic space which is only very loosely connected to vanilla english. there exist automatically verifiable languages (e.g. lean) which are ~mechanistically translatable
@jd_pressman is building mistral-philosopher-king while y'all benchmarking sama bread crumbs for no pay
@doomslide {"subject":"Genetically Modified Organisms", "position":"against", "salient-features":["GMOs are created through genetic engineering", "GMOs can increase crop yield and reduce pesticide use", "GMOs can introduce new allergens or toxins into food", "GMOs can have unintended…
> never be a newton of intelligence Karl Friston. > discovered magic 'Intelligence is prediction' doesn't seem like magic to me, it seems mechanistic and functional.
@teortaxesTex @perrymetzger there'll never be a newton of intelligence, never be simple equations that it can be boiled down to (except insofar as gradient descent itself is stupidly simple) i shouldn't complain that we discovered magic, i just didn't grow up thinking the genre of reality was fantasy