(109) If 'slightly weaker AI' isn't really a thing and there are large unpredictable discontinuities in AI capabilities for any reason, then I think we are probably going to fail AI alignment and all die; all the plans that I have heard that might work assume not that.
(110) If 'slightly weaker AI' is a thing, then I think some of the plans I've heard in the broad category 'use slightly weaker AIs and idk some formalization of heuristic arguments Paul came up with to align slightly stronger AIs' seem kind of promising and might work fine.
(111) I find it kind of tempting, given this state of affairs, to go "okay, assuming no discontinuities..." but Eliezer Yudkowsky will be so disappointed in me, so I don't do that.
(112) The arguments for 'definitely discontinuities' seem pretty tenuous to me, though.
(113) I really don't think anyone's going to solve interpretability enough that this just itself solves alignment though I tentatively think if you're not pushing the state of the art in capabilities it's worth someone spending five years trying.
(114) imo working on AI capabilities right now is an understandable thing to do, but a very bad one. I could imagine someone working on capabilities having a justification that felt to me like it was persuasive but the existing people seem to have much worse justifications.
(115) I love some things about silicon valley tech culture, but I think it's pretty destructive as the default for AI companies to be operating from.
(116) It kind of seems like there's a weird and sort of stupid degree of important people making AI-related strategic decisions not even understanding what other important people think about AI strategy.
(117) This is never trivial to resolve because it, again, tends to bottom out in some incredibly detailed technical debate, but it's definitely a very obvious way we're doing much worse than it feels like we intuitively could be.
(118) I think it's quite bad how lots of people opine on AI while being deeply confused about who is doing what for what reasons, and I really wish I could make them all read something about who is doing what and why.
@KelseyTuoc I love reading lists. Give me a reading list!
@ID_AA_Carmack I can't believe I missed this tweet!! I think the person to read on takeover risk in RL AGI paradigms is Ajeya Cotra, maybe start here: lesswrong.com/posts/pRkFkzwK…
@ID_AA_Carmack @KelseyTuoc If you’re gonna read one thing on why “AGI safety” is a legit & important thing to work on, I’d suggest this one: 80000hours.org/problem-profil… . But one-size-fits-all discussions are tricky. So if you have thoughts, I can point you to resources that address them more specifically.
@ID_AA_Carmack @KelseyTuoc I like the suggestions others made here. My own favorite place to start is: 1. econlib.org/archives/2016/… 2. lesswrong.com/posts/uMQ3cqWD… Though if you're new to this topic, some parts of "AGI Ruin" (especially Section C) are probably going to be too in-the-weeds to be useful.
@ID_AA_Carmack @KelseyTuoc Or you could talk to @robertskmiles, @NPCollapse or @vaelgates?