@sir_deenicus @lacker @fchollet Possible. Math can be bootstrapped, IMO. For unspecified analogies it might be harder to design a curriculum to learn.
Should check out ARC-2 in detail to make a more educated guess.
JAX Global Meetup is back!
Join us this Friday Oct 7, @_arohan_ will be talking about second order optimizers, deep learning, and JAX. @borisdayma and I will be hosting the event.
Event link: meetup.com/jax-global-mee…
Join the JAX Meetup to notified of all future events!
@erik_nijkamp @MarkusNRabe @Yuhu_ai_ It is true that this code base can accomodate only models of limited size, but there are simple patches to fix that.
We have not tried few-shot prompting on the memory.
@ylecun I don't think it a particularly weak form. Eg, arxiv.org/abs/1907.05242 shows that fully connected layers can be replaced by attention into a fixed mrmory. Since the embeddings of the "config" are trained E2E by signals from the (later) "inputs", it is pretty much a hypernetwork.
@MarisaVeryMoe @ylecun Training-wise?
Most training methodologies are not E2E, but under certain circumstances it is possible.
@Plinz @DukeDarkside @pavel23 @elonmusk It is still upsetting in a situation in which the ongoing ethnic cleansing is quite well-documented.
However the real issue is not just that of the exact outcome, but that of the security guarantees.
@ylecun Alternatively, it is a hyper-network, since the latent representations of the prompt act as data-dependent weight matrices.
An alternative view is that the prompt just selects an expert from a very large mixture of experts. (again the token embeddings act as expert parameters).
Calibrating Sequence likelihood Improves Conditional Language Generation
With SLiC, decoding candidates’ quality significantly improves regardless of the decoding method, without showing any sign of diminishing returns with model scale.
@ilyasut A special operation here, a special operation there, and pretty soon you're talking about nuclear winter.
@JulesJacobs5 @freekwiedijk @el_keogh @fchollet My point is just that in order to bootstrap a strong reasoning AI, we can create curricula that are based on thinning out proofs, creating increasingly large gaps. Larger than they are in text books and do RL this way. This is why this continuum is relevant for AI math.
@JulesJacobs5 @freekwiedijk @el_keogh @fchollet It is.
OTOH, a single "easy to see" step in a high level textbook can be more complex than a whole statement the prrof pf which is given in an introductuonary text book.
And there are multiple levels of "higher level".
@JulesJacobs5 @freekwiedijk @el_keogh @fchollet Sure. That's my goal too. I just wanted to highlight that current notions of correctness do not trivially match that of formalizability. There is a lot of ambiguity etc. that we need to accept of we want to create a fully automated system that can formalize natural language math.
@JulesJacobs5 @freekwiedijk @el_keogh @fchollet I don't think there is much difference. It all depends on the system's sophistication. The more the system "knows", the better it can argue, the bigger gaps it will be able to bridge.
@JulesJacobs5 @freekwiedijk @el_keogh @fchollet That's why I say that there is a continuum between formalization and reasoning.
Early versions of such a system will be able to "formalize" very detailed proofs only, but with time, the system will improve automatically, w/o any extra engineering and get better over time.
@freekwiedijk @JulesJacobs5 @el_keogh @fchollet That's subsumed too. It might or might not succeed, though, but the capability of the system will improve rapidly over time.
In the early stages, I assume it will require more detailed proofs, while after a time of training and feedback, it will cope with mere "easy to see".
@JulesJacobs5 @freekwiedijk @el_keogh @fchollet In theory, the reader always has all the information to prove anything, even w/o the paper.
What I mean here is that informal proofs can exploit uncertainty in a non-trivial way, without constructing the proof in any obviously reproducible manner.
@JulesJacobs5 @freekwiedijk @el_keogh @fchollet You give it a whole proof text and it will mull it over maybe for minutes and then it comes back either with a formalization candidate for you to revise (if they really care), or with some natural language questions and partial formalizations for the successful parts.
@JulesJacobs5 @freekwiedijk @el_keogh @fchollet The difference between refereeing to an analogy and writing a script is that a human proof is considered to be correct if it convinces the reader, a mere hint is enough. While a script contains *all* the information to generate the proof.
@freekwiedijk @JulesJacobs5 @el_keogh @fchollet What I envision is that you write an informal natural language text, then after significant wait (maybe minutes), you get a full formalization w/o intervention, or some natural language questions and partial formalizations if it was not successful.
@JulesJacobs5 @freekwiedijk @el_keogh @fchollet Why invalid?
The only criterion for a paper to be correct is to convince the reader that a formal proof is *possible*.
Analogies, etc are perfectly fine if they help the reader just enough.