(((ل()(ل() 'yoav))))👾 @yoavgo, Twitter Profile

(((ل()(ل() 'yoav))))👾 @yoavgo

2 weeks ago

Here is a meta-review we got for third submission of a paper that aims to study text-understanding capacities of LLMs, focusing on very simple, if not trivial, cases where they systematically fail. We see every stated weakness as a strength, and they are all by design.

24 28 248 93K 114

Download Image

(((ل()(ل() 'yoav))))👾 @yoavgo

2 weeks ago

"the new benchmark is so simple that it could be learn from a handful (one?) example. An old-fashion logic approach will probably solve the new benchmark with no problems." YES! and yet the models did not learn it in their massive pre-training. Isn't this noteworthy?

1 1 88 5K 0

Matt Timmermans @matt_timmermans

2 weeks ago

@yoavgo Maybe it's so unusual that you need to spell it out, like "The capabilities of LLMs are bounded by the most complex problems that they can solve, and the simplest problems that they can't. While the former limits are well studied..."

1 1 60 3K 7