Ethan Mollick @emollick, Twitter Profile

Ethan Mollick @emollick

a month ago

AIs have a bad reputation for truth, so three important findings in this paper: 1) "LLM agents can achieve superhuman rating performance" on fact checking when given access to Google! 2) Bigger models are more factual 3) LLMs are 20x cheaper than humans arxiv.org/pdf/2403.18802…

17 101 505 111K 350

Download Image

Gary Marcus @GaryMarcus

a month ago

On a quick read I can’t figure out much about the human subjects, but it looks like superhuman means better than an underpaid crowd worker, rather a true human fact checker? That makes the characterization misleading. (Like saying that 1985 chess software was superhuman). @JerryWeiAI please clarify who the hunans were, how found, compensated, etc

5 3 41 4K 8

Y_Contributor @Y_Contributor

a month ago

@emollick @Dominic2306 I think developers call this AI failure 'hallucination'.

1 0 3 1K 0

Dr Patrick M. - AI Builder @patmcguinness

a month ago

@emollick x.com/patmcguinness/…

Dr Patrick M. - AI Builder @patmcguinness

a month ago

@emollick x.com/patmcguinness/…

0 0 0 193 0

0 0 0 87 0

JJudge @machineciv

a month ago

@emollick But why can't larger models accurately assess articles, short stories, etc. fed to them?

3 0 0 2K 0

Datapoint 2200 @datapoint2200

a month ago

@emollick Already testing an implementation, this could be a game changer

0 0 0 688 0

Mbongeni Ndlovu @Mbounge_

a month ago

@emollick thanks for sharing

0 0 0 688 0

David Crouch @dcrouchca

a month ago

@emollick Gary’s comments need responding to or is this just a “broadcast channel”

0 0 0 88 0

Alex Truelove @alexctruelove

a month ago

@emollick Do we trust Google as a proxy? Seems like a bad precedent, especially as search results continue to get watered down with less-than-truthful content

0 0 2 365 0

Andreas Ramos @Andreas_Ramos

a month ago

@emollick An expanded version of SAFE would be useful for peer review of academic articles, especially in STEM. As I pointed out, the vast majority of peer reviewers for academic journal articles are not paid for their work.

0 0 0 218 0

hair burnt @mmmm_gold

a month ago

@emollick I checked... and I'm always wrong

0 0 0 390 0