@ylecun @leonpalafox @wightmanr @EMostaque @paperswithcode @metaai But as @Michael_J_Black points out, Galactica may open the door to a new set of fake science "attacks" on the publication system. I think it might also support a great semantic auto-correct for authors (esp ones with less skill in English)
@tdietterich @leonpalafox @wightmanr @EMostaque @paperswithcode @metaai @Michael_J_Black I think @Michael_J_Black 's fears are unwarranted. The incentives for flooding publication venues with generated fake science simply do not exist. It's a career-ending act. It could exist as DDS-style gratuitous acts of vandalism, but those are generally isolated incidents.
@ylecun @tdietterich @leonpalafox @wightmanr @EMostaque @paperswithcode @metaai @Michael_J_Black I agree with you that reputable researchers would not submit generated fake science. But I can see people making new accounts (maybe using their undergrad EDU address) and shamelessly submit. This already happened with COVID-related fake science 2 yrs ago blog.arxiv.org/2020/10/22/ope…
@rasbt @ylecun @tdietterich @leonpalafox @wightmanr @paperswithcode @metaai @Michael_J_Black These fears are all overblown, folk could do a lot of things but ultimately we live in a society with verifiable reputation metrics and things like peer review and curation. The reproducibility crisis in science is indicative of the perverse incentives across the space already.
@EMostaque @rasbt @ylecun @leonpalafox @wightmanr @paperswithcode @metaai @Michael_J_Black Suppose 10% of future papers are synthesized fakes. All of us will need to review 10% more papers. We are already straining under the current reviewing load. "Things like peer review and curation" are not free. To be continued after I process today's 201 cs.LG submissions
@tdietterich @EMostaque @rasbt @leonpalafox @wightmanr @paperswithcode @metaai @Michael_J_Black There would need to be an infinite supply of authors for those 10% of papers, because they would all become quickly blacklisted. The incentive to submit garbage does not exist and the cost is high. Why would anyone want to do it?
@ylecun @EMostaque @rasbt @leonpalafox @wightmanr @paperswithcode @metaai @Michael_J_Black This clarifies for me that the main risk is undetected fake papers (I.e. faked results).
@tdietterich @ylecun @EMostaque @rasbt @leonpalafox @paperswithcode @metaai @Michael_J_Black A model cannot be both laughably bad + able to create a convincing and consistent fake result. If a LLM actually tricked a panel of reviewers with fake results, who cares about fake papers! What else can it do? Next stop, creating a paper with a valid and novel result...
@wightmanr @ylecun @EMostaque @rasbt @leonpalafox @paperswithcode @metaai @Michael_J_Black Fake science is dangerous and wasteful. Danger: wrong therapies kill people; Waste: research time and money invested pursuing nonsense. How much will LLMs contribute to this problem? I don’t know
@tdietterich @wightmanr @ylecun @EMostaque @rasbt @leonpalafox @paperswithcode @metaai @Michael_J_Black Agree with the "waste" part. But the danger seems overstated. I don't think there is a plausible path a treatment based on fake papers would make it through the (many) intermediate checkpoints and start killing people
@ljbuturovic @wightmanr @ylecun @EMostaque @rasbt @leonpalafox @paperswithcode @metaai @Michael_J_Black One word: Hydroxychloroquine
@tdietterich @wightmanr @ylecun @EMostaque @rasbt @leonpalafox @paperswithcode @metaai @Michael_J_Black Interesting example. I was thinking of new drugs in development, where this might be less likely to happen, but you have a point