New paper alert! We dive into the world of LLMs and cognitive biases, focusing on how models tackle arithmetic word problems—do they show the same biases as humans? Here’s a summary 🤖📚 #LLMs #MachineLearning #AI arxiv.org/abs/2401.18070
🔍🧠 We study three biases, related to text comprehension (#1), solution planning (#2), and solution execution (#3).
Bias #1: Consistency. Like kids, LLMs prefer problems with consistent wording. Form (2) below is significantly more difficult than (1).
Bias #2: Concept variation. LLMs are better at transfer problems (like giving or taking away) than problems with static comparisons. Again, like children.
Bias #3: Carry effect. Unlike children, however, it turns out that they are not sensitive to whether an arithmetic computation involves a carry (like 16+7) or not (like 16+3).
Oh and btw, we wouldn’t want to evaluate the models on already seen training data. So we make sure to create our own problems for these tests, using a new neuro-symbolic method.
This is joint work with my great colleagues @OpedalAndreas Haruki Ying @ryandcotterell @bschoelkopf Abu @mrinmayasachan 📜Paper: arxiv.org/abs/2401.18070