ChatGPT is Getting Worse at Maths — Here’s Why

AI Amplified 🚀
2 min readJan 31, 2024

--

Welcome back to our little corner of the internet!

ChatGPT, a versatile language model which has grown in popularity recently (especially among students), is celebrated for its text-related capabilities, but has faced challenges in handling complex maths problems, particularly those involving larger numbers and unknown variables like x,y and z. While excelling in various language tasks, it struggles with mathematical calculations beyond what some would say is a high school level.

Recent research from Stanford University and UC Berkeley reveals that Large Language Models (LLMs) such as ChatGPT, lack fundamental training in performing calculations, resulting in confusion and errors even with simpler maths problems. But why has it seemingly gotten worse at maths over time?

The phenomenon of “AI Drift” has further exacerbated ChatGPT’s math-related issues as its language skills developed. “AI Drift” is when AI models degrade and “act in unexpected ways”, straying from their parameters. For example, what may be a substantial difference from training data to production data may “surprise” the AI, leading to a change in its accuracy.

However, Google has stepped in with a solution, presenting an algorithm that focuses on in-context learning, aiding LLMs in reasoning algorithmically with numbers. This approach guides the learning process step by step, improving the models’ proficiency in number-based reasoning.

Additionally, Wolfram Research, in collaboration with OpenAI, has introduced a plugin named Wolfram+ChatGPT, enhancing ChatGPT’s maths skills. The plug-in converts text queries into equations and visual representations, such as graphs, using Wolfram’s programming language specialized in presenting data computationally.

As developers seek to enhance the model’s capabilities, collaborations with industry players like Google and Wolfram Research highlight ongoing efforts to bridge the gap between language models and proficiency in mathematical reasoning.

Thank you for reading this!

--

--

AI Amplified 🚀

The commonplace for people who are curious about technology and AI. And yes, my profile picture was generated by DALL-E, a generative AI by OpenAI.