At a conference at New York University in March, philosopher RaphaĂ«l MilliĂšre of Columbia University offered yet another jaw-dropping example of what LLMs can do. The models had already demonstrated the ability to write computer code, which is impressive but not too surprising because there is so much code out there on the Internet to mimic. MilliĂšre went a step further and showed that GPT can execute code, too, however. The philosopher typed in a program to calculate the 83rd number in the Fibonacci sequence. âItâs multistep reasoning of a very high degree,â he says. And the bot nailed it. When MilliĂšre asked directly for the 83rd Fibonacci number, however, GPT got it wrong: this suggests the system wasnât just parroting the Internet. Rather it was performing its own calculations to reach the correct answer.
Although an LLM runs on a computer, it is not itself a computer. It lacks essential computational elements, such as working memory. In a tacit acknowledgement that GPT on its own should not be able to run code, its inventor, the tech company OpenAI, has since introduced a specialized plug-inâa tool ChatGPT can use when answering a queryâthat allows it to do so. But that plug-in was not used in MilliĂšreâs demonstration. Instead he hypothesizes that the machine improvised a memory by harnessing its mechanisms for interpreting words according to their contextâa situation similar to how nature repurposes existing capacities for new functions.
This impromptu ability demonstrates that LLMs develop an internal complexity that goes well beyond a shallow statistical analysis. Researchers are finding that these systems seem to achieve genuine understanding of what they have learned. In one study presented last week at the International Conference on Learning Representations (ICLR), doctoral student Kenneth Li of Harvard University and his AI researcher colleaguesâAspen K. Hopkins of the Massachusetts Institute of Technology, David Bau of Northeastern University, and Fernanda ViĂ©gas, Hanspeter Pfister and Martin Wattenberg, all at Harvardâspun up their own smaller copy of the GPT neural network so they could study its inner workings. They trained it on millions of matches of the board game Othello by feeding in long sequences of moves in text form. Their model became a nearly perfect player.