When you train an AI model with code, it gets better at reasoning. For example, Mark Zuckerberg revealed that teaching Meta's Llama model with code significantly improved its reasoning abilities. This enabled the smaller Llama 3 model to outperform larger models like Llama 2 in logical and mathematical reasoning capabilities. This is no coincidence. Users think in terms of data, using tools like calculators and spreadsheets to directly manipulate it. Developers, however, think in terms of metadata, writing code that manipulates variables, which in turn manipulates data at runtime. This ability to abstract data as metadata is a step change in intelligence between users and developers. AI improves its intelligence in the same way. One approach, called program induction, allows AI to generate code for complex problems like frontier math. If understanding and generating code can make AI more intelligent, could understanding meta-metadata such as compilers make it even smarter? That's what we will explore in this post.


Intelligence in Math

Imagine you are a maintenance worker asked to hang a painting in an art gallery at 12 feet from the floor. You have a ladder whose length is 13 feet. How would you place the ladder to reach 12 feet? You might first get the help of another person to hold one end of a measuring tape at the bottom of the wall while you climb the ladder with the other end of the tape to mark 12 feet. Then you would place the ladder parallel to the wall and drag it until it hits the 12-foot mark. This approach of working directly with data represents the basic level of intelligence.

If you are an engineer, you would apply the Pythagorean theorem to find the solution in one step—placing the base of the ladder 5 feet away from the wall ensures it hits the wall at exactly 12 feet. Knowing an algebraic equation (i.e., √c² - a²) and applying it to the data is a higher level of intelligence.

Now, imagine you are Pythagoras, and the theorem has not been invented yet. Observing patterns in right-angled triangles—perhaps noticing the 3-4-5 triangle used in Egyptian architecture—you could hypothesize that a² + b² = c². By generalizing and verifying this pattern across different usecases, you formalize it as a theorem. This progression—from data to metadata to the invention of metadata—illustrates how reasoning evolves from worker to engineer to inventor. Intelligence is the ability to abstract and reason at multiple levels.


Intelligence in AI

The first version of ChatGPT was comically bad at math. Even the recent version, ChatGPT 4, is only as good as the maintenance worker. It answers that the ladder has to be placed 3 feet from the wall. It uses the 4:1 safety guideline to minimize ladder slippage—relying on transduction, which means simply memorizing and recalling data without deep understanding.

Next-generation models like o1 can memorize and recall metadata, such as the Pythagorean theorem, and apply it to data. It correctly calculates that the ladder needs to be exactly 5 feet away using the theorem, much like an engineer would. This ability aligns with program induction, where the AI generates code to solve complex problems.

Future models beyond o3 could potentially discover fundamental new geometric relationships. Consider the circle division problem: when we draw n straight lines through a circle, what's the maximum number of regions created? While we can count regions for specific cases (4 lines create 8 regions), finding a formula that works for any number of lines remains unsolved. Like Pythagoras discovering the relationship between triangle sides, finding this formula would reveal deep patterns in geometry that have eluded mathematicians for centuries.


Intelligence in coding

The idea of moving up the metadata stack to become more intelligent isn’t new to developers. Workers use calculators (data tools) to solve specific problems like finding ladder distances. Developers abstract these tasks into code (metadata), using programming languages like python to automate such problems for millions of workers around the world. At the highest level, language creators build meta-tools that compile and execute code that helps millions of developers around the world. They use meta-metadata tools, like Python's PEG (Parsing Expression Grammar) or ANTLR (a parser generator) to create new programming languages and compilers.

This progression explains why, as we master abstracting simple logic, these tasks become intellectually unchallenging, leading us to dismiss them as CRUD apps. Instead, we aspire to build more complex tools like compilers or app builders. Writing our own compiler using tools like PEG or ANTLR grammar is a natural step for developers to move up the intelligence stack.


Summary

OpenAI recently introduced levels framework describing how AI evolves from chatbot to reasoning to invention. While it is a good measure of capability, it doesn't explain the underlying reason for the evolution of intelligence. The Llama example and program induction approach show that teaching AI with code and asking it to generate code improves its intelligence. This mirrors how engineers become more intelligent than workers through understanding mathematical patterns, and how developers surpass users by mastering code patterns. If our developer instincts—aspiring to build compilers to move up the intellectual stack—are correct, then teaching AI with meta-code such as language parsers and compilers should produce the next step change in intelligence. After all, the logical trajectory of reasoning models shows that the nature of AI is meta.