Prediction isn’t understanding: AI’s evolution and the soul of science

FirstPrinciples
Jun 11
6 min read

Updated: Jul 8

From rule-based ‘expert systems’ to neural networks, AI has long chased the dream of scientific reasoning. But while today’s models are good at pattern matching and can generate code or summarize academic papers, they struggle with the heart of scientific discovery: structured reasoning.

The symbolic summer: Dartmouth and formalizing the field of artificial intelligence

Text image titled "A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence," listing authors from various institutions. — Cover page of Dartmouth summer workshop research project on artificial intelligence. (Credit: Dartmouth)

In the summer of 1956, during a now-legendary workshop at Dartmouth College, four researchers (John McCarthy, Marvin Minsky, Nathaniel Rochester, and Claude Shannon) proposed a radical thesis: that “every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it.”

This was one of the founding moments of AI. And its early years were dominated by the belief that the right cocktail of logical rules and structured symbols (now referred to as symbolic reasoning) could reproduce reasoning itself and ultimately lead to the development of mechanized intelligence.

By the 1960s and 70s, this idea became reality in systems like DENDRAL, which identified chemical structures from mass-spectrometry data, and MYCIN, which diagnosed blood infections from symptoms and lab data. These ‘expert systems’ encoded human knowledge as elaborate trees of if-then statements.

These early systems were powerful, yet narrow. In ideal cases, they matched or outperformed human experts. But there was a catch: they couldn’t adapt. Feed them unfamiliar data and they froze or failed. They could not learn; they could only look up. True intelligence, it turned out, is more than meticulous bookkeeping.

The AI Winter: A Pause Between Paradigms

By the late 1970s, funding had started to dry up, triggering an “AI winter”. The world, it seemed, was too messy and too noisy for perfectly hand-crafted rules, and the limitations of symbolic AI were exposed. As early optimism collapsed, many left the field altogether as the funding drain precipitated a brain drain to other focus areas.

The problem wasn’t just technical, but rather a conceptual one. Symbolic systems assumed the world could be exhaustively described in logic. Intelligence, it turned out, might be less like programming, and more like pattern recognition. What machines needed was not more syllogisms but rather more of a talent humans excel at: pattern recognition. That shift opened the door to a new paradigm.

Graph showing technology lifecycle stages: Technology Trigger, Peak of Expectations, Trough of Disillusionment, Slope of Enlightenment, Plateau of Productivity. — Gartner's Hype Cycle, a methodology for describing how new technologies, and the perceptions of them, change as they emerge. (Credit: Jeremykemp via Wikimedia)

The statistical turn

By the late 1990s and early 2000s, machine learning began to eclipse symbolic AI. Instead of encoding rules by hand, new systems learned them statistically. Techniques like Bayesian inference, support vector machines, and eventually deep neural networks allowed researchers to train algorithms on massive datasets, uncovering correlations and predictive patterns that often eluded traditional analysis.

Geoffrey Hinton with a headset speaks on stage. He wears a suit, patterned tie, and a pin. Dark background, warm lighting highlights his face. — Geoffrey Hinton, 2024 Nobel Prize Laureate in Physics (Credit: Arthur Petron via Wikimedia)

A particularly influential shift came through the work of Geoffrey Hinton, who championed architectures inspired by the brain's structure in an effort to capture its adaptability. His breakthroughs in deep learning—especially in backpropagation and deep belief networks—paved the way for modern neural networks and earned him a share of the 2024 Nobel Prize in Physics for advancing our understanding of how machines might approximate human-like perception and learning. These advances brought about a new paradigm, where systems could generalize from data rather than rely on brittle logic trees.

However, though promising, these techniques are data-hungry and remained theoretical and conceptual until the early 2010s. The parallel rapid development of smartphones and big data ushered in an age of easy access to personal data, which provided the fuel that was needed to power neural networks and other AI techniques.

With this momentum, AI began to extend beyond computer science and into data-rich scientific fields like genomics, astronomy, particle physics, and climate modeling. These disciplines, long constrained by the complexity and scale of their datasets, now had access to tools capable of making sense of it all. AI didn’t just accelerate analysis, it became indispensable.

Models could now uncover patterns that would otherwise escape human notice—like the faint signals of distant exoplanets—or even propose new directions for investigation. These early wins demonstrated how AI could complement human insight. Since then, examples have only multiplied.

But again, these new tools introduced a new trade-off: opacity. These models could predict, yet rarely could explain. They mapped correlations, not causes, and struggled to navigate science’s hard constraints. These black boxes were powerful, yet they couldn’t explain why a pattern existed and they couldn’t reason from first principles.

And in science, principles matter.

AlphaFold and the line between prediction and understanding

In 2020, DeepMind’s AlphaFold changed the conversation and stunned the scientific world. The model didn’t merely correlate amino acid sequences with known protein structures; it made accurate predictions of three-dimensional folds with a level of precision unmatched by any previous system. It learned the statistical geometry of folding, effectively internalizing the patterns that emerge from the physics and chemistry of molecular interactions.

This was a turning point. AlphaFold didn’t just identify similarities—it modeled constraints and operated within the structured space of protein biology. It felt, at times, like the system understood something fundamental about the folding process.

Molecular model of DNA and proteins with blue, green, and pink strands intertwined, set against a light blue background. — Digital rendering of protein structure prediction from Alphafold (Credit: Google DeepMind)

But this appearance of understanding is exactly where the line must be drawn. AlphaFold does not explain why proteins fold as they do; it does not reveal the causal chemistry or energetic logic underlying the outcome. It predicts exceptionally well, but it cannot account for the mechanism—the deep structure of physical explanation that science demands.

Still, AlphaFold marked a shift. For the first time, an AI system moved beyond tool-like assistance and approached the role of a collaborator. It worked within the formal grammar of a scientific domain, not replacing biologists, but participating in their reasoning process. It demonstrated that a specialized, domain-sensitive model could reach beyond surface-level pattern recognition, even if it still could not explain the laws it had learned to navigate.

The rise (and limits) of the LLM generalists

Today’s public conversation about AI is dominated by large language models—systems like ChatGPT, Claude 3, and Gemini—trained on web-scale corpora and optimized to predict the next token in a sequence. This predictive engine, while remarkably versatile, is fundamentally designed for linguistic fluency, not logical precision.

LLMs can now summarize scientific papers, draft grant proposals, generate useful code, and answer technical questions with impressive surface-level competence. They are already reshaping aspects of the research pipeline. But here, the pattern recurs: broad capability masks a lack of conceptual depth.

Scientific reasoning demands rigor—formal, structured, and internally coherent. It requires adherence to the logic of models, the consistency of equations, the careful scaffolding of inference and constraint. Today’s LLMs do not reason with this kind of discipline. Instead, LLMs are trained to predict the next most likely word (technically, token) in a sentence. Their ability to capture correlations across long texts makes them incredibly good at this—but fundamentally, they remain next-word predictors. As a result, their knowledge is statistical, not structural; they simulate, and often hallucinate.

And this distinction is critical. Because science is not a corpus to be scraped, it is a system to be understood. It operates on symmetries, conservation laws, and mathematical logic, not on token frequency or textual proximity.

Until AI systems can match the rigor that science demands, their role will remain assistive, not explanatory.

Colored spheres in rows of orange, blue, and green connected by lines forming a network on a dark background, illustrating connectivity.

What’s at stake and the future of scientific artificial intelligence

There’s a growing risk in the current AI hype cycle: that we prioritize speed over substance, automating science for productivity rather than understanding. Papers get written faster, results summarized instantly, but the deeper work of reasoning risks being lost.

AI has made remarkable strides in prediction. AlphaFold, large language models, and other tools have extended what’s possible, often outperforming human-crafted systems at specific tasks. But prediction is not explanation, and these systems still struggle to engage with the structure, causality, and abstraction that scientific reasoning requires.

The next breakthrough in AI won’t come from bigger models or faster GPUs. It will come from systems that reason, not just simulate; that seek principles, not just patterns.

That’s how we keep the soul of science intact in an automated age.

This article was created with the assistance of artificial intelligence and thoroughly edited by FirstPrinciples staff and scientific advisors.