what is artificial intelligence?

AI means that a computer has learned by itself how do something.

Traditionally, when we use a software for a task, it relies on an algorithm. That is, humans have written a set of instructions for the computer to follow. if this then do that; else do whatever; while at it continue; or break; etc. In contrast, AI used to be called machine learning (ML) because, instead, the computer is given an objective and has to learn and come up with a solution by itself.

How does this learning work? Through trial and error. It uses a set of training examples and adjusts many "parameters" These parameters are usually called weights (and biases) which control how to go from an input to the expected output.

poster of 2001 movie A.I. From the A.I. movie in 2001 (an eternity ago). AI is very good at very complex things, where writing the rules would be far too cumbersome. But it's also not an algorithm, and humans don't know what it does internally. AI is a mini computer brain; and like our own brains, you can't just look at it to know what it does.

large language models

Because of this "black box" effect, I used to be skeptical of AI in physics. But in early 2023, ChatGPT came out and I was intrigued. I kept hearing: 'no one understands how it works'.

ChatGPT is what we call a Large Language Model, or LLM. It means that it's a model (the term for a trained little computer brain), for language, and that it is large. It's "large" because it contains billions of parameters, whose values have been acquired through training.

Lego machine The insides of an LLMs are like a giant machine made of billions of Lego bricks. In fact, LLMs are so large that they become human-level smart. But also unpredictable — because they are so complex. Rather than "black boxes", I prefer to refer to them as "Lego boxes": we can look inside, but all we see is a giant machine made of billions of bricks, far beyond human comprehension. But can we still learn something?

interpretability

Interpretability is the nascent science of understanding how LLMs think. There is evidence that the model parameters assemble into mini algorithms — in cryptic ways.

There are many different approaches. In the summer of 2023, Chris Earls convinced me that LLMs can be seen as complex systems, familiar to physicists. I joined his team at Cornell to study LLMs like a natural system and probe them for insights.

We have looked at how words travel inside the computer brain to create new text; We call these trajectories lines of thought. how LLMs solve equations; and we have many other ideas at the intersection of computer science, physics, and even literature & linguistics. These questions also have broader philosophical underpinnings and implications.

prospects

LLMs bring a new perspective on the human condition. What is intelligence? What is art? Why is language so powerful at describing the world?

I also believe that this new, invented alien "intelligence" forces us to examine another kind of overlooked cognition: that of animals. Maybe we could even train LLMs to communicate with them? More details in Animals.