on Toshareproject.it - curated by Bruce Sterling
*This is very dense, but quite interesting. The parts where he says he’s given up on certain ideas, those are especially remarkable.
ZDNet: You will also manage to tick off the Transformer people, the language-first people, at the same time. How can you build this without language first? You may manage to tick off a lot of people.
YL: Yeah, I’m used to that. So, yeah, there’s the language-first people, who say, you know, intelligence is about language, the substrate of intelligence is language, blah, blah, blah. But that, kind-of, dismisses animal intelligence. You know, we’re not to the point where our intelligent machines have as much common sense as a cat. So, why don’t we start there? What is it that allows a cat to apprehend the surrounding world, do pretty smart things, and plan and stuff like that, and dogs even better?
Then there are all the people who say, Oh, intelligence is a social thing, right? We’re intelligent because we talk to each other and we exchange information, and blah, blah, blah. There’s all kinds of nonsocial species that never meet their parents that are very smart, like octopus or orangutans.I mean, they [orangutans] certainly are educated by their mother, but they’re not social animals.
But the other category of people that I might tick off is people who say scaling is enough. So, basically, we just use gigantic Transformers, we train them on multimodal data that involves, you know, video, text, blah, blah, blah. We, kind-of, petrifyeverything, and tokenize everything, and then train giganticmodels to make discrete predictions, basically, and somehow AI will emerge out of this. They’re not wrong, in the sense that that may be a component of a future intelligent system. But I think it’s missing essential pieces.
There’s another category of people I’m going to tick off with this paper. And it’s the probabilists, the religious probabilists. So, the people who think probability theory is the only framework that you can use to explain machine learning. And as I tried to explain in the piece, it’s basically too much to ask for a world model to be completely probabilistic. We don’t know how to do it. There’s the computational intractability. So I’m proposing to drop this entire idea. And of course, you know, this is an enormous pillar of not only machine learning, but all of statistics, which claims to be the normal formalism for machine learning.
The other thing —
ZDNet: You’re on a roll…