AI Unpacked with Nobel Laureate Geoffrey Hinton

9.27k views4309 WordsCopy TextShare

Valence

Even the “godfather of AI” Geoffrey Hinton has been surprised by the speed and scale at which AI has...

Video Transcript:

Jeff welcome we're so excited to have you here today we are gathered with ch's heads of talent of some of the largest companies in the world and what we're trying to do is make sense of AI we're really wondering what it's going to be like in the future but to understand that I'd like to go back to the past if we look back to let's say around 2010 so 15 almost 15 years ago if you tried to you know Jeffrey Hinton in 2010 the predictions that you made where were you uh too optimistic too pessimistic

about the speed of progress um how has the field progressed since then so ask me about 2016 later okay um so if I think if you would had asked people even fairly enthusiastic people who believed in neural Nets um in 2010 where we will be now they wouldn't have believed would have something like gbd4 they would have said you're no in the in the next 14 years you're not going to develop something that's an expert at everything not very good expert but an expert at everything you're not going to be able to have a system

where you can just ask any question you like some obscure question about British tax law or some weird question about how you solve equations and it's going to be able to give you a pretty good answer an answer that's better than 99% of the population could give you that's extraordinary and we wouldn't have predicted that and so progress is happening faster than you anticipated yes can you share more what's it like to experience that as one of the leading researchers in the space and watching it accelerate it's amazing because back in the 80s when rumelhart

um reinvented back propagation he rediscovered it and he and I worked together to use it for things and we thought to begin with we thought this is going to solve everything we've got something that can just learn and and don't it seems to be any to it and then it was very disappointing and we didn't understand why it didn't work better it was partly architectural things and for about 30 years we used an input output function to looked like this when we should have used one that looked like this um just crazy um but it

was mainly scale and we just didn't understand that this whole idea would only really come into its own when you had a lot of connections and a lot of training data and a huge amount of compute so we couldn't have done it back then and if we' said back then yeah but if we made one a million times bigger and had a million times more data it'll really work that would have just sounded like a pathetic excuse but it turned out that was the truth that's fascinating so one of the things that you and I

talked about earlier is the underselling of what large language models do if we use the term next word prediction the experience that we have is that they could be reasoning they could have a degree of intelligence can you share more about how that comes about so there's many people who say these things are just using statistical tricks they don't really understand what they're saying they're just using correlations but if you ask those people well what's your model of how people understand if they're symbolic AI people their model is we have symbolic expressions and we manipulate

them with symbolic rules and that never worked that well it didn't work nearly as well as the large language models if you have cognitive scientists they'll come up with a variety of explanations but my initial tiny language model wasn't designed to do NLP natural language processing it was designed to show how people could learn the meanings of words so it's a model of people a very simplistic model but the best model we have of how people understand sentences is these large language models it's not like we have a different model of how people work and

these work differently um the only good model of how people work that we have is like this so I think they really you do understand and they understand in the same way as we do and these large language models might have that kind of embedded creativity already in them yes so many people say you know these language models will do routine things but people are creative well if you take a standard test of creativity I think the large language models now do better than 90% of people that's so the idea they're not creative is crazy

um this is very relevant to the debate among artists and Silicon Valley um about whether um these AI models are just stealing the creations of artists um obviously to produce a work in a genre you have to listen to a lot of music in that genre but it's the same with a person whenever a person produces new music in a genre they are stealing the works of previous people in just the same where the a system is so the a system is not stealing them any more than another another musician does I mean it's fascinating

if you read analysis of the work of Picasso he is clearly borrowing from um artistic Traditions I think he's you know Ben masks and many other areas and he's merging them into a new you know a new approach but he is building off of things that he has seen I think AI if it's seen everything there's no reason why it can't do the same thing yes so AI can be creative and of course to be creative in a particular way you look at works of art that are done in that way but it's hard to

say that it's stealing because what it's not doing is pasing together bits of other things it's understanding the underlying structure the same way a person does and then generating new stuff with the same kind of underlying structure so it's just very like a person creating something now you also studied psychology the human brain and your undergrad how does that compare to what we have in our brains so we have about a 100 million synapses and even though many of them are used for other things like breathing um the CeX the neocortex has most of those

and so we've got many more adaptable parameters than these big language models which makes it very strange that gp4 knows thousands of times more than we do and you said 100 million I think you meant 100 trillion did I say 100 million I think you said 100 million I could be a politician I can't tell I can't tell that millions and trillions 100 trillion yes 100 trillion um synapses and so so it's fascinating so we have large language models that are two orders of magnitude smaller than the connections in the human brain and yet know

an enormous amount of information yes they are not very good expert at everything which is so they know thousands of times more than any one person and one of the reasons it can do that is you you can have many different copies of exactly the same neuronet running on different Hardware so you can get one copy to look at this bit of the internet another copy to look at that bit of the internet they can both figure out how they'd like to change their own weights and if you just average those changes then both copies

have learned from the experience that each of them had so I take a thousand of those imagine if we could take a thousand people they could all go off and do a different course and at the end everyone knew what everyone had learn had experienced um we've talked a little a little bit about memory and how memory is stored in the human brain we've talked about um sort of fast weights and how those can adjust is there anything missing in an llm architecture that humans still do exceptionally better the human brain does better I think

we still learn better from limited data um and we don't quite know how we do that um we know the human brain has changes in connection strengths and many different time scales so the first time I met Terry sinowski in 1979 that was basically the first thing we talked about how these neural net models have just two time scales they have the time scale of the activities of the neurons changing and so each time you put in a different sentence the neural activities will change and then they have the activities of the the values of

the weights the connection strengths and they change very slowly that's where all the knowledge is and they just have those two time scales now you could have many more time scales let's just suppose you have one more time scale um where you have weights you have the weights that change slowly but you have an overlay of Weights that change much faster but Decay quickly um that gives you all sorts of extra nice properties so for example if I say an unexpected word to you like cucumber and a couple of minutes later I put headphones on

you and I put lots of noise in the headphones and I play words so you can only just Shar them most of them you can't quite make out what they are you'll be considerably better at making out the word cuccumber yeah because you heard it two minutes ago so the question is where is that stored and it's not stored in Ural activities you can't afford to do that it used up too many Ur and it's not stored in the long-term weights because in a few days time it'll be gone it's stored in short-term changes to

the synapse strengths and we don't have that in the models of present my undergraduate research was actually looking at something very similar except it was pre perceptual so you would flash the word cucumber very quickly you didn't notice that you'd see it sublimal subliminal and then you could pick it up more likely if you either saw it you know in a collection of words or listen to it and so there was a question of how did you how did you process the word cucumber without realizing it in such a way that your brain stored it

and was able to recognize it more quickly I think there's also a phenomenon where you flash the word cucumber and you'll be better at hearing at recognizing the word lece yes that was unconscious it was the it was the association of sort of similar words yes so it's not just that you got the word you got the semantics of the word without any Consciousness can you share some examples of how introducing new information to an llm that it might not have had in its training data how it can reason over that and come up with

an answer that's similar to how a human right Reason by analogy well I can give a nice example of doing analogy is that most people can't do I would love to hear that so I asked gp4 some time ago when it wasn't hooked up to the web um why is a compost heap like an atom bomb and I would not be able to answer that question excellent um so it said the time scales are very different and the energy scales are very different and then it went on about chain reactions it went on about how

in a compost deep the hotter it gets the faster it generates heat in an atom bomb the more neutrons it's producing the faster it generates neutrons and so the underlying physics similarity um gp4 had seen now it probably didn't see it when I asked the question it had probably seen it during training so we see a lot of analogies and we actually store things in the weights and it's much easier to store things in weights if they're kind of analogous structures because you can you you can share the weights [Music] um and these large language

models are just the same and so in order to store huge amounts of information they have to see analogies between different facts they're learning and they would have seen many analogies that no person's ever seen so this is fascinating so in order to compress that amount of information into that few um parameters they have to implicitly understand and codify analogies in their waiting and many of those analogies are analogies at a deep level like between a compost heap and an atom ball and they might be discovering they might have embedded in the weights right now

analogies that we as humans have not actually thought about ourselves yes because gp4 is a not very good expert at physics but it's also not not very good expert at ancient Greek literature and it may well be the something in ancian Greek literature that's rather like some weird thing in quantum mechanics but no one person's ever seen those two things and so so in 2010 you started understanding what was possible you and Ilia um won image net Alex I think was um Alex kvki it's called Alex net Alex net oh that's right so he was

an amazing coder and he managed to make um to code convolutional Nets on on Nvidia gpus much more efficiently than anybody else and so at that point you've started to see that scale matters how is the past 10 years 2016 why is that moment an important moment for you oh the reason I mentioned 2016 is because I made a prediction in 2016 um that was wrong in the opposite direction I predicted that in 5 years time we wouldn't need Radiologists anymore um this upset some Radiologists um and it turned out it was wrong I was

off by B factor of two possibly even a factor of three the time is going to come and I I meant for scans I actually think I said at the time five years maybe 10 but um when they're reading scans in maybe 10 years from now I'm very confident that the way you'll read almost all medical scans is an AI will read them and the doctor will check it it the AI is just going to get much better than doctors AI can see much more in scans than doctors can so my wife had cancer and

she get CAT scans every so often and they'd say the tumor is 2 cm and then a month later they say the tumor is 3 cm well this thing was shaped like an octopus and two is not a very good measure of the size of an octopus right you'd like to know much more about what's going on and with AI we can do that with doctors they can't do that because they don't have the they don't know what the outcomes are but I think with AI we're going to be able to see things about cancers

that'll tell you whether they're going to metastasize soon and stuff like that um we know there's lots more information in the images that isn't being used well as you said earlier if you've got you know 500 doctors that can each spend a lifetime looking at 500 images and seeing the progression of them and then compress their brains that's vastly more information than one single doctor yes so no Radiologists can train on enough data to compete with these things once these things are really good at Vision but for example in tuition we're going to get very

good AI tutors and there's a lot of research that shows take a school kid and put them in the classroom they'll learn at a certain rate give them a private Tut they learn twice as fast and so we know that AI is approaching being good enough to understand what people are misunderstanding and as soon as you get private tuition by an entity that knows knows what you don't understand it's going to be much more efficient way of learning than um just sitting in a classroom and listening to a broadcast so I think in healthc care

and education um there's going to be huge advantages I want to spend a moment on that education example because we've been inspired by that the idea of a tutor for everyone for people learning in traditional education a leadership coach for everyone who is at work and so for us this idea of personalization matters do you think AI can understand you and your context and almost be able to sort of access it's like a a librarian for the world's information but just for you absolutely so um a few weeks ago I won a Nobel Prize and

I've never had a personal assistant before and the university gave me a personal assistant and she now understands quite a lot about me and it's wonderful and everybody could have that if we can do it with AI That's fascinating and you had to bring her up to speed give her a context and if she had infinite access to your information she'd be even more yeah hopeful yeah um but I think that's sort of the good scenario where we all get these really intelligent personal assistant that know everything about us um and help when we think

about building AI product something that gets tossed around a lot is human machine or human model empathy and helping users understand what maybe they should expect from model so they know how to channel it properly how do you think about that for software well there's one experiment where you have ai doctors and real doctors and they interact with patients and then you ask the patients how would you rate them for empathy the a ones do much better the a ones actually listen to the patients so already they can exhibit empathy it may be we think

of empathy as you think how would that be for me and then you say oh my God that would be awful for me I'm so sorry um and maybe they don't do that but they nevertheless behaviorally they seem to exhibit empathy pretty well and we would like um a if you had an AI Shure you'd like it to have empathy about the fact that the pupils misunderstood something um and I'm sure they're going to be able to do that and I think you would say correct me if I'm wrong that if it exhibits empathy it

might be doing it in the same way that we exhibit empathy and therefore it might be it's not just an exhib like performative empathy it's it's going to come across as genuine empathy is that right it might be genuine empathy um I think for us to call it genuine empathy the a would have to be similar enough to us so they could imagine what it would be like for them we tend to think of empathy is the ability to imagine what it would be like for you and then see understand how it is for the

other person and I think if you're not doing that you're just being very oh that's terrible I'm so sorry about that but you're not thinking of how it would be for you right that seems less genuine empathy and as can certainly do that I mean I definitely agree with that but I think part of the beauty of literature is that it puts you in other people's positions and you can experience it through that and you can say well I've never been in that position but I've now lived that experience and if you have the world's

literature compressed into that you know model they might be able to understand what a range of humans even more than I would would be going through and and exhibit empathy to that they might yes that's really interesting so looking I want to zoom out to the societal side of things so we've seen an enormous amount of hype enormous amount of of coverage of llms in the past a couple of years um one of the things you and I talked about is the analogy of sort of how difficult it is to see the future when things

are growing exponentially can you share a little bit more about how you're experiencing that yeah we're not used to exponential growth so a good analogy is if you're driving at night on a windy road that you don't know you often drive on the tail lights of the car in front of you and as the car gets further away from you the tail lights get dimmer and they get dimmer quadratically so if you triple the distance the they get dimmer by a factor of nine um that's why you're trying to stay close um with fog it's

not like that at all it's totally different um with fog if you can see clearly at like 100 yards you just assume you'll be able to see something at 200 yards but actually you can see clearly at 100 yards and then nothing at 200 yards because fog is exponential per unit distance it removes a certain fraction of the it's very different from linear or quadratic things that we used to um people don't really understand the word exponential because it's misused so much people misuse the word exponential to mean a lot in fact I think the

rate at which they're misusing exponential is growing quadratically reminds me of um a Riddle That I Used to Love as a child which was if you have a pond that's um starts with one Lily in it and it doubles every day until the 30th day when the lies cover the pond in obliterate sunlight so the pond dies what is which day is the pond half filled with lies and the answer is the 29th day but the intuition people have is oh maybe it's around the 15th and so it's hard to sometimes understand because we don't

live in that experience what exponential growth could be like um is there anything as you think about the future of work we talked a little bit about Workforce um a world of everyone having assistance is obviously wonderful a wor jobs being replaced is obviously going to cause a lot of social stress how how should people who are leading large companies think about navigating the next two to three years there's obviously joblessness so we just don't know whether AI is going to get rid of a lot of jobs I suspect it is Yan thinks it is

y Yan come my friend thinks it isn't and in the past things like automatic teller machines um didn't cause massive unemployment among tellers they just ended up doing more interesting complicated things and taking longer about it so you have to queue for a long time um so maybe it'll produce joblessness maybe it won't I suspect there's some kinds of jobs where you could use a lot more of that so if for example they made doctors more efficient um we could all especially old people could use a lot more doctor's time if you if you got

doctors who are 10 times as efficient I'd just get 10 times as much Healthcare great um there's other things though that aren't like that and what will happen is one person with an AI assistant will be doing the jobs that 10 people used to do and the other nine people will be unemployed and the problem with that is you got an increase in productivity that should help people but you get nine people unemployed and one rich person who gets a bit richer and that's very bad for society obviously we can't see very far in the

future if you take the fog analogy I think the Wall comes down at 3 to five years we're fairly confident we've got some idea what's going to happen in the next few years in 10 years time we have no idea what's going to happen and you can see that by looking 10 years back we had no idea this was going to happen um I think companies should navigate it by going in the direction of everybody having an intelligent I assistant um so people feel that going to get improved working conditions from this smart assistant you're

going to get increases in productivity um that will be great for everybody um the next five years are going to be extraordinarily eventful for a lack of a better word and um you've played an enormous role helping us get here getting through the AI winter um getting through those moments when it might not have felt like it was quite as clear as it is now and I just wanted to say what an honor it's been to have this conversation and thank you well thanks very much for inviting me it's been fun yeah I really enjoyed

it thank you thank you so much you're welcome