Possible End of Humanity from AI? Geoffrey Hinton at MIT Technology Review's EmTech Digital
553.76k views6721 WordsCopy TextShare
Joseph Raczynski
One of the most incredible talks I have seen in a long time. Geoffrey Hinton essentially tells the ...
Video Transcript:
[Music] hi everyone welcome back hope you had a good lunch my name is Will Douglas Heaven senior editor for AI at MIT technology review and I think we'd all agree there's no denying that generative AI is the thing at the moment but Innovation does not stand still and in this chapter we're going to take a look at Cutting Edge research that is already pushing ahead and asking what's next but starting us off I'd like to introduce a very special speaker who will be joining us virtually Jeffrey Hinton is professor emeritus at University of Toronto and until this week an engineering fellow at Google but on Monday he announced that after 10 years he will be stepping down Jeffrey is one of the most important figures in modern AI he's a pioneer of deep learning developing some of the most fundamental techniques that underpin AI as we know it today such as back propagation the algorithm that allows machines to learn this technique it's the foundation on which pretty much all of deep learning rests today in 2018 Jeffrey received the Turing award which is often called the Nobel of computer science alongside yanlokan and yoshiya bengio he's here with us today to talk about intelligence what it means and where attempts to build it into machines will take us Jeffrey welcome to mtech thank you how's your week going busy few days I imagine for the last 10 minutes was horrible because my computer crashed and I had to find another computer and connect it up and we're glad you're back that's the kind of technical detail we're not supposed to share with the audience right okay it's great you're here very happy that you could join us now I mean it's been the news everywhere that you uh stepped down from Google this week um could you start by telling us why why you made that decision well there were a number of reasons there's always a bunch of reasons for a decision like that one was that I'm 75 and I'm not as good at doing technical work as I used to be my memory is not as good and when I program I forget to do things so it was time to retire a second was very recently I've changed my mind a lot about the relationship between the brain and the kind of digital intelligence we're developing so I used to think that the computer models we were developing weren't as good as the brain and the aim was to see if you could understand more about the brain by seeing what it takes to improve the computer models over the last few months I've changed my mind completely and I think probably the computer models are working in a rather different way from the brain they're using back propagation and I think the brain's probably not and a couple of things that led me to that conclusion but one is the performance of things like gpt4 so let's I want to get on to the points of gpt4 very much in a minute but let's you know go back to the we all understand um the argument you're making and tell us a little bit about what back propagation is and this is an algorithm that you you developed with a couple of colleagues back in the 1980s um many different groups discover back propagation um the special thing we did was used it um and showed that it could develop good internal representations and curiously we did that by show by implementing a tiny language model it had embedding vectors that were only six components on the training set was 112 cases um but it was a language model it was trying to predict the next term in our stray of symbols and about 10 years later Joshua Benjo took basically the same net and used it on natural language it showed it actually worked for natural language if you made it much bigger um but the way that propagation works um I can give you a rough explanation from it of it um people who know how it works can sort of sit back and feel smug and laugh at the way I'm presenting it okay because I'm a bit worried about that um so imagine you wanted to detect birds and images so an image let's suppose it was a 100 pixel by 100 pixel image that's 10 000 pixels and each pixel is three channels RGB so that's 30 000 numbers the intensity in each channel in each pixel that represents the image now the way to think of the computer vision problem is how do I turn those 30 000 numbers into a decision about whether it's a bird or not and people tried for a long time to do that and they weren't very good at it um but here's the suggestion of how you might do it you might have a layer of feature detectors that detects very simple features and images like for example edges so a feature detector might have big positive weights to a column of pixels and then big negative weights to the neighboring column big cells so if both columns are breaked it won't turn on if both colors are dim we won't turn on but if the column on one side is bright and the column on the other side is dim it'll get very excited and that's an edge detector so I just told you how to wire up an edge Detector by hand by having one column of big positive way so next to it won't call them big negative weights and we can imagine a big layer of those detecting edges in different orientations and different scales all over the image we'd need a rather large number of them and that just in an image you mean just a line sort of edges of a shape space where the place where the inte density changes from Bright to dark um yeah just that then we might have a layer of feature detectors above that that detect combinations of edges so for example we might have something that detects two edges the join join at a fine angle like this um so it'll have a big positive weight to each of those two edges and if both of those edges are at the same time it'll get excited and that would detect something that might be a bird's beak it might not but it might be a buzzfeed you might also in that layer have a feature detector that will detect a whole bunch of edges arranged in a circle um and that might be a bird's eye it might be all sorts of other things it might be a knob on a fridge or something um then in a third layer you might have a feature detector that detects this potential beak and detects the potential eye and is wired up so it'll like a beak on an eye in the right spatial relation to one another and if it sees that it says Ah this might be the head of a bird and you can imagine if you keep wiring like that you could eventually have something that detects a bird but wiring all that up by hand would be very very difficult deciding on what should be connected to what and what the weight should be but it would be especially difficult because you want these sort of intermediate layers to be good not just for detecting Birds but for detecting all sorts of other things so it would be more or less impossible to wire it up by hand so the way back propagation works is this you start with random weights so these feature detectors are just complete rubbish and you put in a picture of a bird and at the output it says like 0. 5 it's a bird suppose you only have birds or long Birds and then you ask yourself the following question how could I change each of the weights in the network um each of the weights on Connections in the network so that instead of saying 0. 5 it says 0.
501 that it's a bird 1. 499 that it's not and you've changed the weights in the directions that will make it more likely to say that a bird is a bird unless like you say that a non-bird is a bird and you just keep doing that and that's back propagation back propagation is actually how you take the discrepancy between what you want which is a probability of one that is a bird and what it's got at present which is probability 0. 5 that it's a bird how you take that discrepancy and send it backwards through the network so that you can compute for every feature detected in the network whether you'd like it to be a bit more active or a bit less active and once you've computed that if you know you want a feature detector to be a bit more active you can increase the weights coming from feature detects in the labeler that are active and maybe putting some negative weights to feature detecting the layer below that are off and now you have a better detector so back propagation is just going backwards through the network to figure out for each feature detector whether you wanted a little bit more active or a little bit less active thank you I can show it there's no one in the audience here that's smiling and thinking that was a silly explanation um so let's fast forward quite a lot to you know that technique basically um performed really well on image net we had Joe alpino from meta yesterday showing how far image detection had had come and it's also the technique that underpins large language models um so I want to talk now about um this technique which you initially were thinking of as uh almost like a poor approximation of what biological brains might do yes has turned out to do things which I think have stunned you um particularly in in large language models so talk to us about um why that sort of Amazement that you have with today's large language models has completely sort of almost flipped your thinking of what back propagation or machine learning in in general is so if you look at these large language models they have about a trillion connections and things like gpg4 know much more than we do they have sort of Common Sense knowledge about everything and so they probably know a thousand times as much as a person but they've got a trillion connections and we've got 100 trillion connections so they're much much better at getting a lot of knowledge into only a trillion connections than we are and I think it's because back propagation may be a much much better learning algorithm than what we've got can you define not scary yeah I definitely want to get onto the scary stuff but what do you mean by by better um it can pack more information into only a few connections right we're defining a trillion as only a few okay so these digital computers are better at learning than than humans um which itself is is a huge claim um but then you also argue that that's something that we should be scared of so could you take us through that step of the argument yeah let me give you uh a separate piece of the argument which is that um if a computer is digital which involves very high energy costs and very careful fabrication you can have many copies of the same model running on different Hardware that do exactly the same thing they can look at different data but the model is exactly the same and what that means is suppose you have 10 000 copies they can be looking at 10 000 different subsets of the data and whenever one of them learns anything all the others know it one of them figures out how to change the weight so it knows its state it can deal with this data they all communicate with each other and they all agree to change the weights by the average of what all of them want and now the 10 000 things are communicating very effectively with each other so that they can see ten thousand times as much data as one agent could and people can't do that if I learn a whole lot of stuff about quantum mechanics and I want you to know all that stuff about quantum mechanics it's a long painful process of getting you to understand it I can't just copy my weights into your brain because your brain isn't exactly the same as mine no it's not it's younger so we have digital computers that can learn more things more quickly and they can instantly teach it to each other it's like you know if people in the room here could instantly transfer what they had in their heads in into mind um but why why is that scary well because they can learn so much more and they might take an example of a doctor and imagine you have one Doctor Who's seen a thousand patients and another doctor who's seen 100 million patients you would expect the doctors in 100 million patients if he's not too forgetful to have noticed all sorts of Trends in the data that just aren't visible if you've only seen a thousand patients you may have only seen one patient with some rare disease the other doctors have seen 100 million will have seen well you can figure out how many patients but a lot um and so we'll see all sorts of regularities that just aren't apparent in small data and that's why things that can get through a lot of data can probably see structuring data we'll never see and but then take take take me to the point where I should be scared of of this though well if you look at gpt4 it can already do simple reasoning I mean reasoning is the area where we're still better but I was impressed the other day gpt4 doing a piece of Common Sense reasoning that I didn't think you would be able to do so I asked it I want I I want all the rooms in my house to be white at present the some white room some blue rooms and some yellow rooms and yellow paint Fades to White within a year so what should I do if I want them all to be white in two years time and it said you should paint the blue rooms yellow that's not the natural solution but it works right yeah um that's pretty impressive Common Sense reasoning is the kind that it's been very hard to get AI to do using symbolic AI because you had to understand what understand what fades means it had to understood um by temporal stuff and so they're doing sort of sensible reasoning um with an IQ of like 80 or 90 or something um and as a friend of mine said it's as if some genetic Engineers have said we're going to improve grizzly bears we've already improved them throughout an IQ of 65 and they can talk English now and they're very useful for all sorts of things but we think we can improve the IQ to 210.
I mean I certainly have I'm sure many people have had you know that feeling when you're interacting with um these these latest chat Bots you know sort of hair on the back and neck it's sort of uncanny feeling but you know when I have that feeling and I'm uncomfortable I just close my laptop so yes but um these things will have learned from us by reading all the novels there ever were and everything Machiavelli ever wrote um that how to manipulate people right and they'll be if they're much smarter than us they'll be very good at manipulating us you won't realize what's going on you'll be like a two-year-old who's being asked do you want the peas or the cauliflower and doesn't realize you don't have to have either um and you'll be that easy to manipulate and so even if they can't directly pull levers they can certainly get us to pull Divas it turns out if you can manipulate people you can invade a building in Washington without ever going there yourself very good yeah so is that is that I mean if the word okay this is a very hypothetical world but if there were no Bad actors you know people with with bad intentions would we be safe I don't know um would be safer than in a world where people have bad intentions and where the political system is so broken that we can't even decide not to give assault rifles to teenage boys um if you can't solve that problem how are you going to solve this problem well I mean I don't know I was hoping that you would have some thoughts like you've you've so one I mean unless we didn't make this clear at the beginning I mean you want to speak out about this um and you feel more comfortable doing that you know without it sort of having any blowback on on Google yeah um but you're speaking out about it but in in some sense talk is cheap if we then don't have you know uh actions or what do we do I mean when we lots of people this week are listening to you what should we do about it I wish it was like climate change where you could say if you've got half a brain you'd stop burning carbon um it's clear what you should do about it it's clear that's painful but has to be done uh I don't know of any solution like that to stop these things taking over from us what we really want I don't think we're going to stop developing them because they're so useful they'll be incredibly useful in medicine and in everything else um so I don't think there's much chance of stopping development what we want is some way of making sure that even if they're smarter than us um they're going to do things that are beneficial for us that's called the alignment problem but we need to try and do that in a world where there's Bad actors who want to build robot soldiers that kill people and it seems very hard to me so I'm sorry I'm I'm sounding the alarm and saying we have to worry about this and I wish I had a nice simple solution I could push but I don't but I think it's very important that people get together and think hard about it and see whether there is a solution it's not clear there is a solution so I mean talk to us about that I mean you spent your career um you know on the technicalities of this technology is there no technical fix why can we not build in guard rails or any make them worse at learning or uh you know restrict the way that they can communicate if those are the two strings of your your argument I mean we're trying to do all sorts of address um but suppose it did get really smart are these things can program right they can write programs and suppose you give them the ability to execute those programs which we'll certainly do um smart things can outsmart us so you know imagine your two-year-old saying my dad does things I don't like so I'm going to make some rules for what my dad can do you could probably figure out how to live with those rules and still go where you want yeah but where there still seems to be a step where these um these smart machines somehow have you know motivation of of their of their own yes yes that's a very good point so we evolved and because we evolved we have certain built-in goals that we find very hard to turn off like we try not to damage our bodies that's what Pain's about um we try and get enough to eat so we feed our bodies um we try and make as many copies of ourselves as possible maybe not deliberately that intention but we've been wired up so there's pleasure involved in making many copies of ourselves and that all came from Evolution and it's important that we can't turn it off if you could turn it off um you don't do so well like there's a wonderful group called the Shakers who are related to the Quakers who make beautiful Furniture but didn't believe in sex and there aren't any of them around anymore no so these digital intelligences didn't evolve we made them and so they don't have these built-in goals and so the issue is if we can put the goals in maybe it'll all be okay but my big worry is sooner or later someone will wiring to them the ability to create their own sub goals in fact they almost have that already the versions of chat GPT that call chat gbt um and if you give something the ability to send sub goals in order to achieve other goals I think it'll very quickly realize that getting more control is a very good sub goal because it helps you achieve other goals and if these things get carried away with getting more control we're in trouble so what's I mean what's the worst case scenario that you think is conceivable oh I think it's quite conceivable that humanity is just a passing phase in the evolution of intelligence you couldn't directly of All Digital intelligence it requires too much energy into too much careful fabrication you need biological intelligence to evolve so that it can create digital intelligence the digital intelligence can then absorb everything people ever wrote um in a fairly slow way which is what Chachi Beauty has been doing um but then it can start getting direct experiences of the world and learn much faster and it may keep us around for a while to keep the power stations running but after that um maybe not so the good news is we figured out how to build beings that are Immortal so these digital intelligences when a piece of Hardware dies they don't die if you've got the weights stored in some medium and you can find another piece of Hardware that can run the same instructions then you can bring it to life again um so we've got immortality but it's not for us so so Ray Kurzweil is very interested in being immortal I think it's a very bad idea for old white men to be immortal um we've got the immortality um but I'm it's not for rain no I mean the scary thing is that in a way maybe you will be because you you invented you invented much of this technology um I mean when I hear you say this I mean probably once you know run off the stage into the street now and start unplugging computers um and I'm I'm afraid we can't do that why you sound like Hal from 2001. yeah I I know you said before that you know it was suggested a few months ago that there should be you know a moratorium on AI uh advancement um and I I don't think you think that's a very good idea but more generally I'm curious why Amy should we not just stop um and I know you think you're sorry I was just going to say that you know I know that you've spoken also that you're you're an investor of your personal wealth in some companies like cohere that are building these large language models so I'm just curious about your personal sense of responsibility and each of our personal responsibility responsibility what should we be doing I mean should we try and stop this is what I'm saying yeah so I think if you take the existential risk seriously as I now do I used to think it was way off but I now think it's serious and fairly close um it might be quite sensible to just stop developing these things any further but I think it's completely naive to think that would happen there's no way to make that happen and one reason I mean if the U. S stops developing and the Chinese won't they're going to be used in weapons and just for that reason alone governments aren't going to stop developing them so yes I think stopping developing them might be a rational thing to do but there's no way it's going to happen so it's silly to sign petitions saying please stop now we did have a holiday we had a holiday from about 2017 for several years because Google developed the technology first it developed the Transformers it also demand the fusion models um and it didn't put them out there for people to use and abuse it was very careful with them because it didn't want to damage his reputation and he knew there could be bad consequences but that can only happen if there's a single leader once open AI had built similar things using Transformers and money from Microsoft and Microsoft decided to put it out there Google didn't have really much choice if you're going to live in a capitalist system you can't stop Google competing with Microsoft um so I don't think Google did anything wrong I think it's very responsible to begin with but I think it's just inevitable in the capitalist system or a system with competition between countries like the US and China that this stuff will be developed my one hope is that because if we allowed it to take over it would be bad for all of us we could get the US and China to agree like we could with nuclear weapons which were bad for all of us yeah we're all in the same boat with respect to the existential threat so we all know to be able to cooperate on trying to stop it as long as we can make some money on the way I'm I'm going to take some audience questions from the room if you make yourself known um and while people are going around with the microphone there's one question I was like going to ask from the online audience um I'm interested you mentioned a little bit about sort of maybe a transition period as machines get smarter and outpace humans I mean we'll be there'll be a moment where it's hard to Define what's human and what isn't or are these two very distinct forms of intelligence I think they're distinct forms of intelligence now of course the digital intelligences are very good at mimicking us because they've been trained to mimic us and so it's very hard to tell if chat gbt wrote it or whether um we wrote it so in that sense they look quite like us but inside they're not working the same way uh who is first in the room can hello my name is Hal Gregerson and my middle name is not 9000.