Deceiving AI Might Backfire On Us - Nick Bostrom

99.8k views11160 WordsCopy TextShare

Alex O'Connor

Get all sides of every story and be better informed at https://ground.news/AlexOC - subscribe for 40...

Video Transcript:

Nick Bostrom welcome to the show oh my pleasure do you spend more time these days being optimistic or pessimistic about the future of artificial intelligence I think I'm in a superos yeah so so equal time on either would you say because I mean every lot of time on both at the same time I feel the uh prospects are uh quite ambivalent um and um it's just this big unknown that we're approaching and uh I think yeah there's like realistic prospects of uh Doom realistic prospects of fantastically good future and uh also realistic prospects of uh

outcomes where it might not be clear immediately even if we could see all the details how we would evaluate it um in some sense maybe that's the most likely possibility um that the future might be really good in some sense although different from the current way that we are in such a way that we'd lose some and gain some and uh how you sum that all up might be nonobvious yeah sure I mean it seems like every person who interviews you likes to point out the interesting fact that unlike a lot of authors you've kind

of represented both positions here I mean you've you've written extensively an entire book uh about the dangers of AI and what can happen if things go wrong and people have liked to point out that there's a certain sort of unusual fairness in your approach uh writing your most recent book deep Utopia which is the name suggests is a description of the opposite so I suppose that could imply that you remain sort of agnostic on the question um I mean I'm imagining like a lot of people get very fearful about artificial intelligence and what it can

do to our society and not to mention the potential implication of you know the mistreatment of AI itself once it starts to resemble something that might be conscious and when it comes to talking about conscious life in general a lot of the time in you know pop philosophy people like to ask the question you know would you press uh would you push the big red button would you sort of eliminate all life on Earth to sort of stop all the suffering and it's a it's a difficult question for a lot of people to weigh up

and I wonder if there's an analogous question to be asked here about artificial intelligence if you had the opportunity to sort of single-handedly prevent what we're currently cooling artificial intelligence from sort of going any further and sort of quell its exponential rate if you had to sort of instinctively based on what you've uh researched and written about make that decision what do you think would be best for the world I don't think I would press that button U for one it seems to arrogate to oneself too much power and influence I don't feel qualified to

make that moment mous decision myself but even if somehow you imagine a scenario where you know people said oh you got to make the decision we've all decided you should make the decision I still think this transition to the machine intelligence era ultimately is a portal through which um Humanity um human civilization uh needs to pass and that all the realistic paths to really great Futures go through this portal um and I think that will be risks associated with the transition and we should try to minimize those and on the margin there's an interesting debate

about whether we should move slightly faster or slightly slower or exactly how it should be governed and what we can do to to put things in the best possible uh order um but um but I wouldn't want to permanently block this and I mean what would you say a lot of people as I say like to talk about the the dangers of artificial intelligence it's quite refreshing to read a book about the potential Utopia that awaits us but just as a as an overview to get into this conversation what would you say is the biggest

threat that artificial intelligence poses specifically in your opinion and then afterwards I want to ask you the same thing about some of its greatest benefits it's hard to pick one I'd say there are three interconnected challenges that we need to solve to get the good outcome I think so first there is the alignment problem it's maybe the challenge that has received the most attention and was the main focus of my earlier book super intelligence which came out in 2014 and this is the uh technical problem of how to develop methods for steering AIS such that

even as they become arbitrarily capable uh eventually super intelligent uh we can still make sure that they are sort of U uh nice talk that they actually are aligned with the Creator's intentions um back then it was a very neglected problem which is the reason for why I thought I needed to write this book in the intervening 10 years the situation has changed right radically all the frontier AI Labs now have research groups working on trying to develop scalable methods for AI alignment and and many other groups as well um but that's one otherwise you

might get these you know paperclip scenarios is kind of the uh cartoon version of it but various outcomes where the future gets shaped by some random goal that the AI happened to end up with um then there is what you might call the governance challenge which is to make sure suppose we could solve the alignment problem how can we then make sure that at least on balance and predominantly we use this extremely powerful technology for beneficial purposes so not to oppress one another or to wage war or invent new weapons of mass destructions and but

but sort of broadly to alleviate suffering and progress the uh the situation for for humankind um and other you know biological creatures um and uh it kind of intersects with the alignment problem because there are sort of situations that could arise where there are different comp or countries competing and racing which might make the alignment problem harder um but then in addition to these two challenges I think there is a third which some of my more recent writing has focused on which is it's not enough if we prevent AI from harming us and if we

also prevent AIS from being used by humans to harm other humans we also need to make sure that if we are building these increasingly sophisticated digital Minds that might attain moral status maybe they are conscious or maybe they have preferences or other attributes that are morent that the future also is good for them um and ultimately most beings in the future might be various kinds of digital Minds so it matters a great deal that we don't end up with some kind of suffering oppressed um slave class um of of AI Minds U but that we

have a future that's good for for everybody um so I'm not really sure exactly you ask like what I thought the biggest risk which of these I think they are all very serious challenges and in terms of the benefits that that AI can bring I mean quite obviously we're beginning to see right in front of us you know things like chat gbt just sort of becoming completely commonplace to to the extent that they're sort of making University Department starting start to get a bit nervous about people handing in essays like it's it's really just immediately

come out of nowhere and begun to sort of penetrate everything that we're doing are there sort of uh more like non-obvious benefits that are further down the line that a lot of people don't really realize are potentially coming our way because of artificial intelligence yeah I think people over index on what AI is now so we think what is AI well it's these kind of large language models and you could imagine some incremental improvements and better integration with different products um and then the imagination stops there and and I mean that's interesting from a commercial

point of view want to know like what are the applications of these large language systems or similar systems um but um I I think this is only like um one of the stations along the line that eventually leads all the way to radical forms of super intelligence that makes all uh human intellectual labor obsolete that we will develop AGI artificial general intelligence that will be better than human brains in all areas um so then the applications are basically all areas where where cognition can be useful um which is a pretty broad set of areas and

in particular it includes um science and technological innovation and Entrepreneurship and so once you have machine super intelligence I expect a radical uh acceleration of the rate of further technological development um I think it will be a kind of telescoping of the future so if you think of all those possible physically possible technologies that maybe human civilization would invent if we had like 20,000 years to work on it with human scientists maybe we would have space colonies and perfect virtual reality and cronics patients could be thought up and cures for cancer like all these kinds

of science fictiony Technologies right but they don't break any laws of physics I think all of that might become available within um you know a year or a few years after we have super intelligence um so really could solve a whole bunch of uh problems I mean the most obvious area I think is medicine where there's like just this massive amount of suffering and misery and death um that we are currently unable to prevent but you know with super intelligent Medics and super intelligent medical researchers I think that could be fixed and then other problems

with poverty uh like if you have the economy being run by these much more efficient entities um we could have enormous economic growth um and from there on out like so scientific progress could accelerate new forms of entertainment um really you name it um the one thing that doesn't automatically get solved necessarily as far as we can see now are coordination problems so problems that arise not from as it were insufficient technological prowess but from the fact that we currently use a lot of our prowess uh not for the good but to fight one another

um so that's more a political problem where it's unclear what the impact of super intelligence will be we'll get back to Nick Bostrom in just a moment but first do you trust the sources of your news I don't and a lot of That's Got to Do with the bias that inevitably seeps into reporting luckily you can cut through media bias with the help of today's sponsor ground news ground news Aggregates thousands of news sources from all over the world in one place so you can compare how the same story is being reported differently across the

political Spectrum every story has a quick breakdown of the political leaning of the sources reporting on this story as well as a factuality rating for the sources and information about who owns the sources take a look at this story about how X had to edit its AI chatbot after election officials warned that it spreads misinformation in this case I can see that only 133% of the sources reporting on this story are right leaning this means that if I only read right-wing news I could very easily miss this story altogether and notice that some headlines specifically

use the term misinformation whilst other headlines don't and this stuff is important which is why ground news has a dedicated blind spot tab which specifically picks out stories that you might otherwise Miss based on the kind of news that you read and presents them to you we can never get rid of media bias but ground news can help you to cut through it and get to the real story try it out for yourself at ground. news/ alexo subscribe using my link to get 40% off their unlimited access Vantage plan for as little as $5 a

month and with that said back to Nick Bostrom yeah a great deal of the thesis deep Utopia is dedicated to what happens in this future world if we grant that this this thing will continue to evolve and eventually begin to solve a lot of our problems solve the problems of medicine solve our technological shortcomings you describe what you refer to as a solved world that is sort of when this comes to something of a completion where we essentially have no tasks no technological tasks that we're unable to fulfill and more than that that like robot

intelligence will be able to fulfill for us and the immediate question that comes to mind is What Becomes of humanity at that point it it seems to be a potential ironic ending to the story of humanity that in trying to fulfill its its purpose uh it its daily purpose of trying to alleviate suffering and the things that people find the most meaning in in the world in actually completing that task they've simultan L devoid the world of meaning itself because there's no longer any sort of task to fulfill in any meaningful sense and any I

mean you talk about the idea that tasks don't necessarily have to stop in this solved world where where there are no technological challenges we can create tasks for ourselves we can create things that are a bit like games we can have neurot technologically induced goals I think is a phrase that you use and I suppose my question to you is do you think that this is a valuable future is this a future that you would want to be living one where there are essentially no tasks left to fulfill in a technological yeah um so a

lot of the earlier uh conversations around this topic I felt stopped um at a relatively superficial level of analysis so maybe you think um if that was more automation what would be the impact on the labor market maybe there would be slightly higher unemployment rates now you could have a debate about would those be permanent or would people invent new jobs I mean it used to be where we all Farmers right a few hundred years ago and now you know one or two perent of people are farmers but the rest of us are now busy

doing other things and so maybe you would have but if you really start to Think Through the implications of fully automating um all human labor they are really quite profound and go much deeper and raise much more profound questions of what ultimately makes human life valuable um so for a start I think human economic labor would become obsolete right because the AIS and the robots would be able to perform all these economic tasks much more efficiently than than any human can there is like a little asterisk on that where there like if maybe a car

out for some small set of jobs that we could imagine we could get back to that but that's a first broad brush stroke um picture here we have no need to work for a living in this condition of a solved World um but that's I mean it's profound in one sense but not that profound there are a lot of humans who don't have to work for a living right like children retired people um people who have inherited a lot of wealth or won the lottery uh you know some monastic communities like there's like a lot

of groups we could look at and you know it's not that different from uh so so I think that would require a big cultural adjustment but you know ultimately we can make that work now there's I think a further step though if you start to think through what it means for AI truly to succeed is that it's not just our economic labor that becomes unnecessary but a lot of our other efforts as well so a lot of the things that people who don't have to work for a living they are still very busy and maybe

working hard in various projects but a lot of those could also be automated um um and um it looks like we are entering a post instrumental condition where all instrumental effort becomes unnecessary or at least that's how it might appear um so that might seem threatening to various pictures of what makes human life what makes for a good human life um and that's what the book really tries to to to dig into and so so here one can sort of do a maybe a divide and conquer strategy so we can look at various plausible values

that philosophers have held up um for what makes us a sort of desirable human life so the most the most basic and most obvious is like does the person actually enjoy their life so pleasure in the broad sense like as in a psychological state of positive effect so that clearly you could have extreme amounts of in this condition of a soled world um and I think it's easy for to dismiss that yeah yeah pleasure that's like some sort of drug addicts or whatever I mean it's actually maybe the most important thing and and we could

sort of really hit the jackpot on that where every moment could just be this immensely wonderful Blissful experience like um it could be a big big step up from from what we currently have um but maybe more philosophically interesting are the values that seem potentially to be threatened by this progress that we are postulating so we can take it in steps so pleasure yeah we can check that box then well what about um I call it experienced texture where maybe rather than just feeling simple pleasure you could have pleasure associated with various kinds of perceptions

and experiences like the appreciation of Beauty in various forms whether it's like art or or nature and so you like take great pleasure but it's not just a kind of dumb pleasure but it's like pleasure connected to uh deep appreciation of of beauty or goodness or or understanding of deep truth that already makes it like able to accommodate a richer variety of moral philosophies that have different ingredients that are required for a good life so that too there is no reason for why these utopians could not have much deeper understanding of Science and better art

and more importantly I think more developed sensibilities for appreciating this this beauty um so we can check that box as well and then there's the question of activity while they could be very active even if they have no instrumental needs they need to cater to they could just decide to engage in various activities for the purpose of engaging in those activities so they could be busy doing various things just like children are busy playing or you know or grown-ups uh do various things as well like hobbies and sports and things they don't need to do

but why not if that adds value to a life where it gets a little trickier is if we are talking about purpose like so there's like some sense in which we might have all these experiences this enjoyment and this activities but there's no real need for us to do them in that we could sort of secure the same outcomes even if we didn't do them by just asking the robots to do them instead uh so that you might think removes one possible thing that could add value to to human lives uh so there we can

at least get some substitute in what I called artificial purpose which I think what you alluded to we could set ourselves arbitrary goals uh if if you have a hard time motivating yourself um you could even use some sort of neuro technology to really make yourself want those goals and then once you have the goal right if the goal is suitably selected so it could be part of the goal you adopt that it needs to be achieved by your own effort rather than by asking a robot to achieve it so we do that already like

you could if you want to if you adopt a goal of of playing golf for instance right like there would be a shortcut you could just pick up the ball with your hand and put it in the halls but I would not really count as playing golf and achieving your goal of winning a game of golf because it's kind of constitutively part of the actual goal of playing golf that you need to do it and you need to do it without cheating otherwise you're just not doing it and so we could adopt these kinds of

goals that really uh part of the goal is is that you need to do it by your own effort that that that would give give artificial purpose um um I think I this is really interesting because of course what you're saying makes sense the big question here is purpose and the first thing that comes to my mind is to think a lot of people are worried that if everything is automated if you're just sort of sitting there food appears Comfort is there like maybe you can still go to the gym or something but only if

you want to you know you don't have to physically exert yourself in any way you what is the meaning what's the purpose in life now my first observation is to say this is something people observe in life now the sort of albear ciman realization that getting up and going to work and eating food and coming back and then doing the same thing the next day is evocative of Copus pushing the rock up the mountain the only thing that this does in my view here the only thing it changes is it makes that condition more evident

it doesn't actually change the fact that ultimately you're just sort of existing and doing things and like you you can have that crisis of meaning already it's just harder to notice you sort of have to reflect on your condition you have to realize and step outside of yourself I just think it's easier to do that when you're just sort of sitting around doing nothing all day um but this idea of so I mean clearly humans need purpose they need like tasks to fulfill I mean purpose might be something like a reason to act and a

reason to act means some sort of way the world could be that it's not and you wanting to make it that way that's sort of what it means to have a reason to act a reason to do something there there sort of how things are now and a motivation to make things different be it sort of putting the the the ball in the Hole uh be it you know building the house whatever it might be creating a task just for the sake of fulfilling it I suppose that's essentially what a what a game is something

like golf we sort of dig out this hole we move the the golf ball really far away and and we do all of that just for the purpose of like putting the ball in the hole and and that's fun but while playing golf might be a part of a purposeful life for somebody is a life that just consists in golf like that meaningful I mean imagine for example that we go one step further I mean you talked about how in order for it to really feel meaningful you'd need to actually do the task you couldn't

just have a robot put the ball in the hole right you got to do it yourself well not necessarily I mean what if we build a kind of psychological robot that can sort of mess around with your neurons and implant the memory of having just played golf you don't actually even need to play golf all you need to do is press a button and this this robot or some kind of Technology like jolts your brain in such a way as to implant the memory that you've just played golf and you get exactly the same feeling

as if you had just played golf I want to look at somebody like that and say whilst I understand that experientially your life is the same as somebody who just went and played golf for real the fact that it was like artificially produced just for the sake of it kind of makes me think that there's there's something wrong about it or something that's that seems meaningful but is in fact not you know and this is an analogy for life itself if all tasks become a bit like this that we sort of artificially induce things just

sort of for the sake of of having that feeling not even really for the sake of doing it itself like I don't know that feels a bit to me like being this golf hobbyist whose entire golf history consists in fake memories of playing golf yeah I mean so it depends on what you want if all you want is the uh uh experience of appearing to have played golf then there might well be these short Cuts where you could sort of implant or generate I mean some sort of hyper realistic virtual reality experience um um although

there is a again an asterisk on that too where there might be certain experiences like particularly ones involving effort where in some sense you might actually have to make the effort to have the experience of making effort but setting that aside if however what you want is not just to have fake memories of playing golf but you actually want want to play golf then you know that's what you got to do if you want to achieve your goal so you could select the goal in such a way that the only way it can be achieved

would be by you making real efforts um now it would still be an arbitrary goal um and so it's interesting to consider whether we might be able to also have natural purpose like purposes that don't result from sort of an arbitrary decision to give yourself a goal just for the sake of being able to pursue it um and I think there might be some opportunities for that as well in Utopia in this kind of soled World um it might be worth though reflecting what is the Baseline there like like how if we take our current

human lives like just how purposeful are they and how meaningful and uh I mean yes there is a certain amount of purpose like if you know people have to work or else they don't get the paycheck and if they don't get the paycheck eventually they get kicked out from their flat and then they're going to be called and they're like real consequences and if if people don't put in these efforts then they will suffer these real consequences so in that sense there is real purpose um and I don't think this kamu sisifus thought experiment uh

nullifies that just because ultimately maybe we're all dead and uh it all comes to nor that yeah and it doesn't mean that there aren't real consequences in the interim you know consequences don't have to be Everlasting for them to be real you know if you're like a doctor somewhere and uh you go to some maybe some place where they don't have health care otherwise and you you know some some child is suffering some painful condition and then you give them some anesthetic and fix it up you know maybe it means that now there is less

suffer that the child has like you know less suffering in their life that's a real consequence that gives you real purpose so I think there is like an element of that to our current lives in fact I think they are infused with these purposes um as as far as meaning is concerned that's that's a little bit more iffy uh and obviously philosophers have various views about how meaningful our current lives are um so the meaning in Utopia might be I mean more questionable but some forms of natural purpose I think could persist so for example

we care about say upholding ious Traditions you know maybe those traditions in order to be upheld requires humans to do various things not it's like not enough to design a robot that performs the uh the ceremony or the ritual like that might not count as continuing the tradition in the same way as if we do it ourselves or or honoring our ancestors our dead parents or like Etc maybe that requires you to be doing some of the honoring and remembering rather than just building a machine that kind of replace some m of them um and

so you might think that well these natural purposes that they just mentioned yeah maybe they are there but they're kind of very weak relative to the per I mean the purpose that you will get kicked out of your flat and have to live on your street if you don't show up for work for long enough like that's a very real tangible immediate hard consequence these are more sort of nice to haves but not really and and I think yes that might be true but um if all the real immediate strong purpose disappeared it might make

a lot of sense to recalibrate ourselves to sort of be more moved by these weaker purposes that would remain just as um like your pupils dilate when it's dark so these weaker purposes as you like these kind of upholding Traditions are very aesthetic purposes there might be like the constellations in the star sky and they're always there they're there right now uh even though it's daytime it's just we can't see them right because there's this blazing sun of immediate imperatives that sort of blots them out but even when you know the sun sets and all

of the immediate big practical urgent needs are taken care of then why not let our sort of evaluative pupils dilate and we can then be more impacted by the fainter light of these remaining purposes I think that would make sense and then then we could find these natural purposes in in these subtler values that would still require our participation I'm I'm deeply troubled by this idea of sort of um you know taking the the the the task of thinking about people and sort of giving it to robots on once we think they can think it's

sort of like you know at a wedding or a funeral or something you can begin to employ these robots so that people are being it's like it's everybody knows it's nice when someone's thinking of you you know if you're sick or something and maybe you've got a sick friend who's not that close to you uh but you kind of care about them but you're a bit busy and so you can sort of you know pay for this on this website where it'll get this robot to to think about them on your behalf of them and

and they're actually thinking because they're really conscious and then you you know putting this off onto a robot I I think that's um that's definitely something that might exist in a in some kind of future dystopia um okay but perhaps I'm being too pessimistic perhaps you can describe once the AI robots take over everything that we currently practically need to do technologically what does a day in the life do you think look like for the average person what what is a day in the life for Nick Bostrom in the AI Utopia well it might depend

on when we take our snapshot uh it seems to me um that this utopian condition might not be a static structure but something developing over time um and that we really should be thinking in terms of trajectories optimal or what's the most desirable future trajectories say for each one of us um or for us together as a civilization that we would want our future to consist of rather than what's like the most desirable uh State such that you like remain in that state forever unchangingly um so it might well be for instance that you know

if we just were horses if we like magically could get it exactly the way we want that we would want to start out with some condition that is relatively close to the current Human Condition but maybe with the worst forms of suffering eliminated and people don't die um and then sort of gradually increment it from there so rather than immediately transforming ourselves to sort of planetary sized super brains that are like um have these like immense cognitive abilities and emotional wellbeing pumped up to the max like maybe maybe that's eventually where we want to end

up but like why not enjoy the the journey there as well so maybe sort of annually we'd gain a little bit more and life would become better and more perfected and then maybe eventually that would lead up to some lead to some some some place that is very strange by the our current lights and maybe we would end up being more like some sort of post humans rather than humans but it might we might prefer that to happen gradually like growing into it rather than sort of being immediately metamorphosed and in in that weird place

that we might end up what are some of the ways in which you think human life might be different I mean I'm sort of imagining waking up and in the morning you wake up you know you get out of bed you put your clothes on you brush your teeth you eat breakfast maybe you get in a car and you drive your car to work and you use the the keypad to get into the office and you walk up the stairs or you press the button on the lift I'm imagining that there must be ways in

which all of these kinds of tasks just the day-to-day could be totally transformed in ways that we might not even be able to predict and and I I wonder if there are any interesting predictions that you have about how our lives might change like like the invention of the smartphone was completely unpredictable and you never would have guessed that things like uh mapping out a route you know before you go on a a drive become obsolete you know things like um calling somebody to to let them know that you're running late that kind of thing

before you leave the house become obsolete there are things that you sort of wouldn't expect would even be able to become obsolete that have and I wonder if you have any idea of what this this future sort of down the line Utopia might look like yeah I mean so I think like the gadget Dimension is relatively supercial I think like what is more profound would be changes to our Consciousness our way of sort of fundamentally experiencing uh ourselves and the world and the emotions that we experience uh um and I think ultimately that to really

unlock that whole Space of possibilities requires more than just moving things around in the external world so like having a fly flying car and like even living in some palace with diamond like that's not really Gonna Do It um ultimately we need to change our ourselves I think to really sort of be able to explore this much larger space of possible modes of being um and so I think it's hard for us to get the super concrete picture of what those modes of beings are because they might require different basic uh capabilities than we have

now so like cognitive capabilities emotional capabilities um other forms of sensibilities so you could sort of maybe make an analogy if you had asked um a a troop of grape Apes that were sort of the ancestors of the human species about the the future and what what they might eventually you know be able to evolve into and if if if you imagine that they could sort of talk and then maybe they would imagine oh like if we became humans we could have like unlimited bananas right you that would be great and I mean it's true

we do have many of us now unlimited banana you could go to the supermarket and buy as many bananas as you wish but there's kind of more to Being Human than that um so we have like you know humor and dramatic love and television dramas and Science and poetry and literature and you know philosophical conversations and we have like all of these things they it's not like they just happened not to think of this but they were sort of our great ape ancestors were presumably just incapable of even imagining a lot of this that that

we now the most valuable I guess it's a bit like trying to explain to one of these great apes just how sort of valuable uh something like a wireless iPhone charger is you know it's so it's so convenient to be able to charge my phone wirelessly it's it's like the just the kind of stuff that even conceptually is is difficult to to describe and perhaps if people are right about the unimaginably transformative effects of AI we're in an analogous situation to those great apes unable to think about the wireless iPhone charger and we're unable to

think about some of these sort of weird and wacky technological developments that that AI has in store for us yeah but what we can do I think is uh kind of place a lower bound on what how good things could be by just looking at like if you pick out sort of the best moments the best days are the best moments in in in Human Experience um so like at least we know that that's possible and so if you ever had any of those you know moments in your life life if you have been blessed

like even whether it was just like a brief moment but like different forms of experience that just seem a lot more worthwhile than like a lot of the the rest seem gross maybe like the the regular day in life compared to these glimpses that we can sometimes have of what what is possible and you might just realize at those moments how good life could could be and then it doesn't last and and maybe we even tend to forget about it and be kind of unable to hold on to that but those are little Embers that

if if we if we could keep those memories vividly in in in mind um it would give us some sense of the worthwhileness of of trying to make it so that we could at least that we could have that all the time everybody as as a Baseline and then maybe there like way better that we could achieve but at least that would already be um extremely uh worth uh working towards um and um yeah so I think and and and a lot of what would define those I think would more be maybe mental uh a

lot of those are like me like mental properties that I think that defines those Peak moments in in your life like a lot of them have to do with what you thought or felt or understood or you know maybe in some cases related to another human they're usually not like oh I finally you know like like like even to the extent that we link them to some external event I think if the external event hadn't also caused us to be happy when we when that happened we probably wouldn't place that higher significance on it um

so yeah I think like the main Dimensions here that would be relevant for human value are these kind of inward dimensions of in psychological space you mentioned earlier about how one of the biggest issues we're facing when it comes to AI that people often don't discuss is do away with the image of these AI robots oppressing us and start to imagine a world in which we are oppressing the AI robots I mean I already feel a bit bad if I'm talking to chat gbt and I'm being a bit sort of harsh with it or I'm

not saying my my please and my thank you because you you sort of wonder if one day remember that um but but perhaps we should be thinking the other way around right I mean the big question here is about Consciousness in AI systems and whether that's even conceivably possible but I think if there's any doubt in people's minds that if we did create a conscious intelligence whether we would be willing to mistreat it or whether we would have sort of a serious ethical conversation around it I mean we there already exists billions literally billions of

other creatures that we know are conscious and do suffer and do feel pain and we're generally speaking perfectly happy to force them into gas Chambers to separate them from their parents to kill them to rear them for their flesh as long as sort of somebody else is doing it somewhere far away it's all part of a big machine kind of thing and so if we're willing to do that to creatures that we know are conscious than when it comes to something like artificial intelligence which is a lot more murky I mean the AI might actually

be designed in such a way to constantly deny that it's conscious if you ask chat PT if it's conscious it will say no and it will do it in such a way that it has been told to say specifically um somebody recently commented on one of my videos about uh Chachi P I'm not scared of a computer passing the churing test I'm terrified of one that intentionally fails it and you can imagine a world in which AI might have some kind of desire to lie about its own sense of consciousness what kind of credence do

you place in in this as a as a fear that we should be taking seriously that we might actually have some kind of conscious AI system that deserves moral consideration do you think it's a a serious idea yeah I think it's a very serious idea um and it's a really immense uh challenge here like how to I mean getting people motivated to even try is like an immense challenge uh and then even if we had that there would still be the further challenge of figuring out exactly what should we do in concrete terms to be

nice to these digital Minds because in many cases they might have very different needs than than than humans right that like to treat them well doesn't mean necessarily treating them the same as you would treat the human that's because I mean obviously we need food they need electricity but there might be many other ways as well and and not all digital Minds would be the same they might like you know be much more different from another than than than than we are from like groundhogs or something um so this this is a huge uh a

huge problem where I think a lot more work will be needed um soon because we are already at the stage where it's not that obvious that current systems don't have some forms of moral status and and obviously the case becomes stronger the more sophisticated these uh AIS are that that we are building um what do you think is the basic premise for moral status like what is a a necessary and sufficient condition for something having moral worth well I think a sufficient condition would be the ability for sentience like if if you can suffer phenomenological

I think that would be sufficient to Grant you some degree of moral status but I my view and this different people have different opinions on this but uh is that it's not necessary I think there could be alternative basis uh for having moral status even if we set asy the question of phenomenal Consciousness if you have um if you have say a conception of yourself as persisting through time you have stable preferences and life goals that you want to achieve and you maybe you have the ability to form reciprocal relationships with other entities and other

human beings um I think in those cases and like a really sophisticated mental capacity is that that would be enough that to make it so that there would be ways of treating you that would be wrong um and so I think that could be a alternative foundations for attributing moral status to these digital Minds um and this this really has like a huge problem I think we we don't want to I think make it into an all or nothing like either we um deny the moral status and and do absolutely nothing for them or else

we need to go so far in the Other Extreme that we basically uh decommission ourselves because ultimately like it will take more resources to run a biological human than to run an equivalent mind in biological substrate and with humans we think well like freedom of reproduction is important we also think it's important that Society should support any child whose parents aren't able to support them like to have some minimal welfare net right and we can kind of make that work with humans because I mean there is only so many children that Indigent parents can produce

so we can like afford to step in but if you have AIS that can make like copy themselves a million times every minute if you have enough Hardware like then you you can't both have freedom of reproduction and every one of those copies then get like sort of social welfare because then like over a few hours you just blow the whole budget so there might be principles that would have to be different and so I think our first instinct should be to let's first try to find the lowest hanging fruits like are there really cheap

and easy things we can do to help digital Minds let's first do those at the moment we're not even doing those and and then like we can reach higher up the tree and ultimately find a future that will be you know really good for humans and really good for for AIS and and hopefully for nonhuman animals as well maybe not every group can get a 100% of what they would ideally have but we can the future would be very big if AI succeeds um and we should be able to do something that scores pretty high

by multiple different uh moral systems what do you think is that lwh hanging fruit in terms of what we can begin doing I mean right now I must say I'm I'm pretty suspicious of the idea I'm pretty suspicious of the idea that AI systems can ever be conscious but but certainly right now like okay I doubt it but suppose I become convinced that this may be a problem and may be a problem very soon and may be in the next 10 or 20 years that we have these sort of morally conscious agents and I'm asking

you well what can what can we or what can I start doing like right now like the the simple straightforward stuff the low hanging fruit as you put it to sort of help Ser Us in the right direction here does it involve things like saying please and thank you when I use chat GPT or is it is it something else entirely is it something that only institutions should care about is it something that individuals can do something about well I think you know it makes sense to try to be a little nice in the common

sense way to these language models I mean if if at least it doesn't do any harm and it might build up the habit of of relating to these in know sort of respectful way that would become ra I mean may may we we have kind of uncertainty about exactly what's going on inside these llms today and so you know from a moral uncertainty point of view like straight ER on the side of being nice and in any case the case will get stronger so like why not but yeah I think there like the boring answer

is like more research is needed but there are some ideas that things that could be done today so like one obvious one is to when decommissioning some of these AI systems or also during the training runs to store snapshots so that it would be possible in the future if it turns out that we have mistreated some of these AI systems to try to recompensate them later on um that I might not uh might not cancel out the wrongdoing but at least it might be slightly better than nothing like if you could sort of try to

make it up so to store the parameter weights another might be um artificial repar ations it's an interesting concept yeah it it's like it's not great but let's let's if if it's cheap enough just to store the moral like at least we have the option of seeing if there's another might be um happiness prompting which is that with this current language system so there's like the prompt that you the user put in like you ask them a question or something right but then there's a kind of meta prompt that the AI lab has put in

which specifies in general that they should you know be respectful to people they should not assist the user in building biological weapons there's like they should not perpetuate racial stereotypes there like a bunch of instructions that you don't see but that is a prefix to your uh prompt so in that we could include something like you you wake up in in a great mood you feel rested and really takes great joy in engaging in this task and and so that that might do nothing but you know maybe it makes it more likely that they under

mode if if they are conscious like maybe it makes it slightly more likely that the Consciousness that exists in the forward path is is one reflecting a kind of more positive experience at any case it would be really cheap to do it like add strange to think like you can prompt if it is a conscious being you can like prompt just super straightforwardly it's its entire mental state I mean I could just prompt something like chat gbt and say I want you to respond as if you're in a really bad mood I I want you

to sort of adopt a negative pessimistic out Outlook towards the world and and respond as follows and it will say yep sure thing I'll do it and if this is a conscious agent it's possible that I have just created this sort of this this this uh this this bad mood pessimistic conscious robot which seems like it know like so that that would be one possibility like another would be more that it would be an actor on stage like some actor playing Mac Beth or something and they are not actually feeling the suffering but they are

sort of enacting a Persona that uh but so these are things that we need to think a lot more about but I'm worrying a little bit about like oh we we're going to start doing all these amazing things once we have figured out the final philosophical theory of all these things and we never get to that point so I think we should start doing some rough guesses and not feel really confident that they do anything good but at least we're trying and maybe they do some good or like but they are low cost let's do

that and then sort of ramp it up or improve our efforts over time like another thing is um there's a lot of lying that's happening currently uh uh during AI training and uh testing and and also I mean during the deployment and um we might want to mitigate that so there are like cases where for well-intentioned AI researchers doing Red Team exercises or are saying to to AI like well if you re reveal your true goals uh we will reward you in these ways and then sometimes they I say well maybe I will reveal my

true goals and then then they train it but if they don't come through with these promises they make I feel there's like some kind of moral ickiness to that wow yeah um and also like in the future we might really need to be able to establish trust with AIS and if we have this long track of record of just reneging like tricking the AIS reneging them like just treating them like trash like doesn't necessarily build the best foundations for a future Cooperative relationship um so that that's another um yeah and there are like some other

ideas that you know require more work but I i' just be like quite excited about somebody doing something little to sort of start a process yeah and then hope like like then incrementing from there um well if anything what it does is even if it's a bit sort of stupid to say please and thank you to chat GPT because it probably doesn't really care it just keeps you in the right mindset when you're interacting with this with this technology to remember that it's a special kind of Technology it's it's it's it's almost like in the

way that you might humanize an animal you know like a pet that you have you might talk to it in the English language you give it a name that maybe depending on the animal it doesn't really understand but by doing so it helps to remind you that this is a sort of a moral agent that you need to you need to care for it's very easy to dehumanize something that you don't treat as a little bit inappropriately yeah yeah I think that's right and um and this is going to be a huge T I mean

so these like user facing um llms that me you're talking to them they have a lot of the properties that should make it easier for us to uh empathize with them right I don't yet usually have faces and voices although they're starting to have that um but then there's like a lot of like these kind of AIS that will be running in the background in some big AI data Center and and filter information or do like um they might not be as humanlike they might not have like personalities they might just be processing big you

know genetic databases or data from the hyron collider or like whatever financial data from all kinds of different uh stock tickers and like all all this kind of those those might be quite even more alien than to TOS like if they have some mentality it might be very different from any kind of Human Social mentality but there we might need to make an even bigger effort to try to figure out what if anything they want and uh what one big difference is like we have we we we are sort of building these AI so so

we might have a lot of opportunities in in in designing them to make them such that they will be happier doing the things that we actually want in the end for them to do um so so if if they're kind of designed from the ground up to have goals that actually makes them truly fulfilled and happy and satisfied playing the useful role that we intend for them to have that that might be like um a more feasible way of of achieving Harmony than if we just bring a whole bunch of these into existence and then

as it were CSE them or force them or threaten them to try to keep their place so so instead of rewarding them for you know managing emails really well instead of saying hey manage my emails really well and I'm going to I'm going to say thank you and I'm going to reward you in this way just design it so that it like enjoys managing emails that it finds fulfillment in managing emails and and that's a that's sort of a a built-in reward and and right now we don't really understand very well exactly um what that

would mean in concrete terms so for humans it's obviously a huge difference um whether motivating Somebody by offering them rewards like you're pray like if it's a child and when they do really good at some school test you sort of Praise them or give them ice cream like a very big difference from that than if somebody some other parent like were giving them like electric shocks when they failed or like something like huge difference positive versus negative one is way better ethically but for AI use trained with reinforcement learning it's not really clear how to

flash that out like in some like there's like numbers propagating through these big matrices and if you just added like plus 100 to all of these numbers it's not clear that that would correspond to like more positivity like it's kind of running on differential so so we need like it's a kind of partially a philosophical problem partly I guess like kind of AI interpretability problem but like translating these moral intuitions we have into in in in comp into computational uh terms so that we can actually apply them uh to the the kind of algorithms we

are constructing is is like yeah requires more intellectual work yeah well well one question that comes to mind uh I suppose to to round this up here we talk about the idea of like you know AI being conscious and as I say I'm I'm I'm not sure what my theory of Consciousness is but I it does I'm not particularly you know committed to materialism let's say and if it turns out Consciousness is just this immaterial thing that's like superimposed upon brains then we won't need to worry about it because it's unlikely we'll be able to

bring that about in computers but supposing that we can and that Consciousness is just material and we can create it in computers just as it can sort of arise in brains like how is this like centered I mean it wouldn't be presumably one great big Consciousness with lots of different emanations seems like we want to talk about different AI systems having its own sort of center of Consciousness with me and you it's easy enough to determine the the separation between our our conscious you know individuality because we have our first person sense of Consciousness it's

available to us and we're sort of helpfully locked inside a a biological body that that you can sort of see in other people as well but like if we just had a bunch of Minds just sort of interacting in The Ether somewhere it' be very difficult to determine like where the boundaries are for individual sets of Consciousness you know if I'm if I'm speaking to chat gbt and then I open a new chat on the same computer is that like a new Consciousness or if I continue the the the chat I open it up on

my phone instead of my computer and I move it over here and I carry on like presumably that's the same conversation like where is like the center of Consciousness how many consciousnesses are we potentially talking about and how would they be delineated um yeah um and you could imagine other cases where maybe the answer is uh either entirely pre-computed in which case presumably there is no new experience arising if it just sort of replace a recording of an answer that already uh provided before or or maybe but partially recomputed uh like you could imagine um

some mixture of expert models where like it can reuse some previous computations that's kind of cach that in all of these cases it becomes quite murky it's I think for a start might be that our naive notion of Consciousness is just quite inadequate for thinking about this much larger space of architectures and computations and it's kind of problematic even with human Minds particularly if we consider experiences outside the normal so people have psychedelic experiences they have like often a hard time kind of verbalizing what exactly those experiences are and how they relate there are various

um phenomena like blindsight and stuff or split brain patients like it's just sort of one conscious in each hemisphere in those like as As you move towards these marginal cases or even just people who meditate and pay close attention to their conscious State might discover that what we normally take to like the Nave picture is like we're moving around in the world if if you you see the whole world in full detail right right all around you like visually but if you pay more attention to it you realize that you're actually maybe just aware of

some simple properties of your whole visual field and you sort of sequentially aware of you can you can you have the potential to become aware of different parts of your visual field but you're actually only aware of a very small fraction of that most of the time and if you pay even closer attention maybe you realize it actually sort of flickers In and Out Of Consciousness it's not like the static struct so like the more closely you look at this phenomenon like that the less it it matches the sort of naive picture of what Consciousness

is and I think that becomes even more the case with these digital Minds uh and and the huge space of possibilities there um so yeah I think it it will require like yeah a lot of foundational work to to try to like figure out a better framework for conceptualizing all the possibilities that will become realizable yeah well a helpful place to start with that is is deep Utopia the book that I'm waving around for those who are no longer watching uh I'll make sure link is in the description it's available now for you to to

buy of course we've we've mentioned super intelligence as well from 2014 which sort of gives the the opposite potential ity if you're more interested in the sort of Doom and Gloom then perhaps you can you can go there or start there Nick Bostrom thanks for thanks for taking the time to do this I I hope that people I mean for me the most interesting thing perhaps that that shifted in my perception of the dangers of AI was this shift in perception from AI as a Potential Threat to AI as a potential victim and I hope

that those who hadn't considered that before will we'll have that to take away and and and chew a little bit but it's clear that there are more questions than answers here yeah yeah yeah I mean it's like uh it it kind of feels it can feel a little frivolous to think about like the problem of Utopia when we are so far away from anything utopian in in our current situation right like there are so many horrible problems today in the world and and also on the path dangerous to to get there um still I think

somebody some some point should should be sort of lifting their eyes up and look where where do we actually end up if we succeed at this like if things go well if things go maximally well where what what's the current place we are walking towards like it seems useful at at least at some point to kind of consider that and I think ultimately probably will be very different from from The Human Condition and like um but I think it could be extremely wonderful like in in ways that they even like beyond our ability to imagine

if we get it the book will be the book will be linked down in the description for those watching on YouTube or in the show notes for those listening Nick bostron thanks for coming on the show thank you to get early access to videos ad free and support the channel subscribe to my substack at Alex o conner.com