Model Thinking Full Course

44.88k views141452 WordsCopy TextShare
Nerd's lesson
About this Course We live in a complex world with diverse people, firms, and governments whose behav...
Video Transcript:
hi my name is scottie page i'm a professor of complex systems political science and economics at the university of michigan in ann arbor and i'd like to welcome you to this free online course called model thinking in this opening lecture i just want to do four things first thing i want to do is i want to sort of explain to you why you know i personally think it's just so important and fun to take your course in models second what i'd like to do is i want to give you a sense of the outline of
the course like what we're going to cover a little bit and how it's going to be structured third thing is i'll talk a little bit about an online course that i've never taught an online course before you probably haven't taken one so i'll talk a little bit how it's just how it's structured how it's set up you know what's what's out there on the web that sort of thing and then the last thing i'll do is i'll talk about sort of how i'm going to structure particular units which unit will focus on a single model
or a class of models i want to give you some sense of exactly how we're going to unpack those things analyze them and think about them all right okay so let's get started why model first reason i think is this in order to be an intelligent citizen of the world i think you have to understand models why do i say that when you think about like a liberal arts education i think some of us classically think of sort of the great books like this long shelf of books that everyone should know and when the great
book curriculum was formed right and i'll talk about this some in the next lecture you know most of human knowledge didn't have models in it models are relatively new phenomena right so if you take you know when you go from anthropology to zoology anywhere in between when you go to college you'll sort of find like oh my gosh i'm learning models in this course and we'll talk in a minute about some of the reasons why we're using models right but models are everywhere and so in order to just be involved in the conversation it's important
these days that you can use and understand models right reason number two the reason models are everywhere the reason they're everywhere from anthropology to energy is they're better right they sort of make us clearer better thinkers anytime anybody's ever run a horse race between models making you know people using models to make decisions and people not using models to make decisions the people with models do better so models just make you a better thinker the reason why is that they sort of weed out the logical inconsistencies they're a crutch right they just you know we
sort of are crazy you know think silly things can't think through all the logical consequences models sort of tie us to the mast a little bit and in doing so right we think better we just better what we do all right reason number three to use and understand data so there's just so much data out there right when i first became a social scientist i mean it was it would be a real effort for someone to go grab a data set now there's just a ton of it i like to think of as a fire
hose of data but my friends who are computer scientists they call it a hairball of data right because it's just sort of all mangled and messed up so models let us take that data right and sort of structured into information and then turn that information into knowledge and so without models all we've just got is a whole bunch of numbers out there with models we actually get information and knowledge and eventually maybe even some wisdom at least we can hope right okay reason number four last you sort of main category of reason and by the
way the next four lectures i'm going to sort of work through and unpack each of these four reasons in more depth but i just want to sort of lay them out there this first lecture so reason number four to decide strategize and design so when you've got to make a decision whether it's you know if you put either president united states or whether you're running your local pto organization it's helpful to be able to structure that information in a way to make better decisions so we'll learn about things like decision trees and game theory models
and stuff like that they just help us make better decisions and to strategize better and also at the very end of the class we'll talk about design issues right you can use models to design things like institutions and policies and stuff like that so models just make us better at making choices better at taking actions okay so those are the big four now let's talk a little bit about um the outline of the course what it's like so this isn't going to be a typical course not just because it's online but because the structure of
the course is very different so most courses like if you take a math course it sort of starts here and sort of moves along right with each thing building on the thing before it now the difficulty with a course like that is if you ever like fall off the train right fall behind that's it you're just lost because you know everything what i'm doing in lecture six you need to know lecture five and lecture five you need to know lecture four well this course is going to be very different this course is gonna be a
little bit more like a trip to the zoo all right so we're going to learn about giraffes and then we'll go learn about rhinos then we'll go to the lion cage so if you didn't quite understand the rhinos it's not going to hurt you too much when we go over to the lion cage right so it's it's more like you know just moving from one topic to the next they're somewhat related that they're all sort of animals right but you don't need to fully know the giraffe to move to the right all right so but
obviously like we're not going to take right giraffes and rhinos we're going to study models and important so what kind of models well we're going to study models like collective action models these are models where individuals have to decide how much to contribute to something that's for the public good we'll study things like the wisdom of crowds like how is it the groups of people can be really smart well study models that have like fancy names like lyapunov models and markov models these are models of sort of processes right so they sound scary but they're
actually sort of fun and interesting we'll study game theory models people study something called the colonel blotto game which is a game where you have to decide how many resources to allocate across different fronts so this can be thought of as a really interesting model of war it can also be an interesting model of you know firm competition or sports or legal defenses all sorts of stuff so we're gonna just you know play with a whole bunch of mobs everything from economic growth to tipping points you know a whole bunch in between so it should
be lots and lots of fun okay what's the format for this how's this going to work what does an online course even look like okay well let's think about it so the first thing there's these videos right you're watching on right now i'm going to try and keep them between 8 and 15 minutes in length right sometimes i might sneak to 16 but mostly i'll be doing 15 minutes in length and inside the videos there'll be questions so i may all of a sudden the video may stop and it'll say what's the capital of delaware
why should we say that but something germaine hopefully you know to the to the lecture so there'll be these fifteen eight to fifteen minute lectures each module each section will have you know somewhere between like three and six of those right okay in addition they'll be reading so on the wiki you'll find links to the readings now a bunch of these readings will come out of some books that i'm i've written and some that i'm one that i'm about to write about actually this course and it'll all be free so you'll be able to princeton
university press has been very generous in letting a lot of that content on my previous books be out there so we're going to uh you'll be able to download whatever you need to look at all right there's um some assignments so there's an assign on the web page you'll see little assignment things so we'll also throw out some assignments you can just make sure you're following what's going on with the course and then finally there'll be some quizzes right so there's some quizzes out there just to make sure you know hey am i really getting
this i mean you'll watch me you'll think yes scott gets these models but that's not what this is about right this is about you understanding the models so there'll be some quizzes right but all in good fun okay and finally there's a discussion forum i mean there's 40 to 50 000 people in this class right so office hours would get sort of crowded so we're gonna have a discussion forum where people can ask questions i'll answer some i've got some graduate students who answer some other students can answer things but it'll be a place for
people to sort of share ideas share thoughts give feedback and should be hopefully you know really useful it's structured in a way i think it'll work for everybody okay so how does it work what's one of these sections going to look like well each section of the course right which there's going to be 21 is going to be focused on you know particular model right and so that when we talk about the model said okay what is the model what are the assumptions what are the parts how does it work you know what are the
main results what are the applications so we just sort of you know talk through how the thing sort of plays out then i'll go into some technical details sometimes in the same lecture they present the model sometimes in later lectures this will be you know more technical stuff a little bit more mathematics and i'll try and be pretty clear about whether or not the math is you know easy medium or hard you know i'll let you know up front like okay this may require a lot of algebra or this is just you know sort of
simple logical thinking right so i'll be pretty clear about how much effort it's going to take to get through trudge through some of the examples and then there'll be practice problems that you can work on as well the other thing i'm going to do in every one of these sections is talk about the fertility of the models all right so remember before i talk about this kernel blotto model that was could be used to model war or sports or legal defenses right most of these models were developed for one purpose but we can apply them
to other purposes so we're going to talk a lot about how okay now that we just learn this model where can we apply it where does it you know where else does it work right okay so that's it that's sort of how it's going to work right learning models is really important makes you more intelligent citizen right just you know sort of just a more engaged person out there in the world because so much of how people think and what people do is now based on models makes you a clearer thinker that's why so many
people are using models it helps you use and understand data and it's gonna help you make better decisions strategize better and even you know design things better so course should be really really useful we're gonna cover a lot of topics they don't necessarily build on the one before right there'll be some quizzes and videos and that sort of stuff and this should be just a great time all right welcome and let's get started thank you hi welcome back in this lecture i want to talk a little bit more about how using models can help make
you a more intelligent citizen of the world and so i'm going to break this down into a bunch of sort of sub-reasons about why models just make you better able to engage in all the things that are going on in this modern complex world in which we live okay so when we think about models they're simplifications they're abstractions so in the sense there's a sense in which they're wrong there's a famous quote by george box where he says all models are wrong and that's true right they are but some are useful and that's going to
be a mantra that comes up throughout this course these models are going to be abstractions they can be simplifications but they're going to be useful to us they're going to help us do things in better ways okay so in in a sense right and this is a big thing in this course models are the new lingua franca they're the language of not only the academy you know which i talked about some in the last lecture but they're the language of business the language of politics they're the language of the nonprofit world wherever you go whether
it's people are trying to do good make money cure disease whatever it is that they want to do right you're going to find that people are using models to enable them to be better at what their purpose is okay that's why they've really become the new lingua franca so if you think back remember i talked about this in the first lecture the whole idea behind the great books movement was that there was these set of ideas that any person should know so within the hundred and so great books there were thousands of ideas and mortimer
adler and this robert hawkins president of the university of chicago they had this thing that they wrote called the cintapicon which was a list right adler put this together it was this giant list of sort of all the ideas that someone should know right an intelligence person should know so what are those ideas well one of those ideas was to tie yourself to the mast and this comes from the odyssey when odysseus's ship is going past the sirens and he wants to hear the sirens beautiful love song so what he does is he has his
crew tie him to the mast he ties himself to the mast so that he can listen to them but pre-commit to not driving his boat over to hear the sirens at the same time he puts wax in the ears of his crew so they also won't be you know encouraged to sort of drive the boat over there well this is an idea that recurs in history right we think about cortes burning his ships right so his men won't you know retreat they'll continue to advance so this idea to tie yourself to mass is a really
worthwhile thing but but here's the problem one of my one of my favorite websites is a website called opposite proverbs so on this website that says things like he who hesitates is lost um a stiction time saves nine or two heads are better than one too many cooks spoil the broth so you've gotta get this really good advice something that probably made it in the centopicon but then you get something that says the exact opposite well how do we adjudicate between those two things the way we adjudicate between those two things is by constructing models
because models give us the conditions under which he who hesitates his laws and they give us the conditions under which a stiction time saves nine so when we talk about the value of diversity and prediction we'll see why it's the case that two heads are better than one and we'll also see why it's the case that too many cooks blow the blood so ironically what models do is they tie us to a mass they tie us to a mass of logic and by time's too of watching we figure out which ways of thinking which ideas
in the centaupicon are useful to us okay so if you look at almost any discipline whether it's economics and here what you see in this diagram is you see a description of sort of this is a utility function for an agent and what that agent is doing trying to maximize their payoff right so economists use models all the time biologists use models as well they they you know have you know models of the brain where they have little axons and dendrites going between the neurons they have models of gene regulatory networks they have models of
species right things like that sociology we have models as well right so there's models of sort of how your identity affects your actions and your behaviors and things like that right in political science we have models and political sciences this is a picture of a spatial voting model so they might say candidates are liberal or conservative on certain dimensions and voters are a little more conservative and you say that well you're more likely to vote for a candidate who takes positions similarly similar to yourself it's like i work at the university of michigan we have
something called the national election studies that's run out of there where we sort of gather all this data about where politicians are and where voters are and that allows us to make sense of who votes for whom and why okay so models help us understand the decisions people make linguistics right here's another area right so you might think how can use models in linguistics well this little model here you see things where it says um you see these and n and p's in here if you look closely well v stands for verb n stands for
noun and what you got in s stands for you know subject let's say right so what you can do is you can ask what is the structure of a language you can ask formally mathematically what are the structure of the language is and whether some languages are more like other languages and not depending on how people you know set up their sentences so in german where they may put all the adjectives at the end of the sentence that looks very different than let's say english all right even the law this is a graph from one
of my graduate students former graduate students now he's a law professor dan katz where he's got sort of a network model of which supreme court justices you know who they appoint so who if someone appoints judges from some other judge by putting that data that's out there in this sort of model based form we can begin to understand how conservative and how liberal certain judges are all right so there's lots of ways to use models and there's even whole disciplines now that have evolved that are based entirely on models so game theory which is what
i was really trained in as a graduate student is all about strategic behavior it's the study of strategic interactions between you know individuals companies nations right and game three can also be applied to biology right so there's all sorts of stuff right when you go to when you go to you know college you go to grad school you'll find that there's game theory models of just about anything right so it's actually a field based entirely just on models right why right why all these models why does everything from linguistics to economics to you know political
science use models well because they're better right they're just better than we are so let me show you a graph here this is a graph from a book by phil tetlock it's a fabulous book and in this graph he um what he's showing is he's showing the accuracy of um some different let me pull up a pen here different ways of predicting so what you see on this axis here this calibration axis right here this is asking sort of how showing you how accurate a model is and this axis is saying how discriminating is in
terms of how particular how fine of predictions is it making so instead of saying that hot or cold might be saying it's going to be 90 degrees or 80 degrees or 70 degrees so this axis here this up and down axis is discriminatoriness discrimination and this axis is how accurate so what you see here down here are hedgehogs so these are people who use a single model hedgehogs are not very good at predicting right they're terrible at predicting up here are people he calls foxes now foxes are people who use lots of models have sort
of lots of loose models in their head and they do much better you know sort of at calibration a little bit better discrimination than individuals but way up here better than anybody are formal models formal models just do better than either foxes or headshots they might just how much data is this well tedlock actually had tens of thousands of predictions so over a 20-year period he gathered predictions by people and compared how those people did to models and the answers models do much much better okay all right so what about people that who actually make
predictions for living so this is a picture of bruce booney mosquito who makes predictions about what's going to happen in national relationships and he's very good at it he's so good at it that they put his picture on the cover of magazines right he's at stanford and nyu chair of the department of nyu used to be anyway so bruce um uses models he's got a very elaborate model that helps him figure out based on sort of bargaining position and interest what different countries are going to do but just like george bach said at the beginning
he doesn't base his decision entirely on that model what the model does is gives him guidance as to what he then thinks so it's a blending of what the formal model tells him and what his experience tells him so smart people who use models but the models don't tell them what to do okay another reason models have taken yeah they're better but they're also really fertile so once you learn a model that's you know for one domain you can apply to a whole bunch of other domains which is fascinating so we're gonna learn something called
markov processes which are models about dynamic processes so they can be used to model things like disease spread and stuff like that right we're gonna find when we learn those you can also use them this is sort of surprising to figure out who wrote a book you might say how does that happen well that happens because you can think of words writing a sentence as a dynamic process so different authors right use different sequences of words different patterns so therefore we can use this mathematical model that wasn't developed in any way for this purpose to
figure out who wrote what book okay totally cool all right another big reason models really make us humble the reason they make us humble is we just have to lay out sort of all the logic and then we realize holy cow i had no idea that this was going to happen right so often times when we construct the model we're going to get very different predictions than what we thought before right so if you look at things here's a picture of a the tulip graph right from uh when there's a big in the 17th century
when there's a you know this big spike in tulip prices you can imagine that people thought the price was going to continue to go up and up and up well if you had a simple linear model you might have invested heavily in tulips and lost a lot of money so one reason that models make us humble is never go back to the george box code all models are wrong right so model is going to be wrong but the models are humbling to us because they make us sort of see the full dimensionality of our problems
once we try and write down a model of any sort of system it's a very humbling exercise because we realize how much we've got to leave out to try and understand what's going on all right here's another example right this is the case show or home price index and what you see is you see prices going up and up and up right and then you see this pen up here precipitous crash right here right a lot of people had models that just said look things are going to continue this way there were a few people
that had models that said things go down these people the ones whose models went down they made a lot of money these people thought it was going to go up didn't so we're always going to see a lot of diversity models and we're not going to know really often until after the fact which one is right and so one thing's going to be really important is to have many models so let's go back to that fox headshot graphite that we i showed you before the the foxes the people lots of models did much better than
the hedgehogs the people with no models and formal models did better than the foxes but what would do better than formal models well people with lots of formal models right so if we really want to make sense of the world what we want to do is have lots of formal models at our disposal so what we're going to do in this class is almost like you know remember the old like 16 32 box of crayolas that's sort of what we're doing here right we're just going to pick up a whole bunch of models and we're
going to have them right there fertile we can apply them across a bunch of settings so when we're confronted with something what we can do is pull out our models ask which ones are appropriate and in doing so right be better at what we do so the essence of tetlock's book right that's where that graph came from of the foxes and hedgehogs is that the only people who are really even better than what he he has a way of classifying what a random choice would be the only people who are better than random at predicting
what's going to happen are people who use multiple models and that's the kind of people that we want to be okay so that's sort of the big intelligent citizen of the world logic right there's models are incredibly fertile they make us humble they help you know really clarify the logic and they're just better right so if you want to be out there you know helping change the world in useful ways it's really really helpful to have some understanding of models thank you very much hi welcome back okay so last lecture we talked about how using
models could make us more intelligent citizens of the world how we could just sort of understand how the world works a lot better that was sort of the new lingua franca all right in this lecture we're going to talk about how models make us clear thinkers and this is one of the big reasons why people use models is because they just help us think more logically about how the world works okay so this is sort of a multi-step process so let's let's see exactly how it plays out so the first thing you do when you
write down the model is you name the parts so let me take a simple example suppose i just want to write a model of where people go for lunch and let's suppose it's a small town and there's really just sort of you know four restaurants so there's restaurant one restaurant two restaurant three restaurant four so those are parts right but what else is parts another part is people right so there's these individuals and they've got to decide where to go right which restaurant do i go to well now we have to ask what are the
relevant parts of the people well you know this guy's wearing shoes but the thing is his shoes probably aren't a relevant part so we name the parts we're gonna think about what really matters and the shoes probably don't matter nor does it matter if he's wearing mittens or even if we put a hat on his hat isn't going to have much of an effect on which restaurant he goes to okay so what does matter well one thing what matter is how much money he's got right and how expensive these restaurants are so this restaurant may
be cheap and this restaurant may be expensive but this may be somebody's got a lot of money so how much money you have is going to be one determinant of which restaurant you choose another thing that's going to matter is how much time he has does he have only 15 minutes right or does he have a whole half hour to go have lunch and different restaurants may take different amounts of time a third thing maybe i'll write this fancy little signal here for preferences this is how economists and social scientists write preferences down these are
just you know what he likes maybe one of these is a mexican restaurant maybe one of these is an italian restaurant so he's gonna have different preferences over the different restaurants so these are all things these are all sort of relevant parts that go into the model all right once we've laid down the parts then we've got to think about the relationships between those parts so models help us that have identified the specific relationships so we see on the left is a simple game theory model this is called a extensive form game where one player
here's player one takes make some sort of decision and then another player player two takes some sort of decision and then the players get payoffs so once you sort of name the parts then the next thing you need to model is identify the relationships between those parts and how things play out so now you've got the parts that's relationships what you can do is you can think through the logic so let me show you how complicated this is and how models are so useful let's do sort of a simple thing that i sometimes play with
my uh undergraduates suppose i want to build a rim for the earth to be shot through a build a big basketball i'm going to shoot the earth through the room but i want to give a little bit of space so you can make it so i'm going to put one meter instead of all the way around right so there's a little bit of a gap of one meter and then the earth can go through with just that a little bit of spacing one meter all the way around well now that's a question what should the
circumference of that rim big how big around does that room have to be assuming that the earth is let's get simplified say it's 25 000 miles around the equator of the earth how big around does that rim have to be if i want one meter of clearance all the way around think about okay well now let's do a little math so we know the formula for circumference of a circle right circumference is equal to pi times d right now what i want is i want to find the circumference of that rim and that's going to
be pi but my diameter is going to be the diameter of the earth and if i think about it remember i've got this rim here and i want the earth to go through but i want one meter on this side and i want one meter on that side so it's going to be the diameter of the earth plus 2 meters so the circumference of my rim is going to be just pi times the diameter of the earth plus 2 meters well that's pretty easy to solve right because that's just going to be pi times the
vm of the earth plus pi times 2 meters well pi times the diameter of the earth you already said was 25 000 and pi times 2 meters is just going to be 6.28 meters so the circumference is going to be 25 000 and 6.28 meters so that's probably not what most of you guessed right so by writing down a very simple model just a model for the circumference of a circle we're able to figure out exactly how big that room has to be and it's often very different right from what our intuition would have suggested
okay so working through the logic is the big reasons why models make us clear thinkers now the next thing models do is they allow us to inductively explore so let me give a sort of a fun example of this suppose you have a room right and we've got this room here and there's a door there's one little door right here that people can come out of and the problem is jammed in the door right so you know it was people trying to exit this room everybody gets jammed in the doorway so it's a question of
like how do you figure out um a better way to prevent people getting jumped but one thing you might do is you might put a post right here and this post might prevent people from uh you know bumping into each other as they come out because they come here they sort of up in the post they have to go around and that prevents things from grant people from sort of getting bunched up near the door so it's interesting here is once you construct a model of the room and you put a bunch of people in
the room right so here's people and then you have them run out you can ask what's the effect going to be a putting the post then you can inductively explore better ways to sort of position things in the room to prevent people from getting piled up okay now once we sort of work through the logic and explore things we can ask what what exactly happens in our model remember we talked about types of outcomes earlier on i said there's really four things that can happen in a model one is it can go to some sort
of nice equilibrium like the planets you know i mean like i'm sorry if i drop my pen it gets rest in the fluorine equilibrium it can go some sort of cycle right like the planets orbiting the sun it can be completely random but it can just be you know totally unpredictable and random or it can be complex and so one thing that models like that let us do is figure out which of those things is going to happen so let me throw something out there suppose we're looking at oil which is a commodity right we
want to ask what about the price of oil what about the demand for oil what can we say about those things well let's think about it the demand for oil is you know probably going to depend on the size of the economy and so therefore you'd expect since the economy tends to grow at a fairly constant rate you'd expect you know the demand for the total supply of oil to probably slope up right what about the price of oil well the price of oil depends on a whole bunch of people who are sort of have
so you know they might have some in reserve and they're bargaining and they're buying and selling and all sorts of crazy stuff can go on so if i were going to make a guess i would say you know that the supply of oil is some sort of nice pattern right the total world demand supply was probably a nice nice pattern but if they get the price of oil it's probably crazy okay and so in fact if you look at it that's exactly what you see so here's oil production right here and that satisfies sort of
this nice upward slope but if you look at the price of oils well price oil which is down here that's just crazy it's completely unpredictable it's what we'd call complex it's not random right but it's complex so the models but if you construct models of these two things we can see why you know total production of oil goes up and why the price of oil is so hard to understand okay next identify logical boundaries this is one of my favorites so there's a website called opposite proverbs and on this website you see statements like these
two two heads are better than one and that's certainly true often it's the case that two heads are better than one and too many cooks bow the broth and that's often true as well right it is too that too many cooks do spoil the broth well here's the problem there's the opposite right the same is with a stiction time saves nine and he who hesitates is lost so if you just have these sort of proverbs or mantras that you sort of follow they're not going to do any good because there's always going to be an
opposite proverb that you know says do the opponent do the opposite thing so which one do you follow what models enable us to do is find the conditions under which one thing holds and one thing doesn't so later in this course we're going to see when is it exactly the case that two heads are better than one and when is it exactly the case that too many cooks boil the broth so even though these properties are opposite there's conditions under which one each holds all right okay last thing communicate one of the real beauties of
models is they allow us to communicate our ideas and what we know really simply so let's take politics for instance and suppose you want to ask me scott how do you how do people vote exactly now i could say well you know it's pretty complicated i think that people you know they like candidates or they don't like categories and then there's issues there's these things called issues and there's a question like you know is the candidate do they take positions on issues that you like or they don't take positions on it you like and they
they balance these things and they watch debates and and i could go on and on and on and you might have really no idea when i'm done how i think people vote well let's suppose instead i've written a really simple model and i said okay so here's how it works there's going to be a voter and there's going to be two candidates so here's my voter and here's candidate one and here's candidate two now what the voter does for each of these candidates is they've got some sort of likability so they can say so there's
likability of candidate one and there's likability of candidate two and this is just sort of like you know how how friendly do they seem do they seem trustworthy they seem honest that sort of stuff so this is you know we'll put likeability here now the second thing i'm gonna say is that people care about policy now for policy what they care about is sort of this set of issues so you can think of on policies what i'm going to do is i'm going to say that the voter i'm going to put a little left right
continuum to say the voter over here is maybe a little bit conservative right and then for these candidates i'm going to say well candidate 1 is over here is kind of liberal and candidate 2 is really conservative right over here so this is where candidate 2 is so now here's my model of how people vote what they do is they sort of say okay well how likable is each candidate right so look at the likability of candidate one and the likeability of candidate two and then they ask how far apart is here's my sort of
policies you know maybe a little bit to the right how close are these candidates to me well candidate one is is pretty far away candidate two is a little bit closer so then how people vote depends on the combination these two things likeability and how close somebody is in policy space notice how that's a much clearer way of explaining exactly how i think and it enables me to communicate much more clearly to other people how is it that i vote okay all right so that's how models make us clearer thinkers now what we're going to
do next once we've sort of you know got this understanding of models helping us think logically is we could take those logical models and bring them to data thank you welcome back in the previous lecture we talked about how we could use models to become clearer thinkers in this lecture what we're going to do is we're going to talk about how we can use models with data and this is an important using reason why people use models in fact we talked to scientists about why they use models whether they're social scientists or natural what they'll
typically say is well we use models to take them to data to typically use and understand data in better ways but what i want to do is want to unpack that in several directions i want to give some specific reasons or ways in which people use models with data all right so the first one the first real reason is just to understand some basic patterns in the data so what do i mean well you know you could look at data it could just be a straight line and nothing could change so for example if you
look at a system and that's how much energy is in the system we know the energy is either lost nor gained so energy is a constant and we can have a model that explains why we see you know energy being a constant alternative way you can see something that's just a straight line it's just an increasing line on a model that explains that remember we also talked about how we can see patterns in data right so we could see things that go up and down slowly like this like business cycles and we could have models
that tell us why we see these sort of cyclic curves we could have something that's much more spiky we have a model that explains that so again we talked about is sort of hairball or this firehose of data there's tons of data out there that data is going to have patterns to it and what we can do is use models to understand why we see those particular patterns okay in addition to the patterns there's also just the use of models to predict specific points so suppose you're out looking for a house and you see this
house that's for sale and you're wondering but i wonder how much that house is going to cost well you could have a model that says okay the price of a house depends on its size so here's sort of the size of the house in square feet and here's the price we just put dollars in there for price and maybe you've got a linear model and your linear model says basically for every you know additional square foot the price of the house goes up 100 or 200 or something like that well then if this is your
model so on your mom we've got a house that's got this many square feet that's 2 000 square feet right and you go up here and find the point if it's a hundred dollars per square foot then your model would predict that the house is 200 000 so we can use a simple model to make some sort of prediction about just in the ballpark how much the particular house would would cost so this is again a common use of models to if you construct a model and from that model you predict a point value okay
third reason why we use models is not so much to predict points but to produce bounds so suppose you're the economic advisor to the president not a job you necessarily want but suppose you are and the president comes to you and says what's inflation going to be next year or next month well you know inflation doesn't move that quickly and you might be able to say to the president well you know i think it's going to be 1.2 and you might be pretty confident that it's 1.2 percent but suppose the president says do you know
what i'm just doing some long-range forecasts so what if what's inflation going to be 10 years from now well who knows what inflation's gonna be 10 years from now so you may have some fairly sophisticated models but they're not going to give you a point estimate so instead what they might say is well you know i can tell you with you know pretty high probability it's going to be between zero and three percent so it gives you a range right so your model won't tell you necessarily exactly what's going to happen because there's too many
contingencies out there there's too much complexity too much uncertainty you can't say for sure but your model might give you some bounds about what's going to happen and that can be really useful for making policy decisions okay reason four retrodiction what i mean by that well you can use models with data to predict the past now there's a couple reasons you might do this one reason is you might not have data from the past you might want to sort of use models to try and figure out what do we think the past was like and
this is thing you know geologists do this you know biologists do this anthropologists do this archaeologists do this they use models and data to try and figure out what do we think you know temperature was like how many animals do you think they were what were these civilizations like those sorts of things if you have the data then you can use models to see how good they are so you can actually retro dick data to see if in fact your model would have worked let me explain what i mean so suppose we're looking at um
some data stream perhaps it's let's let's stick with unemployment suppose the unemployment data looks like this for some period of time right and now what you're doing is you're saying okay we've got a model i'm going to ask how well that model would do so what you do is you sort of fix that you give that model data up to here so it's fitting pretty well and then at this point right here you say let's see how our model would predict from there on out well maybe if you run your model it sort of goes
like this if it goes like that you can say you know our model in the past if we'd used the same model in the past it wouldn't have worked and so that makes you fairly dubious about whether the model is going to work now so retro addiction going back and testing past data is a good way to test how good your model really works fifth reason predicting other stuff so you might construct the model for one reason let's suppose you're really interested in the unemployment rate you know you construct them a little quick unemployment rate
but out of that pops out the inflation rate so you get something else this is a good way to tell you know how strong your model is because typically you construct a model for one reason it gives you other stuff now there's another type of predicting other that's really cool about models so when they develop the first models of the solar system right the heliocentric model the sun in the center right so you've got the sun sitting in the center and the planets orbiting the math didn't quite work out right and they figured out there
must be some big planet out here that's causing the orbits of the other planets to be skewed a little bit and that big planet was neptune they couldn't see it but their model predicted it so the model predicted something something else something other that from it was evident in the data so models can predict stuff other than what you expect and predict to predict which is pretty cool all right six sixth reason to inform data collection so let's suppose that you're interested in educational reform which is something i'm really interested in you want to think
okay how do we make better schools well what you can remember in our last lecture about being a clear better thinker one thing models forces to do is name the parts so if i want to think how are schools how do you make better schools well there's a lot of data out there on school performance so what i want is i want some sort of model that explains why students do poorly and students do well so i think what are the parts of that model well there might be things like teacher quality we'll talk call
that tq right there might be parental status we'll call that p.s whether your parents went to college whether they've got high school degrees whether they're doctors lawyers that sort of thing there might be total spending in the school district that might matter right things like class size you just put cs for classes but class size probably matters a lot right you might um argue that you know technology matters is there technology in the classroom you might even argue you know does general health is health a big consideration and you could even you know argue you
know what is sort of the uh what are the other students like in the school right is there other sort of peer effects does that affect how well students do so if you don't have a model you don't even know what data to go get and so what models help you do is figure out okay what data should we get and what data should we include in our and you know what data should we go out there and try and find in order to figure out how the world is going to work so the use
of models to name the parts is incredibly helpful for gathering data because it tells you which data to go get our last two reasons for your model are a little bit different but they're they're similar to one another and that is that we can use data right to sort of tell us more about the model and then we can use the model to tell us more about the world so let me explain what i mean a little bit it's kind of confusing so one thing we can use models for is to estimate hidden parameters in
the model so here's a sort of a classic model from um disease from epidemiology to study disease it's called the sir model so there's three types of people there's susceptible people there's infected people and there's recovered people so if there's a disease you could be susceptible to it you can be infected or you can be recovered and when you recover then you're immune if you're not going to get it again so let's suppose that you know you work for the center for disease control and suddenly you see oh my gosh people are getting sick but
you don't know there's some sort of flu going on but you're not quite sure how this is spreading is it spreading is it airborne right is this virus spreading you know through mucus or something you're not sure and you're also not sure how virulent it is so you're not sure how many people are going to get the disease what you've got let's draw a little graph where you've got time on this axis and you've got the number of people who have the disease and what you can do is you can sort of see over time
exactly how many people are getting the disease well if you can see over time how many are getting it from that data you can predict how virulent the disease is like how likely it is to pass from one person to the other and that's gonna allow you to figure out is the disease gonna go like this or is it gonna go like that and so from that data you can estimate hidden parameters right namely how virulent the disease is like you can't tell by looking at data how likely one person is to get it from
another you know from just you can't tell but in the world but by looking at how many people get it you can go back and estimate that parameter you can figure it out that's what's really cool all right last reason calibration so calibration refers to sort of constructing a model and then calibrating it as close as possible to the real world let me give an example here so suppose i want to write a model of forest fire so i'm going to draw some really bad trees here here's a tree here's another tree right and i
want to know like what's the probability these are horrible choices okay what's the probability that the foyer moves right from this t to this tree how fast does it move all that sort of stuff well what i could do is i could gather and this data exists tons and tons of data about past forest fires and with that past data i could calibrate a really accurate model of forest fires how likely are they spread how you know their speed of spread depends on how dry the trees are how much precipitation there's been what the wind
speed is all that sort of stuff once i've got all that data that would allow me then to figure out you know how dangerous are particular forests right because i could say oh my gosh northern new mexico hasn't been raided in two years here's how dry the soil is how dry the trees are here's you know how many acres of forest we have here's what the wind speed is and you could know exactly how dangerous a particular forest happens to be at a particular moment in time so you use all sorts of past data to
calibrate a particular model and you know your big model and then you can use that model to construct policy and that's what we're going to talk about in the next lecture right how do we use models to make decisions to strategize right and to design things thank you hi in this lecture we're going to look at our fourth category of reasons about why you'd want to take a course in modeling why modeling is so important and that is helping make better decisions strategize better and design things better so let's get started this should be a
lot of fun all right so first reason why models are so useful they can they're good decision aids they help you make better decisions let me give an example the biggest good is going here so what you see here is a picture of a whole bunch of different financial institutions these are companies like bear stearns aig citigroup morgan stanley and this represents the relationship between these con these companies in terms of how one of their economic success depends on another now imagine you're the federal government and you've got a financial crisis so a lot of
these companies are some of these companies are starting to fail and you've got to decide okay do i bail them out do i save one of these companies well let's use this very simple model to help us make that decision all right so to do that we need a little bit under more of an understanding of what these numbers represent so let's look at aig which is right here and jp morgan which is right here so now we see a number 466 between the two of those what that number represents is how correlated j.p morgan's
success is with aig success in particular how quarter their failures are so if aig has a bad day how likely is it that jp morgan has a bad day and what we see there is a really big number now if we look up here at this 94 this repre this represents the link between wells fargo and lehman brothers what that tells us is that if lehman brothers has a bad day well it only has a slight effect on wells fargo and vice versa so now you're the government you got to decide okay who do i
bail out nobody or somebody well let's look at lehman brothers there's only three lines going you know going in and out of lehman brothers and one is a 94 and uh four lines one is a 103 one is a 158 and one is a 155. those are relatively small numbers so through the government you say look and even though lehman brothers has been around for a long time and it's an important you know company these numbers are pretty small if they fail it doesn't look like these other companies will fail but now let's look at
aig we've got a 466 we've got a 441 we've got a 456 we've got a 390 we've got a 490. so there's huge numbers associated with aig because there's a huge number you basically have to figure you know what we probably have to prop aig back up even if you don't want to because if you don't there's the possibility that this whole system will fail so what we see here is the incredible power of models right to help us make a better decision because the government did let lehman brothers fail and you know terrible for
lehman brothers but the the economy sort of soldiered on they didn't let aig fail and we don't know for sure that it would have that you know the whole financials you know apparatus united states would have fallen apart but things they propped up aig and you know we made it the country made it so it looks like they made a reasonable decision all right so let's that's you know sort of big important financial decisions let's look at something a little more fun this is a just a simple sort of logic puzzle that'll help us see
how models can be useful now this is a game called the monty hall problem and it's named after um [Music] monty hall was the host of a game show called let's make a deal that air during the 1970s now the problem i'm going to describe to you is a characterization of an event that could happen on the show it's one of you know several scenarios to get up in the show here's basically how it works there's three doors behind one of these doors is a prize behind the other two doors there's some you know silly
thing like a goat right or you know a woman dressed up in a ballerina's outfit or something right so but one of them has something fantastic like a new car or a washing machine now what you get to do is you pick one door so maybe you pick door number one you might see big door number one now monty knows where the prize is so the two doors you didn't pick one of those always has the goat behind it or you know silly prize behind it so because one of those always has a silly prize
pen he can always show you one of those other two doors so you pick door number one right and what monty does he pick one and what monty does is he then opens up door number three and says look here's a goat then he says hey do you want to switch to door number two well do you all right that's a hard problem so let's first try and get the logic right and then we'll write down a formal model so it's easier to see the logic for this problem by increasing the number of doors so
let's suppose there's five doors and now there's five doors let's suppose you pick this blue door this light blue door the probability you're correct is one-fifth right one of the doors has a price the probably correct is one-fifth so the probability you're not correct is four-fifths so there's a one-fifth chance you're correct there's a four-fifth chance you're not good now let's suppose that monty hall is also playing this game with you and he knows again he knows the answer so monty's thinking okay well you know what i'm going to show you that it's not behind
this yellow door and then he said you know what else i'm going to show him that it's not behind the pink door is it honey be nice i'm going to show you it's not behind the green door now he says do you want to switch from the light blue door to the dark blue door well in this case you should start thinking you know initially the property was right was only one-fifth and he revealed all those other doors that doesn't have the price it seems much more likely that this is the correct door than mine's
the correct door in fact it is much more like the property is four fists it's behind that dark blue door in only one fifth it's behind your door so you should switch and you should also switch in the case of two now let's formalize this this isn't so much this is just we'll use a simple decision theory model to show why in fact you should switch all right so let's start out we'll just do some basic probability there's three doors you pick door number one the probability you're right is a third and the property that
it's door number two is the third the probability star number three is a third now what we want to do is break this into two sets there's a one-third chance that you're right and there's a two-thirds chance that you're wrong right now if you after you pick door number one the prize can't be moved so it's either behind behind door number two number three or if you got it right maybe it's been door number one so now let's think about what monty can do monty could basically show you if it's behind door number one or
door number two he can show you door number three he can say look there's the goat well if he does that because he could always show you one of these doors nothing happened to your probability of one third there's a one-third chance you will write before since he can always show you a door there's still only a one-third chance you're right right alternatively suppose that um it was behind door number three it was doing door number three well then he could show you door number two you'd say the goats here so it's still the case
that nothing happens to your probability and the reason why is when you think about in terms of these two sets you didn't learn anything you learn nothing about this this other set right here the two-thirds can't you run because he can always show you a goat so your initial chance your initial probability of being correct with one third your final chance of being correct was probably one third so just this idea of drawing circles and writing probabilities allows us to see that the correct the correct decision in the monty hall problem is to switch right
just like when we looked at that financial decision that the federal government had to make with the circles and the arrows you draw that out and always the best decision is let lehman brothers fail bail out aig all right so let's move on and look at sort of the next reason that models can be helpful and that is comparative statics what do i mean by that well here's a standard model from economics what we can think of is comparative statics means you know you move from one equilibrium to another so what you see here is
s is a supply curve that's the supply for some good and d1 and d two are demand curves so you see it's demand shifting out so when this demand shifts out in this way what we get is that more goods are sold the quantity goes up and the price goes up so people want more of something more is going to get sold and the price will go so this is sort of saying how the equilibrium moves so this is again a simple example of how models help us understand how the world will change at least
in equilibrium world just by drawing some simple figures all right reason number three counter factuals what i mean by that well you think we only get to run the world once we get to run the tape one time but if we write models of the world we can sort of rerun the tape using those models so here's an example in april of 201 in 2009 the spring of 2010 the federal government decided um to implement a recovery plan well what you see here is sort of the effect this line right here shows the effect with
the recovery plan and this line shows here says does here's what a model shows would have happened without the recovery plan now we can't be sure that that happened but you know at least you have some understanding perhaps of what the effect of the recovery plan was which is great so these counterfactuals aren't going to be exact they're going to be approximate but still they help us figure out after the fact whether a policy was a good policy or not reason number four to identify and rank levers so we're going to do is we're going
to look at a simple model of contagion of failure and so this is a model where one country might fail so in this case that country is going to be england and we can ask what happens over time so you see initially after england fails we see ireland and belgium fail and then after that we see france fail and after that we see germany fail so what this tells us is in terms of its effect on the world's financial system london's a big lever and so london's something we care about a great deal now let's
take another policy issue climate change one of the big things in climate change is the carbon cycle it's one of the models that's used all the time simple carbon models we know the total amount of carbon is fixed it can be up in the atmosphere can be down in the earth if it's on the earth it's better because it doesn't contribute to global warming so if you want to think about where do you intervene we want to ask where in this cycle are there big numbers right so you look here in terms of surface radiation
that's a big number or if you think of solar radiation coming in that's a big number coming in so you want to when you think about where you want to have a policy impact you want to think about it in terms of where those numbers are large so if you look at the number the amount of carbons reflected by the surface that's only a 30 that's not a very big lever okay reason five experimental design now what i mean by experiments suppose you want to come up with some new policy so for example the federal
government when they wanted us when they were trying to decide how do we auction off the federal airwaves right for cell phones they want to raise as much money as possible well to test what auction design would work best they ran some experiments well the thing you want to do is you want to think about so here's the example of experiment and what you see is this is a round from some auction and these are different bidders and you know the cost for um that they paid what you can do is you can you want
to think how do i run the best possible experiment the most informative possible experiment and one way to do that right is to construct some simple models all right six reason six institutional design now this is a big this is one where you know means actually a lot to me my the person you see at the top here this is stan writer he was one of my advisors in graduate school and the man of the bottom's leo herwicks he was one of my mentors in graduate school in leo won the nobel prize in economics and
what leo won the nobel prize for which is a field known as mechanism design now this diagram is called the mount writer diagram named after stan writer in the previous picture and then ken mount one of his co-authors and let me explain this diagram teach it's very important what you see here is this theta here what this is supposed to represent is the environment the set of technologies people's preferences those sorts of things x over here represents the outcomes what we want to have happen so how we want to sort of use our technologies and
use our labor and use you know whatever we got at our disposal to create good outcomes now this arrow here is sort of it's what we desire it's like if we could sit around and decide collectively what kind of outcomes we'd like to have given the technology this is what we collectively decide this is something that's called the social choice correspondence or social choice function sort of what would be the ideal outcome for society the thing is society doesn't get the ideal outcome because what happens that you know wants always because the thing is to
get those outcomes you have to use mechanisms and that's what this m stands for mechanisms so that mechanism might be something like a market might be a political institution it might be a bureaucracy and we want to ask is is the outcome we get to the mechanism right which goes like this is that equal to the outcome that we would get right ideally and the better mechanism is the closer it is to equal to what we ideally want example so with my undergraduate students for a homework assignment one time i said suppose we allocated classes
by a market so you know she had a bid for classes would that be a good thing or bad thing well currently the way we do it is there's a hierarchy so seniors you know fourth year students register first and then juniors and sophomores and freshmen and when the students were asking us should you have a market their first reaction was yes because markets work right they have this food that you know if you have a market what you get here is sort of what you'd expect to get right what you'd like to get so
it's sort of equal but when they thought about choosing classes they was wait a minute markets may not work very well and the reason why is that you need to graduate and so seniors need specific courses and that's why we let seniors register first and if people could bid for courses then freshmen who had a lot of money might bid away the courses of seniors and people might never graduate from college so a good institution markets may be good in some settings they may not be in others the way we figure that out is by
using models reason seven to help choose among policies and institutions simple example suppose we're thinking about a market for pollution permits or a cap and trade system we can write down a simple model and it can tell us which one is going to work better or here's another example this is a picture of the city of ann arbor and if you look here you see some green areas right what these green things are is they're green spaces now it's a question of should the city of mount over create more green spaces now you might think
of course green space is a good thing but the problem is when you if you buy up a bunch of green space like this area here right is all green what can happen is people can say you know let's move next to that let's put little houses all around here because it's always going to be green and that can actually lead to more sprawl so what can seem like really good simple ideas may not be good if good ideas if you actually construct a model to think through it okay we've covered a lot so let's
give a quick summary here how can models help us well the first thing they can do is they can be real-time decision mates they can help us figure out when we intervene when we don't intervene what choices we make second they can help us with comparative stats we can figure out you know what's what's likely to happen right if we make this choice third they can help us with counter factuals they can you know after we've done a policy we can sort of run a model and think about what would have happened if we hadn't
chosen that policy fourth we can use them to identify and rank levers oftentimes we've got lots of choices to make models can figure out you know which choice might be the best one the most influence fifth they can help us with experimental design they can help us design experiments in order to develop develop better policies and better strategies sixth they can help us design institutions themselves figuring out should we have a market here should have a democracy should we use a bureaucracy and seven finally they can help us choose among policies and institutions so for
thinking about one policy or another policy we can use models to decide among the two all right thank you hi in this next set of lectures we're going to look at a series of models that help us try and understand an empirical phenomenon and that empirical phenomenon is this is that if you look out there in the world you'll see that groups of people who hang out together tend to look alike think alike act like so what we want to try and make sense of why that's the case first let me sort of explain what
i mean a little bit if you look at a city like detroit this is a census map of the city of detroit what you see is each blue dot represents a city block that's majority african american where a majority of people living in that block census block or african-american the red dots are blocks that are majority caucasian so what you see if you think of the city of detroit you see incredible segregation so this shows that people literally choose to live with people who look a lot like they do but it's not just this sort
of sorting effect another reason that people who sort of are hanging out together look and act similarly is that well we change our behavior we moderate our behavior to match that of people around us so for example take smoking it could be that you don't smoke and you start hanging out with a bunch of people who do so you may have an occasional cigarette or alternatively you may have spent your whole life smoking but now you hang out with a group of people who don't smoke and so you say you know that's it i'm gonna
i'm gonna stop smoking so it's these two forces one of them which is sorting or what sociologists call homophily you know we sort of choose to hang out with people who are like us and there's this other effect which is we choose to start acting like believing like people we're hanging out with and so both of these things are going to create groups of people who look similar to one another in case what we want to do is we want to try and get some understanding of how those processes work through some simple models now
these may seem like sort of fairly obvious things to look at but what we're going to find is when we construct models we get some pretty interesting unexpected results okay so let's get started what's it going to look like what are the lectures going to look like you know in this unit in this module we're going to start out with a famous model constructed by the name thomas shelling and this is a model of racial segregation and it's sometimes called shelling's tipping model in that model we're going to see how it's a little bit more
subtle than we might think about what causes those segregation patterns like we see in detroit after that we're going to look at a model by any mark grenovetter who looked at sort of people's willingness to participate in some sort of collective behavior this could be a riot this could be a political uprising it could be a social movement it's a very very simple model after that we're going to extend this renovator's model to something i call the standing ovation model this is a model that my friend john miller and i developed kind of for fun
but it gets at a really serious point it's a model of pure effects where you change your behavior to match that of others around you and what's interesting about it is the standing ovations you know gives us a really nice framework within which to think about this question and again there we're going to get some sort of surprising results all right and then last after we've sort of done all this we're going to talk about something called the identification problem and that is suppose it's the case that i look at a group of people and
they all seem similar well the question is did they sort and did they choose to hang out with one another because they're similar or was there some sort of pure effect were they hanging out and they did they all start to sort of look and talk and act the same so that's an interesting question to try and figure out and these models will give us some understanding of how we can tell those two things apart okay so let's get started now before we do that one quick comment about the type of model we're going to
be using here now a lot of times when people think about models we think about equation based models so when i teach this course let's suppose i teach this course to my undergraduates one of the things that some of them care about not all the lot care but it's their grade and so i could say to them well you know here's an equation based model for your grade it looks something like this your score on the final exam is going to be 50 plus five times the number of hours you work study right so if
you don't study at all you put in no hours of study you'll probably get a 50 on the exam and fail the course if you put in 10 hours on the exam then you'll probably you know studying for the exam and you'll probably ace the test right and get an a for the course so this would be a linear model it sort of explains how the world works with an equation now the models we're going to do in this unit aren't linear models so what are called agent-based models so what's an agent-based model agent-based model
works as follows you have a bunch of asians these could be people these could be firms they could be countries they could be organizations but they're the agents the objects of the model now these agents have behaviors they've things that they do rules they follow now in some cases these rules might be optimal rules they might be optimizing in that case it becomes what we call a game theory model a rational choice model where individuals are doing the optimal thing in the context of the model in the models we shoot we study here they're not
going to be optimizing they're going to be following just pretty simple rules and then the third part of an agent-based model is once you've got all these agents following all these rules that creates something at the macro level create some sort of outcome and so what we can do is then ask sort of what kind of outcome do we get what we'll find in these models is the outcomes are sort of surprising and that's again why models are so useful because we may logically think that if we assume agents that follow this behavior we're going
to get outcome a but when we work through the model and actually you know work through all the logic we'll find in fact that maybe the opposite's true or something you know sort of surprising is true okay so that's it that's the plan we're gonna look at some models of sorting and we look at some models of pure effects and we'll learn how the two differ and how the two are also a little bit the same all right thank you hi in this lecture we're going to talk about a famous model from social science and
this model is known as shelling spatial segregation model shelling's model was developed by a man named thomas shelley who's an economist at the university of maryland what shelling was trying to do is he was trying to sort of understand an empirical phenomenon and then under that empirical phenomenon that he was interested in was segregation and two types of segregation actually primary interesting one was racial segregation the other was segregation by income so let's look at some pictures of each so what you see is a picture of new york city each red dot in this graph
represents a city block that's majority caucasian or white each blue dot represents a city block that's majority african american each yellow dot represents a city block that's majority latino and each green dot represents a city block that's majority asian of some sort it could be korean could be japanese could be chinese american but it's classified as asian by the census so if you look at this picture what you see is incredible racial segregation right now the same is true if you look by income so here's again a picture of new york city each red dot
represents someone who's very wealthy each light blue dot represents someone who's poor and the moderately blue dots represent people in the middle class so if you look at this picture what you see is segregation by income and it's also you know fairly stark not as stark as the racial segregation but it's pretty stark so this is what shelling wanted to understand you want to construct a model to make sense of this now you might say we don't need a model why do we need a model it's obvious look people maybe they're racist people don't like
to live with people who don't look like them and that's why we get segregation well that's what shelling set out to explore and he set out to explore that using a model so what kind of model is he construct he constructs what we call remember an agent-based model so remember an agent-based model you've got three things you've got these agents which in this case will be people right that's part one then you have their behaviors you have to say okay what rules do they follow that's part two and then the third part is you just
add them up you just aggregate it and you see what happens when all these people are following these rules what do we get at the aggregate level okay so what's shelling's model about shelling's model is about people choosing where to live so you can think when you think about people choosing where to live well you think about okay i'm going to buy a house what kind of house do i bought want to buy do i want to buy this beautiful craftsman house do i want to buy a spanish style house do i want to live
in an apartment those sorts of questions well shelling abstracted away from all that and he said okay let's i don't think about this in a different way i want to think about people living in a city living in some place and deciding should i stay here or should i move so here's how he did it he thought of each person as being located on a checkerboard so what he did is he's got the whole city whether it's new york or detroit or houston it's a giant checkerboard and each checkerboard can have a person living there
or it can be blank so in this picture that we see here right what we've got is we've got a person living at x so this is our person right here and there's eight neighbors one through eight and one of those neighborhoods is blank right if you look at number three here right there's no one living there so she's got a total of seven neighbors now let's let red represent rich people and gray represent poor people so this is a rich person and if we look at her she's got three neighbors who are rich like
her but then she's got four neighbors who aren't so in total three out of her seven neighbors are the same as her and she's got to decide okay it's three out of seven enough if three out of seven of my neighbors aren't like me should i stay or should i go well this is where shelling then writes down the rules and he calls this a threshold-based rule so each person has a threshold and they decide based on this threshold do i stay where i'm at or do i move so one rule would be well 3
7 is good enough so maybe your rule is 33 so 33 of my neighbors are like me i'll stay but if fewer than 32 percent are like me then i'll move so this woman here she looks and she counts three of her seven neighbors are like her so she stays but if one of her neighbors moved out and now there were only two out of seven neighbors like her then she'd move so this is the model that's all there is to it so there's people they've got neighborhoods and they have to decide whether to stay
or to move and then we ask what happens now when shelling ran his model he did it on using paper and pencil on an airplane actually and he wrote out a big checkerboard and just used i think nickels and pennies to represent the different income groups we've got some advantages shelling didn't have we're going to use a computer program called netlogo and this is free software we've used it before right so let's go to our netlogo model okay here it is our model now remember three parts on hmas model are the agents their rule and
then the aggregate behavior so if these if you look at this negative model the first thing we see here is this number up on the top and that tells us number of agents so we can set this up and we're gonna have blue agents and yellow agents with the blue agents be rich and yellow agents be poor and they're just randomly set up on this grid we've got a behavior right in behavioral is the percent similar wanted so it's 30 right now people want 30 of their neighbors to look like them and then we've got
the aggregation which are going to be covered in these two graphs so the aggregation is going to tell us what's the percentage similar so how many people are like you in your neighborhood of eight and then the percentage are unhappy how many people aren't having their threshold met all right so we're starting at 30 percent and notice we start out 50 similar and only 16 are unhappy now 50 similar makes sense because people are randomly set out there so half should be like you have should be not okay so here we go if we let
this run what happens is the end window puts 72 percent similar and nobody's unhappy so the system goes to an equilibrium but what's interesting about this is if you look at seven 72 percent of a person's neighbors are like them even though people are incredibly tolerant they only need 30 percent of the people in their neighborhood to be like them and you end up with 70 of the people in your neighborhood like you so here's the deep insight from shelling's model what you see at the macro level segregation like this may not in fact represent
what's really going on at the micro level because these are pretty tolerant people right these are very tolerant people all they want is a third of the people to look like them and they'll be okay but if that's their rule you end up with 70 of people looking like you but what if we make them just slightly less tolerant and so we move this up to let's say 40 percent and we set this up now again we start with 49.5 percent of people unhappy and 30 of people are happy but 49.5 of people are similar
to you and if we let this go what we end up now is 80 percent of the people end up being similar to their neighbors right so you get a person's neighbors are 80 of them are similar to them so we get even more segregation well what's interesting here is if we ramp this up even more let's say to 52 percent let's just make it over 50 now over 60 percent of people are unhappy that's because over 60 percent of people have 50 or fewer neighbors like them and if we run this we get unbelievable
segregation right now what's incredible about this is 52 percent isn't that intolerant if you think about it you're sort of saying look i just want to be in the majority i actually might prefer a racially mixed neighborhood or an income mixed neighborhood but um if i do that what i end up with is 94 of my neighbors will be like me and if you look closely at this picture what you see is that there's sort of like little islands of empty space the black regions are empty space between the blue and yellow regions so these
people are really segregating now you could say so this is sort of surprising right again we get this amazing result from showing that at the macro level we get segregation you know at the micro level people are pretty tolerant now remember how we thought about why we get segregation because we think that well people are you know people that want to hang out with poor people what if we assume that rich people don't want to hang out with poor people so let's crank this way up and poor people don't hang out with rich people so
let's crank this way up to 80 so now people want 80 of their neighbors to be like them and if they're not they're going to move well we should get massive segregation here right even worse than before okay we don't we don't even get an equilibrium we get this sort of completely random process right everybody's still hanging out in neighborhoods or 50 people are similar the reason why is if you don't want anybody in your neighborhood to be like you well it's hard to find a place to live you know because you move someplace else
there's going to be someone who's not like you and then you're going to want to move again so if people really were incredibly racist or incredibly biased based on income then we might not see the segregation we might see people moving all the time churning churning churning churning to avoid being around anyone like them so what shelling's model tells us right again in this really sort of very simple way that what happens at that macro level segregation by race by income by all sorts of other things may not be right because of the fact that
at the micro level people are that intolerant so the micro and the macro may not align okay so that's the big lesson right micro motives need not be equal to macro behavior in fact shelling has a book the book that you know he's famous for is called micro motives and macro behavior and reminding us that what we see out there in the world need not imply that that's so the macro level outcomes need not imply what we think about the micro level behaviors of individuals okay so let's let's flesh out shelling small a little bit
more because one of the things that's interesting about it is what people sometimes call the tipping phenomena in shelling's model so remember when we set it up only 15 of people wanted to move so let's suppose i've got some person you know they're sitting in some neighborhood there's only two sevens of her neighbors are like her so she moves but when she moves she can cause other people to move so let's look at this person here who's sitting in a neighborhood there's two of her seven neighbors i like her now let's suppose that she's happy
with that she's cool with that she's gonna stay with two of her seven neighbors being um just like her but what happens is this person leaves person number five leaves the system right moves to someplace else because that person didn't have enough neighbors like her to feel comfortable but when that person leaves that's going to cause her to leave all right so that's an exodus tip one person leaves causing another person to leave okay there's also a genesis tip let's suppose she's living in this neighborhood and she's got in this case right she's got two
out of seven neighbors who look like her and what happens is someone moves in to the neighborhood who's not like her and so now she's only got two out of eight and so now there's too many poor people living in the neighborhood she says you know what because of that i'm out of here this would be a genesis tip somebody moves in causes her to move so those are the two things that cause tips and challenge models genesis tips and exodus tips so people moving out cause other people to move out and people moving in
cause some people who are currently there to want to move out all right so when you look at a city then like new york or detroit or houston or l.a or chicago or philadelphia if you put maps for any one of these cities they're going to look exactly like this not exactly you have the same sort of patterns of racial segregation in some sense cases even more pronounced now what we could infer from that would be that at the micro level people are very racist people don't feel comfortable living in neighborhoods with people like them
well what's interesting is if you poll people if you ask them people actually say no i'd like to live in a mixed income mixed race neighborhood yet they sort of may want it to be a little bit more like maybe 30 40 like them well that's what people want at the micro level you know to be in sort of mixed neighborhoods it's not what they get what they get is pictures like this and that's what's so surprising about shelling's model the macro behavior right doesn't the micro behavior doesn't produce sort of macro level behavior that's
consistent with what they want all right so that's shelling's model and in the next lecture we're going to go into more detail about sort of how do we measure segregation and then we'll talk about some of the fertility of shelling's model how we can apply it to other settings as well thank you hi in this lecture we're going to introduce something called a measure of dissimilarity an index of dissimilar we're going to use that to look at different cities in different regions and ask how segregated they are so remember in the last lecture we looked
at shelling's segregation model and what we saw is that people who had fairly tolerant thresholds for you know living with people of different income groups or different races still ended up being segregated and so the result is when we look at cities across the united states we see substantial segregation by income we see substantial segregation by race and we want to know what we want to do here is sort of figure can we construct some sort of measure some way of you know categorizing numerically how segregated a particular city is along a particular dimension because
when you have those measures right that allows us to make you know better sense of data like to use and understand data better and that's one reason why we model so to get started let's remind ourselves again of just what these patterns look like so this is the city of new york and remember that regions that are depicted in red are predominantly caucasian regions that are depicted in blue are probably african-american yellow predominantly latino and green pronunciation so new york is interesting because it's just like these big chunks of different racial groups spread out all
over the city not all cities look the same way here's los angeles right los angeles has this area called the valley which is mostly white south central which is mostly african-american and then over by monterey park it's mostly asian if you look at houston again you see all these sort of interesting patterns and how people are racially distributed across the city and if you look at dc it's almost like there's a dividing line to the east most people are african-american and to the west most people are caucasian so different cities look different ways what we'd
like to do is have some sort of number for representing this racial disparity okay now remember the same is true for income so we can use the same for income disparity and if you look at a city like chicago what you see is that there's red represents wealthy people here so there's wealthy people along this area known as the gold coast in the center of the city it's mostly poor people and then to the north and to the west in the suburbs right sometimes called collar counties makes it looks like a collar these again are
wealthy people again new york remember the red dots who represent rich people people who make more than 200 000 a year all around central park here you see wealthy people and as then as you move further out from the city you see poor people so it's interesting new york is sort of a little bit different than chicago and that right in the center of new york there's a lot of wealth and then as you move out it gets poor chicago sort of looks the other way so what we want is we want some measure for
how segregated a city is so to construct we're gonna expect a very simple measure called the index of dissimilarity and we're going to do it with just two types of people rich people and poor people so i'm going to represent rich people with blue dots and poor people with yellow dots now i'm going to place these people on a grid so i'm just going to have a 24 city block area here and in each block i'm going to put 10 people all right so let's start out and let's post it in 12 of these blocks
right here i put all rich people and in six of these blocks i put all poor people and in six of these blocks i put half poor half rich so what does that give me total well remember so i've got 12 blocks here and i've got 10 people per block so that's 120 rich people here and i've got six blocks here but there's only five rich people per block and five course so that's 30 so 120 plus 30 equals 150 so i've got 150 rich people and then i've got 90 poor people so 240 people
total 150 rich 90 point i want some way of representing how segregated are these city blocks now the interesting thing is these districts here these green ones are less segregated than these blue ones and these yellow ones so i want some measure that will capture that fact all right so how do i do it well what i'm going to do is i'm going to do this i'm going to let b be the number of blue people in a block little b and let big b be the number of blue people total so then if i
take little b over big b that's going to tell me the percentage of blue people total of in that block right relative to the total number of blue people so it's just giving me some sense of the proportion of the total number of blue people that are in that block and similarly little y over big wise giving me the proportion of yellow people in that block now why do i want to do that why do i want to look at these two numbers because if i take the difference between big little b over big b
and little y over little y that's going to tell me how distorted the distribution is in that particular block let me be more precise suppose i have a district that has five blue and three yellow and i want to add this would be a perfectly representative district what that would mean is that five over 150 of the there's 150 blue people and five those 150 people live in this um particular block so 5 150 equals 1 over 30. so one out of every 30 blue people lies inside that block now there's 90 yellow people and
three out of the 90 yellow people live in that block so one out of 30 yellow people live inside that block or poor people live inside that block so 1 over 30 minus 1 over 30 equals zero so what we get is that if you had a perfectly representative block between rich and poor what i'm calling blue and yellow we'd have a difference of zero but if we've got relatively more blue or relatively more yellow since i'm taking the absolute value that's what these two lines mean right here the absolute value it means that i'm
going to get a positive number so i'm going to have more i'm just going to represent more segregation so let's look at our particular example so these are this block right here is all blue right so there's 10 blue people in there now there's 150 blue people total so 10 out of 150 blue people lie in that block there's no no yellow people no poor people in that block so i've got 10 over 150 minus 0 over 90. so that equals 10 over 150 i can get rid of the zeros it equals 1 15. so
in every one of these blocks my index is going to be 1 15. now these yellow blocks right here there's no blue people there's no rich people so that's zero over 150 but there's 10 yellow people or poor people so that's 10 over 90. so there's way too many yellow people then there should be proportionally and so if i take 0 minus 10 over 90 i get 1 9 right now because i've got these absolute value signs here so everything becomes positive so in this these districts these blocks are 1 9. then finally i've got
these green districts remember these have five blue so that's five over 150 and they've got five yellow so that's five over ninety right and i take the absolute value now what do i get there well that's one over thirty minus 1 over 18 so that's this is complicated we're going to find out that this is equal to 1 over 45 okay so this is 1 over 45. what we get then is every one of those 10 blue districts the index of assembly is 1 over 15. every one of the yellow districts the index of similarity
is 1 9 and every one of the districts that's 5 blue and 5 yellow is 1 over 45. okay so how do we figure out how segregated this whole region is what we do is we say we've got six districts or blocks here that have a distributive one over forty five so we get six times one over forty five and we've got six here that have a dissimilarity of one ninth so we're gonna add six times 9 and then we've got 12 that have a decimal of 1 15 so we get 12 times 1 over
15. and if we add all that up we get 72 over 45. so 72 over 45 is it's our tentative measure we're gonna have to change this a little bit because what does that mean what does 72 over 45 mean is that bad is that good so let's let's go through and let's sort of put our measure through the paces so whenever you construct a measure what you're trying to do some extreme cases to see how well it works so let's start out with a simpler case to see if this measure sort of makes sense
and i've got four blocks that are four blue for all yellow and here's another case where i've got um all eight of them are 50 50. and let's compute our index of dissimilarity in each of these cases so let's start with this one well in each one of these blocks there's going to be five blue right and five yellow the total number of blue and yellow right since i've got 8 blocks i've got 80 people so that means there's going to be 40 blue and 40 yellow so for each one of these blocks i get
5 over 40 minus 5 over 40 which is zero so every single block contributes zero and my total index of dissimilarities dissimilarity is zero so that's great right because that means that if i if everyone was perfectly mixed my index would be zero so it seems like it's a pretty good index but let's go back and look at this other case so now i've got this case where i've got you know four that are all yellow and four that are all blue so once again i've got 40 yellow and 40 blue but now i've got
to think for each one of these yellow districts what do i have i've got 0 over 40 blue minus 5 over 40 um yellows right i'm sorry 10 10 so i've got 10 yellows in each one so 10 over 40 yellows so what that means is that's going to be equal to 1 4. and since all these are the same i'm going to get a fourth a fourth fourth a fourth and so on and also for the blues right by the same logic so every single one of these is going to give me a value
of one-fourth when i add all those up i get two right i don't get one i get two so now i've got a bit of an issue so when if people are perfectly segregated i get two and if they're perfectly mixed i get zero so this suggests i've got a pretty good measure here but what i probably want to do is i want to divide it by two right so if i divide it by two then if i get if you're perfectly mixed you get a score of zero and if you're perfectly segregated you get
a score of one all right so if i go to this case if where there's 40 rich 44 and they're perfectly mixed my score is going to be zero because i get five black and five yellow in each district oops this should be a 5 right and so i get a score of 0. and when i do the other one i get a score of 1. now i look at my thing here when i get 72 over 45 which didn't make any sense now that's 72 over 90. and if i divide this by nine that's
going to be 0.8 so it's 80 percent so sometimes this is 80 segregation which seems pretty segregated now we can go back and we can look at our cities so now we can go back and we can look at our census data and we can look at a city like philadelphia you can ask how segregated it is it and notice look at when we get 0.8 exactly the same as our example so this tells us the score in philadelphia is point a now we look at this map and say well how segregated is it now
we have a score and now we can do things like compare philadelphia to detroit so remember detroit also looks segregated when we need to detroit even though it looks sort of segregated this score is only 0.6 so detroit is actually substantially less segregated than philadelphia even though look if you look at these two pictures here's philadelphia and here's detroit it's very hard to tell the difference between the two okay so what have we learned in this lecture we've learned that just we can construct a very simple measure called the index of dissimilarity and by using
that measure we can compare how segregated different cities are and now once we've got this measure in our pocket right we can use it to measure segregation by race segregation by income and all sorts of segregations it's a really useful tool to help us sort of take data and understand the world all right thank you hi in this lecture we're going to look at our first model of pure effects this is going to be a very short lecture but i just want to get the idea of pure effects out there and introduce what i think
is an interesting point and that is namely that when we think about these sort of contagion phenomena that happen through things like pure effects that sometimes the tail wags the dog what do i mean by that what i mean is that sometimes the people at the end of the distribution the extremists are the ones that really drive what happens and as a result that means it can be incredibly difficult to predict what's going to go on so that's the the basic lesson of this model that the tail is going to wag the dog so this
model is developed by a mark renovator who's a sociologist at stanford university and it's really fun it's one of my favorite models just because of its simplicity and elegance so um before i present the model i want to again get to this point about predictability so recently we've had some events we've had uprisings in libya and in egypt and these caught everybody off guard there weren't experts lined up saying egypt's about to topple libya is about to topple in fact no one predicted it at all you know you don't have to go far back in
time to the orange revolution and see that nobody predicted that either when the ukraine suddenly had this giant uprising split of uprising that too was unexpected ditto for the berlin wall you didn't have anybody seeing this coming now it's easy to say oh my gosh you know these experts aren't really experts and there's a sense in which well you know we can be critical of how smart the talking heads on tv are but it wasn't just the talking heads it was no one anywhere saw this stuff coming so we're going to use grenada's model to
explain you know why that's the case why it can be very very hard to anticipate sort of a whole bunch of people moving or joining some cause all right so here's the model very simple super simple so there's an individual so there's n people each person has a threshold this threshold is like how many other people would have to join the movement in order for them to join the movement so if your threshold is zero if your threshold is absolute zero that means nobody else has to join you're going to you know grab your stick
and run out there in the streets right if your threshold is 50 then you need to see 50 people out there before you join in the before you join in the sort of collective movement so each person has a different sort of threshold for whether they're going to sort of join in and participate in some collective activity so that's the model what we want to do is we want to analyze how that outcome varies depending on these distribution of these thresholds right what causes the collective action to occur collective movement to occur and when doesn't
okay to sort of make this little more fun and a little less heavy instead of thinking about you know people rising up and overthrowing some dictator let's think about people wearing some silly thing like a purple hat so suppose you have a group of friends just like you know five of you and you know any one of you at any given moment in time could decide to start wearing a purple hat and there's a question of like do you wear a purple hat well let's suppose here's how it works we've got five people right and
here's the thresholds so the thresholds are zero one two and two and two so this person who's got a threshold of zero you know what they like how they look in purpose he likes how he looks in purple hats so he's just gonna get one these three people have thresholds of two they're really not that keen on purple hats but if their other two friends wear purple hats then they're gonna get one so let's see what happens so the person's got the threshold of zero well they they buy the hat because they put it on
because their threshold was zero once they buy the hat the person who's got a threshold of one buys the hat right because one of their friends has the hat so they think well you know that's fine i wear the hat but once the two of them wear the hat right then these three people down here these three people at a threshold of two they buy the hat and everybody ends up buying the hat right so you get this collective purple hat movement within your group of five people okay that's example one okay let's do another
example this time there's five people three of them have thresholds of one and two of them have thresholds of two but what happens in this case that's right nothing nobody ever buys the hat because nobody's got a threshold of one this is what i mean by the tail wagging the dog there's no one who's got that threshold of zero there's no one who really really wants to wear the purple hat so it never takes place okay let's do one more so in this one we've got five people and now the thresholds are zero one two
three four and five so we've got someone at zero and someone at four so zero through four the thresholds all right well what happens in this case well in this case the person's got a threshold of zero buys the hat right because they want the hat once he buys the hat the person's got a threshold of one bites the hat once those two have the hat the patient's got the threshold of two bites the hat right once those three have the hat the person with the threshold of three buys the hat and once those four
have the hat this last person who really didn't want the purple hat right gets the purple hat because all of his friends have it so you end up with all five people having the hat even though the people really weren't that key not having purple hats so what i mean but they're not that kind of let's compare the last two examples in a little bit more detail remember the one example where nobody gets the hat the average value there's three with one and two with two so the average number of people have to get a
hat for someone to get had is 1.4 so people are pretty willing to get purple hats but there's just no one to get it started in the second case where you've got 0 1 2 3 and 4 the average value is 2.5 right and so here you've got people who are really really willing which are not very willing to get hats but what happens is they do get the hats because you've got this person at zero and this person at one so the tail is able to wag the dog all right so what do we
learn from this we learned that collective action collective dissipation is more likely to happen if right there's lower threshold people are really angry really upset or really want to wear the purple hat but we also learned this is a surprising part that it's more likely to happen if there's more variation in the threshold if there's more people who sort of at the low end want to wear the hats or or participate in the collective action that sort of can cause the whole system to ripple can cause a cascading effect in which you get collective action
that's why it can be very difficult to figure out if there's going to be some sort of uprising because you not only need to know the average level of discontent right you need to know the distribution of discontent you've got to know are there a handful of people a group of people who are really willing to sort of rise up and in addition to knowing that you've got to know are those people connected to one another interesting way and that's what we'll start looking at in the next lecture we're going to push this a little
bit further and we're going to do something called the standing ovation model which allows us to look at this sort of pure effect phenomena in a little bit more detail but for now right this simple granulator model gives us a lot it tells us if you want to ask do you see collective action yes it matters how much people are willing to participate right what their how low their thresholds are but it also matters what that distribution of thresholds are and if you've got more heterogeneity more diversity more people in that tale really willing to
riot you're more likely to get a collective action thank you hi in this lecture we're going to look at a model that's a favorite of mine known as the standing ovation model now this is a model that builds off the grenada model it's just really an extension but it can allow us to sort of think about threshold-based models of participation and pure effects in little more subtle ways high standing ovations those seem like a kind of funny thing to study well here's why think about a standing ovation when the performance ends you don't have a
lot of time to decide whether you're going to stand up or not you've got to make sort of a fairly quick judgment you're going to clap of course right then you've got to decide do i stand or do i not stand and then after the standovation either starts or doesn't start you got to make another decision you know do i stand up do i follow these people or do i you know stay sitting so when we think about human behavior there's going to be different models that we play with throughout the course about how humans
act one model will be that people are optimizing that they make rational choices in all settings when the content is understanding innovation that's probably a difficult thing to do because it's all happening so fast so instead what people probably do is they follow rules so standing ovations are a nice place to sort of get people it's a nice domain to get people thinking about rule-based behavior about people following simple rules and then asking how those rules aggregate so it's kind of like the shelling model that way right people follow rules and aggregate and we're going
to see what's going to happen over here right this is a model of i think it's like pure effect now why so when you think of a standing ovation one way to think about it is as a pure effect right other people stand and then you stand what's nice about standing ovations and i think it's nice about it is that it could also be information so what i mean by that well suppose you you know you're sitting in the theater and you notice that the woman sitting in front of you to the left you know
is talking with her neighbor she seems to know a lot about the theater about the performance you're going to see and she seems like someone is a little more sophisticated than you are so play ends and you're kind of how do i stand or not stand and you see her just pop up she pops up she's applauding like crazy well you figure she's conveying some information that could be useful to you about the quality of the show and so you decide to stand up right so it's not just that you're copying her for the sake
of copying her like in the in the renovator model like with the purple hats here you're copying her because you think she's telling you something about how good the show is okay so that's an information effect so what we want to do on a model that's going to capture these sort of pure effects and information effects so just like in the ground of veteran we're gonna have a threshold there's gonna be a threshold in which i stand up but it's gonna be somewhat different so on the ground runner model the thresholds were the number of
other people doing the action participating in the collective action rioting wearing the purple hats there's some number of other people that i needed for me to do it here the threshold is going to be related to the quality of the show so we can think of as let's suppose shows have qualities between 0 and 100 and your threshold may be 70 or 60. any show above 70 you stand any show below 70 you don't stand so if the quality is above the threshold you stand bodies below your threshold you don't stand but let's make this
a little bit more sophisticated so let's suppose instead of there just being you know you see the quality let's suppose that you get something called a signal now the signal is going to be the quality plus some other term some term e which we can think of as an error right so you don't see the true quality you see the quality plus some noise and now your decision to stand doesn't depend on the quality in your threshold it depends on your signal and the threshold if your signal's above the threshold you stand if your signal's
below the threshold you don't stand okay so that's the model but one more thing one more thing is because that was the initial decision to stand now you've got to decide okay after this sort of initial decision do you stand once you see other people standing so what you can assume is that you've got some sort of threshold x right that says if 10 of people are standing off stand or maybe 30 of people are standing i'll stand right and that also affects your behavior so the rule really depends on just two things one this
your individual your initial threshold for quality and then two the second threshold for how many other people have to be standing for you to continue to stand okay so that's the model so now let's think through some of the results we can get from this model so the first claim is higher the quality of the show the more likely to be a standing ovation why is that true well think about it you stand if you're the quality plus the air which is your signal is above the threshold so if the quality is higher your quality
plus your error terms higher it's more likely to be of the threshold more likely to have a standing ovation again not surprising one one thing we want for models is we want us to give us results that make sense right basic things that make sense and then the other stuff right maybe more surprising but we want some stuff to really accord with our logic right off the bat okay what's another result another result is lower our threshold make our t lower more likely to get standing ovation why is that true again same logic right we
need our signal the quality plus there to be bigger than our threshold so if my threshold falls from 70 to 50 to 30 to 20 i'm more likely to stand up right make perfect sense result three the lower x remember x was the percentage of people who have to stand for me to stand subsequently then there's going to be more ovations right okay that makes sense as well right and the logic is you stand more than x percent stands so if x goes from you know sixty percent if you need sixty percent of people standing
down to ten percent you only need ten percent of people standing then you're much more likely to get a standing ovation because if twelve percent of people stand then everybody stands up all right now what would cause x to be big and what would cause x to be small so this is a cool thing what was you write down these little variables in terms of stuff then you get the great what does that mean in the real world so what would it mean for x to be bigger big x to be small that's sort of
how willing you are to stand based on other people standing so big x would mean if you didn't stand initially you need to see a ton of people standing before you stand so this would represent people who are really secure in who they are a small x would mean people are really ready to jump on any bandwagon right so five percent of people are standing then they stand so this x tells us something about the people in the audience right which is an interesting thing to think about okay let's go back to this signal remember
i said you get the signal s which is q plus this term e what is e well remember i said e is error right so i said there's a quality of the show that may be like 55 and you don't see 55 maybe you see 58.52 so it's variation in what we perceive well another way to think of that variation of what we perceive would be to think of it as diversity right if you and i care about different things we've had different life experiences we may interpret the theater the performance differently than one another
and as a result what's a 50 to you maybe a 60 to me or what's a 70 to me maybe an 85 to you so because you and i are different we're actually going to have different qualities so this e term right this q plus c we can think of this as either error or we can give us diversity so that why does that matter what matters because when you think about interpreting the model like what the model tells us in the one sense in one case it'll tell us well what happens if people have
more noise in terms of how they interpret the show or alternatively we can say what does it mean if the audience is more diverse right okay so well what does it mean well to do that now we've got to think a little bit now we've got to actually sort of do some math so let's let's do an example suppose we got a thousand people the threshold for all these people to make it simple is 60 and the quality of the show is 50. so 50 is less than 60 right so that means people aren't going
to stand for the most part there's no error no variation nobody would stand up okay so let's do it case first where the error term is really small so it's 15. now the mean here is 50 right and the error term is either between minus 15 and plus 15. so someone's got an error of minus 15 gives it a 35 and someone's got a nerve plus 15 gives it a 65 so let's write 65 there so all these people below 60 they're going to sit they're going to clap they may clap nicely but they're going
to sit because it's below their threshold so only this small set of people here stands out so unless x is really small what's going to happen here is a few people stand up no standing ovation but now let's increase the variation let's suppose the variation goes from minus 50 to 50. well now the people get the score minus 50 right they're gonna give it a score of zero and the people who get a score of plus 50 they're going to give it a score of 100. now once again the average score is still 50 right
so that's less than 60. so the average person doesn't stand up but here all these people stand up so if you assume people are sort of uniformly distributed between 0 and 100 so they're equally distributed across there that means that 40 of people will stand up what that means is unless x is really small as long as x is above you know 40 below 40 percent you're going to get a standing ovation and think about it if you go to theater and you see 30 of people stand up you're likely to stand up so what
we get here when you have this large diversity or large error term you're more likely to get a standing ovation okay so that's claim four if q is less than t so if it's a bad show well not your batch it's not a great show you're more likely to get a standing ovation if the variance in e is larger right and again that's because you stand if q plus e is bigger than the threshold if this has more there's more people with big e's you're going to have more people stand and that's more likely to
cascade into a full-blown standing ovation okay so let's think about what would cause you to be big well one thing would be the audience right so even unsophisticated audience you know maybe they don't get it i mean some people would think it's great some people think it's bad you know you're just getting more variation or if you have a diverse audience right if you have people from different backgrounds you're going to get more variation the performance itself could matter if it's really complex and hard to figure out that could cause variation or if it's just
multi-dimensional there's lots going on that could cause more variation so different attributes of the audience and the performance could result in more variation in that e-term which could lead to more standing ovations so let's think about what we know what did we learn higher quality show more likely to be a standing ovation not a surprise lower threshold for standing right more likely to be standing ovation again not a huge surprise sort of people more willing to stand up jump on the bandwagon lower x more likely to be a standing ovation larger you know larger pure
effects what i call that and the last thing is like more variation there's more variation more diversity in that error term then more likely to distinguish so here we got these four sort of nice results that help us understand when we're going to get standing ovations when we're not now you could say how much did the model help us well my guess is that if you just thought about standing ovations without constructing a model you might figure out yeah better shows get standing ovations you know people more willing to give standing ovations are going to
have standing ovations and if people are more susceptible to influence you're going to get standing ovations but you probably wouldn't have gotten this one you probably wouldn't gotten more variation now you might have if you learned the grenader model right but most likely you wouldn't have had it now that's the simple standing ovation let's let's ramp it up let's have some fun right so let's do an advanced standing innovation model now i used to give the standing ovation model as this assignment for students and they construct models a lot like the ones we just did
now what's funny is if you if you don't give it to a student if you just have something in the street to start describing a standing ovation or just going to the theater they'll include two things that weren't in the model now modeling requires leaving things out but here's two things that almost everybody leaves out of a standing ovation model the first one is the theater itself right i mean you're actually in a theater you're not in our model we just have had these people who sort of saw everybody but that's not true right other
thing is this most people when they go to the theater they go with a date or a group i mean sometimes you go alone right if i'm traveling to new york or something without my family i may go to the theater loan pick up one cheap ticket but for the most part when i'm sitting in the theater i notice that most people are there in groups okay so now we can ask do these things matter i mean these are features of the real world not in the model do they matter well let's see let's look
at um the auditorium itself first okay so here's the the front right here's the front of the auditorium and here's this you know performer with a funny hat conducting the orchestra and we want to think does it matter well let's think about it so if you're sitting here sitting in the spot you don't see everybody after the standing ovation starts you may only have some sight lines right so you get some sort of cone in which you can see who's standing and not standing let's think about why that might matter well suppose you know i'm
sitting here i have this cone i'm sitting here i get this cone here i have that cone here i have that comes so you see that there's small cones and there's big cones let's let's think about the difference between those two so what's a small coin person one person someone in the front they don't see anybody right but notice that almost everybody sees that so they influence everybody but they're not influenced by hardly anybody they're like celebrities right there's somebody who like we all pair with celebrities do but they don't care much but we do
right now what about people in the back with people in the back they can see almost everybody right but almost nobody can see them right anybody's sitting in here nobody can see these people right they can't unless they look through the back of their heads they turn around or something well these are more like academics these are people like me spend a lot of time studying how the world works but not that many people pay attention to what we say more people care what oprah writer ashton kushner says than what i say so as a
result even though we have a better sense of what's going on right nobody listens to us well let's think about the you know ramifications of that what that means what we'd like is we'd like to get a standing ovation if the performance was good and we'd like to knock it whenever performance was bad but what's happening is everybody's cueing off these celebrities who really don't know what other people think and nobody's paying attention to the people in the back who really do know what everyone thought so this means that maybe the system isn't going to
aggregate as well as we'd like it to right so maybe we're not going to get the right answer as often as we expect to okay what about dates what about the fact that you know hey look you know got a date going to the theater how does that matter well thinking this with respect to the x term before i said i'm going to stand and look around i'm going to stand if you know x percent of people are standing up but once i've got a date if she stands up then i'm probably going to stand
up so what this means is if you think about people in groups or in pairs if one person in that pair stands up then everybody might be more likely to stand up which means that adding groups or adding dates or pairs means that you're gonna get more standing ovations because all it takes is one person in the group to get them both to stand up that's gonna increase the percentage of people standing up which in turn is going to create more of a standing ovation okay so what do we got how do we increase the
probability of standing ovation right we had the four we had before higher quality show right lower thresholds larger pure effects those are the obvious ones right more variation the unobvious one and now we have two more use celebrities put people in the front right put people in the front who are going to stand up for sure so pay somebody in the front to start an ovation and then big groups create groups of people and then if one of those people that group stands up the whole group stands up and then that causes more people to
stand up so now we've got six ways you can cause standing ovations and i'm going to guess that like these three might not be things you thought of right people on the front matter more and that you want to create groups so the models actually helped us figure things out okay so wait this place is okay it's got this is all funny good but this is standing ovations not that important true absolutely true but again we're using this model right to help us sort of make sense about rule-based behavior in environments where there's pure effects
one of the reasons we construct models is for fertility so once we construct a standing ovation model we can use it someplace else but where else can we use it well we just studied right collective action problems or you know participation problems and political uprising riots and things like that and that looks a lot like the standing ovation model and now you can think of who are celebrities right and who are what does it mean that big groups in that sort of context well the celebrity be someone who's maybe got a lot of influence on
other people so if you want to start a political uprising you need those sorts of people so this tells us something about that it also works for things like academic performance suppose you've got a school that's underperforming and you want to sort of have people lift themselves up right to work harder how do you do it well you can think we need to sort of raise quality lower you know sort of lower these sort of thresholds of the barriers for people to sort of engage in this productive behavior you can do things like use celebrities
right and you can even create groups get groups of students who work together to all do well right and that may cause other people to do well urban renewal right you can think of each one of these people instead of each one of these boxes instead of being a person standing up you could think of it as someone fixing up their house now it's interesting here is let's go to that sort of variation idea well you could think we want to improve some city let's give everybody a thousand dollars if they'll fix up their house
it could be that nobody does it because it doesn't push anybody above their threshold to fix up their house but the variation idea says if we give a handful of people a lot of money to fix up their house they may do it and then that could cascade in fact a lot of people who study urban renewal push that sort of logic that you should target particular areas where people are likely to do it get things started and help to process log rules all right fitness and health if you want to get a group of
people of society even to become more healthy that's a lot like a standing ovation you need some people to do it and then you hope other people do it now that could be pure effects right i see other people being healthy i want to just copy them or it could be information i see them acting healthfully engaged in fitness and then i think you know what they're doing better than i'm doing so maybe you know they seem happier and healthier so i'll copy what they're doing even like write this online card so i won't be
able to sign up for this online course it's like a standing ovation model each person's deciding do i stand up or not is the quality about my threshold right so if i want people to take the online course i can draw lessons from the standing ovation model and think about how do i do it you know by using celebrities and that's some sense what i did was i tried to get a lot of my friends who were celebrities within the academic world to say hey this is a cool course all right so that's the standing
ovation model it's a pure effect model but it's also the sort of information model and it explains again why we see groups people all doing you know similar stuff right because we see other people doing it and then we copy them and this tells us sort of when we're likely to see it right and when we're not likely to see it okay thank you hi in this last lecture in this module i want to talk about something known as the identification problem and that is basically the question of how do you tell whether something occurred
you know whether people hanging out with each other look alike happened because of sorting because of the shelling or because of the standing ovation pure effect right or these the fancy economic academic terms is this because of homophobia right this idea that you want to you know be with people who are like you or is it because of pure effects you want to start act that you act like people who you hang around with well in some cases that's easy to figure out right so if you look at segregation patterns right by race it's very
clear that like this happens because of sorting right it's also true look at this is a picture of a middle school this is um some data from james moody the yellow dots here are caucasian students and the green dots are african-americans and what you see is you see that these you see these four clusters right here's cluster one two three and four the pink dots are mixed raised students and what you see is that these kids have sort of sorted into groups based on race now you also just wait there's also this this break this
way right because there's four groups what's that well this is middle school so that's male female right so basically white girls white boys black girls black boys creating four different social groups and this again it's very clear that's just sorting now there's other things that aren't sorting so this is one of my favorite graphs you can find this on the web these are generic names for soft drinks so if you go up here in the northeast right or if you go out here in the west people tend to say soda right if you live anywhere
here in the great midwest where i live people will say pop like if i go to the restaurant i'll say oh i'll have a pop you know give me a coke which if you go down into the south is what they call almost anything so you can walk into a restaurant see all the coke and they'll say would you like a dr pepper or an orange or a coke so coke refers to any soft drink so if you look around the united states there's two soda regions there's a pop region and there's a coke region
now there's no way this is sorting there's no way someone sort of grows up in the south really wants to say pop and so they move up here to the midwest in order to live with the pop people this is not going to happen right so this is clearly a pure effect now there's two books that recently came out one is called the big sort by bill bishop the other is called connected um by nick chris and james fowler and both of these books sort of make the case for one of these effects so the
big sort talks about sorting obviously and connected talks about pure effects and in those two books you see pictures of things where they sort of argue that sorting is the cause and pure effects are the cause now what's an example of from the big sort well here's political opinion so what you see in 1976 each dark county is colored in where the democrats won by more than 20 percent in each gray county is one of the republicans one more 20 by more than 20 percent and white counties are counties where it was within 20 so
close counties this is what 1976 looked like here's 2004. now if you notice the country's become much more the country's been filled in and the places that aren't filled in are just a few states like minnesota michigan island and wisconsin but other states are almost completely filled in and the other thing is to recognize here that a lot of the very dark regions extra majority democratic are cities and so most voters live in non-competitive districts bishop argues that this happened because of sorting the democrats moved to where democrats are and republicans move to republicans are
so people sorted according to their political beliefs now you could also make an argument though that this happened because of pure effects the democrat people moved into democratic districts and there were more democrats so they just became democrats well here's some pictures from the book connected which argues for pure effects and this has to do with happiness now people who are blue here are unhappy and people are yellow or happy so yellow sort of sunny if you notice you see clusters of yellow people and then you see clusters of blue people so what you've got
is you've got unhappy people hang out together and happy people hang out together now what christopher's and follow argue follower argue is that this is because of pure effects if you're unhappy but you start hanging out with happy people then you become happy right but one can make the alternative argument that this is because of sorting now if you look at smoking you see a similar thing here smoking in 2000. now the yellow dots are the people who smoke and what you see here is they tend to be out near the fringe of the social
network and you see big clusters of people who don't smoke so here there's a lot of you know you can look at this and think well boy this seems to be evidence of just this snapshot you could convince yourself that this is evidence of pure effects and smoking here's one more though it gets sort of problematic this is the average number of hospice days that someone spends if they're chronically ill now you'd think this should be pretty much the same across the country yet you see huge differences right like you look just in the state
of tennessee here right there's regions where it's you know between 6 and 13 days and then right adjacent to it right right there it's above 23 days and so what you see is you see these like in here you know up in idaho you also see huge disparities and what's going on here is where these lines are when you see these sharp lines those are probably different hospitals so what that means is some hospitals are keeping chronical patients for a long time and others aren't so why is that well that could be sorting it could
be doctors and nurses who'd like to keep people in hospitals for a long time move to one area it could also be pure effects it could be that people sort of do what the other doctors around them do so when you get things like this right you see pictures like this it's an open question which one is it right here's one more this is medicare reimbursements per enrollee so how much medicare money do people get back and if you look at this like you know i'm from i put this up because i'm from michigan look
at the state of michigan you have some areas where they get a whole bunch back in other ways where they get very little and if you look in the state of california you see massive disparities like these you have regions where it's less than seven thousand dollars right next to regions where it's more than seventeen thousand dollars so there's massive variation in how much medicare reimbursement you get pre-enrolled and again you could ask is this because of sorting the people who like to give a lot of government money move to one area or is it
because of some sort of peer effect people give more money because other people give more money and these are puzzles well let's see why we can't tell why there's an identification problem if we just look at the pictures so suppose you've got two types of people we've got a's and we've got b's all right now let's look at sorting so suppose we started out we have two populations one population has you know two a's two b's and then two more a so there's four a's and two b's and the other has four b's in two
ways if sorting went on what happened is these b's would feel unhappy and they would move down here and these a's would feel unhappy they'd move up there right and what we're going to get is the always and all b's now think about what pure effects would look like pure effects i've got four a's and two b's and these b's would switch and become a's and down here i've got four b's and two a's and these a's would switch and they'd become b's so what happens is in either case i'm going to get one group
of all a's and one group of all b's and i can't tell did this happen because they sorted or did this happen because it was pure effects if i just have a snapshot like if i just take out my little camera here right take a little picture i can't tell so how do we do it how do we make sense of it how can people like flower crystals argue that no this was really pure effects and how can people like bishop argue that this is sorting well in some sense we've already answered that we've answered
that because let's think about the process with sorting we start out like this and we can actually see these people move these b's move here and these a's move here so what bishop does in his book is he gives evidence of people moving into districts where people are like them politically so he literally finds evidence of these of democrats moving into predominantly democratic districts and republicans moving towards republican districts that people choose their districts based on the political ideology of other people in the district so when you look for a house you don't discover how
many bedrooms and bathrooms it has you care about whether your neighbors are democrats or whether your neighbors are republicans right what fire and chrystal is trying to do is they say okay to find evidence of pure effects what we need to show is we need to show that these people right switched and became a's because most of their friends were a's now that can be difficult to do right in some cases it's easier to do in other cases but to distinguish between sorting and pure effects we have to have that sort of micro level data
and it's got to be dynamic data right so bishop is seeing people move foreign people change their behavior the point is and this is why identification is so tricky right is if you just have the snapshot you can't tell you can't tell the difference now let's go back to one of the big reasons why you know i think this course is important that is to be a many model thinker right if you have lots of models of disposal multiple models at your disposal you're just going to better understand the world well we've learned two models
that look at the same phenomena that people you know you go to some place and the people all seem to be you know looking the same acting the same right they believe the same things what we've learned in this module is that there's two possible causes for that one cause could be that they sorted into it another cost could be that there were pure effects when i just see that phenomenon i can't tell there's an identification probe so if i want to sort out did this happen because of sorting or did this happen because of
verifix i need dynamic data i need to sort of have more data over time so that was another reason we have models right to help inform data collection so if you want to answer this question so exposure concerned about health care costs and you see these this massive variation in reimbursement per enrollee or you care about healthcare just in general you see this massive variation in how much time people get in hospice you might know what's best well you want to figure out why is this happening what's causing this variation is it pure effects or
is it sorting well how do you do that how do you know well you need better data you need dynamic data to see are people moving or are people changing and then once you know what it is you can start affecting policies right in order to get better outcomes all right thank you hi in this next module what we're going to do is we're going to focus on a particular topic known as aggregation now aggregation is really an interesting thing to think about because if we think about just basic mathematics we learned early on that
like 1 plus one equals two right and we think we can just sort of add things up and the sum is the sort of the whole of its parts when we start modeling more interesting phenomena whether it's physical the physical world the biological world or the social world we find that aggregation is actually really tricky and one of the reasons we model right is to get the logic correct and we find that the logic of aggregation is really sort of incredibly surprising and novel now we already saw that earlier in the in the previous lecture
when we talked about shelling segregation models right remember people had these rules that they followed in order to decide where to live based on their tolerance of other people right people who look different than they did and what we found is that reasonably tolerant people sort of you know individuals following rules that were tolerant could lead to macro level segregation like we see in a city like new york or philadelphia or detroit so what we want to do in this next lecture is just construct some very simple models some toy models and when i say
toy models what i mean are models that have very few moving parts that help us try and understand some very basic logic about how the world works and we're going to use these toy models to understand the process of aggregation so it's going to be simple but then also sort of mind-boggling in a way okay so one of the core ideas in aggregation goes back to a famous paper written by the physicist phillip anderson and anderson's a nobel prize winner in physics famous physicist from princeton and anderson wrote a paper called more is different and
in this paper what he says is look you can sort of take a reductionist approach you can pull everything back and look at something you know in great detail and say this is a salt crystal or this is a water molecule right and or you know this is a neuron but there's something very different when you connect all those things together and so you can't do purely reduction of science and look at just individual parts and understand the whole so more is different and that's really going to be the focus of this module of lectures
is how how is it some of the ways in which more can be different so what did anderson mean exactly so the most famous example that people use is this this is a picture of a single water molecule right two hydrogens one oxygen we can understand all the properties of a single molecule but one water molecule can't be wet right wetness the fact that we can like put our hand through water and feel the slipperiness comes about because the fact that those hydrogen oxygen bonds are fairly weak and so our the bonds in our hands
are stronger so we can just push through it and feel that wetness so wetness is a property of a bunch of water molecules not of a single water molecule right but wetness is sort of child's point compared to something like cognition personality so think of the amazing things our brain can do right but our brain consists of a bunch of little neurons right there's neurons and there's axons and there's dendrites and there's myelination all that sort of stuff right it's very complicated but if we break the brain down to its parts we're never going to
understand where cognition comes from where personality comes from or where consciousness comes from those are all what we're going to call emergent properties of the system now we're never going to in these lectures we're not going to explain consciousness or cognition but we are going to sort of at least work through how is it that at the macro level at the emergent level we can get stuff that's really far more interesting and surprising that we saw at the micro level right so how are we going to do it what's our plan how are we going
to proceed in thinking about aggregation well we're going to start out by thinking about aggregation of actions and i'm going to talk about something called the central limit theorem but we're going to talk about how your sort of actions add up and that will just get us thinking about this notion of aggregation in a simple way then we're going to look at a particular game called the game of life and we're going to look at a single rule this is just sort of one set of rules and just see how that rule aggregates just to
give us a sense of mystery and wonder about how amazing simple things can be when they add up right third thing we're using we've got a whole family of rules we look at a class of models called one dimension or cellular automata models these one-dimensional models are extremely simple you almost can't imagine a simpler model and yet we're going to find that these very simple models can do anything literally anything so we talked about those classes of outcomes they can do anything and then last just to pull this into social science a little bit we're
going to talk about aggregation of preferences so think about aggregation you can think about adding up like one plus one equals two you know two plus four equals six that sort of stuff we're adding single numbers but preferences aren't single numbers but there is they're sort of you know i like bananas more than apples or i like you know fords more than bmws or something like that right and so different people have different preferences and we can ask how do you add up preferences they might say why would we want to add up preferences well
we want to add our preferences because if we have a small group if we have an organization we have an entire society oftentimes we have to make collective choices and so those collective choices have to depend on our aggregate preferences so what does everybody want so the way if they do that you've got to add up here's my preferences for someone else's what do we get right okay so what i want to do in this sort of brief opening lecture just in the next couple of minutes is unpack a little bit more about what we're
going to do when we talk about aggregation so the first thing in terms of aggregation of actions right when we talked about why you model right a bunch of reasons one of them is to sort of predict points and one is to understand data so when we talk about aggregation of actions like you know someone's decision to go to a store someone's decision to get on a plane right we can think of you know the economy think of the u.s economy there's 300 million people each day people get up and make choices and what we
see at the aggregate level is sort of the average of those choices and what we can show with a very simple model is sort of why oftentimes those aggregate choices have a lot of structure to them a lot of predictability and we're going to get things that look like this picture like this is called a normal distribution or a bell curve and this bell curve implies with it a certain amount of predictability and understandability so very simple model let us explain a whole bunch of things that happen in the real world all right next thing
we're going to want to do is use models to understand patterns so a lot of what we see isn't just points but it's distributions of things it's things flowing now this is true in the physical world the biological world it's true inside our heads with neurons it's also too sort of in the social world so we're going to construct a toy model a fun model called the game of life and this game of life is going to be very simple rules and we're going to start out with patterns so here's a pattern right here right
and time moves in this direction right and we can see is this time moves this this weird configuration keeps changing its shape and then eventually down here notice that it's in the exact same configuration it was that it started out with but it sort of moved one down to the right now this is what we're going to call a glider and this is going to be a recurrent pattern in this model and we're going to see how this thing which looks like it's living hence the game of life is really comes from just simple rules
from some one simple rule building on itself right then okay once we've done that sort of simple thing we're going to even go to a simpler model called one-dimensional cellular automata models and in these models we're going to show how a very very simple model where it works as follows you can imagine a long string of lights and each light can be on or off and each light then has a rule whether or not to be on or off based on just two things whether it's on and off and what its two neighbors are doing
so it could be that says if i'm on and my two neighbors are off i'm gonna switch to off so each light is gonna use the same rule and we're gonna see what sort of behavior we can get what we're going to find is we can get everything whatever everything well remember what we saw we talked about what could what could the world do well we could see equilibria right we could see patterns we could see complete randomness and chaos or we could see complexity what we're going to show some very very simple rules can
generate all four of those and this is amazing right i mean it's sort of if you're in the mood to be amazed this will be an amazing result so what do i mean exactly i mean look look at this incredibly complex pattern now you might look at something like this and say wow to produce something that complex there must be some really interesting sophisticated complicated underlying dynamics well the answer is going to be no you can compute you can get things this interesting right with very very simple rules all right then the last thing we're
going to do the last lecture in this module is going to be about aggregating preferences so what i mean by preferences well let's suppose there's apples bananas and coconuts and this might be me right here so it's a little s here and it might be that i like apples better than bananas better than coconuts so these little greater than signs mean which one i like more than some each other so this apple is greater than bananas it's greater than coconuts for me now someone else like my son cooper he might prefer bananas right to apples
and apples to coconuts so different people can have different preferences and what we want to talk about is how those aggregate now what we'll see is aggregation of preferences introduces all sorts of interesting paradoxes and creates all sorts of problems which is why at least one reason why politics is so interesting because the aggregation of just these simple preferences creates difficulties that don't arise when we think about just adding up numbers okay so big picture here a lot of what interests me as a social scientist is groups of people aggregations of people now how do
we understand that how do we understand how societies work economies work political systems work organizations work well you've got to do two things and sometimes you've got to understand how the parts work and then you've got to understand how you add them up so we're going to sort of do that in the opposite order we're first going to talk about some of the complicatedness of adding things up that's this module and then the next module we'll talk about the parts like all these individual people in here right so to understand the world we're going to
have a twofold approach first understand sort of how things can add up second thing understand the parts that do add up and then the models that follow in this course right what we'll do is sort of put all those things together to make sense of things okay so that's the outline for this module we're gonna you know play with some very very simple toy models that help us understand some of the mysterious phenomena we see in the world and also just some of this sort of i think it's got some amazing results in here and
how simple things add up to create very complex holes thank you hi in this lecture we're going to talk about a really simple model of aggregation so here's the thing i want to model i want a model situation we've got a group of people it could be a hundred could be a thousand and each one is independently going to make a decision to do something it could be to you know go to the gym could be they go to the beach it could be to go to the grocery store what i want to try and
understand is if you've got a whole bunch of people each one's making these independent decisions what's the number of people that shows up now to characterize that i'm going to use an idea called a probability distribution so to make this simple let's suppose that there's um a small group of people like my family which has four people in it and i want to know what's the distribution of number of four people who go for a walk on a given saturday well if i think about the numbers could be there could be zero people that go
it could be one there could be two there could be three or it could be that all four of us decide to go for the walk right the dog would prefer to fall four of us went but you know there's going to be some number that goes so i could take i could keep track of data i could you know chart this on like my wall somewhere i don't be good right and you can ask what's the likelihood that nobody went for a walk and maybe that's ten percent now what's likelihood that one person went
for a walk well that might be 15 and what about two people that might be 40 percent and what about three people that might also be 15 and then what's the likelihood that four of us one for what that might be let's say 20 percent now the thing known about a proper distribution is each one of these probabilities is less than one right and if we sum them up we get 25 plus 40 65 plus 15 is 80 plus 20 is 100 so we get a total of 100 so what our probability distribution tells us
is what are the different things that could happen 0 1 2 3 and 4 and then it tells us the likelihood of each of those things okay so here's sort of the huge result that we're going to leverage to understand how things add up there's a theorem called the central limit theorem and what the central limit theorem tells us is that if i add up a whole bunch of individual independent events so what does independent mean it means my decision to go to the beach is independent of your decision to go to the beach which
is independent of your cousin mary's decision to go to the beach so and by independent i mean not influenced so i don't care whether you're going to beat you're not i'm going to make my decision on my own it's completely independent of what you decide to do or your cousin mary so what the central limit theorem tells us is if a whole bunch of people make a whole bunch of those independent decisions the distribution that we get has this nice bell-shaped curve and this bell-shaped curve means that like the most likely outcome is the one
right in the middle so there's a lot of structure to what happens and that means we can predict a lot of things and we can tell a lot about what's going on in the world and that's what we'll learn in this lecture it's gonna be a lot of fun to get an understanding of where these distributions come from let's start really simple so suppose i flip a coin twice and i want to know what are the odds of getting ahead what's the probability distribution over heads well what could i get i could get tails tails
now be zero heads i could get tails heads or had tails and both of these would be one head or i could get head head and that would be two heads so what's the probability of each of these probably getting tails tails is just a fourth they're probably getting one head is a half and the probability of getting two heads is a fourth so i'm going to get a property distribution if i do it out like this zero one two there's one fourth chance of that a one half chance of that and a one fourth
chance of that you notice it sort of looks like a little bell curve okay let's suppose i flip it four times well this gets harder like i think okay what are the odds of getting no heads i could get tails tails tails tails well how do i figure out the probability of that well it's one-half times one-half times one-half times one-half four one-half so that's two times two times two times two so that's one-sixteenth what are the odds of getting one head well i could get the head first and then three tails i could get
it second i could get it third and it could come last so there's four places that could show up so that means there's a 4 16 chance well i could do all sorts of math again i thought what are the odds of getting two heads and i'd actually get six sixteenths and three heads well that's the same as getting one head really right because tails and heads are exchangeable interchangeable so what i get is i get this again if i do this distribution out i'd get a peek at two heads right i'm gonna get a
nice bell curve right so i'm gonna get this nice thing where there's you know very little chance of getting no heads not that much chance of getting four heads but the most likely thing is getting two heads so i can count all this stuff but it's fine but here's the problem remember we talked about in this class but big data there's lots of data out there we want to try and understand it well often times we're gonna have more than two or four we're gonna have n we have some huge number so if we're talking
about new york city that could be 10 million people if we're talking about ann arbor i live that's still like a hundred thousand people so i don't want to be sitting there writing tales tales tales tales tales 100 000 times i want to have a model that'll help me under explain it so what you can do is you can think okay first off if there's n things the mean the expected number should be n over two right should be half of n but what we'd like to do is understand sort of what that distribution looks
like well what we know from statistics is that distribution is actually going to be a nice bell curve and the mean right right in the middle of this thing is going to be n over 2 and then it's going to sort of flow out nice and symmetrically from each side now there's a fancy equation a formula right that tells you what this line looks like we're not going to get into that but if you take a statistics class which i'd encourage you to do it's a lot of fun you can learn exactly what this formula
is and how it works okay we just want to use it there was a model for understanding how things aggregate so we're going to sort of take some leaps ahead in statistics here's the trick though we got to be a little careful flipping a coin it's always equally likely right it's either a head or tails each one's 50 50. but if i'm worried about people going to the beach right or people going to the supermarket or people showing up for their flight that's not a 50 50 proposition right so maybe 90 of people may show
up for their flight but maybe only 10 percent of people or 15 of people go out to the beach so i'd like to change that one half to something else well i can introduce something called the binomial distribution where instead of having one half there's just some probability p of doing the thing so let's suppose going to the beach happens 15 percent of the time well then if i have a thousand people and p equals 15 percent then p times n is 150. so i'd expect of 150 people show up so that makes sense but
then i can ask well what's the distribution though right i mean 150 is the average but you know i could have 200 i could have 74. well again what the central limit theorem tells us is that we're going to get a nice bell curve right so we got this nice shape here but now and with a mean here of just p times n now this whole provided n is big enough right but if i get a pretty large n you're going to get this nice spell curve and the mean is going to be right at
p times n okay there's more though here's where it gets a little bit complicated but also interesting there's something called the standard deviation and this is this thing called sigma which is this called the standard deviation now when i draw a normal curve there's going to be a mean that's this point right here at the center and then there's going to be a standard deviation which basically tells us how far spread out that curve is and what i mean by that is how far spread out the different outcomes are so it'll turn out there's this
nice structure to any normal distribution if you tell me the mean and then you tell me the standard deviation it's always going to be the case that 68 of all outcomes will be between -1 and plus 1 standard deviation so if it's got a big standard deviation that means that that range could be really wide if it's got a small standard deviation that means that range will be really tight but if you tell me the mean and tell me the standard deviation it's always going to be the case that 68 of the time in between
-1 and plus one standard deviation now in fact since that's two for one it's also going to be two for two and two for three and two for four right so there's going to be a 95 percent chance i'm be within two standard deviations so wait why do why do we care about this why do you care about this level here's why now i've got this model that says if i add up a bunch of independent events here's what the mean is right now in a second i'm going to show you formula for the standard
deviation so i'm going to tell you what sigma is here well if you know the mean and you know sigma then i can give you a range and i can tell you you know that 95 percent of the time i'm going to be between minus two sigma and plus two sigma so if i said the mean number of people that show up is a hundred and that's the mean right and the standard deviation is only two well then you know ninety-five percent of the time you're going to be between 96 and 104 so you know
okay i should prepare for pretty much exactly 100 people if i told you the standard deviation was 15 then you'd know it could be anywhere between 70 to 130. so that's what we want to try and use this model to explain like sort of how wide a range of outcomes we're likely to see in any particular setting so let's go back to our simple binomial distribution where the probability was a half the mean remember is just n over 2. well you can do a little bit of math and show that the standard deviation is the
square root of n over 2. so let's suppose i have n equals 100. so if n equals 100 that tells me the mean is going to be 50. so if i flip a coin 100 times guess what the average is 50. no surprise but standard deviation is the square root of n over 2. so what's the square root of 100 that's 10 so this is 10 over 2 so that's just 5. so what that tells me is if i think of my normal distribution right if i draw this thing out i've got a mean of
50 and then i've got a standard deviation of 5 so that means between 55 and 45 68 of all outcomes so if you want you can do this at home it'll take a while flip a coin 100 times count how many heads you get flip it again how many had you get do that a whole bunch of times you'll find that 68 of the time you get between 45 heads and 55 heads so what this model gives us is it gives us a sense of how strange of outcomes we'll get so we know that most
of the time 68 of the time will be between 45 and 55 right so our mean was 50 one standard deviation was 55 and 45 that means two standard deviations is 60 and 40. so what that tells us is 95 of the time you're going to be between 40 and 60 heads and 99 percent of time you're going to be between 35 and 65. so basically you're almost never going to throw fewer than 35 heads and never throw more than 65 hits and so this is what sort of the power that central limit theorem is
right it gives us a sense of not only the average but also what the spread will be okay remember this is a simple case right this is the p equals one half case and what we'd like is we want it for the more general case where the probability of something happening can be anything right this is this p over n thing what turns out here we're okay because the standard deviation is just p times 1 minus p times n the square root of the whole thing so in the case where p equals one-half right then
we just have the square root of one-half times one-half times n but notice i've got a one half squared here inside so i can just pull that outside so it's just one half the square root of n so that's where that squared of n over two came from so now for the binomial distribution i've got this clean formula as well and we can use that to to model and understand stuff that's a little bit more interesting than just flipping a coin okay let's do a real example let's have some fun so how most of us
have been bumped off a plane before right so you show up at the airport and there's like you know too many people showed up for the plane and you think why do they do this well the reason that they sell themselves to bump is they oversell and the reason they oversell tickets is because not everybody shows up so if you're running an airline and you've got 400 seats and you know people only show up you know 90 of the time you want to sell more than those 400 seeds right so that um your plane's pretty
much full so let's let's do an example let's suppose to make it simple that our plane doesn't afford it's got 380 seats so let's suppose we got a bone 747 with 380 seats let's suppose that 90 of the time people show up so we've gathered you know we run on airline we've got lots of data we pretty much know 90 of the time people show up and that it's independent so one person's decision to show up doesn't leave anything to do with anybody else's now that might be not might be true right because if it's
snowy if i'm late you're likely to be late but let's just suppose that these things are independent and let's suppose that we sell 400 tickets not only wonders when it makes you get some understanding like okay what does that mean what's the likelihood if we sell 400 that we're going to have more than 380 people show up here's where the model can help us because it'll tell us what the mean is it'll also tell us what the standard deviation is so the mean right if i sell 400 tickets and on average 90 of people show
up that means i should sell an average 360 tickets that's less than 380 seats so i should be fine but what i care about right what i care about is more than 380 people show up because more than 380 show up guess what they're gonna be mad right because they're gonna be like you know i paid for this trip to go to florida i want to go to florida i don't want to be bumped so the 360 doesn't tell us enough we want to know something about the distribution okay well look we've got a formula
right remember so n was 400 and p was 0.9 so p times n is 360. that's our mean now the standard deviation we can solve for pretty easily that's just the square root of p which is 0.9 times 1 minus p which is 1 times n which is 400. so if we multiply that out that's 0.9 times 0.1 times 400.1 times 400 is 40 times 0.9 is 36 so that's just the square root of 36 which is 6. so 6 is our standard deviation so now i know i've got a bell curve with a mean
of 360 and a standard deviation of 6. okay well that's useful that can help us because let's go back and let's look that means our means 360. our standard deviation is 6 so that means 68 of the time we're going to be between 354 and 366. that's great it means that 95 percent of the time will be between 348 and 372. also great it means 99.75 of the time will be between 378 and 342. well how many seats do we had we had 380 seats so this means that 99.75 actually more than that right more
than 99.75 of the time we won't over book so here's the central limit theorem so let's let's say it formally central limit theorem basically says the following i've got a whole bunch of random variables so those could be decisions to show up to a flight or not so in those cases the random verbs are just ones or zeros or they could be you know the weight of your bags each person's weight of their bags is some independent variable as long as those things are independent so that means that each person's decision doesn't depend on somebody
else is there how much stuff i jam in my bag doesn't affect how much you jam in your bag and that those things have finite variance what does that mean that means that they're bounded so you know we can't have super huge values like so my bag couldn't weigh billions and billions of pounds as long as they're sort of you know the possible range of eyes that each one can take is bounded in some way or doesn't with some high probability take huge huge values then when you add those things up when you sum them
up you're going to get a normal distribution right which means a bell curve which means we can predict stuff right we can use that model then make sense of how the world works now let's step back for just a second think about like why this is so cool suppose it weren't true right here's a little thought experiment suppose it were the case that when i added up a bunch of independent events most of the time i got something nice right but then there was some spiky probability of some huge event over here what would this
mean well this would be this would mean like sometimes you'd go to the grocery store and there'd be like a thousand people there or sometimes you'd be like i'm just gonna run to the bathroom there'll be 300 people in line right a lot of the predictability of the world a lot of the predictability of just sort of daily comings and goings stems from the facts that this can't happen and that we get these nice bell curves because if individual people individual firms individual groups of people make decisions that don't depend on what other people decide
if they're sort of independent decisions then what you're going to get is you're going to get sort of nice regular stuff according to a bell curve yeah sure there'll be traffic jam sure there'll be a lot of people at the mall you'll be days when you get a lot going on and yes there'll be days when not much is going on but most of the time you're going to get things in that middle region which is going to be predictable and understandable now is everything normally distributed no it's not so what about stock returns so
if you look at stock returns you'll actually see that there's far too many days where really nothing happens and there's far too many days where there's huge gains and far too many days where there's huge losses and what's going on there's this is that the actions are no longer independent so for example prices are going up a lot of people may buy and that's going to cause prices to go up even further and if prices start to fall people may sell and that can cause prices to fall even further so when events fail to become
independent fail to satisfy the independence assumption then we can get more big events than we'd expect and you know more small events than we'd expect so let's put a bow on this what have we got so we've used the central limit theorem as a model and use it as a model to explain how if we add up a whole bunch of independent events then what we get is we get a nice normal distribution right and we can understand the mean we can understand the standard deviation we can use that to predict how likely things are
to occur right we also learned that like it's that independence that gives us that normality right without independence we could get really big events we small events we get all sorts of strange stuff happening so where we're going to go next i'm going to take there's a brief lecture on something called six sigma that pushes this idea of sort of the predictability of a system a little bit further than we had before but then after that we're going to start you know we're going to take the gloves off we're going to start having systems where
there's interdependent actions and we have those interdependent actions we're no longer going to get these sort of nice bell curves we're going to get all sorts of really interesting strange stuff it's going to be a lot of fun all right thank you hi in this lecture we're going to do a little sort of bonus i want to talk about the normal distribution again and i want to talk about it in the context of a business practice that has to do with quality control that's known as six sigma six sigma was a process developed by motorola
you know quite a while a couple decades ago and it had to do with sort of making production processes more predictable so they had fewer quality errors so to understand how it works let's go back and remind ourselves of what sigma is and then we can understand what six sigma is so number one we had a normal distribution right we had a mean and then we have these standard deviations these sigmas one standard deviation two standard deviations and so on right and then all we had is that 68 of the time right that outcome would
rely would lie within one standard deviation and 95 percent of the time it would lie within two standard deviations so what would la how often would we lie within six standard deviations if i went out here way out here to six standard deviations because that's even further how often would i be inside that well the answer is the only time i would fall outside of it would be 3.4 in a million okay so that means that there's almost no way that i'm going to be way over here outside of six you know six sigma too
big or six sigma too small and so that's going to be the core idea let me explain that in the context of an example and then take it to the production how it's used in production so here's an example let's go back to the grocery store so suppose i'm in a grocery store and i sell bananas and on average i sell 500 bananas a day you know i keep i've kept track of my data it's a normal distribution and the standard deviation is 10. so what i want it to be the case is if i
have any sort of you know day within six sigma i'm not going to run out of bananas well this is easy to solve for right because sigma is just equal to 10 right so that means that six sigma is going to be 60. so if i want to be within any event within six sigma i'm still going to be okay all i need to do right is have 560 bananas on hand pounds of an amazon hand and then even if i get a four sigma event a five sigma event a 5.8 sigma event i'm going
to be fine i'm not going to run out of bananas so that's the the idea right you want it to be that even if you get a six sigma event things are going to be okay okay so let's see how this works for production so suppose i'm making some metal part and this metal part has to be between 500 and 560 millimeters right so this is the range anything in this range is okay but if i'm outside this range then the parts not going to work so i could be making phones i can making car
doors whatever now suppose it's the case that you know what causes the door to be a little thicker a little thinner than we want is just a bunch of random things being added up so i've got a normal distribution well i should be able to make my production process so i get the mean right in the center of that right so i've got 530 which is right in the center but now i want to be the case that if i have a six sigma standard deviation i'm still going to be okay well this isn't very
hard to figure out right so we can say okay here's my distribution 5 30 is the mean and i'm going to have a bell curve it's not a very good velcro but i wanted to be the case that anything within six sigmas is okay so 560 to 500 have to be that's got to be my six sigma range so this is going to be plus six and this is going to be minus 6. i should put little sigmas here okay so if 6 sigma is 30 above the mean right this is 560 minus 530 equals
30 that means i just want six sigma to equal 30. so if six sigma equals 30 that means sigma equals five so what does that mean that means if i'm running this company if i'm sort of making these metal parts i want it to be the case that my standard deviation when i you know keep track of the standard deviation of my parts i want to get that all the way down to five and if i get that down to five then if i have any event less than six sigma the parts still going to
work now how do i get it down to 5 that's not easy right you've got to do you know continuous quality improvement so the real management practice was not so much just computing standard deviations and figuring out what the 6 is it was doing all that really hard work that makes it so that sigma falls down to five so it could be initially your signal might have been 30 or 20 or something like that the idea through you know continuous improvement is you drive your sigma down so that sigma gets small enough that even if
something really bad happens the process still works and the part still functions and you don't have to do some sort of massive recall okay so that's six sigma thinking what six sigma basically tells us is that we can use this idea right this model of sort of normal distributions with standard deviations to inform how we you know run our production processes and we can figure out like you know what we're just making too many mistakes and if we make mistakes at this level we're constantly gonna have parts not work whereas if we can reduce our
variation by reducing our variation then the process is almost always going to work right our parts will fit in whatever part they've got to fit into all right so that's just another example how we can use this aggregation thing right these these techniques these tools we're using in ways that we might never have expected when we first came up with them okay thank you hi in this lecture we're going to talk about something called the game of life now this is a very simple model of aggregation and before i turn to the game of life
i want to preface this lecture a little bit by placing it in context so remember why are we taking this course well one to be a more intelligent citizen of the world you just understand what's going on around us two to be clear and better thinkers three to use and understand data and four to better you know design strategize and decide so what is what are we doing here in the game of life the game of life is a very simple model that shows how things aggregate it gives us a lot of sort of surprising
conclusions now it's a toy model it's very simple it's not really about anything it's not about climate change it's not about the financial system it's not about eradicating poverty it's a model that sort of shows us how complicated aggregation can be so the way you want to think of this model is the way if you if you learn piano that you think about sort of learning your scales or something like that or if you play basketball like i do you know practicing your dribbling this is a a model that helps us practice our thinking to
learn the subtleties of aggregations what's going to be amazing about the game of life as well as the one-dimensional cellular automata models that we study next is we're going to see how really complicated the process of aggregation is and so when we then go out and look at the world which involves lots of aggregation we'll have some deeper appreciation for why is it that it's so hard to infer by looking at the macro level what's going on at the micro level right and that's one of the things we saw in shelling's model and now we're
going to see it sort of in a more extreme form in the game of life okay the game of life was developed by mathematician actually not that long ago this was by john conway he's a cambridge mathematician he's a brilliant mathematician his work has been in group theory and this was just a game he came up with using just a go board which is a big you know grid rectangular grid and has little white and black stones that you place on the board so the game of life works a lot like shelling's mom each cell
right like this cell right here has eight neighbors and cells can be either alive which we'll call her dark or dead or alternatively on or off and so on is going to be dark off will be light now the rules of the game of life are fairly straightforward if you're currently off you can only come on if exactly three of your neighbors are on so you need exactly three of the people around you to be on so this cell that's currently off wouldn't come to life because it only has two neighbors on if you're currently
on if there's fewer than two neighbors on so only zero or one you die of boredom it's just nothing going on you just turn off if you have more than three neighbors on you suffocate there's just too many people around and they they're using too many resources you die off but if there's two or three neighbors that are alive then you can stay alive so let's formalize that the rules are quite simple right cells are either on or off if you're currently off you turn on if exactly three neighbors are on that's the rule and
if you're currently on you can stay on if you've got two or three neighbors on okay so off it requires three on two or three okay all right so if you look at this particular cell here right x in the center what you get is it has three neighbors one two three that are on so that means the next period when we look at it it's going to turn on right so we looked at that cell the next time we'll assume these other ones also stayed on it's going to turn on all right now if
you look at the same cell in this picture now it has 1 2 3 four neighbors that are alive and so what's gonna happen is it's gonna turn off okay so now we can do in the game of life is not look at just individual cells we can look at entire configurations of cells so now here's a starting pattern suppose i see the world with these two cells if i look at the one on the left it has no neighbors on and if we get the one on the right it has no neighbors on so
what's going to happen is if i seed the world to look like this it's just going to end up dead right nothing's going to happen okay now suppose i see the world with three in a row well let's look first at this person on the left it has one neighbor on the person in the center has two neighbors on and the person on the right has one neighbor on so what that means is these two cells on the left right they're going to die off they're going to turn off but these and the one on
the right is going to turn off but the one in the center is going to stay alive but now there's two other cells we got to worry about right look at this one right here this on the top it has three neighbors that are alive one two three as does this one one two three so those two right are going to come to life and so if we let this system if we let this go in the next period we're going to see is the original one in the center stayed on right the one above
stayed on and the one below stayed on so we now have three in a row to look just like this now let's let it go one more period what's going to happen well again as before this one in the center is gonna stay alive the one on the top and the bottom will die off because they have one live neighbor but now the ones on the left and the right right this one right here and this one right here they'll come to life because they each have three live neighbors okay all right so what's gonna
happen is it's gonna go like this well now let's let time run once we're like this it'll go like that and once we're like that it'll go like this and so what we get is we get a blinker right so the game of life is interesting because we started out with these simple rules right if you've got if you're on and two or three your neighbors are on you stay on and if you're off you only come to life if exactly three neighbors are on and what we see is those micro level rules can create
macro level patterns right like blinkers okay let's start with something else we'll start with something more a little bit more complicated let's look at this one if we start here this person this cell has two on this cell has two on this cell has two on this cell has one on now if we look around we can see this cell has three right and this cell has three and there's no others that have three so if we look at what happens the next period we're going to get a picture that looks like that so the
game of life not only can create die off and create blinkers it can also have systems that sort of grow so one of the things we want to do is we want to sort of try to understand okay how does what can the game of life produce well let's look at some classic examples and we'll do this using a program called netlogo we're going to look at three things we first have a beacon which is two squares of size four and then we look at something i call the figure eight which is two squares of
size nine and then look at something called the f pimento which is just a line of three with one on the right and then one below it on the left okay so we can look at these three configurations but instead of doing it by hand because that takes a long time we're going to use that same net logo program that we used for shelling's model okay so first we're going to do the beacon and what we do there is we're going to remember we're going to draw these cells we're going to do one two one
two three four and then next to it we've got one two three four and now we can think okay let's press this go once button and what happens is it goes like that and then it goes like that now if you look at the individual rules and figure out what each cell was due you'd figure out this is what's going to happen we'll let this go forever let me slow this down a little bit right and what we see is we get this nice little beacon flashing back and forth okay now this is a lot
like the little blinker we had before it was going you know up and down sideways vertical horizontal vertical horizontal and you know that's a nice little picture so let's stop it and now let's make this a little more interesting and let's draw some more cells and let's make this thing um oops three by three cells so that one's off one here we go so now these things are um three by three blocks and we'll see what happens here okay now again each cell is just following those rules from the game of life so let's let
it go once twice three times four times five times six times seven times eight times okay that's unbelievable right this looks like a eight on its side so if you put it in an angle it looks like a figure eight and if you watch this thing let's do it again one two three four five six seven eight so what's really cool with the game of life is that very simple structures can create these elaborate patterns and again each cell is only following its own simple rule so what we learned from this is that aggregation right
simple things following simple rules can aggregate to form really complex patterns okay so that's the uh game of that's the figure eight now let's do that thing i call the f pimento remember so that was three things in a row like this and then one in the center off to the left and one off to the right now let's just slowly go through this let's go to go once twice three four five six seven eight nine it seems to be taking on a life of its own let's let it go forever and what you see
is it's producing things that are like little gliders right so it's producing things that move out through space right so these things are it's like almost like it's alive now you can start thinking about some really interesting things like think about the human brain right the human brain has these neurons that follow simple rules and by following these simple rules these things are connected and sexually they can create these really novel patterns that produce things like memory and thought and cognition and personality and all that sort of stuff well the game of life obviously doesn't
explain cognition or anything of the sort but what it does do is it show how simple things following simple rules can create incredibly elaborate patterns remember because we started out let's just do it um one more time we started out with an incredibly simple thing right we had one two three in a row one up on top one to the left and then as we watched this thing unfold right as i click this each time what we see is this these incredibly elaborate patterns and i'll just sort of let it go on its own again
right you see just really interesting things including these things that glide across the space that are known as gliders here we've got a picture of a particular state this is a simple glider so here it is at time zero and then if we go through the rules and think okay what happens in the next period what you can do is you can look at sort of each individual cell and you can say okay well what's going to happen well if you look at this cell right here right which is right here in this thing it
has one two three neighbors so it comes to life next time so if we follow that through here's what happens here's where it's at time t equals zero and then at time t equals one it looks like this at times t equals two it looks like this because this cell which came to life is now going to be dead because it only has one live neighbor right and then if you follow it around to t equals three and t equals four you find it t equals four it looks exactly like that look at t equals
one except for it's moved right one cell down to the right so this if i start with this configuration it's just going to glide across the space so this is again this example we call an emergent or self-organized pattern because this thing is looks like it's moving if you watch the movie of this particular starting point you'll see something just glides across the space so you might think this thing is actually flying but it's not what's going on is each one of those individual cells is following a particular rule okay so here's what's really interesting
then about the game of life remember one of the reasons why we construct models is to understand the class of outcome what do we get right do we get fixed points right does the whole system just sort of go to one thing does it alternate does it blink right remember we saw both those things in the game of life is it completely random right which we see down here or do we get these complex patterns now the interesting thing it's been shown the game of life can give you all four of these now we saw
three of them right we saw the systems that died off the systems that alternated and the systems that were complex you can also the thing can just be almost completely random so you can use it to generate random numbers so what's interesting is very simple rules can aggregate to form all sorts of interesting macro level phenomena now one of the things that people gonna ask is well what what what's the limit of what you can get the answer is almost nothing anything you can do with the computer you could actually do with the game of
life which is sort of amazing here's a particular configuration if you want to you can plug it into netlogo it takes a little bit work you got to draw these cells this is a glider gun so what this thing will do is it'll pulse and it'll send out gliders in off in this lower right direction so what you'll get is this thing will just pulse almost like a heartbeat sending out gliders so it's really interesting because you've got again each cell is just following its own those same rules those game of life rules and you're
getting this really elaborate pattern okay so what do we learn from the game of life a bunch of things one is we get what we call self-organization so these patterns appear without a designer so you get these gliders you get these things that blink you get these glider guns you get all sorts of things no one designs that from above it's individual cells following individual rules that when they're placed in certain configurations they produce these patterns the patterns appear to self-organize there's also what we call emergence now emergence means when those patterns have some sort
of functionality so a glider a glider gun a counter so you can use the intellect to create something that actually counts things right or you can even use it to compute things if you interpret what those cells mean so when those patterns have a functionality we can think of that as being emergence so you can think of things like consciousness and cognition as being emergent phenomena okay and the game of life produces both these patterns and these patterns that seem to have or can be interpreted as having functions another cool thing about the game of
life right is it helped us get the logic right right without writing down the model without running in a computer we'd never be able to figure out all the stuff that's going on and we see how we can get really complex things from really simple parts so that's something that logically you might not have anticipated so i'm sure when i started this lecture that these are these simple rules you might have thought this isn't going to be very interesting right these are just a bunch of simple rules and it's a checkerboard but then when you
see all the amazing stuff that can come out of the game of life you start realizing like wow okay simple rules can produce incredible phenomena that's something that i might not have known had i not constructed a simple model okay so that's the game of life it's a a game that belongs to a class of models called cellular automata models now it's just one it's just one cellular automata model what we're going to do next in the next lecture is look at a whole class of even simpler cellular automata models to try and get an
understanding of what causes a system remember that question like why is the system go to equilibrium why is it complex we're going to look at a whole class of sodium automobiles trying to get at least some understanding of why that might be the case thank you in the previous lecture we looked at the game of life which was a particular cellular automata model and in it we saw how we could get just amazing phenomena right how simple rules could aggregate to produce really sort of complex novel outcomes what we want to do in this lecture
is look at an even simpler class of cellular automata models and actually these are the original cellular foundation models and to try and figure out what has to be true about the model in order for it to produce different types of outcomes remember one of our core questions was what kind of outcomes is the system going to produce is it going to go to equilibrium is it going to produce patterns is going to be complex is it going to be chaotic and what we want to do is we want to try and understand which of
those things is going to happen now we're not going to get a definitive answer but again by using a toy model we're going to get some understanding of what leads to complex outcomes all right so first some history cellular automata were developed by a guy named john von neumann who is just a brilliant man neumann built one of the first computers known as the joniac or the eniac he also came up with was one of the founders of game theory and of growth theory and economics so just a brilliant brilliant mathematical mind one of the
things he came up with and this was working with a guy named stanislav ulum who's a mathematician was really the simplest model he could think of of computation which is what's going to be called a cellular automata model now his vision is sort of studied in gory detail including a recent book by the name stephen wolfram who's the developer of mathematica called the new kind of science and in this book we're from explores that really to unbelievable depth this is a thousand page book with hundreds and hundreds of illustrations um how the cellular automata model
works and wolfram refers to this as a new kind of science because he's arguing for a computational inductive way of looking at the world okay so what are these models what are cellular temperature models well again they're exactly we looked at the game of life except for here instead of being on a two-dimensional grid things are on a one-dimensional line so you can imagine as before we've got a bunch of cells and they can either be off which should be clear or they can be on right and so what we can do is we can
just then sort of say okay how do these things evolve over time now the difference between this and what we did before is that now if i have a cell here right sitting in the center we're gonna assume that it has only two neighbors so before in the grid world each cell had eight neighbors now it's only got two now the advantage of doing throughs only two neighbors is well it's simpler for one thing and it also means we can exhaustively study and that's why wolfram's book is so thick we can study every single one
of these rules so we can write down every single rule and then ask how do the different rules work what behaviors do they produce and that sort of stuff the other big advantage is this can be much easier to display these worlds than the other worlds because we can let time move along this axis so what i can do is i can have this here's the cell at this moment in time maybe it's filled in and then i can say what happens to the next period maybe it's off and then i can say what's happened
to it's the next period maybe it's on so i can represent times sort of moving vertically down the page right so that's the models now i've got to decide okay what can the rules look like well here's an example so let's think about what a rule would have to look like so if i think of this cell x right right here this is the cell x now there's and it's got two neighbors right so neighbor one neighbor two or we could call these left and right if we want we can ask what are the possible
states those things can be in well it's possible that all of them could be off and it's possible all of them could be on right or it's possible only the one to the right is on or only the cell itself is on right so we can think through there's basically eight different possibilities so what would a rule be a rule just says what do i do in each one of those states so it could say well if i'm in the state where we're all currently off then i'm going to stay off and if we're in
the state we're all currently on i'm going to go on and it could say well if these two of us are on i'm also going to go on and then what you do is then you think about okay here's this cell we start out with some initial configuration we've got a whole bunch of cells and some of them are colored in and some of them are not and then what they do is each cell says well what am i what does my configuration look like if i'm in this cell right here i notice that all
three of my neighbors are on so i go to the lookup table see all three neighbors are on and say i'm going to be on next period okay so all you do is for each cell like so this cell right here if i get this out right here it's got it's on but it's two neighbors are off so i'd go up to lookup table and say okay this is the configuration we're in right here and it might say in that situation go off so that means the next period it would stay off so that's it
time moves horizontally and we have these rules of looking right now one of the things that wolfram does in his book is he says okay look if you look across all these different rules you can get all four of these classes of behaviors right so you can get we talked about this before you get these fixed points you get alternation you can get randomness and you can get complexity and so understand this why why do you get these things what's true about the rules in order for this to be true okay in order to get
these different types of outcomes okay now before we go any further okay there's a lot of rules how do we make sense of them how do we keep track of what the rules that are well wolfram has a really ingenious way of numbing it so let's think about it so if i'm in this state here all off well there's two possibilities here right we could be off or we could be on and if i think about this state there's two possibilities as well we could be off we could be on and that's true for every
one of these two two two two two so there's two different things i could put for each of these things so that means that there's two to the eighth possibilities which means there's 256 different rules so now we think holy cow the whole universe of these rules is of size 256 there's 256 things we have to explore that's why will from this book runs to a thousand pages right if you guys give four pages to each rule you suddenly you know used up a thousand pages now wilfred also comes up with an ingenious way of
numbering this rules what he does is he says let's just give these the numbers 1 2 4 8 16 32 64 128 and then what he says is if it's on right then so let's suppose that a rule here let me do this a different way so suppose that if it's this is our rule right here these three are on so then recess is we'll call this rule 2 8 128 and we'll just add up those numbers to give us 138 so that will be rule 138 so what he does is he makes this first
one with one the next one with two the next one with four the next one with eight and so on and this enables him to give every rule a unique number between 0 and 255. so the rule everything's off israel zero the rule where everything's on we just add up all these numbers and get 255. so this isn't going to give us a numbering system for the rules okay so let's look now at some rules that create some interesting phenomena this is rule number 30 right because we have 2 plus 4 plus 8 plus 16.
and this rule says if you're currently if all three of you are off you stay off um if the one to the right is on or the one on the left is on right these two things you go on if you're currently on you stay on and then here's a little bit of an asymmetry if the one to the right is on you stay on right but if you under your left is on over here you go off so let's think about what happened here these this one and this one all have three all are
in this state right where all three off so they're going to stay up this one has one to the right arm so it's going to come to life right this one right here this next one is currently on with its two neighbors off so it's going to stay on right this one right here has the one to the left on so it looks like that so it's going to stay on and the other ones are all going to die off so what we get is we get these three states are now one these three cells
are nouns what happens the next period well let's just start again these ones to the left are all going to stay dead but this one right here because it's got one neighbor to the right on is going to come to life this one because it has one neighbor to the right on is going to come to life but this one which is in the center has three in a row so it's going to die off so we're going to get something that looks like that so what we get is we get this sort of pattern
spreading out well again we're doing this by hand let's try this in a more serious way using netlogo okay so we're going to set this up but there's one cell that's alive in the center and then we let it go and we see that we get those three right and now we see is this really interesting pattern evolving as i move down and notice how this is uh creating now we see these different structures right we see smaller triangles bigger triangles and so on right and one of the things that's been proven about this rule
which is sort of interesting is if i drew a line right down the center like if i picked a particular cell and drew a line right down the center of its path over time it's going to be a random sequence of ons and off so you wouldn't be able to tell you wouldn't be able to predict um what's going to happen expert if you knew what happened the period before so what this is this is an example rule 30 is an example of a rule that produces perfect randomness all right here's the next rule this
is rule 110 so remember we get the rule the the 2's on the 4's on the 8th on the 32's on the 64's on so we add those all up we get 110. so let's think about this one again we've got these three cells over here to the left and these three cells over here to the right all have no neighbors on so they're all going to stay off now this one has a neighbor to the right on and so it's going to come on this one right here right is currently on and with no
neighbors on so it's going to stay on and this cell right here has a neighbor to the left on right but notice how it's going to then stay off unlike in the previous case well now if i go along this one is going to stay off this one's going to stay up but this one because it's got a neighbor to the right that's on is in this configuration so it's going to come to life this one has two neighbors and it has its own on its neighbor to the right on so it's in this configuration
so it's going to stay on right but this cell right here the original cell that was on is in this configuration it's on and the one that's right on so it's going to stay on as well and then finally um this cell right here is in the configuration was in before where its neighbor to the left is on so it stays off and so now we get something that looks like this where we sort of have this increasing triangle now we could ask what happens to rule 110 as we let it run and what we
get is we get this is a map from wolfram we get this really interesting pattern and this is going to be sort of complex we see these particles that sort of move through space and this rule 110 is classified as class four by performance complex rule all right so what we've got here's a better picture if i start with the random configuration here's rule 110 and again we see all these sort of interesting particles moving through space we see lines moving through we see things like this interacting and then causing bigger things we see all
sorts of really crunchy interesting stuff this is complex right it's very hard to make sense of so what we've seen then which is interesting with this simple one-dimensional psychoautomata model is it's easy to make rules where everything just dies it's easy to make rules so everything just blinks there's some rules where things appear to be random and you can actually prove that they're random like rule 30 and then there's rules that um like rule 110 right to create this complexity so what we can do is we can ask okay here's the interesting question why right
why are some rules why do some rules go to steady state some rules blink some rules random and some rules complex before we get to that question of why what creates complexity what creates chaos what complex order let's just stop for a second and think about how profound these results are these are really simple models much simpler the game of life and they can give us anything and this has led some physicists and mathematicians to to suggest that this may be how the world works in some sense that everything may come from very simple rules
so all the complex things we see out there in the real world come from very simple binary interactions so this led to the phrase by the physicist john wheeler it from bit now let me quote wheeler here because it's really sort of profound he says it from bid otherwise every it every particle every field of force even the space-time continuum itself derives its function its meaning its very existence entirely even if in some context indirectly from the apparatus elicited answers to yes or no questions binary choices bits it from bit symbolizes the idea that every
item of the physical world has at its bottom a very deep bottom in most instances an immaterial source and explanation that which we call reality arises in the last analysis from the posing of yes no questions and the registering of equipment evoked responses in short that all things physical are information theoretic in origin and that this is a participatory universe okay that's wheeler 1990. so we were basically saying this it from bit idea is that you can actually explain anything right by just simple yes no questions at the core and so the very very deep
bottom of reality could just be binary switches so it us the universe everything could literally come from bit now that's a bit of a you know that's a big leap from the simple one-dimensional cellular automata model but again the cellular automata model is capable of producing pretty much anything so it's interesting all right so let's get to this question of how does it produce anything what's going on well chris langton who is a researcher at the santa fe institute who got his phd in michigan studying these cellular automatas you know came up with something he
called langton's lambda and what lambda does is it tells us sort of what the outcomes look like so let me explain what i mean so look remember the wolfram number these rules from 1 to 256 langton takes a much simpler approach he says look how many things go on in this case there's three so you can think of langton's lambda as three or as three eighths either way but it's just the percentage or the number of switches that are on right so this rule would have a alpha or lambda i'm sorry of zero or zero
over eight and this one would have a rule of one over eight so the langton's lambda tells us the percentage of bits that are on and this one remember this was rule 30 right would have a lambda of 4 over 8. well let's go back and look at these again this one has a length a lambda of four over a zero over eight what's going to happen nothing right everything's just going to die nothing interesting is going to happen what's going to happen to this one that has a one over eight well initially a lot
of stuff is going to die off but then once everything's die up dies off everything's going to go on but then once everything's on it's all going to die so this thing's going to blink right what about rule 30 which has the lambda 4 or 4 over 8 well remember this thing was chaotic right this was completely random and what about rule 110 right this was rule 110 this has a lambda of 5 over 8 and this thing was complex now what you can think of then you think well wait the bigger lambda gets the
more like we are to get something interesting well that's not quite true because think about when lambda is eight right when lambda is eight then everything automatically goes on so that's not going to be interesting either so what's going to be interesting it would seem to be is sort of this in between region right this region where you've got sort of either two three four five six things go on well let's look up let's look at it so here's all the rules in the the one-dimensional cycle automatically with two neighbors and if i sum this
up i'd get 256 if i want to know how many class 3 members this sort of chaos or random and in that class there's 32 of them right and if we look 20 of them have a lambda equal to 4 and they're all in this region between two and six class four is the complex rules right in the complex rules there's only six of them and those all happen between three and five lambda between three and five so here's what's really interesting now we wanna ask what causes chaos and complexity well it's this region right
here it's intermediate levels of interdependence right so a rule like this which has a lambda of 7 8 or 7 right nothing interesting is going to happen it's just pretty much going to go to everything being on and then once everything's on right it's going to stay on so it's going to be stable so it's these intermediate levels where we see the complexity so if you look at something like this is the nikkei index you know where you see these incredibly complex patterns what you'd expect is that these rules have substantial interdependence right because that
middle level means that whether i'm on or off depends a lot on what other people are doing so if there's a lot of interdependence in the rules you're going to see complex patterns like these things right well what happens in a market well people's rules depend a lot on what other people are doing so there's a lot of interdependence and therefore you get these complex patterns if there weren't an independence interdependence right then you'd always go on or always go off and nothing interesting would be happening so what do we learn from this very very
simple toy model first we learn again that simple rules can combine to form just about anything incredibly simple rules second we get the sort of profound idea of it from bit and third we get the complexity and randomness require um some intermediate level of interdependency right so you can't have it be that like i always go on or always go off you need interdependency in the actions in order to complete create complex phenomena okay so that's cellular that's one dimension or cellular automata models it's a toy model but it gives us a deep insight and
that deep insight is if we see complexity out there in the world it's likely because people's behaviors are the rules that things are following are interdependent all right thank you hi in this last lecture in aggregation we want to talk about the aggregation of preferences so this is going to differ from what we've looked at before remember when we looked at the central limit theorem we looked at aggregating numbers or actions and then we looked at the game of life and cellular automata we were talking about aggregating rules now we want to do is aggregate
preferences so preferences are going to be a different structure different mathematical structure than what we had with either rules or numbers so to get a handle on this to get on how we aggregate these things first got to say well what are preferences how do we how do we represent them so well let's think about it so let's suppose i'm just asking what do you prefer do you prefer apples or do you prefer bananas and you may say well you know i prefer apples or someone else may say no i prefer bananas or alternatively i
could say how about bananas and coconuts do you prefer bananas or do you prefer coconuts and you might say well you know i prefer bananas to coconuts so one way we write down preferences or think about preferences is the revealed actions so we can just give people sets of choices and ask them which do you prefer over the other so when you think about overall preferences what we'd like to do is we like to have a complete listing of someone's preferences so we'll talk about oftentimes our what are called preference orderings which are just a
ranking of a whole set of alternatives now typically those alternatives will be within a particular class so i'll have a preference ordering over fruit i might have a preference ordering for vegetables i could have a preference ordering over houses over cars right so within a category i can rank different things all right so we can then ask well how many preference orderings are there right so what does this thing look like well let's suppose i've got these three things apples bananas and coconuts and i could say okay well on apples and bananas there's two possibilities
right either i prefer apples to bananas right which i'm going to show here with a greater than sign or i could prefer bananas to apples so there's two possibilities next if i look at bananas and coconuts right now i've got that i could i could prefer bananas to coconuts or alternatively i could prefer coconuts to bananas so there's two possibilities there and finally with apples and coconuts i can either prefer coconuts to apples or apples to coconuts and there's two possibilities there so 2 times 2 times 2 right i've got it times these is going
to be 8. so there's eight different ways eight different types of preferences i could have for these two types of three types of fruit right so that's a lot of different things that each one of them i can just represent by these sort of greater than signs like which one i like first you know which one do i prefer now there's a bit of a problem though with this let me erase all this for a second there's a bit of a problem with this because let's look at these particular preferences these preferences say i prefer
apples right to bananas i prefer bananas to coconuts and i prefer coconuts to apples now that doesn't make any sense because if i prefer apples to bananas bananas to coconuts then i should prefer apples to coconuts right so this doesn't make any sense and it should go like that these are what we would call transitive preferences so they satisfy a relationship called transitivity so these are transitive preferences and so we typically assume that individuals that people have transitive preferences another one thing about transitive preference is that they're they're rational so it'd be irrational to say
oh i like apples more than bananas bananas more than coconuts but coconuts more than apples that doesn't make any sense so we think of rational preferences as being preferences that are transitive if i like a more than b and b more than c then i also like a more than c okay well then i can ask if this is true right if apple's bigger than bananas bananas bigger than coconuts if that implies apples more than coconuts that puts a restriction on how many preferences i can have i can no longer have anything it rules certain
things out so now we can ask how many preferences can i get that way well this is actually also an easy calculation and we've sort of done some of this math before well it means there's going to be one thing i like best right that's ranked first one thing i like second best and one thing ranked third best well so how many different things could i like first i could like the apple i could like the banana or i could like the coconut so i chose the apple there was three possibilities once i've chosen the
apple first i've got two things i could choose next the banana or the coconut i choose the banana but i could have chosen one of two but once i get to the third thing i've only got one thing left so there's three times two times one which is six so there's only six ways to be sort of have rational preferences over these three alternatives so when we think about rational preferences what we think of as these preference orderings right where one thing is preferred to the next is referred to the next now more sophisticated models
we could also allow equality right so i could say i like i'm indifferent between bananas and coconuts but here we're just going to assume that like you like one thing more than the next so what first thing we get just been thinking about these preferences is that if we impose some rationality assumption like for people having preferences then there's fewer preferences than we get if we just sort of allowed anything to go so here's the game here's sort of what we're going to play with in this particular model we want to think about suppose i've
got a bunch of people who have rational preferences and now suppose i want to ask how do their preferences add up what i mean by that is like that okay well think about it each person has preferences and now i can say well what is the society's preference or even like in a family i could say everybody in our family has preferences over these fruits right each member does well can i say anything about the family's preferences well first notice if everybody has the same preferences it's pretty easy if everybody in the family likes apples
and then bananas and then coconuts then we can say well the family likes apples and bananas and coconuts it gets tricky right if different people like different stuff so if one of us likes apples and then bananas and then coconuts but another person likes bananas and then coconuts and then apples so if we differ in our orderings now becomes somewhat problematic to decide what what is our collective ordering what are our collective preferences so this is an aggregation problem right we've got individuals with preferences and now we want to ask what's the collective preference well
here's what's really interesting let's watch so here's some preferences person one right here's person one they like apples and then bananas and coconuts person two likes bananas and then apples and then coconuts and person three likes apples and bananas and then coconuts okay so if you think about this and then okay what what are our collective preferences well there's some diversity here and what we want but it seems pretty clear that like coconuts should be last because everybody has coconuts left so we'll put a little coconut here right that's in last place now it comes
down to sort of apples versus bananas now one thing we could do is we could say well let's treat people equally let's not suppose that person two is somehow important more important than person one or person two three so we treat people equally and we can say well let's just vote and if we vote two people like apples and one person like bananas so then we can put the little apple here i'll do a really bad apple that's going to be better than the banana if that's a horrible looking banana and that's going to be
within the coconut so we can sort of say these are collective preferences apple banana coconut okay that's pretty easy well now let's go for something where the preferences are even a little bit more diverse now person one likes apples bananas coconuts person two likes bananas coconuts apples and person three prefers coconuts to apples to bananas now we gotta think okay well what what happens here there's no doesn't seem any clear winner well one thing we could do is we could say well let's let's just do a pairwise vote so let's just you know vote these
things through so let's first compare coconuts to apples so if you have coconuts to apples we notice that again let's number these people one two and three if we do coconuts versus apples we see that person two and person three right prefer coconuts so coconuts is going to win two to one if we look at coconuts versus bananas we see that um well person one and person two both prefer bananas to coconuts so one and two prefer bananas so bananas are gonna win so let's actually circle these so coconut wins versus apples and bananas win
versus coconuts so therefore it would stand the reason that bananas should win with respect to apples right because bananas are better than coconuts coconuts are better than apples so therefore bananas should be better than coconuts well let's check let's sort of just check to be sure so if we compare apples to bananas we see that person one likes apples more than bananas well that's okay person two likes bananas more than apples but person three likes apples more than bananas so we get that one and three right both prefer apples to brand so apples wins well
look at this the group here's the collective here's what the collective preferences are so we make this collective preferences the collectives likes coconuts more than apples apples more than bananas and bananas more than coconuts that's irrational that's not transitive so here's the really funky thing we've got individuals every single individual is completely rational right they've got nice transitive preferences there's no inconsistencies but then when we vote when we try and aggregate these preferences we get something that's not inconsistent so this is a paradox of aggregation you know so before we talk about aggregation we got
things like you know simple rules could create complex phenomena here what we're doing is saying look aggregation of preferences you know aggregation of some structure can give us something that's not it doesn't have one of the properties of the parts so each part was rational but the collective isn't rational so this is sometimes called this is formally called conder says paradox so each person is rational each person has rational preferences but then when we vote the collective is not rational the collective says if we go back right the collector says oh yeah coconuts versus apples
coconuts apple apples versus bananas bananas so then you think for sure they must like coconuts more than bananas but in fact if you have people vote bananas will be coconuts so this is the condorcet paradox each person is rational the collective is irrational so this has some pretty severe implications the implications are going to be then we think about voting that now suddenly we're not necessarily going to get a good outcome we could get almost a random outcome and then it means that people might want to vote strategically so later on in the course we
can start constructing models of how people vote this will be near the end of the course we'll see that the fact that aggregation doesn't work right that there's a problem with aggregation in these over preferences that that's going to create incentives opportunities conditions under which people might want to sort of manipulate agendas lie about their preferences or misrepresent their preferences in order to get outcomes that they want to get so here's a case right aggregation doesn't give us something we want okay so again just to drive this home each person has totally sane rational transitive
preferences exactly we'd expect but when you look at the collective the collective has this irrationality so aggregation is sort of a funny thing that's why social science right in particular in this case politics is so darn interesting right because what happens at the macro level is sort of logically inconsistent now we've seen aggregation in several forms right we've seen aggregation of numbers in the central limit theorem we've seen aggregation of rules right in both the game of life in the one-dimensional cellular automaton we saw we could get really complicated thing from simple parts and now
in looking at preferences we've sort of said let's look at aggregation of some other mathematical structure namely these orderings and we found that orderings that are you know in a mathematical sense transitive sort of the social sense rational don't necessarily aggregate into orderings that are transitive and rational so we get that they're sort of we can lose logical consistency as we go up so these sort of interesting aspects of aggregation are ideas we're going to play with throughout the course now none of these particular models models any real thing per se right but they're building
blocks they're giving us a basis for how to think if nothing else i hope these lectures are giving you some sense of like the mysteries and the intricacies of adding things up and why the social world is very very different than the parts that comprise it thank you hello in this set of lectures what we're going to do is we're going to look at some decision models so these are models of how people do make decisions and how they should make decisions and we're going to do it a bunch of types we're going to do
multi-criterion decision making models we'll do some spatial models and we'll do some decision theory models under uncertainty okay so that's sort of the outline now when we think about decision theory models we're really doing these things for two reasons first is normatively remember we talked about how models can make us better thinkers well these particular models are going to help us make better choices remember i talked about how they can serve as a crutch they can help us where we started because of our sort of limited ability to hold information models can help us that's
clearly going to be the case here so normatively these models will be useful because they help us make better choices also there's a positive dimension so social scientists use these sort of models a lot to try and predict the choices that people make whether they're policy makers or business people right or you know even entire governments you can use these sort of models to figure out you know why did some actor make the choice that he or she made okay so that's sort of what this is all about you can bring these models to data
as we'll see as well so what do i mean by normatively using model think about it right so you've got all these choices to make you know whether to go to school what investments to make what job to take even whether you should like drive or fly somewhere it's a whole bunch of choices you have to make and these can be difficult there can be lots of dimensions to them and they can be made under uncertainty so these models will help you make better choices which house should you buy should you have your wedding inside
and outside or outside right all these sorts of choices you can use these models to help you make better ones now positively talking about that you can help predict what's going again you've got some politician who makes a policy choice maybe they make a nomination of the supreme court you want to understand why did they choose that particular person a political candidate chooses a platform you can use these models to understand why did they choose that platform a business makes an investment these models will explain you know why they made that investment or maybe even
tell you something about what they perceive the value of an investment to be so these models were used in two ways one to help you make choices and two to understand the choices of others now we're going to do two broad classes of models the first are multi-criterion choice models so that's where there's lots of dimensions and you're trying to weigh one alternative versus another the second type are going to be probabilistic where there's some uncertainty out there in the world you're not sure how it's going to unfold and so what you have to do
is decide okay how do i balance off the risks versus the reward let me explain this a little bit and then we'll get started so suppose you're buying a car you could be looking at the new ford fusion right which is a great car or the chevy volt also a great car and you've got to think about okay how do i um measure these two how do i decide which of these two that i want to buy right well one way you could do it is you could say okay well here's a bunch of criteria
maybe there's you know how comfortable the seats are and so ford wins on that one and maybe this one is miles per gallon and maybe the chevy gets better miles per gallon so what you can do you have all these criteria and you can choose the car based on these criteria now alternatively you could have a spatial model now in a space shuttle on what you could have is you could have an ideal point so there could be a couple dimensions one of them might be sort of how fast the car goes speed of the
car and another might be um how comfortable it is all right so let's get straight comfort here not very smoothly so i could want a car that's sort of moderately comfortable maybe i don't want to be too comfortable because i sort of fall asleep in the car and maybe i don't want to be too fast because my sons are going to be driving sooner i want a car that isn't quite as speedy so this is sort of my ideal point right here and then i can think about how far are these cars from my deal
point so this is distance one and distance two and i buy the car that's closer to my ideal point so that's another way we can think about making decisions spatially in terms of distance between you know how close one product or one policy is to my ideal preferences okay those sort of choices though are you know under certainty we know all the attributes of the car oftentimes we have to make choices under uncertainty where we don't know whether you know how to feed what the future is going to hold so for example i teach at
the university of michigan great school a lot of people want to go there but people always ask me you know should my son or daughter apply to michigan and you can think of that as a choice under uncertainty right because you can apply or you cannot apply but if you apply there's some probability p that you'll get accepted in some probability one minus p that you won't get accepted so this decision do i apply or do i don't apply not apply it comes down to really what you think the value of p is and also
how nice it would be to be accepted how much you really want to go there so you can write down these decision trees and decide is it worth it doesn't make sense to fill out that application and pay the application fee in addition to doing that once we've got these decision trees we're going to show how we can use them every one reason we write models and they let us do other stuff once we've written down those decision trees we can actually compute something called the value of information but the value of information tells us
is if the uncertainty went away so someone could you know could tell us whether you're going to get into michigan or tell us if it's going to rain on our wedding day then we can figure out exactly how much that information be worth so it's a little trick we can do once we've written down a decision tree which is a pretty cool thing so that's the outline you need multi-criterion decision making spatial models we'll do a quick aside on probability and then we'll move on to decision theory in the value of information let's get started
thanks hi in this lecture we're going to talk about a type of decision making known as multi-criterion decision making and the idea here is that there's lots of different dimensions and you've got a couple choices and you're trying to figure out which choice you know sort of makes you happiest which choice you think is sort of the better the two or three or four right so how does this work we're going to do this in two ways we're going to first do it qualitatively and then we'll do it quantitatively so let's do a qualitative assessment
first by that i mean just sort of no numbers right it just means sort of like which is better let's suppose you're looking at a house this is a very small house but let's suppose that you're looking at a house and you're trying to decide you know should i buy this house or there's some other house down the street i'm also looking at that now if you think about making that sort of decision a house has all sorts of dimensions to it right so there's square feet there's the number of bedrooms there's a number of
bathrooms there's the lot size there's the location there's the condition of the house there's all these things so if you sit there and you go through one house and go to the other house and you try and keep all these things in your head that can be really really complicated so one way to do multi-criterion decision making is just to say okay look let's just create a chart like this let's create the following model where these are the dimensions right here these are all the dimensions i'm gonna decide on and then for each house right
house one and house two i have a column so now i can as i can write down the square footage of these houses so i could say this house is maybe 2 000 square feet and this house is 1800 square feet right and number of bedrooms i can see maybe this house has four bedrooms and this house has three bedrooms so house one is sort of bigger with more bedrooms but house one may only have one and a half bathrooms whereas house three house two has three bathrooms now look at the lot size and maybe
this one is only a quarter acre and this one is a half acre right and then i look at the location and location now this may be maybe i put miles from work maybe this is 10 miles from work and maybe this is 15 miles from work and then last i can ask okay what's the condition of the house and maybe this one is excellent and this one is only good well now what i can do is i can just go down freaking criteria and say okay which one wins right so what i can do
is i can say okay well on square footage this one wins on bedrooms this one wins on bathrooms house two wins on lot size has two wins on location because this is distance to work house one wins and on condition of the house house one wins so what i get is that house went one gets a total of four and house two gets a total of two so i can do things like house one seems like a better house right because it it wins on four of the six criteria and this is a way you
know one way to make decisions you can do this on other issues as well a few years ago the state of michigan had a referendum called the michigan civil rights initiative and this essentially was a referendum saying that state employees you know state employers including universities could not give preferential treatment to women or underrepresented minorities so what i did with liz suhai who's a grad student at the time the university of michigan was we put together a decision theory guide for people to help make this decision so what we did is for each of the
different dimensions of this issue how it was going to affect things like equality the quality of education help or harm women social cohesion reward merit and whether it was an appropriate limit or not we wrote a guide that sort of laid out the argument for the mcri and against the mcri and then what a person could do is they could go down each of these dimensions and say i'm in favor of the mcri i'm against it right and so on and then count up the columns and say okay there's four reasons why i should support
it and two reasons why i should be against it or alternatively someone might read through these things and say boy on every one of these dimensions i think we should reject the mcri and vote against it so what this was is this was a guide to help people recognize the dimensions weigh the alternatives along those two dimensions and then make hopefully more informed choices so when you think about how you vote or how you buy a house this can be really a useful way because there's so many dimensions right what you do is you just
lay out each dimension put columns for the alternatives and then decide which alternative is better on which dimension and then add those up and make a choice so let's go back to our house case so i do this i put down all the criteria and i you know list house one and house two and then it says okay you know i should buy after doing this right remember we came down and we had four reasons why i should buy house one and two reasons why i should buy house two and i say okay therefore i'm
buying house one i suppose i do that and uh i don't like house one i've done this analysis and i in my gut i say you know what i don't like house one i really like house two well a couple things going on one is maybe i'm missing some criteria maybe house two is a spanish style house and house one is a craftsman house and what i really like is spanish style houses well if that's the case right what i should do is i should go back to my criteria and add spanish style house but
if i do that that's still only going to make it four to three well what can i do then well one thing i can do then is i could say well do i really like house two right is it really the case like i just or is there something like irrational in what i'm doing is this is it some sort of romantic attachment to house two when in fact when i look at it somewhat objectively according to the criteria that i care about house one is better well another way we can go more sophisticated multi-criterion
decision making model is to take into account quantitative weights right to not only just take into account qualitatively which is better but actually quantitatively to measure the differences so what do i mean so now i can say okay here's all these things i care about right and let's suppose now that i've added in here that i also care about the basement so now there's house one and house two and maybe again house one wins on square feet number of bedrooms but it doesn't win on bathrooms or lot size and it wins unlock location and condition
but suppose house two wins on basement right and before i had this total of four to three but it could be that you know square footage doesn't matter to me that much nor does the number of bedrooms so i both give each of these one but number of bathrooms matters to me two and lot size matters to me too location and condition don't matter me very much and having a basement is incredibly important well now if i do this house one gets a total of one two three four right and house two gets a total
of two four six and so now it's house two and so what i see is okay what really was going on here is that when i put in how much i care about these dimensions i realized that's the reason why in my gut i sort of felt that house two is better now again let's think about this remember we talked about we don't want to let the models tell us what to do we want the models to help us make better choices so if i'm buying a house and i write down all the criteria and
i compare two houses and i see boy according to my very my own criteria house one is better than house two but in my heart of hearts i prefer house two that doesn't mean we shouldn't buy house two but it does mean we should ask ourselves are they crazy are the criteria we're leaving out or are there weights so we can at least understand ourselves and know why we're making the choice and we know we're not making it because you know perhaps we liked the blue curtains in the living room or something right so it
helps us make more rational choice so that's sort of normatively why these things are so useful positively it's useful because when you see somebody else buy a house and we see boy house one was better on every dimension except for lot size we can then infer wow this is someone who cares a lot about lot size okay there we go so that's it multi-criterion decision making what you do is you list all your criteria take your alternatives and you see which one wins in each on each criteria now you can also add weights and make
it more quantitative so some criteria matter more than others again two ways we can use it one is make better choices ourselves right we just saw how when we're deciding like whether to buy a house or how to vote this can be really useful and second we can use them to understand the choices of others thank you hi in the previous lecture we talked about multi-criterion decision making right we had lots of different dimensions and you weighted alternatives according to those dimensions in this lecture we're going to move in a slightly different direction we're still
going to consider multi criteria multiple criteria what we want to do is we want to have a spatial model so the difference here is is that instead of just wanting sort of more square footage or a larger lot you're going to have an ideal point so there's going to be sort of a perfect amount that lies sort of between too much and too little so these are known as sort of spatial choice models now spatial choice models originally started by thinking about geographic choice there's a guy named harold hoteling who's an economist who thought about
imagine you're on a beach and there's an ice cream vendor you know 50 feet to your left and there's another ice cream vendor 40 feet to your right you may decide well you know since the one to my right is closer what i'll do is i'll go and you know buy my ice cream on this coaster not to walk as far we can take that idea and you can apply it to attributes of a good so for example i love indian food right and i like my indian food to be reasonably hot we can imagine
one of the dimensions that an indian food is whether it's you know cold right in terms of you know cold in terms of how spicy it is or you can imagine this here means it's really really hot so i could take all the indian restaurants i could put you know one indian restaurant here indian restaurant one indian restaurant two an indian restaurant three and this is how hot their food is and so there's me as a consumer and i'm trying to decide okay where do i go buy do i go from indian restaurant one indian
restaurant two and a restaurant three well it could be that if since i like my food really hot this is my ideal point right here i'm gonna go to indian restaurant one because it's closest so this is the idea that there's what each person has is a sort of a preference an ideal point and then they buy the thing or they purchase from the firm right that offers the product that's closest to their ideal point so this is hoteling's idea this guy named anthony downs who was also an economist that sort of moved into political
science and what anthony downs did is he said you know we can apply this to how people vote so we can do is we can put politicians right somewhere between left and right so maybe you know this might be the democratic candidate for president let's say and this might be the republican so both are you know democrats sort of the left republicans sort of the right but they're not really completely extreme and then you've got a voter who perhaps sits right here and the voters got to decide okay do i vote for the democrat or
vote for the republican and what they do is they look at this distance how far am i from the democrat this would be distance one and how far am i from the republican this is distance two and i talked about this sort of in an earlier lecture about why we construct models and using this model the voter could say well you know what since i'm closer to the democrat i'm going to vote for the democrat so this is a fairly simple spatial model what we want to do is we want to ramp this up a
little bit so let me ramp this up is in two ways one is you can take this model to data so this is worked by andrew gellman columbia university and what andrew does is he looks at supreme court justices and so here's a whole list of supreme court justices this is a little blurry but here's people like justice blackmon right and here's judge scalia and here's judge ginsburg justice ginsburg and what you can do is you can chart ideologically where they are over time where this here is to the left and this here is to
the right so you notice that scalia is up here in the right and blackman's down here to the left so what you can do is you can sort of use this model to keep track of where the different supreme court justices are what's interesting about this is then when you think about where does a president who does the president a point well the president may also have some ideal points if a liberal president their ideal point may be down here if a conservative president their ideal point may be up here and they're going to want
to appoint a judge that has the same sort of ideology that they have so it's really nice very simple model but you can take it to data and then you can use that data to understand how people how presidents appoint judges and also how the ideological how the ideology of the court has changed over time okay we want to use this we want to sort of take this model and show how we can expand it to more direct dimensions so before that was just one dimension hot cold left right but we can do the same
thing in more dimensions so let's go back right to the the car example remember i'm just trying to decide between the ford and the chevy and i said you know there might be two things there might be sort of speed of the car right and there might be comfort right and once again penmanship could be better so and what i do is i decide which of these two cars is closer okay so let's do this in a you know sort of a fun example in terms of getting a burger so you know i like burgers
burgers and we can think of what's my ideal burger well my ideal burger might have two pieces of cheese and it might have two patties and two tomatoes and also some ketchup let's say four tablespoons of ketchup four tablespoons of mayo and i'm a pickle guy four pickles so this is my ideal burger right and so we can write that down and sort of more carefully write just a better font so say this is it this is my ideal point and this would be if i do this out it's not in two-dimensional space like it's
multi-dimensional space i can't draw it because this is six dimensions right but this is where computers are nice i can code this in a computer and it's just a vector of length six well now i've got to decide where do i go for lunch if i go to mcdonald's get a big mac or go to burger king and get a whopper what we can do is we can say okay let's look at my ideal point which is right here and let's look at the big mac the big mac has two pieces of cheese which is
great two patties which is great no tomatoes not so good not enough ketchup not enough mayo for me and a few too many pickles you know like pickles seems to have you know a few too many pickles so i could ask how much do i like that big mac what we can do is we can take my ideal point and the big mac and take the difference so here the difference is zero here the difference is zero here the difference is two here the difference is one here the difference is zero and here the difference
is two now notice i've got these little lines on the other on the side of difference that means i'm taking the absolute value of the difference so otherwise because there's you know it's got two too few tomatoes if we put a minus two here and a plus two here the two few tomatoes and the too many pickles would cancel out right so what i can say i can add all these things up and say that my total distance from the big mac is five let's suppose i walk across the street and now i decide let's
see how the whopper stacks up well the whopper has um two pieces of cheese that's great it's got one patty sets off by one two tomatoes that's great a little not quite enough ketchup but perfect amount of mayo perfect amount of pickles so it's only off by two so my distance from the big mac right that if we go back was five and my distance from the whopper was only two so we could you know represent that by here's the whopper and here's the big mac this is only a distance of two for me and
the big mac is a distance of five for me so what i could do is i could after writing down my ideal burger i can look at the big mac look at the whopper and say you know i'd rather buy the whopper because it's closer to me it's closer to my ideal point now this is really cool because this is a way this is the thing i can use to figure out you know what should i choose what burger should i buy it also i can use this to figure out who should i vote for
because instead of thinking of these as big macs developers i could think of that maybe there's two dimensions to policy right so one dimension could be some sort of social policy between liberal and conservative and there could be some sort of fiscal policy right between liberal and conservatives so this would say i'm sort of socially liberal but on fiscal dimensions i sort of lie in between liberal and conservative and so this would be my ideal point not in sort of big mac whopper space but in political space and then what i could i could vote
for the party or the candidate that's close to me another thing we can do this model and this is sort of cool is remember we talked about how we can use this positively so suppose i watch one of my friends right what i mean by positively right is to figure out explain why we see what we see so suppose i watch one of my friends and i and i go in and i see that my friend doesn't go to burger king my friend goes to mcdonald's and gets the big mac but i don't know anything
about my friend's ideal points but i do know about the big mac and the whopper well notice the big mac and the whopper are the same on this dimension this dimension this mixture same nanomachines same amount of ketchup same amount of mayo so what i could do is i could say just wipe out those categories right because they're the same and then it comes down to number of patties tomatoes and pickles so now if i see my friend buying the big mac what can i infer i can confer either that they like sort of two
patties or that they don't like tomatoes or if they really like pickles or some combination of those things and so what i can do by looking at choices i can understand what someone's ideal point is and again once we've got this idea in our head we can go to data and we can figure stuff out so for example let's look at political parties so what you can do and this is a map by michael toffius using some data from poole and rosenthal can nominate scores and what this does is it takes every single member of
congress and it looks at all the different votes now based on their votes you can figure out how conservative are they and how liberal are they now nominate breaks things down into two dimensions you think of this being one dimension another dimension so just for our purposes let's suppose this is a social dimension right and this is a fiscal dimension right so this is money up and down here right and this is more policies on this dimension so what you see is you see look all the republicans lie to the right on the social dimension
and the democrats all lie to the left and if you get this is a particular map looking at the tea party which is a movement within the republican party and i think the tea party people are pretty well evenly mixed so what you can do is by taking this modeled data right you can figure out by looking at the choices people made this is sometimes called revealed preference you look at the choices people make in this case you know politicians voting and you can map out where they are ideologically so you can notice the democrats
are all the lefty republican on social issues but on the fiscal dimension it's a little bit more complicated right okay so that's spatial models they're really cool right we can use spatial models to figure out sort of what we should you know what where we should buy indian food who we should vote for which car to buy whether to go to burger king or mcdonald's right by looking at these dimensions and seeing how close something is to our deal point now we can also take these models to data and do more serious things we can
figure out you know where supreme court justices are and where members of congress are ideologically whether it's on one dimension on two dimensions we can do the same thing for products right so we could do that same sort of mapping and look at different products in space whether it types of whether it's types of coffee right whether types of automobiles and we could look at people's decisions on which to buy and we could figure out sort of our people where people's preferences really lie based on the cars they buy or based on the coffee they
buy so spatial models really powerful helps explain what people do and helps us make better choices ourselves thank you in the previous two lectures we talked about multi-criterion decision making and then spatial decision making where we want to go next is decision making under uncertainty where there's some probabilities involved so to do that first we want to take a little time out and talk for a moment about probability now if you've already taken a class on probability or if you know a lot of probability you can skip this little unit if you haven't this will
give you enough understanding that you can you know need what you you'll know what you need to know to do what we're going to do with respect to decision making under uncertainty so probability our probabilities are just the odds that something happens and so when you write down probabilities they have to satisfy three axioms first axiom is that any probability is between zero and one so if something can happen it's probably zero something's definitely going to happen this probability is one now even if you're 100 if you're totally sure something's gonna happen the property can't
be bigger than one so you can't say i think there's a hundred and ten percent chance this is gonna happen no it's gotta be between zero percent and a hundred percent second axioms are more complicated we have to make a distinction between outcomes and events so an outcome is just an individual thing that can happen an event is a subset of outcomes so if i write down all possible outcomes then the sum of those probabilities has to equal one so if i think about flipping a coin right there's two outcomes heads or tails the probability
of heads is a half the property of tails is a half and when i sum those two things together i get one that's the second axiom easy third axiom if i have an event now an event would be a set of outcomes and the event a is contained in the event b then the probability of a is less than the property of b so one event might be that i get a heads i mean write these as little sets another event might be that i get a head or a tails well the probability of getting
a head is a half the probably getting a head or tails is one and since getting a head is a subset right of getting a head or a tails the probability of a head one half is less than the probability of getting a head or tails which is one that's the third axis so that's it those are the three things probability of any outcome or event is between zero and one could be zero could be one but it's somewhere in that range that if i add up the properties of all the different outcomes those them
up sum up to one and then if i have one event that's a subset of another event this is this axiom then that first event has a smaller probability than the second event so that's the axioms now there's actually three different types of probabilities first type of probability are classical probabilities so these are the sort of things that mathematicians play with and we think about things like dice and roulette wheels and things like that so for example if i roll a die i can sort of logically or classically assume that the probability of you know
getting a 4 would be just 1 6 and the probability of getting an even number would be one-half and the property of getting an odd number would be one-half so this is classical probability where you can sort of write down mathematically in some pure sense what each probability would be now there's a second type of probability which is frequency so here like with a with a die we know that it's going to be a sixth because the dye is sort of equally shaped but there may be other things where we don't know but what we
can do is we can count we can sort of do a frequency count so we've got lots of data and we can look at all that data and from that data we can you know make an estimate of what we think the probability is so for example supposed to ask you the following question do more words begin with r or do more words have r as their third level let letter right so this is the question now what you could do is you could just guess right you could think well i'm guessing that two percent
of words have r as their third letter letter and eight percent of words begin with r another you do is you could open up the dictionary and you could count right so you could just first of all you could just sort of look at how many pages are there that seem to begin with r and you could maybe get that you know six percent or something begin with r and then you could randomly flip through the dictionary looking at words and see what percentage of words have r as their third letter and you might find
that that may be like 11 or something and you might find out oh my goodness that this is actually bigger well what you're doing is you're sort of estimating through frequency what the probability of having r is its third letter is and estimating through frequency what the probability of having r is its first letter is so frequency just means you count things right and then you figure out the probability from there so it's not a pure probability like rolling a die that it's one-sixth but it's just how often it seems to happen so if you
look at something like is it going to rain next july 7th or june 7th i'm sorry what you can do is you could go back and look over the last 100 years and you can say give it a hundred years of data and on 26 of those days it's rained and on 74 of those days it hasn't rained and so then you could say i think the probability of rain is 26 again this isn't like rolling a die this is just counting it up right and this is a frequency estimate of what the probability is
when you make these frequency estimates you're making some strong assumptions one of which is is that what we call stationarity that nothing has changed over the last hundred years that the probability of rain has been stationary it hasn't changed and so this is a good predictor so ideally right we know classically the probability of something and if we don't know classically then the next best thing would be to use all that data we've got it there in the world and do a frequentist account sometimes we can't do either one of those things and we're stuck
with subjective probabilities so these are cases where we kind of have to guess or have to and we'll talk about this actually what we'd want to do is use a model right we want to have some sort of model we could use to figure out how what a subjective probability is so for example here's um a case that is sometimes given by psychologists so shelley majored in political science and was very involved in college republicans write down probabilities for the following events so i've got let's think now so now i said anything shelley's a political
scientist right that's sort of interesting she's a republican so that means maybe a conservative political scientist you know maybe she's you know more interested in money what are the properties she'd do these things well flight attendant i might think well boy that's not very likely right five percent blogger i could think you know maybe blogging maybe there's a 10 chance she blogs because you know she was a political science major and she was a republican so maybe she likes to vlog flight attendant while finishing her mba well that seems actually pretty reasonable let's give that
a 10 chance and then medical field let's say you know medical a lot of people in the medical field so let's just put a 15 chance he's working the medical field because that's about what you know the base rate for what people work in the medical field so these would be my probability estimates just subjectively writing those things down well let's look at these little more carefully i did something wrong what did i do wrong remember our three axioms what were our three axes axiom one was right that the probabilities had to be between zero
and one right axiom two was that all probabilities summed up to one so the sum was one and the third one had to do if event a was contained in event b so let's go back and look at what i did what did i do i assumed event a that she was a flight attendant that this was only two five percent of the time and event c i had the problem that she was a flight attendant while finishing her mba was 10 well this can't be right because if she's a flight attendant right that's this
event event a right contains event c so if she's a flight attendant while finishing her mba then she's also a flight attendant so this number right this 10 percent has to be smaller than the 5 so i made a mistake in fact this example when i remember i said psychologists like to use this this is an example where we see a bias where people make mistakes and the way the lecture will talk about these sort of biases but so subjective probabilities are dangerous because when you start writing down numbers right we may not satisfy those
axioms and so that our probabilities don't make any sense so suppose somebody asked a question like will housing prices go up next year how do you do it well one thing you could do is you could just guess you could say well i think there's a 30 chance housing prices go up next year what we want to do in this course right is think about well why would you think that right maybe we should write down a model maybe we should think about what are all the things that could go into affecting housing prices and
then based on those things think about which direction the economy is moving and then from there make some sort of assessment about prices are going to go up so when we think about cases where we don't have a classical probability right when you pick up our property probability textbook they'll say wait there's only there's two things you can do one is you can do a frequency method and the others you can use subjective subjective method we're actually going to argue for a third way which is even though these probabilities are subjective we want to think
of these as sort of model based probabilities what we're going is try and construct the model and based on that model figure out what we think the probability of event will be here we have probability a nutshell right there's three axioms if probabilities are between zero and one probability if you add up all the possible outcomes that is the sum to one and if one event contains another event it's got to be more likely that's all that's it those three x's and again there's three types of probability one is classical we know sort of mathematically
why a probability is what it is second is frequency based where we've got all sorts of data and based on that data then we know we have some good estimate of what we think the probability is third case is what's often called subjectives where we don't have data and we don't have a classical reason and so we sort of gotta we gotta guess well rather than guess what we can use we can try and construct models and use the model to give us some you know estimate of what we think the probability is going to
be and so these probabilities then are going to come into play in the next lecture when you think about how do we make choices when we don't know something for sure we know there's some probability of it raining or some probability of prices going up so that's we're moving next decision making where we've got uncertainty in these probabilities thank you hi welcome back the previous lecture we talked about probabilities and we did so because we're going to use probabilities in this lecture to talk about decision making under uncertainty and to do that we're going to
introduce a new technique a model known as decision tree models now this decision tree models are really going to be useful in terms of making decisions when there's lots of contingencies when there's probabilistic events when we don't know the future state of the world so big reason we want to learn this model is just to be better thinkers to make better choices make better decisions rather than just sort of throw up our hands and say i can't figure out what to do i think i'm going to choose this now there's going to be two other
reasons as well one is going to be we're going to use them to infer uh things about the world about other people's choices so we're going to see someone's choice and from that we can get some understanding of how that person thinks about the world so we can again use it to explain what's going on and then the third reason for fun is we use these decision trees to actually maybe learn a little bit about ourselves a fun example at the end so let's get started what is the decision tree decision tree is pretty straightforward
what you do is you think i've got some choice i can make maybe i can you know buy something or not and you know if i don't buy it maybe my benefit is zero and if i buy it maybe my benefit is plus five well if that's the case then i should buy it right because it's got a positive value so decision trees gets this really simple thing where we draw branches representing our choices and then we choose the branch that has the highest payoff well we want to do this though when the choices are
a little bit harder and there's all sorts of contingencies and probabilities so here's a example imagine the following scenario you're planning a trip to a city and you've got a ticket to go to a museum let's say from one to two and suppose the museum's quite a ways from the train station so you look at train ticket prices and you see you can buy a ticket for the three o'clock train for only two hundred dollars but the four o'clock train is four hundred dollars you're trying boy should i buy that you know should i try
and save money by buying that ticket for the three o'clock train before it sells out now there's a 40 chance you're not going to make the train so you're not going to think oh my gosh should i buy the ticket or not give this a 40 i'm not going to make it and if i don't make it then i'm gonna have to buy two tickets i'm going to basically throw away the 200 well how do we make that sort of choice well it's not very hard what we can do is we can draw a decision
tree now the way draw these trees is we're going to put a square box to represent a decision node and i've got a choice do i buy or do i not buy well it's not quite that simple right because if i buy there's some possibility a 60 chance let's put a six here that i make the train in a 40 chance that i'm late so now i've got to decide okay what do i do because that's a little bit more complicated well to finish this off to use this tree so i can make a good
decision all i've got to do is put the values of each choice so if i don't buy the ticket then i'm going to be out 400 so i've got to buy the expensive ticket if i do buy the ticket and i make my train that's great because what happens is i'm only out 200 but if i buy my ticket and i'm late then i'm out 600 and so now i've got all the information that i need i've got all my payoffs at the end of each branch i've got the probabilities of each branch right 60
and 40 and so i just have to figure out what's the best choice for me so let's make this all so let's make this all nice and clean so here's all my data and i've just got to say what's the better choice it's clear if i don't buy the ticket i'm not 400 so what if i buy it well there's a 60 chance that i'm at 200 plus a 40 percent chance that i'm out 600 dollars well if i add this up that's a hundred and twenty sixty times two hundred plus two hundred and forty
and 120 plus 240 is 360. so what i get is if i buy the ticket right i'm out 360 and if i don't buy the ticket i'm out 400 so it's a fairly easy choice right buy for 360 right or don't buy for 400 and here since i want to have the lower cost what i'm going to do is buy so that was a fairly simple example let's do one that's more complicated suppose you think about applying for a scholarship and there's it's worth five thousand dollars that seems like a pretty good deal and they
limit the scholarship to 200 applicants and you go into the you know the office and you realize you know you're you can be one of the first 200. now for the scholarship right you have to write a two-page essay and after you write a two-page essay explaining why you deserve the scholarship they're going to pick 10 finalists and those finalists are going to have to write 10 page essays so i got this choice i could you know you could basically get 5 dollars that's a lot of money but you've got to write these two essays
a two-pager and then if you make it as a finalist a 10-pager and there's some probability of making as a finalist and some probability of winning so you look at this and you think how do i make this choice well again what do we need to know we need to know the probability of events happening so the probability of making it to be a finalist to the probability of winning and that's pretty straightforward and we need another payoff so we know the payoff from the scholarship but we need to know the cost of these essays
so to use the decision tree the first step you're going to make is to figure out the cost so let's suppose you figure well what's the cost to me of writing a two-page essay and figure well maybe 20 bucks maybe it's worth 20 worth of my time to write a two-page essay now what about the ten pages well the ten page essay could say well maybe that's only forty dollars and this is only forty i've already written the two pagers i've outlined my ideas and i'd be sort of excited about having to be a finalist
and and it's not that hard to expand on ideas so let's just assume forty dollars so now i've got everything i need i've got my benefits and my cost and all my probabilities so i just have to draw the tree right well that's right first step is draw the tree but then once i draw the tree i've got to write down all those payoffs and probabilities so i've got to make sure i've got everything in their right and once everything right i can solve it backwards just like i did before i can figure out the
value of each branch right and then figure out what choices i should make so let's draw the tree the question is do i write the essay or not if i don't write the essay i get nothing and if i write the essay well now it's more complicated because what can happen well there's going to be some random note here where i could be selected or not and then if i'm selected i can decide whether i want to write sa2 or not but i probably will and then there's going to be some random thing whether i
win yay that'll be great or whether i lose and that won't be so great but what i want to do is not have these smile and happy face the happy faces and sad faces i actually want to like put in the numbers so let's do it this isn't too hard again if i don't that's zero um what's the probability of selected well 200 applicants 10 make it to be finalists so we can assume this is 5 right 0.05 and there's a 95 percent chance i lose now essay 2 i can either do it or don't
do it and then if i win there's here's another chance node right here what are the odds of me winning the odds of me winning here are ten percent one out of ten and there's a 90 chance i lose so those are all my probabilities now i got to figure out my payoffs well i mean if i don't like the essay my path was zero if i write the first essay and lose i'm out twenty dollars so that's minus twenty if i'm selected but then don't write the second essay which is sort of a crazy
thing to do i'm also at twenty dollars if i do write the second essay and lose i'm out sixty dollars because i wrote two essays one for twenty one for forty but if i win right then i get 49.40 i get the 5 000 minus the 60 dollars for the cost me writing the essay so let's clean this up a little bit so here's the total analysis right here's the beautiful game tree with all my probabilities what i've got to do is i've got to figure out what's my payoff right what's the payoff from doing
this thing so let's we just work our way backwards so let's start right here if i win there's a 10 chance i win that's 49 40. so i can take .1 times 4940 plus 0.9 times minus 60. well what is that 0.4 times 4940 is 494. right and 0.9 times minus 60 is minus 54. so what i get is 440. so what i can do is i can put a 440 right here i can basically wipe out all this stuff over here and put a 440 there now so if i look at this question then
do i write essay 2 it seems really obvious right if i write the essay my expected winnings are 440. if i don't write the essay my expected winnings are minus 20. so again let's clean this up so if i write the essay 440 if i don't write it to minus 20 it seems pretty clear right that i should write the essay so now it just comes down to this if i write essay 1 there's some chance i'm going to get selected if i'm selected my expected winning is 440. if i'm not selected i'm going to
end up losing 20. so what's this worth well 440 i'm going to get that 10 5 of the time but 95 percent of the time right i'm going to lose 20. so i've got to add these two things up well 440 times 5 is 22 and minus 20 times 95 is minus 19. so if i add those two things up i get three so what that means is i could replace this whole branch in working backwards with a three so now if i look at my decision should i write the essay or not if i
don't write the essay i get nothing and if i write the essay my expected value is three dollars so what should i do well i should probably write the essays it's got a positive expected value now the interesting thing here is if there have been 300 applicants or 400 applicants right maybe i don't want to write the essay so what the tree does what this decisions your analysis does is it helps us figure out was it really a good thing to do so that's how you use decision trees to make decisions let's do something a
little trickier with them let's do something where we try and infer what other people think about probabilities so suppose you have a friend and they say look i know about this investment and it sounds a little risky to you and they say it's going to pay 50 000 you know i'm almost sure and you've only got to put two thousand dollars in she says look i'm in i'm investing my 2000 bucks this is a great deal you should invest so you've got to decide you know do you want to invest well the first thing i
think is what does she think the likelihood of this thing really is what we can do is we can draw a tree and say you know i can invest or i cannot invest and there's some probability that this will succeed and there's some probability that's going to fail and if it succeeds she's going to make 50 000 and so would i and if it fails i'll lose 2 000. so let's try to figure out what our friend is thinking so what our friend is thinking is that 50p right minus 2 times 1 minus p is
bigger than 0. so she's figuring the end of this branch right here before chance takes its move is higher than 0. so if i work this through it says 50p minus 2 plus 2p is bigger than 0. so if i bring the p's all to this side right i get 52 p has got to be bigger than 2. so what she's assuming is p is bigger than 2 over 52 or about you know right around four percent so now i can look at this investment think do i really think there's a four percent chance it's
going to pay off clearly my friend does because she's in and i can and i can decide whether or not to make the decision or not i can also infer this is the key point i can infer from her decision that she thinks that even though this is risky that there's way more than a four percent chance that it's going to happen because otherwise she wouldn't put her money in it okay so decision trees even if we don't know the probabilities if we look at someone else's actions we can infer what they think the probabilities
are now one last thing that's sort of fun we can use these trees to infer payoffs and sometimes we can use them even to infer paths about ourselves like how we think about things so here's the scenario it's kind of a fun one you've got a standby ticket right you've got some standby ticket to go visit your parents you call the airlines in the morning of the flight it's like there's a one-third chance that you're going to make the flight two-thirds cancer probably not going to make it so you've got to decide do you go
to the airport right or to just stay on campus and not go home for the weekend well suppose you decide not to go you just had to stay at the airport you can use a decision tree to find out exactly how much you really wanted to see your parents what do i mean by that well let's see so here's the decision you stay on campus and let's suppose you're let's just make that a baseline payoff of zero you can go to the airport and there's a one-third chance you're going to make the flight and let's
call this v the value of senior parents now there's a two-thirds chance right and we'll we'll put in a little cost here minus c for you know a couple hours of your time to take the taxi to the airport and back or take the train to the airport and back alternative you cannot make the flight and there the cost is just going to be the straight minus c well since you chose to stay at home what that means is this that means one-third times the value of seeing your parents minus the cost of going to
the airport right plus two-thirds times minus c the cost of going to the airport has got to be less than zero what that means is one-third v and if i add up to c's minus c is less than zero so if i work all the way through this what this means is that v is less than three c so it means your value of going to see your parents is less than three times the cost of going to the airport what that's telling you is well maybe i didn't want to see my parents very much
now if you did go to the airport and try and fly standby that's saying the opposite that's saying that v is bigger than 3c and it's saying that you really did want to see your parents which is a great thing so i'm sure your parents would love to see you okay so we've done decision trees here lots of fun what we've shown is when we've got these decisions to make where there's lots of probabilities and contingencies these trees are really helpful they're really useful in helping us make at least reasons decisions now again you don't
have to hear what the model tells you to do but the model is again a crutch and aid a guide to help you making better decisions we also saw you can use these trees to infer what other people are thinking about probabilities right so my friend made that investment we could infer that she thought there was a at least a four percent chance that thing was going to pay off and the last thing you can do is after the fact you can think i made this choice what is this choice saying about how i think
about the world or how i think about my parents depending on what you thought the probabilities were okay thanks a lot hi in our last lecture in this unit what we're going to do is we're going to look at something called the value of information now in the previous lecture we looked at how to make decisions under uncertainty and by definition right you don't know what's going to happen that's what uncertainty means we can ask a question of those models what if i did know the answer how much would the information be worth to you
and here's one of the real benefits of having this formal model is formal structure so we've got this formal structure we can actually figure it out we can actually solve for what the value of that information is so let me give an example think of the game roulette so roulette has a wheel and the united states roulette wheel has 38 different numbers you can bet on right the numbers one through 36 plus two other spots so there's 38 things you can bet on and if you put up you know if this is a classical probability
thing right if i guess a particular number the odds of me winning would be 1 out of 38. so what i could do is i could say what's the value of information what if somebody knew what was going to happen was two different questions we could ask the first question we could ask is what if the person would tell me what if i said look i always bet on number 17. 17 is my lucky number what are the odds that 17 wins someone says well you know i can tell you that before the wheel goes
what's is it worth it for you what's the value of that information well first off you're i thought you're going to lose at roulette right and so if you're going to lose you wouldn't want to bet anything so if you didn't have information you'd bet nothing so your payoff was zero but what's the information worth to you well suppose if they tell you this you could win a hundred dollars well then you think well boy that information is great that's worth a hundred dollars to me that's not quite right because it's only worth a hundred
dollars if he tells you that your number is going to come up so if he says it's not going to be 17 well then you don't bet but if he tells you it is going to be 17 you do bet and you win and you're only going to win 1 38 of the time so the value of that information would be 100 divided by 38. now alternatively suppose the person said i can tell you the winning number and suppose the most you're allowed to win is a hundred dollars per round but this person's like i
can tell you the winning number well if they tell you that they can tell you the winning number you're going to win 100 bucks so if you're gonna 100 bucks then the value of that information is a hundred bucks so the value of the winning number would be a hundred dollars the value of knowing whether your number wins would be 100 divided by 38. okay so that's the idea but now we want to apply this in a context that's a little more complicated where we do our decision trees and that sort of thing so let's
do an example let's suppose you're thinking about buying a car and you can buy the car now but you're worried that there's going to be a cash back program so you've heard some rumors in a month there's going to be a cash back program and this cash back program is getting the thousand dollars cash back and based on let's suppose you've done a frequency analysis of the number of years in the past they've had a cash back program and you figure there's a 40 chance they're gonna have a cash back program so what you can
do is you can rent a car for 500 right now and then wait and hope there's a cash back option so what we wanted you want to figure out what would it be worth to you if someone could tell you suppose you knew someone at the auto company and they said well you know i can tell you whether having a cash back program or not but it's going to cost you right so what would that information be worth how much would you pay to know if there's going to be a cash back program that's what
we want to figure out okay first off i want to say that's not an easy question right so if i said what's it worth to you i don't know 50 100 200 300 who knows right that's what we want to try and figure out so to figure out the valid information we just have to do three things first we're going to calculate the value without the information so we're going to say suppose i just have to make my choice what's my optimal choice how do i come out second i calculate the value if i had
the information so if i knew what was going to happen what would i do what would my net value be and then third i just take the difference i take the value with the information minus divided out the information evaluate out the information and that tells me sort of what the information was worth totally straightforward easy to do provided again we use this decision tree model without the decision tree model it's going to be pretty difficult so here's my choice do i rent or do i buy right now if i buy then you know i'm
just out nothing right there's just my net values we just have this my baseline value if i rent then there can be the cash back program right or there cannot be the cashback program okay so let's make this more formal right so if i buy i get nothing if i rent then and there's no cash back program then i basically wasted 500 renting for the month but if there is a cash back program i net 500. why 500 because remember there was a thousand dollars right cash back minus the 500 to rent so that's 500.
now what was the probability of the cash back program was 40 and what's the probability there not being a cash back program that's 60 so now i got to figure out okay should i rent or not well look i've got 40 times five hundred and sixty percent right little point there times minus five hundred so if i multiply that out point four times five hundred which is two hundred plus point six times minus 500 which is minus 300 i get minus 100. so what i get is if i were to rent i'm out a hundred
dollars so if i can basically draw my tree it's pretty straightforward if i buy i get zero and if i rent i'm at a hundred dollars so what i should do is i should buy so without the information right think about what i do without the information i should just buy and i'm sort of at this net zero case so that's what i've done i've calculated the value without the information now what i want to do is calculate the value with the information suppose i knew what was going to happen so now i can just
draw another tree and this tree is going to look different than the first tree because here's what's going to happen first this chance note is going to be revealed to me somebody's going to tell me is there a cash back program or not and 40 percent of the time they're going to tell me yes there is 60 percent of time they're going to tell me no there's not let's think about what's going to happen if they tell me there's a cash back program then i'm going to rent and get 500 if they tell me there's
not going to be a cash back program then i'll buy and i'm just in the same situation i was before so if i have the information first then i've got easy choices once it's my choice because look once information is revealed there's no longer any uncertainty so with the information i'm no longer making a choice under uncertainty so clearly if there's a cash back program i rent and i'm up 500 and if there's not a cash buy program i buy and i'm based back to this baseline case of zero so what's the value if i
knew this information first well 40 percent of the time i get 500 and 60 time i get nothing so it's worth 200 so with the information my expected value is 200 so let's go back calculate the value without the information zero calculate the value with the information two hundred dollars calculate the difference two hundred dollars right to make this more formal with the information 200 bucks without the information nothing so therefore the value of that information is 200. in this case right this was a fairly simple one but we could go back to both of
the two examples we did in the last lecture one about buying the train ticket and the one about writing the essay and we could say suppose we knew what was going to happen suppose we knew if we were going to make the train and suppose we knew if we're going to win that scholarship we could ask what's the value of that information how much would it be worth to us to know what was going to happen and again you just do the same thing right we call that same technique solve it under uncertainty without information
solve it as though you knew the answer and then just take the difference between those two values so what have we learned here we've learned that we could take a model developed for one purpose right this decision tree model we developed that to figure out how to make decisions under uncertainty and we can repurpose it we can use it to figure out the value of information we can figure out how much would it be worth for us to know what the future is going to unfold how the futures can unfold and it's really straightforward to
do right it's a three-step process first we figure out what we do without information then we figure out what we do with the information and then we can see how much better off we are with information and that tells us the value of the information and that would be in many cases be a very hard thing to do in our head but with this simple tree model it's really straightforward and easy to do so again we see the power of models to help us make better choices thanks hi in this lecture we're going to step
back a little bit and we're going to think about how do we model people because a lot of models are going to be concerning us are models of you know people and groups of people like firms and governments and organizations so if we want to write good models of those things then we've got to have good models of the parts good models of the people okay modeling people's tricky the physicist marie gelman once famously said imagine how difficult physics would be if electrons could think so what did he mean by that what he meant was
that you know if you take an electron or a carbon atom or even a water molecule it doesn't think it doesn't try and make sense of the world it doesn't have any goals or objectives or anything like that no beliefs and so it's pretty straightforward to you know model how those things function when you look at people people are much more complicated right we're purposeful we've got goals we've got objectives we've got things we want to do we've got ways of seeing the world we've got belief structures we're messy and because of that you just
don't quite know how we're going to behave now on top of that we're diverse right we want different things we have different goals and objectives so this combination of sort of purposeful thinking actors who are different means that it can be really hard to understand what they do and how they act so how do we do it well we're going to talk about three basic frameworks the first framework is called the rational actor model now in this framework what you do is you assume that people optimize now this is unrealistic and i'll talk about this
in the next lecture in some detail but it's a good benchmark so one way to think about it is just to assume let's just assume people have some sort of goal and they optimize their goal okay second thing we can assume what we call a behavioral model now here what you do is you sort of gather up all sorts of data about how real people actually do make decisions and choices and act and then we try and do is model people as close as possible to how real people behave now of course you can't make
it too complicated we sort of try and include you know the one or two different things from perfect rationality like the biases that people might have right and the third thing we'll look at is even simpler which are sort of rule-based models so here we're going to do is instead of really digging deep into psychology we're just going to assume that people follow rules and then see how those things add up now what we're going to see is in some cases which of these three things we assume matters a lot and in other cases which
of these three things we assume doesn't matter very much at all and so how we model people and whether it's really important that we get it exactly right it's going to be a function of the particular model we're playing with and we'll look at that a little bit later on in this unit okay but first let me just give a little bit more background so how does a rational actor model work in a rational actor model you assume that there's an objective function there's literally a mathematical function that you know someone's trying to maximize right
so that could be if you're a person maximizing happiness or utility if you're a firm it could be maximizing profits or market share or if you're a government if you're running for office it could be maximizing the number of votes then what you assume is that people optimize given your objective what you do is you do the optimal thing so you take the choice that makes you as happy as possible you make the political decision to get you the most votes and you produce the product that makes you the most money let me let me
more specific so suppose you're an in a conversation i think how many hours should someone choose to work well what you do is you write down a function a utility function say the utility depends on consumption and on leisure and you might assume this like this one it's the square root of consumption times the square root of leisure now why would you assume that well the square root function right starts out up and then sort of slowly falls off so that means the first bit of consumption is really good but then it becomes worth less
and the first bit of leisure is really good right the first hour vacation is great but by a week you're sort of ready to get back to work so that also falls off they've got what they call diminishing returns so this function sort of says consumption is good but becomes less good the more you have leisure is good but it becomes less good the more you have well this is your function and then what you do is you just choose consumption and leisure depending how much they cost to maximize that now that may not be
exactly what you do you might not be sitting at home writing down functions and you know solving these equations for what to do but economists sort of assume that you do that it's as if you do that right you come close enough to doing that that this is a worthwhile model now the rational actor models come under a lot of criticism and particularly of late according to you know just basic data right and so there's a movement within economics called the behavioral revolution but this has also been going on in psychology for about a hundred
years maybe that is you should actually instead of assuming people ration irrational you just go out and look and see what people do and if you observe what people do you'll find out that they're not rational and they're not rational in systematic ways recently this whole research paradigm has been really propped up through evidence of neuroscience right so you can actually look at the structure of the brain look at how people think in particular situations and you can see in fact why they're thinking in ways that the rational actor model would consider to be irrational
so those are sort of two benchmarks and the one you consume people are rational on the other hand you can assume that well people sort of do what people do now this other thing this people doing what people do is going to be a lot messier so therefore there's a third way and the third approach comes from you know again social scientists but also from some computer scientists and even some psychologist and that is to assume that people follow rules it's about our shelling model right we didn't have a very elaborate model of how people
behaved there all we did was we just assumed that people moved out of the neighborhood if you know the neighborhood became too much unlike them so this is just a simple rule and if that simple rule is close to what people do that may be sufficient to work in the model okay so what do we got we got these three basic frameworks people optimize people are sort of behavioral they do what people do and people follow rules each of those in a given situation will give us sort of slightly different predictions about how people behave
right and when we start aggregating them and having them interact we could get very different conclusions but some of the other cases where we really don't see that much of a difference depending on what we assume but what we want to do now in these next couple lectures is we just want to sort of think through the logic of what a rational actor model means visit some of the biases that we've seen that you know psychologists have seen when they look at how people actually behave and then think through sort of what a rule-based model
might look like and we'll conclude by sort of comparing all these in a couple of settings and see when it doesn't matter and when it really does matter okay thank you in this lecture i'm going to talk about our first approach to modeling people and this is known as the rational actor model now the rational actor model assumes that people are well rational we make optimal choices and it comes under a lot of criticism increasingly so as people are dissatisfied with some of the results it produces nevertheless i'm going to argue this really useful way
to think about people especially when you're constructing models so what i'm going to do in this lecture is to find describe how it works i'm going to describe the rational active model first in the context of decisions and then i'll do it in the context of games which are strategic interactions where my choice depends on your choice after i do that i'll talk about you know give an example where it sort of breaks down but i'll talk about sort of why i think the rational actor model is really a useful thing to have in your
pocket well it's a useful way to analyze situations okay so let's get started so how does the rational actor model work well what you do is you assume that people have some sort of objective that objective could be you know anyone a variety things but there's some goal or purpose that somebody has or a group has or a firm has given that objective we assume that you make optimal choices that you optimize again strong assumption but that's what the approach assumes there's an objective and that people optimize so how does that work well let's suppose
it's a firm if you're a firm what you might want to do is you might want to maximize profits or you might want to maximize profit share market share i'm sorry right or you might want to maximize total revenue those are all things that a firm might want to do if that's your objective what we assume in the rational actor model is that you you do that you make the choice that maximizes that goal now if you're a person you might care about maximizing your own utility making yourself as happy as possible or if you're
altruistic you might care about not only yourself but other people again the presumption though is that whatever your objective is that you make optimal choices to satisfy that objective to get us to do as well as you can possibly do if you're a political candidate right what you might do is you might care about getting as many votes as possible so that's your goal that's your objective function get votes the rational action model assumes that you take the action make the choice that gets you as many votes as possible okay so where can you apply
this where can you apply the rational active model well let's take this simple case of a firm and what they want to do is they want to you know maximize let's say revenue instead of profits and suppose that the revenue we can write it as this way right it's just price times the quantity that's how much revenue we're going to get and let's suppose if we let the quantity be q that the price will be 50 minus q now why would this make sense well if the queue is 10 if i only produce 10 of
these things then there's not money to go around and maybe i can charge 40 a piece for them right but so the price would equal 40. but if i produce more of these if i produce 20 then there's more to go around and that's going to cause the price to fall and the price to fall to 30. so in this first case i'd get a revenue of 400 in the second case i'd get a revenue of 600. so the question is what q do i choose to maximize my total revenue and if you think of
this my total revenue is just q times 50 minus q so the optimal thing to do is going to be to have those two numbers be equal so to choose q equals 25 so i get 25 times 25 which is 625. so my optimal choice is q equals 25. so what a rational actor model would assume is the firm wants to maximize revenue that's the objective function and then given that he wants to choose a quantity that will do so so it chooses q equals 25. so where can you apply this you can apply this
just about anywhere so you can think about if i'm making investment decisions i've got some objective and that may be to maximize the value of my portfolio or to give me some sort of nest egg to retire on and so i'm going to make choices that maximize that or think of my purchases right or your purchases like when you go to the grocery store or if you think about buying furniture for your house you could assume you've got some objective function and what you do is you make choices right that are optimal given that objective
even things like education level you said how many years of school should i get take right should i just get a bachelor's degree should i get a master's should i get a phd well you could assume that like you've got some objective function which could be you could maybe care about income you may care about what sort of life you lead is the life of the mind is it physical labor and you choose how much education to get given your objective now you can even apply these things like how do i vote right cause my
objective could be for you know some policies to be implemented so if i look at the candidates and figure out which candidate is likely to vote you know implement policies that i want now you also probably want to figure out is that candidate likely to win right they don't want to vote for someone who's got you you know spouses my preferences but has no chance of winning so what i do is i sort of choose the candidate who's likely to win or most likely to win who also takes positions like the positions that i you
know prefer here's an issue though that i want to bring up when we assume rationality people often think that that means selfish that's not true so let me give an example suppose i'm walking on the street and i find 100 bucks well and i'm walking with a friend so there's me and there's my friend so one possibility is i could just say you know what this is so cool i just found a hundred dollars i could put in my pocket and give my friend nothing that would be rational if my objective function was just me
if all i cared about was myself but it could be this is a really good friend and i care a lot about my friend and so when i found the hundred bucks they say hey wait whoa this is great i guess 100 bucks so i walk into the nearest store and say can you give me 250s and i give one of the 50s to my friend because i care a lot about my friend so there's nothing intrinsic about the rationality assumption that assumes selfishness so again selfishness would just say that my objective function is me
this is how i put in the framework all i care about is me my happiness my income my wealth right altruistic preferences would be that my objective is that i care about other people as well so i care about not only about the happiness of myself but i care about the happiness of others and i can do all the same mathematics right so here's an example just like the price quantity example involving an altruistic person so suppose i've got somebody's got an income of 40 000 and they've got to decide how much do they consume
and how much do they donate and their objective function is just the square root of their consumption times the square root of their donations and they want to think about how much do i donate and how much do i consume if this is my goal well this is just a mathematical problem right so my donations are just 40 000 minus whatever i consume so this is just the square root of c times the square root of 40 minus c i can bring everything under the square root sign and get the square root of c times
40 minus c well now looking at it this way realize i want to make c times 40 minus c as big as possible and the way to do that is going to choose c equals 20 so that d equals 20. so the optimal thing to do here is to consume 20 and donate 20 to split my income halfway between consumption and donation that's rational it's also incredibly altruistic i could be irrational altruistic and possibly consume less or more than this right um and i could also be irrational and selfish but the point is that rationality
right is in no way assumes selfishness you could be rational and altruistic you could be irrational altruistic now i want to move on to something sort of complicated that is i want to make a distinction between this decision and a game so the previous example that was a decision i had to decide how much to consume how much to donate in a decision my payoff what i get only depends on what i do in a game my payoff depends on what other people do this is where it gets tricky because for me to decide what
i'm going to do in a game depends on what i think the other person is going to do so therefore i need a model of what i think the other person is going to do and oftentimes a really good model to have is to assume the other person is rational and that's a lot of how game theory works a lot of game theory assumes the other person is rational and that allows you to figure out what you're going to do so here's an example let's suppose there's two people let's call these people person one and
person two and this is what's called a normal form game this is a game path i'll explain this in a second so person one can decide whether to stay home or go into the city on a saturday as can person two if person one stays home and person two stays home person one gets a payoff of one if person one stays home and person two goes to the city person one still gets a payoff of one so person one if they stay home their payoff is just one it's also if they go to the city
person once payoff is just two so one if they stay home two if they go to the city person two is a more complicated person person two if they stay home their payoff is also one but if they go to the city their payoff depends on what person one does so if person two goes to the city person one stays home person two gets a path of zero because person two is lonely it's no fun to go to city alone at least for person two but a person who goes to the city in person one
goes to the city person two gets a payoff of four well look at person who's choice here this is hard person two is trying to say do i go home do i stay home and go to the city well if person one is gonna stay home then i should stay home because one is bigger than zero but if person one goes to the city then i should go to the city because four is bigger than one because it'd be really fun to go to the city person two things would be great to go to city
with my friend so for person two to figure out what to do person two has to know what person one's gonna do here's where an assumption of rationality is really useful because if person one says i have no idea what person a person just had no idea what person one is going to do it that clueless then person two can't figure out what to do but a person two says i think person one's rational then person two would say well look hmm if person one's rational if they go to the city they get a payoff
of two if they stay home they get a payoff of one so i bet they're going to go to the city so therefore person two thinks person one's rational person two thinks person one is going to go to the city so therefore person two goes to the city and they get this great payoff okay so that's where when you think about decisions you have to make in the real world you often have to have some model of what other people do and oftentimes a decent model is to assume the other person is rational right that
they're going to do the rational thing let's do another example that was an example we call a normal form game this is an extension an extensive form game now an extensive form game these are sometimes called game trees and we sort of draw the accents action sequentially so here there's a green person right and a blue person so the green person is going to go first and they've got to decide do i go this way and if i do we both get payoffs of zero or do i move down here if we move down here
the blue person gets to move so the screen person's got to decide hmm what do i think the blue person is going to do well if the green person passes it down to the blue person he looks and says well the blue person could move over here and the blue person will get two and i'll get two or the blue person could go straight down and blue person get three and i'll get minus three so if the green person assumes the blue person is rational the green person's gonna say well look three is bigger than
two right and so the blue person's going to move down here well then the green person's going to say boy even though i could get 2 2 i'm not going to get it i'm going to get minus 3 so i'm going to move over here so again here by the green person making a rationality assumption on the part of the blue person the green person can figure out what to do okay so when would we see rationality rationality seems like a strong assumption first cases when the stakes are really large so if you think about
it like if you're just in you know buying lunch somewhere maybe you just follow some rule of thumb right or if you're trying to decide you know exactly how many bagels to buy the bagels bagel store you know maybe you just sort of pick a dozen or something but think about buying a house or buying a car or deciding where to go to college or deciding whether to go to college those are large stake decisions and in those situations it's probably the case that you come fairly close to being rational okay when else when it's
repeated so there's been a lot of experiments on whether or not people are rational what we often find is the first time somebody does something especially like remember that monty hall problem with the three tours we did first time people do stuff they typically often don't get it right right but the more and more you do it people tend to learn and we get closer and closer right to being optimal third case when you have groups of people making decisions now groups can get led astray and you can get group think and terrible choices and
escalation of biases and that sort of thing but typically if you bring in more people you're less likely to make an irrational decision that's why when we're making large stake choices right we often go and ask friends and family and other people who respect so that we're not making these decisions alone so there's some sort of group of us making the decisions and then last case is one reason we make optimal choices is often optimal choices are easy to make and somebody says would you either have 20 or 10 dollars we choose 20. if someone
says would you rather do less work or more work we typically say i prefer to do less work so why then if rationality is often too complicated and why if we just think about it people don't do rash act gradually why why make the assumption so my advisor one of my advisors was again roger meyerson who won a nobel prize in this area called mechanism design which is a sort of a branch of game theory in a way and roger makes this following compelling argument that rational behavior is an incredibly important benchmark is probably the
most important benchmark you think about modeling people why well first off it's unique most of the time not always but most time it's going to be unique so think about the case of the firm deciding how much quantity to produce to maximize revenue or think of the person trying to decide how much to donate to charity there's a unique answer so it gives you this definitely testable amount right you can say this is what rational behavior is okay second thing it's often really easy to solve for so even though we think rationality is hard in
practice we're writing down these mathematical equations right so i've got some function that looks like this it's often very easy to use mathematics to find the optimal point to find you know within the context of our model what someone should do so let's contrast this with irrational behavior suppose i write down some model and say people are irrational well i've got two problems one is it's not unique right there could be a thousand ways to be irrational so i have no real prediction coming from the model the second thing is it may be really hard
to figure out what exactly is it that this person's going to do in this context if i start taking in all these sort of psychological influences and contextual influences and that sort of stuff it's often just easier to say here's their objective function let's just assume they optimize another point another reason it's an incredibly important benchmark people learn and we talk about these experiments that over time people get things right well if over time you're moving towards the rationality assumption then maybe the rationality assumption's sort of not a bad place to sort of start and
then you can sort of say this is where we expect the system to go over time right and then last it can be the case that even if people make mistakes if there's no bias one way or the other in terms of the mistakes those mistakes may darn well cancel out and what you're left with then is something that looks pretty close to rational behavior so some people could spend too much some people spend too middle little and therefore on average you get something that looks close to rational okay so what have we seen what
we've seen is this is that rational behavior works from the following set of assumptions you assume there's some sort of objective function and then you assume people optimize given that objective this could be firms this could be people whatever you want it to be now strong assumption yes but a really powerful benchmark now one of the things we found from doing a lot of experiments when we say we i mean psychologists economists all sorts of people with scientists what we found is there are places where people sort of systematically deviate from rationality that's what we're
going to look at next we're going to get some specific biases where the rationality assumption sort of seems consistently not to hold there's going to be cases where it does hold just in the case where it consistently doesn't hold nevertheless it still can be useful even if you think it's not consistently going to hold to think through your model assuming rational behavior to get that sort of benchmark to see what would rational people do that way when you actually look at the evidence you can see exactly how far from rational people really are behaving okay
thank you hi the previous lecture i talked about the rational actor model in the rational actor model individuals had objective functions and then they made optimal choices made optimal decisions given those objectives in this lecture i want to talk about something called behavioral models now behavioral models are critical of the rational actor assumption and they're critical for two reasons one is there's a lot of observations there's a lot of data from experiments in the laboratory and from just looking at the real world that people seem to systematically deviate from optimal choices it's also the case
as we sort of understand how the brain works the more deeply that there's evidence from neurology that just the way our brain is structured the way we encode and represent information the way we think causes us to systematically differ from what the rational actor model would assume so what i'm going to do in this lecture is i obviously can't give a full accounting of the behavioral revolution within economics or of all of psychology so what i'm going to do is i'm going to just hit some high points i'm going to talk about four well-documented biases
and talk about their implications for when we think about modeling people before i get there though i just want to give a little bit of background so daniel kahneman is a psychologist and he won a nobel prize actually in economics for his work on how people systematically differ from the rationality assumption so he's got a recent book out called thinking fast and slow and in this book he makes the following point you can think of the brain as having two sorts of processes a slower process that's a little bit closer to rational that processes information
and then fast processes that are more likely to work based on emotion or just based on quick clues so as a result these fast processes may make us bias right in ways that the rational actor model um assumes that we're not right the rational economic model assumes that we sort of think slowly and carefully about everything but conomon argues that we think both fast and slow and as a result we make some mistakes now there's another book by cass sunstein and richard thaler who then argued that these biases have real implications for policy and that's
one thing we're going to talk about so it's one thing to say people make mistakes remember in the last lecture i said well some people mistake mistakes high and some people make mistakes low those are just going to cancel out and there's nothing we can only do about it what they argue is these mistakes because they're systematic have implications for how we construct policy and we'll talk about that as well okay so what are we going to do well we're going to do four examples i'm going to take four particular types of biases that are
well documented first one is something called prospect theory and this is the idea that we look at gains and losses a little bit differently second one something called hyperbolic discounting this deals with how much we discount the future and how that changes depending on how far in the future it is third look at something called the status quo bias there's a tendency to sort of just stick to what we're currently doing and not make changes and this is one that has big implications for policy and then the last one is something called the base rate
bias and that is that we can just be influenced by what we're currently thinking so what i'm going to show you some examples of each of these maybe talk a little bit about some evidence from experiments in the real world that you know suggest that these things really are biases and then i'll be a little bit critical of this whole approach at the end and sort of push back a little bit okay all right let's get started so prospect theory suppose i say to you okay you've got two options option a is i can give
you 400 bucks for sure here's 400 right now option b 50 percent of the time i'll give you a thousand dollars i'll flip a coin comes a page you get a thousand bucks about 50 percent of the time when i flip the kind of tails i'll give you nothing so you've got to decide 400 bucks for sure or this 50 50 proposition so if you give this to people a lot of people choose a a lot of people say look i'll take the 400 bucks i suppose i ramped this up so they said 400 million
dollars for sure or a 50 50 chance at a billion dollars or nothing then almost everyone would choose a right so as amounts get larger people tend to be what we call risk averse in gains but here's what kahneman showed this is what prospect theory is suppose it's a loss suppose i say okay option a's i'm just going to take 400 from you or we can flip a coin at half the time i'll take a thousand dollars me which is even more and half the time i'll take nothing well it turns out people are actually
more likely to choose b in this setting because we're risk loving over losses so we're risk averse over gains and misloving over losses and that's different than what you'd get from a rational actor assumption so this is just a systematic deviation and this explains why people take gambles maybe that they shouldn't take okay that's one let's go to the next one hyperbolic discounting suppose i say to you here's option a i'll give you a thousand bucks right now or wait until tomorrow and i'll give you a thousand and five dollars what do you take well
a lot of people are gonna say you know what just give me the thousand bucks today most people said i'd rather have a thousand dollars right now then wait a day they just get five dollars more now suppose i say to you okay here's option a i'll give you a thousand dollars a year from today or i'll give you a thousand and five dollars a year and a day from today well here most people say well look i'm waiting a year anyway what's one more day i'll take option b well if you write down a
rational actor model where i'm trying to maximize wealth and i've got some discount rate or something like that if i in this case choose a then in this case i should also choose a but most people don't do that and the reason why is we discount the near future right a lot more than we discount that same amount of time in the far future so this is called hyperbolic discounting where immediate gratification matters a lot to us whereas and so therefore we discount short periods of time from right now a lot more than we do
short periods of time down the road so this has implications and has like to call the sort of the chocolate cake implication so suppose i say you know i want to be fit i want to stay in great shape i want to be healthy and so if someone said to me okay you know a week from now would you want to have chocolate cake with your dessert or not you know for dessert or not would you forego the chocolate cake i'll say absolutely i'm going to forego the chocolate cake because the thing is i really
want to get in shape the thing is if i'm sitting at dinner and someone puts the chocolate cake in front of me even though i want to lose weight even though like i really want to be fit the chocolate cake's right there i can't put it off because there's just that it's you know this it's just right there in front of me and so in this in conomon's language i'm thinking fast right i don't have the long drawn up thinking allows me to make the rational choice and so what's going to happen is i'm going
to choose the cake now and make what would be a sub-optimal choice third something called the status quo bias so suppose you've got the following you go to work and it says check this box you can either check to contribute to the pension fund or not and so i'm sitting there thinking well you know if i check here then what that means is going to take money out of my um you know salary out of my paychecks here i'm not going to check it now alternatively what my firm could do is it could say well
check box to not contribute to the pension fund so this is called a negative checkup so if i check here then they won't contribute otherwise they will well what happens here is again i think from a status costing oh i'm already contributing no i don't want to pull that money out so it sort of seems a little bit like the prospect theory thing right so what happens here is most people in this case won't check the box right most people won't check up here and most people won't check down there now how do we know
this we noticed by looking at organ donations so in england if you want to donate an organ it says basically this check this box to donate the organ and you know many people check the box 25 if you look to the rest of europe there's a little thing that says checkbox to not donate your organ and you know how many people check the box 10 so in the countries in europe in which you have to check the box to not donate your organ 90 of people donate their organs in england we have to check the
box two day due night to donate your organs only 25 percent of people check the box so extremely large status quo bias and that again is a deviation from rationality okay last one what i want you to do is i want you to think of the year that you think this is a box made sometime during this century the last century i'm sorry 1719 i want you guess the year that this box was made i want you to write it down okay got it written down now what i want you to do i want you
to guess how much this box cost so i want you to guess how much you think this box is worth if you were to you know you know buy it on the website prices how much do you think the box is worth so this is a bias called the base rate bias look at the two numbers you wrote down the first one was the year you thought it was made so suppose i thought this thing was made in 1960 then i want you to think what price did you write down maybe you wrote down 63
right those two numbers tend to be fairly close so the base rate biases if you get people thinking about one number and then you ask them something else that second number will tend to be close to the first number so what you get right is this bias that makes no sense at all so you can do so you can ask people like think of a phone think of the last two digits of someone's phone number and then price the box and what you'll find is their prices are actually fairly close to the last two digits
of the phone number so this base rate bias is just a clear deviation from rational behavior optimal behavior okay so we've looked at these four things prospect theory hyperbolic discounting status quo bias base rate bias they're all well documented so they're all well documented deviations from rational behavior so there's a ton of these right so if you go and look on the web you'll see list of hundreds and hundreds of biases right that have been found in the laboratory now there's people who are critical of these right and so one of the acronyms that i
sometimes hear in psychology departments is weird and they say that the results are weird and what do they mean weird stands for western educated industrialized rich developed countries so most of these biases even though there's lots of biases most of them have been found on experiments with people like me western rich industrialized people from western rich industrialized countries so one of the things that we're trying to figure out is how many of these biases hold true across different populations and different cultures some of them do and some of them probably won't now there's another issue
with these right even though there's lots of these biases people tend to learn and again remember we've talked about if the stakes are large right maybe they'll learn their way out of the box so the monty hall problem are they picking the door which the prize was people suffer from a bias there right a status quo bias they stick to their door but after they play it enough it goes away so one of the questions is right how strong are these biases through repeated interaction and do they go away last point about this behavioral function
we think about modeling you may say boy you know it's right i think people suffer from hyperbolic discounting i think that there's prospecting there's a base rate bias i want to include all these biases in my model computationally that can be a really difficult thing to do so one reason why when you look at a lot of models are going to follow we're going to be using simple rules or maybe assuming people optimize even though we know people suffer from biases is that it can just be computationally hard it can be difficult right to write
down a model where you include these biases nevertheless the biases are there and when we think about a model of people we don't necessarily want to assume that people optimize nor as we look next we want to assume that people necessarily just follow simple rules instead we may want to assume that people are these complicated messy things with these biases it just that can be sometimes hard to do and less elegant okay so how do we think about this how do we frame it here's i think one way is that if i'm running down a
model it may not be a bad way to start by sort of assuming people make optimal choices given some simple objective function then i want to ask myself okay what biases are out there what is the list of documented biases given this list of documented biases do i think it means they're going to come into play how large do i think they've been how relevant are they for this particular case i might also say boy maybe i should do some experiments but i mean we should look out there in the world and see are people
behaving the way that my rational actor model assumes they're behaving and if not right if there's reason to believe that one of these biases is kicking in then i want to include the bias but because there's so many of them i can't include them all so i want to think about which ones of them are most relevant so what you see a lot of times in models that try and explain behavior is they'll sort of be what i call sort of rational minus or rational plus right so they're sort of a little bit less than
rational behavior because what they've done is take rationality and add it in a bias and that's one way to write down useful models of individuals okay thanks hi we now turn to our third way to think about modeling individuals and that's rule-based behavior so in rule-based behavior what you do is you just assume that people follow some rule and when you're constructing your model you say here's the rule that people follow so the shelling model for instance right we just assumed that people moved if the percentage of the people like them in their neighborhood fell
below some threshold that's a rule population below threshold you move standing ovation model right the percentage of people standing is above a threshold you stand up that's another rule so we can contrast pool based behavior with rational behavior remember in rational behavior we assume people have an objective function and then we optimize with respect to that objective function we can also contrast it with behavioral models where in behavior models we assume that people have these sort of biases that they suffer from right now of course a bias could be a rule and rational behavior can
even be a rule but when we think of rule-based behavior what we really think of is you know writing down a rule and assuming people follow that we look at four different types of rule-based behavior in a way right so we look at sort of fixed rule-based behavior we just assume you follow the same rule every day right and we can think about adaptive rule-based behavior where you change your rule depending on sort of what's happening so we can apply these two types of rules fixed and adaptive in two different contexts so one context is
a decision context right number in this decision context my payoff only depends on what i do so the only person materially affected by my choice is me and so i could have a fixed rule that i find a decision context or i could constantly be learning i could be adapting my rule over time same second thing we'll consider is games right so in a game that's a strategic context now my payoff depends on what other people do if my payment depends on other people do it could be that i don't want to use a fixed
rule then i want to be a little bit more adaptive but if i found one rule works well almost all the time like in a bargaining section in a bargaining situation i may decide just to say i'm going to bid 10 less than my value it could be that you just follow that fixed rule because you just think look it's worked for me in the past i'm just going to stick with it so what we want to do in this lecture is just walk through all four of these cases that's the goal we're going to
walk through all four and just sort of get some understanding of what modeling rule-based behavior looks like so let's get started let's start with fixed decision rules what would be an example of a fixed decision rule well one fixed decision rule that's actually used quite often is random choice i write down a model and i just assume that people choose randomly now why would i do that well the reason i do that is the same reason that meyerson gave for you know choosing rational behavior remember rational behavior is a really good bench market so it
says what would super smart people do in the situation well we could also take the opposite benchmark and assume that people make completely random choices and then we can compare the two we can say well how does optimal choice differ from random choice in terms of how the system behaves and so by comparing those two sort of extreme points we get a little bit more understanding of what might happen or what could happen now that's sort of not what we normally think of the fixed tool normally think of it as a fixed rule it'd be
something like take the most direct route so for instance suppose i'm in a city and i'm right here and i'm going to head towards i want to head to this museum up here so here's this museum little m next to it and i come to this intersection and there's a road going this way and there's a road heading off at an angle well if i chose the most direct route rule fixed what i do is i would choose this one right i would head in this direction now it could have been though that if i'd
have gone this way i'd have found a diagonal street that had it directly to the museum and it could be that this street goes up here in dead ends and i get stuck so following the most direct route locally can be a good rule to follow but it might be that there's something better to do okay now the reason i get this example is because when we think of fixed rules they may not be optimal right they may be good but they may not be optimal so let's think about fixed strategies let's compare strategies to
decision rules remember in a strategy my payoff depends on what other people do so now a strategy could be something like divide evenly so let's suppose i'm in a bargaining situation right so we're thinking about you know think about splitting an asset with someone so for example suppose that there's some land that you know my brother and i have inherited and we've got to decide okay how much do we get there's 80 acres well what we could say is well let's get divided evenly let's each take 40 acres that would be a strategy that i
could use in a game and it could be a fixed strategy now we get a more sophisticated strategy so a strategy that's described a lot is a strategy called tit for tat so tit for tat you can imagine let's suppose i can be nice or i can be mean and suppose i start out in tit-for-tat by being nice and i'll continue to be nice as long as you're nice but if you're ever mean then i'll be mean but then as soon as you're nice again i'll be nice i'm just gonna sort of do unto you
what you do to me but i'll start off by being nice now the reason i want to explain to it for tat is because we can i'll show you how you can sort of encode this in the context of a model so you can use something called a more machine and so this more machine works as follows i'm going to put a little star here to say i'm going to start out by being nice now what this green arrow says this sort of describes what the other person's doing so i'm going to stay nice unless
the other person's mean the other person's mean then i'm going to come over here and be mean and i'm going to stay mean until the other person is nice if the other person's nice i'm going to go back over here so these arrows tell me if i change my state from the nice state to the mean state so what i can construct is by writing on this little sort of computer program i can take a behavior like tit-for-tat and embed it in a model and i can also embed another behavior this would be a model
a behavior called grim trigger right so here i'm going to start off by being nice and here i'm going to stay nice until you do something mean represented by the green arrow and then if you do something mean i go over here but notice there's no arrows coming out of the mean box right this little mean circle i just stay mean so this is a strategy where i start out nice but if you're ever mean to me that's it and this is formally called grim trigger and the reason it's called grim trigger is that it's
pretty grim right if you're ever mean to me that's what i mean forever and the reason it's called triggers because all it takes is one mean action on your part and that triggers my perpetual meanness again so what i'm showing you here is how you can write like just a simple diagram write a computer model to represent rule-based behavior so those are fixed rules right now let's go to adaptive rules what would an adaptive rule be well let's suppose that um you're you know making like i like to make chocolate chip cookies or they make
oatmeal chocolate chip cookies every week and you're trying to figure out okay how do i make really good oatmeal chocolate chip cookies well one approach that people sometimes use is called a gradient based method and so what a gradient based method is is you keep sort of trying things in directions that are working so suppose i keep adding honey suppose i start by adding one quarter cup of honey right and it turns out that it's pretty the cookies are pretty good so then i add one more one tablespoon more so one more tablespoon so this
is plus one tablespoon so i had one more tablespoon and it's even better so then i might add another tablespoon so then i made my dad plus two tablespoons right and so on so gradient-based would say if i find a heel i keep trying to climb it so that would be adaptive in the sense that instead of just finding a fixed rule i keep experimenting trying to think find things that are better another type of experimental rule is let's go back when we talk about random behavior it could be that i also try something completely
random like i might throw wheat germ in i might throw raisins in i might throw walnuts in it might do something just completely crazy that's an approach like this random search again is adaptive instead of following a fixed rule right i keep changing what i'm doing trying to find something better trying to make a better decision now where we see it adaptation a lot more is in the context of strategies the reason adaptation at adaptive rules make more sense in the context of strategies is often the other person is changing their behavior take advantage of
me so i want to change my behavior to respond to them so in the context of a game it's often important to use an adaptive rule so what will an adaptive rule be so one would be best response what does that mean so let's suppose that we have some sort of strategic situation and the other person is taking some action well what i could do is i could then think like a rational choice person i could say what's the best possible response i could do given what the other person is doing and i'm going to
do that okay so that's a rule i could follow because best respond to the other whatever the other person's doing all right what's another thing i could do well mimicry so one thing i do in the game is i might not be able to figure out all what to do so i could just copy other people that are doing well so i might look around and think i'm not sure it's you know how to dress or i'm not sure how to behave in the situation i'm not sure how to bid in this auction what i
could do is i could watch what are other people doing i could just copy what they do so those are two really standard ways one is just to get the best response trying think of the game think of the situation and figure out what's the best possible thing i could do right but that's going to be adaptive because if the other people change what they're doing the best possible thing to do next period could change another thing you could do is you could just do mimicry you could look around and say who's doing well and
i could copy people who are doing well a couple observations about role-based behavior first sometimes optimal rules are simple so sometimes you might have a situation like this where i'm trying to maximize my happiness my happiness depends on two things chocolate and movies and here's my happiness function right it's the square root of chocolate times the square root of movies and there's some price for chocolate and some price in movies and i could sit down think okay what's the optimal thing to do here if i actually solve for the optimal thing to do it turns
out i should spend equal amounts of money on chocolate and on movies and so in some cases the optimal thing to do is to follow a rule and so assuming rule-based behavior isn't that crazy of a thing to do you could just say let's get to some people spend half their money on one thing and half the money on another thing if i'm writing down a model of the macro economy i might say let's suppose that people spend 20 percent of their money on food 30 on transportation 40 on housing and 10 on entertainment well
that may you may say well that's sort of a crazy fixed rule well it's not because it actually wrote down some elaborate utility function and had people optimized they do something that's fairly similar to that but another thing though simple rules often can be exploited so let me give an example suppose my rule is in bargaining that instead of demanding half we're asking for half i say look initially i want sixty percent and then what i do is i say okay well okay next one is about 59 58 i keep going down by one percent
each period think about what somebody else could do if they're bargaining against me when they when i say 60 they say well you know how about 58 and then i guess okay well then i'll follow my rule and i'm going to say 59 and then i could say well okay 57 and i could say 58 and they could walk me all the way down until i was getting nothing because they could just walk me down from 60 to 59 to 58 to 57 to 56 to 55 to so on so a simple rule can often
be exploited so let's think then let's give a quick summary why did why model people using rules well the first one is it's often really easy to model right you just sort of say here's my own let's just assume like in the shelling model or in the standing ovation model let's just assume people follow rules it's computationally really easy to do and it's easy to think through what's going to happen second reason you can capture the main effects often if you sit down and think how would someone behave in this situation or how do i
behave in this situation you think it's just a rule and that's the and you write down that rule and you're going to capture the main things that are going on third thing though often it's going to be kind of ad hoc right so one of the problems in writing these things down is not everybody's the same as you so when we talked about the difficulty of bottling people one is that people differ and because people differ so much anyone will you write down may be somewhat ad hoc and it may not represent what's really going
on and then the last thing which we just talked about is sometimes these simple rules can be exploited and so if you think about a strategic situation writing down a fixed rule can be problematic because if everybody followed that rule other people would really take advantage of it so where are we we've talked about rational models of behavior where people optimize we've talked about more behavioral models based on observation and even some you know recent neuroscience and now to talk about rule-based models what we do is we sort of write down what people do but
as a rule and then just ask sort of how does that rule aggregate depending on the situation any one of these three things might make perfect sense or maybe you want to do all three to try and get a deeper understanding of what's going on what we're going to do in the next lecture is we're going to talk about how in some cases it's not going to matter very much what behavioral rule we write down and in other cases it could matter a law all right let's get started hi in this lecture we're going to
compare and contrast the three ways we thought about modeling people so remember we talked about first people being rational people having objective functions and optimizing with respect to their objectives then we talked about how that's somewhat unrealistic we talked about behavioral models and i said you can think of these as either sort of rationality plus where we sort of assume rational behavior can be add in a bias or rationality minus sort of being just a little bit below rational behavior and then the third thing we did is we said well you know maybe we could
even go simpler and we could just assume that people follow rules these could be fixed rules they could be adaptive rules and when you write down a model we could just say here's the rule that people follow now in some cases the role that people follow will be rational in some cases the role that people follow will be behavioral in other cases the rule that people follow would just be some rule that we made up what we want to do in this lecture is we want to ask does it matter does it matter which rules
we write down does it matter whether you have rule-based behavior optimizing behavior or behavioral models well the answer is it depends and one of the reasons why we model is to figure out how much does it matter how effectively we model or how accurately we model i should say so let me do two examples first i'm going to talk about a market just a pure exchange market and then i'm going to talk about a game called race to the bottom and we're gonna see in the market it really doesn't matter that much how we model
behavior but in the race to the bottom we're gonna see it matters a lot so let's get started so two-sided market what is a two-sided market well two-sided market has a group of buyers these are people who want to buy some good and let's suppose just for the sake of argument that these buyers in this setting have prices between zero and a hundred dollars that means they're willing to pay somewhere between zero and a hundred dollars for this good in addition to the buyers there's some sellers unless suppose the sellers are willing to sell for
between 50 and 150 dollars so now let's think about what would rational people do so those people are completely rational in this setting how would they behave well if you're a buyer you basically bid a little bit less right than your true value because you try and make a little bit of money and if you're a seller you'd probably ask for a little bit more than your true value and how much more or less you ask for is going to depend on you know what the distributions of these buyers and sellers are so if you're
fully informed about what other people what these distributions are you're going to shade in particular ways now what's going to happen is let's suppose that people keep calling out prices we're just listening to people call out prices until the market clears until the number of goods sold is exactly equal to the people want to sell at that price is exactly equal to the number of goods that are bought so the way it's going to work is each person calls out each buyer calls out a price they're willing to buy it each seller calls out a
price they're going to sell at and then we pick some sort of price in the middle so we get an even number of buyers and sellers well what's going to happen in this situation is that the only buyers that are going to have a value of at least 50 since people are rational the only buyers have that are going to actually be able to buy anything or those have values about 50. so the relevant buyers are going to have values in 50 to 100. so we only need to worry about these people so we call
these b star and the relevant sellers are going to be the only ones that advise between 50 and 100 as well because the ones that have values above 100 right from 100 150 they want too much nobody's always going to pay that much for their goods so what is people are going to call it these prices and what we're probably going to get is we're probably going to get a price of around 75 right where some of the buyers buy some of the sellers sell but not all of them do let's suppose instead we assume
people are biased that they're not super strategic they didn't figure out exactly how much to shade their bids well then what people would probably do is they might rely on sort of focal bits so people will be more likely to bid 50 or 60 or 70 or 75 even increments so let me be more specific so it might be that if your rational buyer and your value was 56 that what you should do is you should say i'm willing to pay 53.72 cents right you solve some really fancy mathematical equations that's exactly your optimal bid
well a behavioral person might not do all that math and that might say well my body is 56 so i'll bid 52 or i'll get 50 55 right just some focal number they're not going to do all the math well again if there's just those slight deviations you're still probably going to see a price of about 75 dollars so optimizing behavior and sort of slight behavioral bias is not going to be a big difference what about rule-based behavior well this has actually been studied using a simple rule-based paper called zero intelligent agents this is zi
we'll abbreviate it so zero intelligence agents work as follows if you're a bidder what you do is you just sort of your buyer what you do is you just basically say okay i'm just gonna pick some random amount less than my value and if you're a seller you just pick some random amount more than your value so if you're a buyer and your value is 40 right so if i'm a buyer whose value is 40 i might say oh i'm willing to buy it for 20. and if i'm a seller whose value 60 i might
say i'm willing to sell it for 63. so what happens is it gets you some random amount well it turns out if you analyze the soda market with these zero intelligent traders what you end up getting is something with a price pretty close to 75 and not that different from what you get with rational actors so in a two-sided market right and things you know we've got models on economics we've got sort of supply and demand curves and things like that right and we get some price it turns out the market itself has so much
influence the institutions have so much influence the behavior really doesn't matter a great deal within some fairly wide range amounts so in markets we don't care as much about modeling behavior but now in games we do so let's do a specific game called it's called the race to the bottom game and here's how it works you pick a number between zero and a hundred this whole group there's a group of people in a room each person is going to pick a number between zero and 100. whoever's closest to two-thirds of the mean wins right so
what do you do here's a little quiz what do you do in this situation well let's let's look so what does a rational person do so what a rational person is going to do in this situation it turns out is bid zero but why is that well it's a completely symmetric game right so everybody's got the exact same incentive so if everybody's rational everybody should be doing the same thing so suppose everybody was picking six if everybody was picking six then the mean would be six but if the mean is six two thirds of six
is four so if you're rational and you know everybody's picking six you should pick two thirds of six so you should pick four so but then if everybody then everybody should pick four but if everybody's picking four then you should guess two-thirds of four right which is eight-thirds but so should everybody else if everybody's picking eight-thirds then you should get two-thirds of eight-thirds and so on and so on and so on until eventually you get down to everybody should be bidding zeros that's what rational behavior would be what would biased behavior be well if you
if you were sort of not strategic at all and super biased in the situation you say okay picking a number between zero and a hundred i don't know there's two thirds of the mean thing i'm confused what i'm doing i'm just going to guess 50. and in fact if you watch people play this game there's a certain percentage of people who do guess 50. in fact i've done this in my classroom a whole bunch of times and you'll get a significant percentage of people who just say i'm going to guess 50. now what would a
behavioral rule be in this situation what would a rule-based behavior be well again this is it's been studied a lot and a rule that people tend to follow is this they say well you know people should guess 50. but if everybody guess is 50 i should guess two-thirds of 50. so therefore i should guess 33. so you see a lot of people guess 33 but then there's a lot of other people who say well everybody should guess 30 gonna guess 33. so if everybody's going to guess 33 i should guess two-thirds of 33 which is
22. and so if you look at real experiments you see some 50s some 33 some 22s in sum right two-thirds of 22 is 14. so you see some people actually say this like people should guess 50. so since they should get 50 i should get two-thirds of 50 which is 33. but you know everybody should get 33 so then i should get two-thirds of 33 which is 22. but then everybody's going to get 22 so i should get two-thirds of 22 which is 14. now of course if we kept going with this we get to
the rational behavior eventually which is zero right and in fact if you play this long enough these numbers do creep down towards zero but they don't typically don't get there you have to run a lot of times so it's interesting here is the behavior we see is this rule is sort of a mix of rational and bias right a biased thing is to give us 50. so people sort of start out with this base rate by so 50 because it's in the middle after that they start sort of best responding remember one of our adaptive
rules for learning was to best respond so the best response if everybody best responsibility uses 50 is to choose 33 and the best responsibilities using 33 is to choose 22 and so on so what we see is the rules that people use are sort of start out with a flavor of bias and then they become somewhat rational let's do something really fun here so let's suppose we have two rational people in this game and one irrational person so let's suppose you're sitting in the right place with three people and you're a rational person let's say
you've seen this game before you know how it works and you're playing with me and you know i've seen this game before i know how it works but there's this third person sitting in the room and this third person we don't know anything about we know that they haven't seen the game before and they you know they hear the instructions and we're both looking at them trying to figure out what is this person going to do well let's try and analyze this so suppose i'm a rational person and you're a rational person so we're going
to pick some amount r right now the other person this irrational person we've got to make some decision like what do we think they're going to pick well suppose we both think they're going to pick x well if they're going to pick x we have to decide how much do we pick what do we pick well here's what has to be true the amount we pick r has to be two-thirds of r plus r plus x right because it's got to be two-thirds of the sum of everybody else's divided by three so it's gonna be
two-thirds of the mean so it's got to be two-thirds of r plus r plus x divided by three so if i multiply this out i'm going to get nine r has got to equal two times r plus r plus x right and so then i'm going to get this is ends up being 4r so i'm going to get 5r equals 2x so r equals 2x over 5. so what that means is if we both think oh this other person is what i don't know totally irrational suppose we think the other person is going to choose
50. if we think the other person is going to choose 50 then r equals 2 times 50 over 5 r is going to equal 20 right so if the other person chooses 50 and we both choose 20 the sum will be 90 right so the average is 30 and so two thirds of the mean will be 20. so why don't i do this example let's think back if everybody is rational the mean is zero the mean bit is zero and two-thirds of the mean of zoom you split the money but here if we throw in
one irrational person then we no longer get zero right we get something a lot bigger than zero because the rational people have to take into account what the other person's bid is going to be and so they want to make their bid as a function of that and that drives up their bid which drives up the mean which in turn drives up their bid so what's the lesson we take away i think the simple lesson is this is that rational behavior is a really good benchmark but it's also important to include biases in our model
think about are there biases that would be relevant and it's also really important but what if we just write down a simple rule and then if we compare these three things rational behavior bias right and then simple rule and we see sort of well how much difference do we see in the outcome if the difference is small then we can say okay look our result seems to be sort of invariant to behavior if the difference is big then what we've got to do is we've got to sit back and think okay which one of these
three makes the most sense and regardless of why we're modeling whether it's to just be a more intelligent citizen of the world whether it's to use and understand data right let's just get the logic right or whether it's to design something or strategize in some way it's probably really useful to think about all three classes and models in the context that you're considering thank you hi in this lecture we're going to talk about bringing models to data now when i say data what i mean is you know you pull all sorts of numbers off the
web or maybe you've got you know for your business or your home you've got all this data out there and how much money you've spent each month or something like that we're talking about how we can use models to understand that data i'm going to start simple we're going to build up so we're going to start with something that i call categorical models now in a categorical model what you do is you just sort of place all the different data in different boxes so for example suppose you've got a bunch of data on how long
people live and you want to try and understand what allows them to live a long and healthy life we might create one box of people who exercise and one box of people who don't exercise and then we can use by looking at how much variation there is in each of those boxes and what the means are in those two boxes you can figure out does that distinction make that categorization between exercising and not exercising does that actually help understand all the variation in the data after you do categorical models we're going to move on to
linear models now when we get a distinction between a linear model and a line so a line is just something that we plot right so we just have like you know here's the y variable and here's the x variable and we draw this line in a linear model what we assume is that we assume that this y variable depends on x right and so what this is is this sum this is some sort of relationship so in this axis x could literally be the amount of exercise you do and y could be how long you
live and so what you have is this line is sort of saying that how long you live is a function of how much you exercise so you're literally thinking of it y is some function of x okay so after we do linear models and i explain what they mean i'm going to have a short lecture sort of describing how we fit data to linear models so if you've got a bunch of data and then you want to construct a linear model how do we do it here's a simple example so if i've got a bunch
of data here like this and i want to ask you know what's the best line to go through that data well clearly this would be a terrible line because it doesn't it's not near the data this line here that they've done this black line looks like a pretty good line it gets pretty close to the data so we're going to see is how exactly do you draw the line and what criteria do you use right to make sure you've got the best line after we've done linear models we're going to move to non-linear models and
we'll show that the techniques are actually fairly similar now what i mean by non-linear well one simple way something can be non-linear it can start out straight and then kind of flatten out or something could start out slow and then get big or it could sort of do both it could start out slow and then get big and then flatten out right so there's all different shapes a function could take um you know john von neumann one time said the set of nonlinear functions is like the set of non-elephants that he meant that that the
number of nonlinear functions is enormous so we'll talk a little bit about how we can use some of the same techniques for linear models to create nonlinear models now after we do that we're going to conclude this this unit by talking about something i call the big coefficient so what do i mean when you have a linear model right you have like y equals a1 x1 plus a2 x2 so x1 and x2 are the variables these are the things that determine the outcome so we'll talk about for example school quality so x1 might be what
the class size is and x2 might be how good the teacher is and so this a1 and a2 are the coefficients and these coefficients tell how important is the variable so the bigger the coefficient the more important the variable so when i say the big coefficient what i mean is making policy or making a decision based on which one of these coefficients is biggest now that makes a lot of sense and so what i'm going to do is i'm first going to argue that boy you know better to use the big coefficient than to just
sort of do c to the pants thinking we're going to see why that's the case that linear models are just better right we've seen this little before but we're going to go into more detail linear models are better than just sort of thinking off the cuff but then i'm also going to criticize a little bit right something to say that one problem with big coefficient thinking is it only works in the area in which we've got the data and oftentimes if we want to make the world a lot better place we have to shift to
an entirely new reality we've got to shift to a place where there is no data so i'm going to draw a distinction between what i call the big coefficient and what i'm going to call the new reality right or situations that are just maybe a lot better than what we currently have all right so that's a summary of where we're going to go we're going to start with categorical models look at linear models show how to fit lines to data right then we go to some non-linear models and then we'll wrap it all up by
talking about this idea of the big coefficient all right thank you hi in this lecture we're going to talk about a very simple class of models that helps us make sense of data and these are known as categorical models in a categorical model what you do is you basically bin reality into different categories and then you hope that these categories help you make better sense of the data that they explain some of the variation in the data i want to start out by just describing what a categorical model is like and then we'll talk about
how they can help us make sense of data so let me give an example a long time over a decade ago i was at a conference in amazon which is a company that you know sells all sorts of stuff over the web right it was just going public and there was a discussion whether amazon was a good investment or not so one person was a wall street investor said no i think it's a horrible investment if you think of samsung all it is is it's just a delivery company right they just got a big warehouse
you order stuff they deliver it the margins in that industry are really small right there's sort of ups and epx ups and fedex and dhl and all those sort of places i just don't think there's any money in it now another person said you know i'm going to put amazon in a very different box i'm going to put amazon in this box if it says information because i think it's part of the new information economy they're going to gather all this information about what consumers want it's all going to be centrally held they're going to
be worth a ton of money now turns out if you put amazon in this information box you probably would have invested in it and you'd have made a lot of money if you put amazon in this delivery box you wouldn't have invested in it and you wouldn't make a lot of money so which box you use right how you categorize things affects how you think about things and again what sort of decisions you make so this leads to a phrase that one of my friends is psychology psychologist once said lump to live so what my
friend meant is this is that we create these lumps these boxes these categories in order to make sense of the world so i look out there on the street and i see a vehicle i don't say oh it looks like a 1997 ford f-150 pickup truck right instead i just say truck or i just say car or if i look at a piece of furniture i just say it's a dresser i don't say it's an 1874 chippendale dresser i don't completely break it down i just put things in categories and these categories are shortcuts right
they help us make sense of the world now let's think about why we model again right remember one of the reasons we modeled was to help us decide strategize and design right so one reason we love is it just helps us make quicker faster decisions or we just put things in categories and say this is something i like this is something i don't like this is something that's risky this is something that's not risky let me give some examples just some fun examples so the first one is let's suppose you're a kid and you've got
to decide what am i going to eat and what am i not going to eat well one sort of categorization you might use is the green categorization so you might say anything that's green broccoli is green grasshoppers green asparagus is green all these things are green right everything else bananas those are yellow candy bars brown orange they're orange bears pears can be green but listen to yellow and strawberries they're red these other things aren't green and so your rule could be i'm going to eat anything that's not green and i want to eat anything that's
green and so that rule will keep you safe from things like grasshoppers and asparagus right so that's a rule you might follow now it's not an optimal rule because you might run into a green pair and that green pair might be something you'd really like but if you've been avoiding green things you may decide well not going to risk it so that's the same an example of a simple rule let's now show how you can use a rule like that to make sense of data so now let's suppose i've got a bunch of data here
and these are different food items and what you've got in this column right here are calories so this is how many calories there are on each of these these food items so what i want to do is i'm trying i want to make sense of why do some things have a lot of calories and some things not have a lot of calories so i've got this list of items well the first thing that i need to try to make sense of is how much variation is there in this data well to understand how much variation
is first i've got to find out sort of what's the average value and then variation tells me how far are things on average from that value so if i add all this up i've got 100 plus 250 that's 350 440 550 900 right so we've got 900 divided by 5 so that means the mean here is 180 so on average everything in this group has about 180 calories and i want to ask some things are higher right this says 350 and some things are lower this is 90 i want some understanding how much variation there
is in that data so one way that we just subtract the mean from everything so if i take 100 minus 180 that's going to be minus 80. 250 minus 80 that's going to be 70 90 minus 180 that's minus 90 right 110 minus 180 is minus 70 and 350 minus 80 is 170. well if i add all these things up i'm going to get minus 80 plus 70 minus 90 minus 170 plus 70. it's going to be zero because it's going to be the same as the mean so what i need is i need all
these differences to be positive so one thing i could use i could just take the absolute value of all these things right and then i could add up the absolute value and i could get 80 plus 70 is 150 plus 90 is 240 plus 7310 plus 170 is 480. so i could say the total difference from the mean is 480. but what we do in statistics is we tend to do something different we actually tend to take the difference and square it and the reason we square it is really twofold one is is that again
it makes everything positive which is what i did before and the other is that amplifies larger deviations because what we'd really like to do is prevent those huge deviations so this is going to amplify large deviations so if i look at the pair i would have 100 minus 180 which is 80 squared which is 6400. so that's how much variation there would be that's the how much the difference from the pa pair to the mean squared and if i did it for the cake i'm going to get 250 minus 180 which is 70 and if
i square that right i'm going to get 4 900. now i could do this for everything all of them right so for the pear i get 6 400 for the cake i get 4 900 for the apple 81 for the banana 49 for the pie 28 000 900 so this is again if you get a long way from the mean and you square it you get a huge effect so square amplifies larger mistakes now if i add up all these numbers i'm going to get 53 20. that's what we call the total variation so i
plotted that data this tells me sort of how much variation is there in that data and what i'd like to do let's keep keep track of the plot here i'd like to put the state in categories that reduces that variation that somehow explains why some things are high and some things are low so what's the obvious categorization the obvious categorization here is that pears and apples and bananas are fruit and cakes and pies right are desserts so let's create a fruit category and a dessert category so in the fruit i've got one thing that's 90
one thing that's a hundred and one thing that's 110. and the dessert category i've got one thing that's 250 and one thing that's 350. so let's look at them in more detail if i've got 90 100 110 the mean is going to be 100 here right the average of those three is 100. what's the total variation well 90 minus 100 is just 10 so if i square that i get 100 hundred minus a hundred right is zero so if i square that i get zero and one ten minus one hundred is also ten so if
i square that i get a hundred so the total variation here is just going to be 100 plus 100 or 200. so now what i've done is i've got a mean of 100 and a total variation of 200 and now if i go to this case the mean is going to be 300 right for the desserts and what's the total variation well for the cake it's 250 minus 300 which is 50 squared which is 2500 and for the pie it's 350 minus 300 which is also 50 squared which is 2500 so when i add those
up i get 5 000. all right so let's clean this up a little bit so what i did is by creating two categories a fruit category and a dessert category i now have a mean in the fruit category of 100 and a variation of 200 and a mean in the dessert category of 300 and a variation of 5 000. now let's think about what i started out with right when i had all the stuff together i had a mean of 180 and i had a variation of 53 200. now look at how much my variation
has gone down it went from 53 000 to 5000. so this is the idea these categories substantially reduce the amount of variation i have left over so think of the variation as what's unexplained so initially i say look i can just say things on average 180 calories and we've got 53 000 units of variation that's unexplained now i say like i'm going to create a categorical model that says there's fruit and desserts and fruits have fewer calories than desserts and you can say well look it appears to be the case fruits have a mean of
a hundred desserts have a mean of 300 and the variation in the fruits is only 200 and the variation in desserts is 5 000. so i've reduced variation a ton what we want is we want a formal measure of how much we've reduced variation that's actually fairly simple right so i've got a total variation of 5300 fruit variation is 200 dessert variation is 5 000 so that gives me 5200 so 53 000 to start and i get 5200 left so what we want to ask is how much did i explain that's the question how much
of that variation did i explain well the i started out with 53 000 right 200 and i now only have 5200 left over and so the amount i explained is 53 000 minus 5 000 right which is um 48 000 right over 53 to so the percentage of the variation i explained was 48 000 divided by 53 000 which is a huge amount now i can write this more simply as just one minus the amount i did that's left over one minus 5200 over 53 000. so right because that's just a simple way to do
it and so i'm going to get that the amount of variation i explained was 90 90 so 90.2 so that's how much of that variation i explained this is equal to again that 48 000 right divided by 53 200 it's the amount of variation that i explained now formally this is called the r squared so this is the expense the percentage of variation that i explained just by that simple categorization so if the r squared is near 1 that means i explain almost all the variation and so the model explains a lot right if the
r squared is near zero that means i didn't explain any of the variation really and the model doesn't explain very much at all now the better the model the more our large r squared it'll have but depending there could be so much variation in the data that even a great model only has an r squared of 5 or 10 percent there also could be situations where the thing you're trying to explain is pretty understandable and a good model has to have an r squared of 90 so there's no fixed rule as to whether you know
what a good r squared is it depends on what the data looks like but within a class you know sort of a class of models or you know a particular data class you can sort of figure out okay this is a good model this is a bad model based on experience let's push this a little bit further we had you know fruit and desserts right those are our two categories but if i had you know a whole kitchen worth of food it may be the case that like i'd want to have more categories so i
might create a vegetable category in a grains category and then i could put everything in one of these four boxes so one of the differences between sort of experts and non-experts is experts tend to have more boxes they also tend to put the right things in the right boxes so they tend to have useful boxes so if you want to be good at sort of predicting things or understanding how the world works what you have to have is a lot of categories and you have to have those categories be the right categories they've got to
explain a lot of the variation and we can measure how much of the variation it explains your model explains by using that r squared one last point even if you explain a lot of variation it doesn't mean that you've got a good model let's go back to the schools case so suppose i'm trying to figure out what makes a good school what really leads to good school performance so i try all sorts of different boxes i look at schools that spend a lot of money versus schools that don't spend a lot of money schools that
have small class sizes and big class sizes schools that are big schools that are small right and nothing really seems to explain too much of the variation and then i create a box that i call the equestrian box and i put all the schools in here that have equestrian teams and i find oh my goodness every school than equestrian team is great well the thing is that doesn't mean that the equestrian team made the school good right so statisticians make a distinction between correlation which is is there a statistical relationship between having an equestrian team
and being a good school and causation did the equestrian team cause the school to be good so remember when we draw when you think about putting it in this box like this box right here has a bunch of good outcomes in this box here has mostly bad outcomes that doesn't necessarily mean that the thing that created this box if it's the equestrian box is the reason that the schools were good it could be that there's some other reason so why would you have an equestrian team well you'd only have an equestrian if you had a
lot of money and you probably also only have an equestrian team you have a lot of parental involvement things like that with a lot of support from the community and you don't have any questions a lot of open space so it could be that having an equestrian team is a proxy for things like money parental involvement open space right those sorts of things that actually do make a school good so even if your boxes work that's no guarantee that they're actually the cause of why it works okay so what have we learned what we've learned
is this we've learned that the simplest kind of model you could have is just a category based model right where you sort of lump the world in different different categories and you place your data in different boxes depending on what type of data it is so that could be information companies versus delivery companies that could be fruits and desserts right and in doing that what you can do is you can reduce the amount of variation you see in the data so there's a total variation which is sort of mistakes how much unexplained variation there was
out there in the world by putting it in boxes you organize it in such a way that you reduce the variation the amount of which you reduce the variation is what we call the r squared that's the percent of variation explained and the more variation you explain the better your categorization is of course if you create more boxes you can explain more of the variation so we're going next is we've got linear models which in effect can create a different box for each value of x our dependent variable okay thanks hi in this lecture we're
going to talk about linear models now in linear models what you do is you assume that there's some independent variable x and this is some variable that that could be anything it could be hours spent exercising it could be money spent on advertising could be anything you want then you assume that there's some other variable y that is a function of x so if x is hours spent exercising y could be life expectancy if x is amount of money you spend on advertising then why could be sales now what we do is we assume that
there's a particular relationship between y and x that it's not any old function in fact that we do is we assume it's a simple line so here's x here's y we assume that we can write y equals mx plus b that y is just a linear function of x so formally right if you see this here's a typical graph right here's y on this axis here's x b is the intercept so this is if i'm writing y equals mx plus b if x equals zero b is the value that y will take and then m
right is just the slope so it tells you sort of how fast it's going up how fast this thing is going up if you move over if you increase x by one how much does y increase now remember i talked about the difference between a linear model and a line so in a line we just have that equation y equals mx plus b in a linear model what we're assuming is that there's some independent variable x there's some dependent variable y and y depends on x so we're assuming that x somehow causes y to occur
all right let's do an example simple example suppose you think about buying a tv and you want to know how much is that tv going to cost i want to construct a linear model of the cost of a tv so let's let x be the length of the diagonal right so you get your square screen right and then tvs are measured by the diagonal that way they can pretend the tv is bigger than it really is okay so x is the length of the diagonal y is the cost of the tv and suppose you say
here's my linear model i think the cost is 15 times the length in inches plus a hundred dollars right so i can put a dollar sign here to show this is 100 so that could be my model of what it cost for a tv now there's two things you want to think about when you have these linear models first is the sign of that coefficient so if i have y equals 5x right that five is positive that says that y is increasing in x so in the case of the tv we expect the price of
the tv go up as the tv gets bigger so we'd expect the sign to be positive second thing we care about is the magnitude how big is that coefficient so how much does the price go up every time you make x bigger by one so when you think about using these in policy settings right so suppose you know that school lunches and you want to school lunches improve performance well the sign on that would be well is performance higher if people get school lunches right so is there a positive coefficient so why do we construct
models right a bunch of reasons one though is to predict right another is to understand data so let's talk about just using this simple model to try and predict so suppose you're thinking about i wonder what it costs to buy a 30 inch tv what we can do is we can plug it in and so the cost would be 15 times the diagonal plus a hundred so what that would be is 450 plus 100 which should be 550 so that would say if i want to go buy a 30 inch tv it's going to cost
me 550 dollars and so if that's if you had a good model this would be what a tv cost if you had a bad model this would be way off base but let's suppose this is your model nothing you can do is you can use it to predict things that maybe don't even exist so suppose you think what if they made a hundred inch tv how much would that cost well if you assume the same relationship holds you could just say well i'm gonna the cost should be 15 times 100 plus 100 which is going
to be 1600 so you can think 100 inch tv would be 1600 okay so maybe you want to sit around and wait for that 100 tv to come around or maybe 1600 is too rich for your blood and you're going to say well you know what i'm happy with the 30 incher okay but again what you can use this model for is to predict now the model might not be accurate right because the world may not be linear but as a benchmark this isn't really a bad thing to do now also understanding data right so
there's all this data here right these are all these dots one thing you can do with the linear model right is you can think okay how well can i fit a line through there now let's go back to our last lecture from remember our last lecture we talked about r squared how much of the variation you can explain well there's a lot of variation in this data well you can do the same thing with a line you can ask how far does this data lie for my line right and so how much of that total
variation that i explained by just drawing that line through there so you can do the same exact thing that we did for the categories for the lines and that's we're actually going to do in the next lecture but for now i just want to get across this idea that you can use a linear model to try and make sense of data like this and also to make predictions here's the thing remember i said first lecture that models are better than we are well let me support that even with these simple models so robin dawes who's
at carnegie mellon in 1979 wrote a paper comparing very very primitive linear models so all he worried about was sort of getting the coefficient close to right so for example here's one case you have 43 bank loan officers and they're trying to predict whether these firms were going to go belly up or not whether they'd repay their loans and so they were given 60 loans 30 of which they was already known had failed and 30 of which was already known had succeeded and they asked these people to predict what was going to happen now these
bankers they're pretty smart they were 75 accurate so that's actually pretty good but if you took a simple linear model just based on that ratio of assets to liabilities of the people taking out the loans that was right 80 of the time so the models beat the people you can say fine that's one example well this has been studied in detail right so in 1954 mel did a study 20 studies of clinicians these are doctors right making predictions versus just doing a simple linear model and sawyer 1966 did 45 studies of predictions out there in
the social world if you think across all 65 of these studies it is never never the case that the experts did significantly better than the linear models so there's cases where they were close right where the experts may be a little bit better but it wasn't significantly better and there were a ton of cases where the linear models did better so when you do a horse race right linear models tend to be better than experts remember we saw this this is that tetlock this is the graph from tetlock again here's formal models way up here
right and this is sort of again this is measuring how what the r squared is how good those models are explaining variation what you get is that formal models are better than people are now again you don't want to only rely on the formal model what you'd like to do is do the linear model and compare that linear model to your own judgment okay so what have we done what we've done is we've shown that you can draw a line through data and use that line to explain some of the variation in the data now
typically the world isn't going to be perfectly linear there's going to be lots of extra variation left over but there's a question of how much of that variation did the line explain in addition to explaining the variation the line tells us something about the relationship between our independent variable x and our dependent variable y in particular we learn the sign on x like does y increase in x or decrease in x and we also learned something about the magnitude so how much does each one unit increase of x increase the value of y so what
this linear model can do is help us understand something about data we see in the real world now what we've done so far though right is just single consider a single variable linear model right so y was just a function of x where we're going to go next is to think of y depending on a whole bunch of different x's so you could think of your outcome having a whole bunch of different variables to contribute to it and we'll start off by assuming each of those values contribute in a linear way okay thank you hi
in this lecture we're going to talk about fitting lines today this remember in the last lecture we talked about simple linear models well when i was drawing those lines through the data the question is how do you do it how do you draw the best possible line through the data that's the focus of this lecture so let's step back for a second remember we need the categorical models we have that notion of r squared which was the percentage of variation that you explained so there's a lot of variation in your data you construct a model
and you ask what percent did you explain so for example if i have a bunch of data like i have here these are all the dots right the data if i just took the mean of this right here then ask how much variation i'd have to take the distance from all these points to the line to be a lot of variation when i draw the line through here i explain a lot of it in fact here it says i explained 87.2 percent so what we want to talk about how do you draw a line through
this data to explain as much variation as possible so remember in our box example again just to give us a reminder how this worked there was 53 000 units of total variation and we only had two 5200 left so fifty two hundred fifty three thousand which is like nine point eight percent that's how much we had left over so that means that we explained ninety point two percent of the variation we wanna show how you can do that same sort of calculation with lines and then show how you draw the best possible line so let's
suppose i got a bunch of data here to figure out how much variation there is i draw the mean and then i can figure out the distance if the mean let's say a 6 and this has a value of 4 i would take 6 4 minus 6 and square that which is 4 and if this is the point as value 8 i would take 8 minus 6 and square that and that would also be 4 and i add up all those variations that gives me the total variation now what i'm going to do is i'm
going to line through the data and figure out now just what's the distance from the line and ask how much of the variation did i explain by drawing the line through so let's do a very simple example let's suppose i've got three kids and they're different grades of school and they wear different size shoes so when i was going to predict shoe size as a function of the grade they are in school so i'm going to say that your shoe size is a function of what grade you're in so i've got a first grader who
wears a size 1 shoe a second grader wears a size 5 shoe and a fourth grader who wears a size nine shoe and i want to try and fit a linear model of this so first i can ask is what's the variation so this is the grade and this is the shoe size now i don't care about the variation in the grades i'm caring about the variation the thing i'm trying to explain which is shoe size so it's just this so it's just this one five and nine so if i take one five and nine
if i add those up i get 15 and divide by three i get five so the mean is five so to get the variation i take 1 minus 5 and square it which is 16 5 minus 5 and square it that's 0 and 9 minus 5 and square it that's also 16. so the total variation is 32. so what i want to do is i want to write down a linear model that can explain as much of that variation as possible well let's start off with just a really simple linear model where we assume y
equals 2x so if i take the line y equals 2x what i'm saying is all three of these points should lie on the line right and so the variation is just sort of how far off the line they lie well so how do we do it well i've got x and y and 2x would be 2 when x equals 1 2x would be 2 when x equals 2 2x would be 4 and when x equals 4 2x would be 8. so these are my predictions in a way right 2 4 8 and what i can
ask is how far does the data lie from those predictions so here i predicted two and the actual value is one so i get two minus one squared which is one here i predicted four and the actual value is five so i'm going to get minus 5 squared which is 1. and here i predicted 8 right and the actual value is 9 so 8 minus 9 is also 1 so i get 1 squared so the total amount is 3. so i think wow that's great i started out with a total variation of 32 right and
now i've only got 3. so if i want to figure out my r squared i just say that's 1 minus 3 over 32 right and once again i'm going to be over 90 right it's like 90 plus percent right so that's really good i've explained a lot of variation but the thing is this was just like i just made this up as y equals to x i just drew this line so how do i know the best line well what i can do is i can say well let's suppose i drew the line y equals
mx plus b so this is just an arbitrary line and then i want to ask how far off would that line be from the data well when x equals one my prediction would be m plus b and the actual value is 1. so my error is going to be m plus b minus 1 squared when x equals 2 my model is going to say that the value is 2m plus b and the actual value is 5 so that's going to be my error and when x equals 4 this is going to be my prediction this
is the actual value so this is b going to be my squared error so if i want to know what the total error is i just have to multiply all these things out so m plus b minus 1 squared is going to be m squared plus 2 mb plus b squared minus 2m minus 2b plus 1. right so that's a really complicated thing and i can do that for each of the other two as well right so i get these long equations now if i do that i'm going to get here's my total error right
so this is the if i choose the line y equals mx plus b this is my error what you can do and this is what's great about calculus you can do math and just solve for this find the b and the m right that make this the smallest possible number and if you do that you choose b equals minus one and you choose m equals eight thirds so this is how you draw those lines you basically go back right and just say well let's take any line y equals mx plus b figure out its distance
to the data right add up your total distance right here so this is just the total distance or the total variation right and then you want to choose an m and a b to make that as small as possible it turns out the way to do that is to choose b equals minus one and this should be a lowercase m right m equals eight thirds now we do that what we're going to predict is that when x equals one our model now says y equals eight thirds x minus one right so when x equals one
we're going to get eight thirds times one minus one so that's gonna equal five thirds right so the actual value is one so if we look at the difference right between our prediction the actual value it's just going to be two-thirds right so we're gonna get the contribution to r squared is gonna be two-thirds squared when we look at five when we take x equals two the real value is 5. our model if we plug it in here is going to give us 13 over 3 that's also off by two-thirds and if we look at
when x equals 4 our model gives us 29 over 3 the actual value is 9 which is 27 over 3 that's also off by 2 3. so what we're going to get is our total variation left over is 2 3 squared plus 2 3 squared plus 2 3 squared so that's basically 4 3 that's going to be well it's going to be 4 9 times 3 which is 12 9 which is 4 3. so now if i want to know what's my r squared right well if i erase all this stuff for a second right
what i get is that how much of the data did i explain right i have 1 minus 4 thirds over 32 and so now i've explained you know over 95 percent of the data so by using by sort of figuring out the optimal b and m to r i can even do better right than i could like instead of trying to draw that line of y equals 2x and so if i draw that actual line it goes like this and you see it becomes incredibly close to the data so let's move on and think about
how do we do this with multiple variables so supposing instead of having one variable i've got a bunch of variables so now i can write y equals ax1 plus bx2 plus c so now instead of just one independent variable i've got two so when you look at these things the sign tells you this y increase or decrease in x the other thing that regressions will tell you is the magnitude how much does y change as a func as a function of x so let me talk about why this then is so important again we often
just reason by the seat of our pants and so let's suppose you care about and again i'm gonna talk about this a lot because it's just an easy easy and important thing to talk about school quality so i've got a bunch of test scores from kids and i also know this is like an achievement test score i've also got iq test scores which basically tell their innate ability and something some level then i've got measures of teacher quality and class size well what i can do is i can run a regression that says well the
performance on this test is going to be some a some coefficient you know just some intercept plus some coefficient on teacher quality i am saying iq teacher quality and class size and what you'd expect is the coefficient on class size to be negative right you'd expect the coefficient on teacher quality to be positive and the coefficient on iq to be positive now without without running a model we don't know which ones of these things are big and we even even don't know if our intuition is right well let's look at class size so recently there's
been like 78 studies last size four of these show a positive coefficient right 13 show a negative coefficient and 61 show no effect right so this is the result of somebody did a summary stud investigation of 78 you know regression studies data studies on does class size matter and what you find is that you know only 13 times does it have that expected negative effect and 61 times it has no effect and four times it actually goes in the wrong direction so even though we think class size matters right and it should matter smaller classes
should lead to better performance it doesn't always work out that way what about teacher quality well there's a recent study by a bunch of economists right and they basically show that a good kindergarten teacher is worth 320 000 but if you have 20 students it turns out that those students can expect to make 16 000 more in lifetime earnings by having a really good kindergarten teacher as opposed to a bad kindergarten teacher so again by plugging all this data when we both expect i mean we all expect right class size should matter right lower class
sizes should be good and teacher quality should matter better teachers should be good but what you find when you run the data is that class size doesn't seem to matter that much at least in the ranges in which we're playing but teacher quality matters a lot so what do we learn from all this what we learn is there's a lot of data out there one thing you can do is you can fit that data to linear models what linear models will do is they'll explain some percentage of the variation maybe a lot maybe a little
these linear models will also tell us the sign and the magnitude of coefficients so it'll tell us whether a variable has got a positive effect but it's got a negative effect and also tell us sort of how big that effect is and that allows us to make policy choices investing in things like teacher quality as opposed to class size because they have a larger effect this is what i call big coefficient thinking thank you in this lecture we're going to talk about a particular skill which is reading the output from regressions now remember early on
this course i talked about one reason to model is to be an intelligent citizen of the world that there's just a certain amount of modeling understanding you have to have in order to get by in order to contribute well you might be doing some project and what's going to happen if somebody's going to give you some sort of regression up with it looks like this and what you've got to be able to do is read it understand it know what this is saying so we see some words on here we already know right like r
squared and we see things here like say coefficient coefficient of intercept what we'd like to do is be able to fully understand what output like this is saying okay so the first thing to notice we've got to think about what's really going on when we see regression output what it really is is it's a linear model but it's a linear model based on multiple variables so remember before we had y equals mx plus b right that was our models a linear model where y depended on x well what we're going to do now typically when
you see regression output is that y which is your dependent variable depends on more than one so it's like m1 x1 plus m2 x2 plus b so for example suppose again let's go back to the health example suppose why is health outcomes well x1 might be how many hours of exercise you get x2 might be how many hours of sleep you get and so on right so what you've got is your dependent variable y depends on a lot of stuff so when you look at regression output like when you look at stuff that looks like
this you see there's more than one x there's an x1 and x2 let's do an example to you know flesh this out a bit suppose again we're looking at student test scores right so that's why that's our dependent variable when it could depend on a couple of things it could depend on t which is teacher quality and it could depend on z which is class size so we just write down a simple linear model that says y equals c times t plus d times z plus b b is again our intercept right that's just the
familiar intercept before just like y equals mx plus b now when we think of this model what should we expect what we should expect is that as teacher quality gets better class size their scores get better so we should expect a c to be bigger than zero but we should expect as class size gets bigger that the scope performance or class performance should fall so therefore we should expect d to be less than zero so when you see a model like this one of the first things you want to do is you want to sort
of come at it with some expectations some preconceived ideas about what you think is going to be true that way when you look at your output you can decide is it surprising right or is it not surprising so let's go back and take just a generic model where we have y equals ax 1 plus bx want 2 plus c so c here is going to be our intercept right that's our intercept and a and b these are the coefficients of our independent variables what we want to do is we look at the outputs going to
tell us something about those coefficients so here we go here's some regression output looks a little scary but let's just relax a second so let's look first and see what we see first we see this thing says r squared is 0.72 what is that telling us well we already know that's saying there's a whole bunch of variation in the data and 72 percent of it was explained by our model that means a linear model in this case is explaining 72 percent of the variation that's totally great standard error 24.21 is telling us on average what
was the standard deviation in the model so how far from the mean were things this is saying is on average about 24. and then this observation thing is 50 is guessing at 50 data points so we had 50 data points on average it was 24 away from the mean and we could explain 72 percent of that variation so you know not a bad model all right now when you look down at this part down here this whole part of it this is telling us something about what our linear regression model is saying about the coefficients
in the intercept so the first thing we notice this 25 here is the intercept and so that's saying our final regression equation is going to look like y equals something times x1 plus something times x2 plus 25 right now this next term here this 20 is telling us that the coefficient of x1 is 20. so it's going to be 20x1 and then this 10 corresponds to x 2 right plus 10 x 2. so this is just basically telling us our equation is y equals 20 x 1 plus 10 x 2 plus 25. now let's suppose
let's go back to our the previous example just talking about let's suppose that x1 was teacher quality so this would say y which was student test scores are increasing in teacher quality so that's what we'd expect but suppose x2 is class size and here we have a 10 and we get that test score is actually increasing in class size well then we say hmm this is sort of surprising to me because i expected class size to have a negative coefficient and it's actually got a positive coefficient so we might think maybe our date is wrong
maybe my intuition's wrong so let's look a little bit deeper so some things to look at first is we should note we've only got 50 observations so 50 observations isn't very many so it could be that maybe these coefficients aren't right well how do we know that well this is where we want to look at this column right here with sc se stands for standard error and what it means is it's sort of how what's the error in our coefficients so for example here we've got a coefficient of 25 and it says the standard error
is 2. so what that's meaning is let's go back remember think of our bell curve instead of saying our model is guessing that the coefficient of this thing is 25 right that the coefficient is 25 but we've got a standard error of two so that means if we went between 23 and 27 we'd be right 68 percent of the time so what it says this 25 is a guess based on the data and what this standard error of 2 is telling us is that well 68 of the time we'd actually the coefficient would be between
23 and 27 so sort of saying you know maybe it's 25 but you know it's probably almost for sure between 21 and 29. well let's look at our x1 we've got a coefficient of 20 but the standard error is only one so what that's saying is we can be really sure that the coefficient on x1 is between like let's say 17 and 23. we can be incredibly sure it's between 16 and 24. but now let's look at this last one x2 the coefficient was 10 and the standard error was 4. so if i draw my
bell curve and let's go over here and draw a big bell curve here make sense of this for a second right what it's saying is from the data i'm estimating that coefficient of 10 but i've got a standard error right of four so that means there's a 68 percent chance it lies between 6 and 14 and there's a 95 percent chance it lies between 2 and 18 and there's actually at least a 2 percent chance it is you know below 2 right so there's some chance that this coefficient actually instead of being 10 could be
negative so you think well why don't they tell us that and the answer is they do so this column right here this thing that says p-value that tells you the probability that the sign is correct i'm sorry the time is wrong so what this is saying is there's no way those two signs are wrong if you think about drawing the spell graph we're getting estimate of 25 there's no way the real coefficient is is negative but when we get down to x2 it's saying look there's about a one and a half percent chance that this
coefficient of 10 instead of being positive it actually should be negative so the regression output is saying because our standard error is so large we're not sure about that 10 there's actually a decent chance of setting a positive that's negative all right except what have we got so we see this regression output it tells us a bunch of stuff it tells us first what the r squared is how much of the data do we explain second it tells us how many observations do we have right a lot of observations are not many third it tells
us how much variation was there in the data to begin with and the answer is on average 25 24.21 so you know quite a bit of variation then it tells us the values of the intercept in the coefficient right these these are estimates 25 20 and 10. and so it tells us you know this is probably positive and this is probably positive and it gives us a sense of magnitude so it tells us sine and magnitude but it also tells us in this p-value thing how sure we are that those coefficients are correct now can't
tell us we can't be for sure it's 25 but we can tell us how sure we are the coefficient is actually positive and so here it's saying we're really sure it's positive right there's almost no chance we're making a mistake but for x2 well there's a one and a half percent chance that maybe there is a mistake there so if this were again a regression of test scores on teacher quality and class size what we could say is teacher quality definitely improves performance and there's a lot of evidence that that's true and we can say
with class size well even though this study goes the wrong way it's possible since we only have 50 data points right that maybe if we did another study it could go in the opposite direction and that's true as well there's a lot of studies on class size that do in fact show that as the class size gets bigger students do better even though that's sort of counterintuitive but there's more studies that show the opposite right that is the class size gets smaller students do better so big things to take away when you look at that
regression output the first thing is look at the sign look at every coefficient and ask does y increase or decrease in x now before you look at it though you know when when somebody says oh i've run a regression model you should say well what are your variables and then what you should do is you should form expectations about what you think the signs of those variables are the coefficients are then when you look at it you can say hmm does the variable have the effect that i thought it would have so if you're looking
at sales for your firm you might want to say well geez you know the coefficient on advertising is negative the more we advertise the less we sell that would be totally counterintuitive right but if the coefficient of advertising were positive if the more you advertise the more you sell then it would make sense then what you want to do is look at magnitude you want to say okay how big of an effect on y does a one unit increase of x have and if it's got a big coefficient that means wow this is something i
should really pay attention to and if it's got a small coefficient it's something that maybe you shouldn't pay attention to all right so pretty straightforward right regression output all it's telling you is right how much of the variation did we explain right so that's a measure of how good the model is and that's captured by r squared then it tells us what the sign and the magnitude of those coefficients are right so is the sign positive and what's the magnitude of that coefficient and then we also get that p-value thing which tells us what's the
probability that the coefficient's actually wrong that maybe you know the data's so noisy we can't say for sure so what's great is you know if you've got data out there you can throw that data into a linear regression model right if you if you get an idea of what variables you want to include and you can get some output and then from that up we can get an understanding of you know how good is the model what's the r squared what is the sign and the magnitude of the coefficients and how confident can we be
that those coefficients are right so it's actually a really useful way for making sense of the world and as we've shown in the previous lecture right it's usually better than we are at figuring out how the world's going to work thank you okay we've just done a whole bunch of stuff on linear regression right we saw we could fit one line to data and then we saw we could add more variables and interpret regression output there's a problem with this linear regression that is that the world is often non-linear right so remember i talked about
john von neumann saying the study of nonlinear function is akin to the study of non-elephants because there's just so many more nonlinear functions and there are linear functions so if we look out there at the world we might see data right that looks like this or we might see data that looks like that right again or we might see data that's sort of both those features that looks like that so the question is how can we use those techniques we've done we're sort of like trying to fit lines the best we can if the world
is kind of messy so this very short lecture i just want to talk about three ways you can get around this problem that the world may be non-linear and we've got techniques that help us sort of understand linear functions so here's the first thing we can do first thing we can do is we can just approximate our nonlinear function with a linear function so we've got this nonlinear function here right but we're just going to draw three linear functions to approximate it so that's the best possible approximation and so what we could do is we
could say i have a model so in this case i have a model that says this is my functional form this is what should happen so i want to test and say well does that model work well that could be a fairly difficult thing to do so a shortcut would be to say well instead of testing whether that model works i'm going to test whether these three linear models work this linear approximation comes close here's what i think of it if you've ever been to greenfield village in detroit which is near ann arbor they've got
a brick wall that's curved right that goes just like this it's a curvy wall and it's made out of bricks and the way it's made of bricks is there's all these short little straight bricks right but the bricks are laid in such a way that they make a curve well the same thing can go on here you can say my model says the thing should be really curved but what i can do is i can approximate that curve through some short lines now let's push that further to see the second way you can do it
suppose you've got a bunch of data and the data you know you notice when you look at this thing that there seems to be sort of different patterns and different parts it seems to be sort of sloping up in this region right but sloping down in this region what you can do is you can break your data into different quadrants so you can say okay here's the first quarter second quarter third quarter we're just going to assume that like we break it into four equal parts then we can use you can say okay what i
want to do is i want to create a function that's linear in each segment right but explains as much of the data as possible so here i might start out and say well look you know here there's not much to explain it's kind of flat but now when i get here the data is down here so i want to come down i want to head down in this direction now we might think hey let's just come way down here but the problem is if i come way down here how am i going to deal with
all this data up in here so what i'm going to do is i'm going to say nope that's not going to work so maybe i'll come down just a little bit right to try and explain some of this and then when i get up here i can start heading up in this direction now to get here right over here i don't want to head all the way up into this region because i've got to come back down to deal with this data so what i'm going to do is i'm going to come up here part
way and then i'm going to head down this way now formally this would be called a spline method but what's going on here is you're sort of fitting different quadrants of the data to different linear functions and you're sort of coming close to explaining some of the nonlinearity so that's so two things we can do the first one is if i have a model that gives me a nonlinear function i can replace it with a sequence of linear functions the best possible sequence and then second if i've just got a bunch of data and it
looks highly non-linear i can break the data into separate parts right and then what i can do is i can fit linear models to each of the parts now there's a third thing i can do include non-linear terms let me explain what i mean suppose our data looks like this so it looks like it's kind of coming off like the square root of x right instead of writing y equals x plus mx plus b i could write y equals m times the square root of x plus b so i'm doing is replacing x with the
square root of x one way to think of this is i could be write this as y equals m z plus b right so that's a linear model but i'm just redefining z to be the square root of x so again i'm back to having a linear model in fact i could let z be anything i want i could let z equal x squared if z equals x squared right and i write y equals mz plus b what that would do is that would fit something that looks like this where here's y and here's x
where it sort of goes up like a squared term so another way i can deal with non-linearities is just to include nonlinear terms and treat them like they're linear terms so we've learned three things right the first is if my model gives me a function that's non-linear i could say well let's just replace it with a sequence of linear functions that approximate it second thing i can do is if my data looks like it's non-linear what i can do is i can break the data into segments and then fit linear models within each segment and
that'll capture some of the non-linearities and then a third thing i can do is i can just introduce nonlinear terms so instead of having x1 and x2 i could have the square root of x or i could have x squared i could even have something like the log of x or the sine or the cosine of x right i can do anything i want i could put any function of x in there in place of x x is just like a placeholder for some variable so even though we've thought about drawing lines through data we
can actually draw sort of any function through data now the statistics on this and knowing how accurate your coefficients are and that sort of stuff gets more complicated when you start making these changes but you can still do it and if you take an advanced course in statistics or econometrics you learn some of these techniques what we're trying to do here though is just get an understanding of how the models work and what linear models do and even sort of quote-unquote linear models with nonlinear terms do is they help us understand patterns and data help
us understand how much of the variation we can explain and understand what's the sign what's the magnitude right of the coefficient of each of the variables that we think are important okay thanks hi in this last lecture in this unit on linear models what i want to talk about is something i like to call the big coefficient now here's the idea if we have a simple linear regression model we have some equation like y equals a1 x1 plus a2 x2 plus b right and x1 and x2 are the what we call the independent variables and
y's the dependent variable so for example y might be sales of a product and x1 might be advertising in magazines and x2 might be advertising and television now we can look at these two coefficients a one and a two and figure out which one's bigger what that's telling us is we get sort of more bang for the buck from advertising on radio and magazines or from advertising on television if it's television if a2 is bigger than a1 then that's where we spend our money so the idea is you put your assets you you put your
resources on the variables that have the bigger coefficients so this big coefficient thinking has led to something that people like to call evidence-based blank so there's evidence-based medicine what you do is you look at all sorts of different treatments that have been tried on patients and you gather all that evidence and then you figure out which ones have the biggest coefficient so does diet have a bigger coefficient than exercise does the medication have a bigger coefficient which medication has the biggest coefficient and that's where you put your resources there's also evidence-based philanthropy if you want
to go and you want to improve a community or improve a country you look and you say which has which coefficient has the you know biggest bang for the buck is it you know spending on children is it spending on health care is it spending on women is it spending on education what is it and based on that you can make better decisions now remember we've talked about this earlier in this this unit about linear models are better than just you know sort of just thinking up stuff without any evidence it's absolutely true so i'm
totally in favor of evidence-based thinking let me explain sort of how it works what you do is you construct some sort of model and by that i mean you have some understanding of what variables you think matter and possibly even the structural form of those variables right of that equation so it could be non-linear then what you do is you go gather data and after you gather that data you identify the important variables and then change those variables okay now there's a movement towards what people call big data and the big data movement says that
well maybe we don't even need the models anymore right this is what some people say they say look here's what we do we first just go gather the data that's the first thing then you find the pattern then identify the important variables so no need for the model now i want to make the point that i think that that's not true and that's an overstatement let me explain why right big data does not obviate the uses for models first let's just think of the broad reasons why we have models one is just to understand how
the world works so even if you see the pattern right the identification of the pattern is completely different than understanding where it came from right so you could recognize wow we've done a ton of experience and force seems to equal mass times acceleration right that's very different than having a model that explains why that's the case right well you might object that you might run away you didn't realize that like heavy objects and light objects seems to fall at basically the same rate still would be nice to have a model that explains why that's true
so identification pattern in no way gives us any sense of explanation all right but there's bigger reasons just even are trying to sort of affect policies to write evidence-based whatever just without without a model just based on pure data first is correlation is not causation remember example the equestrian team right if you run all sorts of data you could find you know this variable seems to matter but the thing is it could be that that variable doesn't matter at all that it gets correlated with something else that matters like the equestrian team second and this
is maybe the most important point linear models tell the sign and magnitude of these these variables but only within the data range right so if i've got a bunch of data here like this right and i fit this model to it that doesn't necessarily tell me anything about what's going on up here and so what i'd like to do is have a model perhaps that gives me some indication of whether i think that linear relationship is going to continue to hold okay so let me uh give an example two examples of what it means the
first is feedbacks so let me give two examples here so the first one is let's take anti-lock breaks in cars so you could think that you know if you looked at data on accidents you could say but one of the things that seems to be causing accidents is cars bumping into the car in front of them if we could just get cars to stop sooner we'd reduce the number of accidents and so you put money and resources in developing anti-lock brakes in effect initially that seems to save a lot of lives but what might happen
over time every people are like thinking electrons if before people kept maybe like a 40 foot or 30 foot gap between them and the car in front of them now that they have anti-lock brakes they may creep up a little bit and they may start driving closer to the car in front of them and a lot of the benefit of the anti-lock brakes will fall off and so if you think of if i had a little graph that had speed of breaking right instead of being a nice linear graph if i take into account the
feedback right the benefit may fall off right instead of being linear it may be sort of sublinear okay another example let's go back to education class size you may fit some data and say boy when class sizes fall from 25 to 20 if i put student performance here that you know there's a lot of data here and this seems to show that like performance is going up so you can say well let's move it all the way to 15 and you could think if i extrapolate from this if i just make class size 15 performance
should go way way up but it could be that if you move class size to 15 the performance sort of just plateaus and the reason why is there's all these other causes for why performance doesn't increase like family support general health right um resources in the community so even if you reduce class size to 15 there may be a diminishing effect for that because of feedback right one big reason why there'd be a feedback in this case is that you need to hire a whole bunch more teachers and you might not be able to get
the same sort of teacher quality that you had when you had 25 teachers for 25 students per teacher when you suddenly double the number of teachers let's say you're not going this good the teachers aren't going to be the same quality and the students might not do as well so again these feedbacks mean that you have to be careful about extrapolating a line outside the data range there's a bigger problem with the fact that your data exists only within a small region and this is what i call the problem of multiple peaks so suppose we've
got a bunch of data and it's all around here and then we run some regressions and we see like whoa it seems to be increasing in slope and so we start moving in this direction and then we find this peak right so we've got data in this range here and using this data we figured out this is the optimal thing to do but in doing so we completely miss this entire peak over to the right we completely missed this other opportunity because we were sort of blinded by the data you know where we had data
in this small range so this leads to a distinction and i want to make between what i call the big coefficient which is climbing our current hill and something i'm going to call the new reality which is thinking of a completely different hill to ask is there something entirely new and different so let me be really clear i'm not saying big coefficient thinking is wrong think it's really useful you want to have models you want to identify the important coefficients you want to think will those coefficients likely hold outside their range and if they do
right then you want to change those variables and hopefully affect change in a meaningful way in a good way however you also want to take into account that the fact that you might want non-marginal changes you may want to do something big and new and to think about the effects of something big and new it's often useful right to construct models of those entire systems to see what do you think is going to happen so let me give some examples of this so suppose you're interested in healthcare big coefficient thinking might be tax cigarettes right
because lung cancer is a leading cause of death this reduces the number of people who get lung cancer you also raise money that you can spend on health care win-win new reality thinking might be something like universal health care but let's give everyone health care and let's try and you know improve the health of americans or the health of people in any other country through a universal healthcare system let's look at traffic big coefficient thinking might be increased number of high occupancy vehicle lanes right the number of lanes where you can have two or three
cars right again makes total sense new reality thinking though would be why not a rail system why not the united states doesn't have much more rail system want to create a huge rail system you know within you know within a few cities on the east coast or the west coast and try and you know move some of that traffic off the highways again new reality versus big coefficient last one this is kind of a fun one when i was growing up there was a study showing that oat bran significantly reduced cancer not turned out this
was sort of a small end study when they did subsequent studies the effect wasn't as big as they thought but because the coefficient looked big at first they started putting out brain on everything including in pretzels right and this is again big coefficient thinking this will reduce cancer let's give everybody a operand new reality thinking would be let's try and get everybody in a fitness regime right so let's try to create some fitness regimen where people are out there exercising an hour a day that's a completely different thing than tossing sprinkling a little oat brand
into pretzels right it's fundamentally changing how we live our lives it's a new reality now this plays out in policy circles all the time so if you look at something like the american jobs act right this was you know 470 447 billion dollar programs it's a lot of money and it did things like you know created tax credits for new employees right and subsidies to hire veterans right in payroll tax holidays these are all big coefficient logic programs the idea is that you know we've looked a lot of data and we see that these sort
of you know policies get people to sort of spend that money initially or hopefully get people to hire employees initially right so these are programs that we think give us the most bang for the buck but again most of this program most of this 447 billion was big coefficient thinking that's good right but we can contrast it with new reality thinking so what's new reality thinking a new reality policy is something like the interstate highway system so in 1956 the united states government allocated 25 billion dollars for 41 000 miles of roads now what that
would cost now if you just use basically sort of an inflation index like the cpi that would be like 200 billion but if you actually figure out the cost per mile of road these days again not within cities but between cities it's about 10 billion 10 million dollars it'd be about 410 billion about the same as the american jobs act but this was new reality this wasn't big coefficient this was creating an entirely new system i'm not saying either one is better than the other because you can do a new reality program that cannot work
at all but the point i'm making is this is that evidence-based methods are really useful and if you're going to work within if you're going to do some minor change or tweak a whole bunch of variables you should use evidence you should figure out those coefficients and should put your money in the big coefficients at the same time you have to keep in mind the fact that big coefficient thinking right can ignore the new reality it can blind you to completely new and different ways of thinking about the world and making improvements in it so
this is where models become so important right one of the things we do in model thinking right is model thinkers is we can construct models to think about what happens if we move outside that data range right so here we'll talk about thinking outside the box how do we think clearly outside the box well one ways we do so is by thinking with models all right thank you hi in this next set of lectures we're going to look at some models of tipping points now these are going to be highly non-linear models in the sense
that what happens is going to be that a small change is going to lead to a big effect so instead of being sort of a nice linear thing like we've seen before we're going to see abrupt changes right where the system tips at some point now our goal here is to really try and understand in a deep way how tipping points occur and what what is the tipping point on what isn't a tipping point because one thing we're going to learn is a lot of things that look like tipping points really aren't tipping points so
let's consider what is a tipping point well imagine the proverbial case of a the straw that broke the camel's back right so suppose i've got a this is a graph of the weight of a camel and here i'm adding pieces of straw right so i add pieces of straw the weight of the camel goes up in a nice linear way right of the weight of the camel i should put weight of camel plus straw right that goes up in a nice linear way but if i were to graph the height of the camel what would
happen is the height of the can be pretty much constant as i add more and more straw until eventually the camel falls down it back breaks and then the count becomes much much shorter right so this point right here is where the tip occurs so that's what i mean by tipping point one more piece of straw has a huge effect on some variable interest namely the height of the camel also the speed of the camel as well that would also chip it would go from a positive number to zero so people talk a lot about
tipping points malcolm gladwell a few years back wrote a book called the tipping point which sold millions of copies and when people talk about tipping points they often mean kinks and curves so if you look at housing starts in the united states right they're going up really nicely from the beginning of the new sanctuary up until about 2006 and then there was this big kink right here and so people tend to say this point is a tipping point and i'll show you that you know that probably was a tipping point but then people look at
things like this is users on facebook this is a graph of facebook user's graph and you see this thing goes up here and there'll be people that say oh right here 2008 that's the tipping point i'm going to argue that's not true okay same as if you look at the world population you might say oh my gosh look at that in the 1940s the world population took this deep kink it did kink but i'm argue that's also not a tipping point same's true with wikipedia articles people say oh boy wikipedia in 2004 this was really
the tipping point these are all cases of the graph having a big kink in it but they're not necessarily tipping points tipping point is when a small change leads to a large effect these processes that we've just looked at facebook world population wikipedia entries these are all cases of just exponential growth when you have exponential growth you're going to get a curve that takes off just like if i draw like you know x squared or x cubed directs to the fourth right i'm going to get something that sort of you know goes like that and
you can convince yourself ooh right there the slope seems to be changing really fast nothing tipped that's just the inevitable process of growth so one reason we want these models is to make sense of what's a tip and what's not a tip and we also want to ask what kind of models produce tips right how can we get tips and and when are we likely to see them so here's what we're going to do we're going to construct two really famous models one from physics one from epidemiology the physics model is known as a percolation
model and maybe you many of you maybe own houses and you have you had to drink wells and there's a question of did the ground perk that's what they said we'll ask them did it percolate so could the water sort of filter down through the ground or was it was it too clay-like and would hold the water in so percolation literally refers to you know can sort of water make its way through a system or like coffee percolating through coffee grounds then we're going to get something called the sis model s stands for susceptible i
stands for infected and s stands for susceptible again this is going to be a simple model of disease where you have this disease you don't have the disease you get infected and then you don't have the disease again okay so we're going to construct that both of these models will have tipping points then we're going to make a distinction between types of tips i'm going to talk about direct tips direct tip is a situation where a particular action tips that that same dimension that same variable so for example in the context of a war a
battle might tip a war you know from one side winning to another side winning right so it creates it tips the same the same exact entity alternatively there's a contextual tip contextual tip means something changes in the environment that makes it possible for something to happen okay so a direct tip is where the variable itself changes it causes itself to tip so it could be you know people buying a particular product it could be battles in a war the contextual tip is something in the environment changing in sexual way that it then causes the system
to move from one state to another i'm going to make a second distinction between types of tips remember we talk about these four classes of things that the world can fall into right it can be stable right like an equilibrium it can be periodic right it can be random right or it can be complex right these are sort of the four types of states the system could be in well a system could tip from one to the other so something could be stable and tipped to periodic something could be periodic and tip to random so
we can talk about not only direct tips and contextual tips we can talk about tips between class right so it goes from one class to another in tips from within class so for example if a system tips from one equilibrium to another equilibrium that would be a within class tip right because the overall structure of the system hasn't changed in terms of how we categorize the types of outcomes we see all right so that's the plan we're going to construct some basic fundamental models that produce tipping points we'll show that a lot of things that
people call tipping points really aren't tipping points they're just exponential growth and then we'll classify types of tips right we'll talk about between class tips and within class tips we'll talk about direct tips and we'll talk about contextual tips then we'll conclude with just a very short lecture on how you might measure tippiness and this is some work i've actually done with a colleague of mine p.j lamberson coming up with some measures of how tipia system is okay let's get started thank you hi in this lecture we're going to talk about a model that produces
a tipping point it's a very simple model and it comes from physics and it's known as the percolation model now the idea is this you've got you know ground up here and you've got water that comes down as rain and you want to ask does it percolate through the soil or not right very simple question so how do you model something like that again the essence of modeling is to simplify things right so what you're going to do is construct the following sort of model just a checkerboard sort of like the game of life or
sell your automata models but the idea here is this is that you've got a bunch of squares and they can either be sort of um filled in like this or they can be left open now the idea is this you can only jump from a filled in square to a filled in square so thinking of a think of a frog trying to cross a river so it can jump along here along here along here but it gets stuck doesn't make it and it can go here and it gets stuck so this is a case where
it wouldn't percolate because you can't get from here all the way down to the bottom so here's the model really simple model yes the following question let p equal the probability that i fill in a square so for each score i flip a coin and a piece a half then half the time i fill in the square half the time i don't if it's a third a third of the time i found a square two thirds of the time i don't and then we ask does it percolate it's a really simple question well here's what
the graph looks like as long as p is less than 59.27 percent it doesn't percolate this is for a big graph right but once you get above that the system tips right right here there's a tip and then it becomes likely that it does percolate so you get this really abrupt change now notice this is this is a non-linear function right a linear function looks sort of like this and this thing goes sort of flat and then makes this tip right at that point so what's causing the tip what's causing the tip is that if
we look back at the picture right for p less than 59 there just aren't enough things filled in so if you were at 30 percent went to 31 percent like i filled in like one more square it's still not likely that i'd percolate but once i get to about 59 percent then so many squares are filled in that it becomes suddenly really likely that i'm able to make it to the bottom so it should be pretty clear that going from 20 to 21 isn't going to have much of an effect and going from 21 to
22 doesn't have much of an effect but going from 58 to 59 to 60 suddenly has a huge effect now what's great about this model is that it can be applied to all sorts of stuff so we're going to first apply it to forest fires and then we're going to apply it to banks so we're going to use a net logo model here to try and make sense of this and what we have is a model here a simple model of forest fires and we have one button that shows the density and that's right up
here and then we just set this thing up and that's going to fill in trees with this density so currently it's at 57 percent and then i'm going to across this left edge you'll see if you look really closely you can see red i'm going to start fire along that edge and we're just going to see what happens if i let it go so if i let that go what you see is that oh it comes close and it almost makes it let's try it again let's set it up again oh it come close and
almost makes it doesn't quite do it one more time just about you know pretty good but let's move this thing up to then 61 which is above that 59 threshold and look what happens here at 61 percent it makes it and let's do it again 61 again look what happens it makes it right 61 again it makes it so what we see in this very simple model is that if we go to 57 percent right and set it up we're not likely to make it but if we just increase it a little bit let's just
make it even 60 right which is barely above the threshold then what happens is the fire spreads throughout the whole space so we see this phase transition right right at 59 percent from the fire not spreading to there not being a fire over the whole space to there being a fire over the whole space okay so we can think about that forest fire model the flying way we think of the yield so suppose we had a forest and we wanted to get as much wood as we could from the forest but we knew there was
a chance of fires what would our yield curve look like well there's that critical value right at 59 percent right and what would happen is as we planted more and more trees we'd get sort of a nice linear yield we'd get more and more wood but then once we got above the critical threshold suddenly our yield would fall off really fast and there'd also be a tip in terms of the yield so not only is there a tip in terms of the likelihood of a fire there's also a tip in terms of the yield so
what's nice is here we've seen again remember we talked about fertility of models so we have a model that was used to explain percolation like why does there seem to be you know certainly that percolates or doesn't percolate and we can use that for forest fires and what we see is that you know again ignoring things like wind speed and terrain and things like that that there seems to be some density at which a fire is really likely to spread in a density of trees at which it's not likely to spread but let's push it
even further let's imagine we have a model of percolating banks what would that look like what do i mean well let's again have this checkerboard thing let's suppose that you know here's a bank here's bank one and here's bank two here's bank three here's bank four and here's bank five now what we can imagine is suppose this bank fails it makes a bunch of bad loans well suppose that this bank then has borrowed money from these banks right so these banks all have given money right to bank one but when bank one fails it then
can't pay the money back to these other banks and if it's loaned enough money right then these banks may fail if they've loaned enough money to bank one they may fail because they don't get their money back and so the failure can spread so remember we talked about that before um about how you could the imf has constructed these models of banks we have here's a bank failing and it spreads to other banks failing and that leads to other failings and so on right now these models are more sophisticated and just bank fails and moves
to the next bank what you do is you write down you know sophisticated accounting equations where banks have assets capital and liabilities and they've got loans of different durations and those loans fail and you can sort of ask if we put stress on the system having a bank fail how far you know how fast does that spread so the basic premise is the same this notion of percolation but what you do is you add more detail rich you know accurate detail about exactly what those loans look like and then you could ask the question is
there a tipping point in the case of these banks so is there sort of a state at which the entire system is poised to suddenly have all sorts of bank failures right and again that's a question you can ask in the context of that richer model so maybe the insight from the simple percolation model will hold in the bank case and maybe it won't that's something we're only going to understand by writing down that richer model now you can do the same thing in the context of country failures right so i put this scrap up
i think one of the first lectures for this course so this is a case where we have england fail first right and then what happens is i think um you know that spreads to iran and a couple of countries and then here it spreads to france or to germany and then it spreads down to france so what happens is you can ask if one country fails will that percolate through all the other countries or not and you can get a sense of is this whole system sort of poised for some giant failure in other words
you can ask is there a tipping point in the system of sort of you know country finance country level financial systems all right let's go even one further we can also take the same model think about information percolating what do i mean by that well imagine there's some network of people right so here's a network of people and you could ask me if there's some probability that if i hear some rumor get some piece of information or know something that i'm going to tell it to my friends right so now instead of there's probabilities things
are going to move across links i can ask as a function of that probability what's the likelihood that the information percolates that everybody hears about it right and i can use the same model and so what would the model tell me the model would say well if a rumor is juicy enough or if a piece of information is important enough then it's really likely to spread nobody's likely to hear it if it's not important enough then it may not spread so here's what's really interesting like let's go back and think about information calculation you might
think here's the value of a piece of information or here's how juicy right a piece of gossip is and here's sort of how many people here so here's the number of people here now you might think well that should be linear the more valuable the information the more juicy the gossip the more people should hear but if you actually construct a network model right that looks something like this and assume that there's some probability of people telling people across links what you're likely to get is something that maybe has a bit of a tip right
that it nothing happens if it's not very juicy or the value information is pretty low it doesn't spread but then once it gets above some critical threshold like right here it takes off and people are almost everybody's likely to hear and so this says that we should expect the distribution of information the distribution of rumors not to be sort of very in a linear way without interesting information is how valuable it is but instead to possibly have this kink to possibly have this tipping point and the reason why is because information spreads through networks right
and because it spreads through networks you get this same sort of percolation phenomena okay so interesting that we took this percolation model from physics we saw how it gives us insight into forest fires and he just gives us insights into possibly how information you know spreads to a system you know it spreads through a graph okay let's have just a little bit of fun here let's really take the shackles off we've got this model and we've got this model about sort of percolation moving from one thing to another we can apply this to this the
following idea that sometimes problems in mathematics or you know problems in engineering or scientific problems or innovations people have been working on them for years and then suddenly a whole bunch of people figure it out at approximately the same time and this can be a bit of a puzzle like nobody you know nobody knows how to make a steam engine something everybody's making a steam engineer everybody's trying to you know figure out some way to um you know identify dna and how much people you know cricket watson found it first but other people are on
the heels of finding it right why is it that we see these bursts of scientific activity in a particular area why do we see many people come up with the same innovations the same scientific breakthrough at the same time well you could use the percolation model to basically say well this could be the logic if you think about constructing let's say a mathematical proof oftentimes it is a matter of just getting from a to b to putting together bits of logic or think about an innovation oftentimes it's about getting all the parts to work so
to have a car in an engine that worked you need a braking system you need steering mechanisms you need all these parts as information and knowledge accumulates we fill in more squares so initially right we can't get from a to b but what happens is technology and information right start filling in squares and then eventually right somebody can find a path right but what's interesting is not only can they find that path somebody else can find a path because there's multiple paths because once we get above the threshold that doesn't mean that there's one path
it could be that there's many paths and so it's at least plausible that a percolation type model might explain why we suddenly see burst of activity in particular areas as the knowledge base increases again speculative but that's what's fun about models right once you've got a model you can think about i wonder if this applies in this other setting and gives us some insight as to why we suddenly see burst of things like scientific activity so that's the percolation model right very simple checkerboard and you basically just ask can the frog jump from the top
to the bottom you can use it to understand percolation of water you can use it to understand forest fires you can use it to construct a richer model of bank failure you can use it to understand how information percolates through a system right through a social system and you can even speculate on whether this might explain why we see burst of scientific activity in particular areas and burst of innovations all right thanks we're not going to look at our second model of tipping points and this is a model from epidemiology and it's known as the
sis model for susceptible infected and then susceptible so that is you're susceptible to some disease then you get infected and then after get infected you're cured but then you can become susceptible again if the disease is mutated in some way like a flu virus there's also something called the sir model where after you become infected then you're recovered right then there's no chance of getting the disease again all right what we want to do with this model is show that it produces a tipping point now there's going to be a variable that comes out of
our model called the basic reproduction number and the basic reproduction number if it's bigger than one means that everybody's going to get this disease if it's less than one it means that no one will so it's going to be a lot like right our percolation model where we get this you know tipping point r zero less than one no spread of the disease or bigger than one or zero bigger than one everybody gets the disease now this model is pretty intuitive but it's got a lot of notation so to sort of build this up to
them i'm going to start with something simple known as the pure diffusion model so in the diffusion model everybody just gets it there's no you know sort of getting cured so this thing of this is diffusion of information through a system or disease that everybody's just going to get right so the diffusion model sort of works as follows let's suppose that there's some new disease called the wobblies and we're going to let w sub t be the number of people who've got the wobblies at time t now if there's n people total in our populations
this could be a community or this could be an entire society and minus wt is going to be the number of people who don't have the wobblies and so we can imagine that like tau right this variable tau is just the transmission rate so it's the likelihood that someone with the wobblies gives the wobblies to someone who doesn't have the wobblies right so it could be that you meet you don't get it and it could be that if two people meet when one has it one doesn't but they do get it tau was just the
rate at which that occurs so if two people mean what's the likelihood that one person would give it to the other well remember w is the number that have the wobblies and n minus w is the number that don't so what you need is you need one person to have the wobblies and one person not to so what's the probability that someone has the wobblies well that's just w over n that's the probability of someone having the wobblies what's the probability of the other person not having the wobblies well that's just n minus w over
n so if you want to think about what's the probability of two people meeting where it could get transmitted this is it and then you just have to multiply that by tau right because that's the probability that those two people do meet that in fact the disease moves from one person to the other now instead of a disease if you think of this as a new technology or the piece of information it's the same model tiles just the probability that i tell you the piece of information right or that i tell you about the technology
and you adopt it okay so now we've got this much fancier formula that this is the probability that's going to move from one person to someone else if two people happen to meet and again tau is just the transmission rate and the only thing that's different here is now i've got these fancy t's here to represent that this is the rate at which it's going to go at time period t right because the number of people that have the wobblies is going to change from period to period well now i'm going to add one more
thing which is the contact rate because it depends on how often do people actually meet so you can imagine a situation where people don't meet very often or you can imagine a situation where people meet a lot now oftentimes what counts as a meeting right could differ right so if it's um a disease then meeting would have to be a physical meeting if we're talking about a piece of information that meeting could be over the internet or over the phone so there's a contact rate c what you can imagine is that i've got that formula
right tau which is the transmission rate and i've got the people who have it which is w over n and the people who don't have it which is n minus w n so this is the probability someone has it's right someone not someone doesn't have and this is sort of the rate if they meet well then i've got to multiply this by the probability that two people would meet right so c is the contact rate and if there's n people i've got to multiply this whole thing times m times c because c is the rate
so n times c is going to be sort of the number of meetings right so this is going to be the number of meetings this is going to be the rate at which it transfers and this is going to be the probability that meeting is between one person who has it one person doesn't now i get this incredibly complicated formula that looks like this the number that happened at time t plus one is the number at time t plus the number of new people right that's what i mean by a lot of notation so there's
a lot of notation here nothing's complicated so if i said to you how many people have the wobblies at time t plus one you'd say well it's a number that happened at time t plus the number of people who get it in the next period that's it it's just it's a little bit complicated to write down so what epidemiologists do is then they use this equation try and say okay what does this tell us about the spread of a disease or the spread of diffusion in this case of disease right well it's going to look
like this it's going to start out really low and then it's going to go really fast and then it's going to get really slow again well why is that let's look at this equation a little bit and i want you to focus on this part right here this w over n times n minus w over n when w small then what you get is something let's suppose it's just one person you get one over n times n minus one over n right so that's not very big right because that's just going to be basically like
1 over n right that's going to be approximately equal to 1 over n but when i get in the middle when w equals say n over 2 like half the population have it then i get n over 2 divided by n times n over 2 divided by n and all those n's cancel and i get 1 4. so what's going to happen is that early on since not many people have it there's not many people who can spread it in the middle half the people have it half the people don't so it's going to spread
really fast later on when w is almost equal to n a lot of people have it but there's very few people who don't have it so there's not many people who can spread to so that's why early on there's few people who have it and so it can't spread very fast later on there's few people who don't have it and so it can't spread very fast so you get an s-shaped curve so again this is non-linear here's the point though no tipping point here right this isn't a tipping point this is just diffusion now you
might look at this graph and say oh boy here's a tip right here where it suddenly speeds up and here's another tip nothing tips all this is just the natural diffusion of a process diffusion starts out slow it then goes fast it accelerates so this is an acceleration but it's not a tip and then it decelerates because there's very few people who don't have it so just because you see a kink doesn't mean there's a tip so kink right does not equal tip it could just be an acceleration so if we look at something like
that facebook graph right the number of facebook users ooh we see this kink here we see boy it really accelerated with this acceleration that does not mean there's a tip it just means that facebook was diffusing and the thing is you could say well yeah but facebook is still going up up up up well at some point there's going to be no more people to get facebook right it's going to diffuse to the whole society and it's going to flatten out right so it's a pure diffusion process okay so it's a pure diffusion process all
right so now we've got the diffusion model down let's move on to what i call the sis model so now we understand the diffusion model we want to move on to something called the sis model now the sis model looks pretty much exactly the same as the diffusion model except for now we allow for the possibility that after someone's become infected they can move back into the susceptible pool so they can become recovered in a sense but then they're also susceptible again so this would be for something like the flu or after you're cured of
the flu the flu mutates and you can become infected again it wouldn't work for something like measles where once you've had the measles you're no longer going to get the measles so here's how the model looks looks exactly like our model before there's some number of people that have it at time t plus one and that's the number that happened at time t right plus the people the new people who get it right this depends on somebody who has it meeting somebody who doesn't have it right and this is the transmission rate and this is
the number of contacts but all we do to this is we subtract off this minus awt well what is this these are the people who become cured right who no longer have the disease and they're going to go back into this pile and be people who could get the disease anew so we're going to sort of like throwing in a rate of people sort of getting better well now here's where this model's sort of more interesting because in the fusion model something just spreads right it starts out slow it goes fast then it tails off
but here what happens is while people are getting sick at some at some other rate some rate a they're getting better and if people get better faster than people get sick then what's going to happen the disease isn't going to spread so this model is going to produce a tipping point so let's simplify things a little bit if we rearrange terms what we get is an equation that looks like this now how did i get this equation all i did was i pulled this wt term out right and i crossed out some n's let me
show you how that works right so up here i've got let's just focus on this part i've got nc tau right times w over n times n minus w over n right minus aw okay and so what i do is i say well you know what let's cross out this n with this n that's fine because those ends go away and then i've got a w here and a w here so i'm just going to pull out that w and then i've got c tau n minus w over n minus a right that's what i've
got and if i look over here that's what i've got i've got c tau n minus w over n minus a so this is just the simplification so when you write models it's useful to be good at algebras if you can do lots of algebra you can simplify things why do we want to simplify things here's why look suppose we're early on in the disease so that means wt is really small so that means that n minus wt over n is going to be really close to one because basically a very small percentage of people
in the population have the disease so now if i look at this thing i can say well you know w t plus 1 is really equal to wt wt plus 1 is equal to wt plus wt times c tau minus a so this thing is going to spread if c tau minus a is positive and it's not going to spread if c tau minus a is negative so this is going to come down to is c tau minus a bigger than zero right or is c tau bigger than a or another way to write this
is is c tau over a bigger than one all right and this leads to what's called the basic reproduction number we let r zero equal c tau divided by a if c tau divided by a is bigger than one the disease spreads right because that means that w t plus one equals wt plus something positive right if r zero is less than one right in other words if c t over a is less than one then w t plus one equals wt plus something negative and the disease dies off so this r0 is called the
basic reproduction number and what it basically tells you is does the disease spread but notice here we've got a tip right r0 less than one no disease spread ours are bigger than one disease spreads so let's take some real diseases diseases like measles mumps the flu the r zeros are fifteen five and three that's why these are real diseases there might be a ton of diseases out there that have r0s less than one and we never even hear about them why because they don't spread now again for things like measles in the months once you
get them you don't fall back in the population so there's a model that you use there called the sir model is very similar to the sis model but for the flu right you fall back into the pool so you can think of it like this sis model now it's interesting here is right from these basic reproduction numbers we can say you know they're past the tipping point the disease is going to spread but let's think about this figure why do we construct models well a bunch of reasons right but one reason is to design policies
you could say how do we stop these things from spreading well one obvious answer is vaccines then the question is how many people do you have to vaccine well turns out the model will tell us so let's think about it this way right there's let v be the percentage of people that you vaccinate so there's this basic reproduction number are zero that's like the rate at which thing spreads to the population well if i vaccinate some percentage of the population that's just going to reduce right the basic reproduction number by that fraction so if half
the people were vaccinated our zero is effectively divided by two right and if 75 percent of people are vaccinated our zero is going to be divided in effect by four right so you only got one fourth of the real population so the question is how many people do you have to vaccinate as a function of r0 well in some sense if we vaccinate right v people it's like we've got a new r0 right little r0 which is just the big r0 times the percentage of people in the population that aren't vaccinated so what we can
do is we can say we want r zero big r zero times one minus v to equal one actually we want it to be less than one right well we can multiply this out we get r zero minus r zero v right has got to be less than one right so we can bring this over the other side we get r zero minus one we need to be less than r zero times v so that means we need v to be bigger than one minus 1 over r0 right so what we get is we get
this equation that says this is how many people we need to vaccinate well let's go back remember the measles were 15 the mumps were five how many people do we need to vaccinate to prevent the measles from spreading well that's just 1 minus 1 over 15 which equals 14 15. so we need to vaccinate 14 15 of the population against measles if we want the measles not to spread for the mumps right we need one over one fifth right we only need to vaccinate eighty percent of the population well here's the model so usually again
this model isn't exactly right we're leaving out all sorts of things like networks and changes in contract contact structures and different locations and stuff like that but still what this tells us is which is really important is depending on how virulent the disease is you have to change how many people you vaccinate in a pretty you know understandable way now the other interesting thing is there's a tipping point in terms of vaccines right if we vaccinate 75 of the people and we needed to vaccinate 80 of the people well guess what it's not going to
work the 25 who don't get vaccinated are all going to get the disease but if we have vaccinated 81 percent of the people then the 19 of the people who don't get vaccinated they're still going to be protected because the disease isn't going to spread okay so the cool thing about this model is it's given us a tipping point r0 it's also given us a policy and this policy is interesting in the sense that like it's of no effect really vaccination has no effect really other than the people you vaccinate but it doesn't know population
level effect until you get to the threshold once you pass the threshold then that next person to get vaccinated in a sense right makes the whole rest of the society immune because the disease can't spread so step back a second in the diffusion model right where we got that nice sort of s-shaped curve there's no tip in the sis model there's this r zero which is the tipping point right so there's this value one so there's no disease if r zero is less than one and there's the disease if r zero is bigger than one
what you can do by vaccinating people is in effect reduce r0 so that's the sis model we've talked about in the context of disease you can even think of it though in terms of the spread of information as we talked about and it's a very interesting model in the sense that it does also generate a tipping point it's another non-linear model and it's a model where we get this you know threshold phenomenon less than our zero no are there less than one no spread ours are bigger than one everybody gets the disease okay not a
happy thought but let's move on thank you hi in the previous three lectures we talked about tipping points and we saw two models right we saw a percolation model and we saw a diffusion moment both of those created tips what i wanted in this lecture is give some formal definitions of tipping points i'm going to make two distinctions i'm going to distinguish between what i call direct tipping points where when i change that variable it causes that variable to tip versus contextual tipping points where the change in some action or the change in some parameter
causes the system to change to result in the tip so in the in the percolation model in the forest fire model we had a change in the density of trees or in the density of the soil and that caused the system to tip so that's a contextual tip i'm also going to talk about tips between and within class right remember we talked about sort of equilibrium systems and complex systems and periodic systems systems can tip from one of those states to another now to understand all this i've got to start with something else i'm going
to start with just some very basic graphs describing some dynamical systems so we can better understand how these different different tipping points operate so we're not going to do the mathematics of the dynamical systems they're just going to draw some simple graphs and they'll help make sense of a lot of this okay so here's the basic idea of a dynamical system what you've got on this axis is here's this variable x and what this is telling you this little x dot is telling you how x is going to change that's what this line tells us
so if the line is positive if it's positive it means that x is going to move in this direction and if the value is negative like it is over here it means x is going to move in this direction so if you look at this particular graph of a dynamical system it shows a stable equilibrium if i start here i'm going to move in this direction to y right there to this point y over two and if i start here i'm going to move in this direction and go here so if you perturb the system
in any way we're going to go right back to that point so it's stable a more complicated dynamical system in this chemical system if i'm in this region the value of the function is negative right so i'm going to move in this direction if i'm in this region divide the function is positive so move in that direction and in this region the value of the function's negative so i'm going to move in that direction so what that means is if i start here i'll go to this equilibrium right and if i start here i'll go
here and if i start here i'll go there so what you get is you get a stable equilibrium here and a stable equilibrium here now notice this point right here where the value is zero is an equilibrium but it's a tipping point because it's unstable any slight change in the variable if i just move it a tiny bit in the to the right i'm going to head over to this equilibrium and if i put it a tiny bit to the left i'll head over to zero so you go from a stable equilibrium to an unstable
equilibrium to a stable equilibrium this unstable equilibrium we can think of as a tipping point because any slight movement any change in that variable will lead to a large change in the variable that's what i'm going to call a direct tip so you're sitting at some unstable point right here and if i just moved it in either direction right i'm going to roll down the hill and either go to this equilibrium or that equilibrium so that's a direct tip tiny change in the variable itself leads to a big change eventually right in the value of
that variable so formal definition direct tip small action or event has a large effect on that end state right so you change the variable a little bit and it has a huge effect on what happens in the long run so for example world war ii right the killing of archduke ferdinand right um you could argue that this resulted in this tip where the whole world goes to war and tens of millions of people die right but this sort of just tipped the whole system right and led to you know alliances forming and you know nations
waging war against one another all because of this one act right it's a direct tip now in the vietnam war right one of the things that escalated the war was the bombing by the north vietnamese of pliku now mcgeorge bundy who is the national security adviser at the time he famously said plikus are like street cars what he meant by that is that streetcars come down the road all the time that plecos these events that can escalate war they happen all the time so what he was saying basically is that even though pliku was what
i'm going to call a direct tip right that yes true it was a tipping point in the war but they come along all the time because what he was saying is the context right the environment was such that it was bound to happen so we think back to our percolation model what george bundy was saying is look it was going to percolate the system was going to tip the probability was about 59.27 percent so all we needed was one drop of water and it was going to race its way through all we needed was one
match and the whole forest was going to catch on fire and that those matches plikus are like street cars so the point here and it's an interesting one is even though we often focus on the direct tip we focus on the killing of the archduke we focus on pliku what's really going on a lot of these cases that what's causing those direct tips is a change in the context right so let me explain how that can work so here's our function right and we've got you know this system where here's a stable equilibrium here's an
unstable equilibrium here's a stable equilibrium well what happens if the context changes and moves this line down a little bit right well if it moves this line down a little bit then all the arrows point in this direction so this equilibrium that was stable right is no longer stable so what we've got is we've got a contextual tip a change in the environment just by a tiny bit has an effect such that it wipes out right in this case by the environment causing this curve to shift down it wipes out this equilibrium the dynamical system
always points to the left and what you get is you get a completely new equilibrium at zero right so a contextual tip is a change in the environment so in the percolation model it's sort of filling in more squares in the forest fire model it's growing more trees in a social network model it's making more connections and it's in doing that it makes the possibility of a direct tip go way up and it means that the end state of the system is likely to change so remember we saw that in a percolation model this is
a change in context right it's exchanging sort of the number of squares filled in remember also our sis model right r0 was the basic reproduction number when we change the virulence of the disease or we change the rate at which people contact or we change the rate at which people sort of get recover from the disease that's changing the context in which that disease spreads and it's changing the likelihood then if you know once the disease gets kick-started that it's going to spread throughout the population so changing the context is what makes these direct tips
possible okay so we've talked about direct tips where the variable moves a little bit and causes itself to move even further right sort of bootstraps itself and then contextual tips where the environment changes i want to make one more simple distinction then we talked about four different types of systems right equilibrium systems periodic systems random systems and complex systems a system can tip within a class so it can go from one equilibrium to another or it can tip between classes it can go from equilibrium to periodic and from periodic to complex so for example if
you look at pictures of chaos this is a logistic map you can have a system that starts out in an equilibrium and then moves to some a simple periodic system where it goes back and forth and then you can get in here where it gets these sort of really you know sort of interesting patterns that might be classified as complex or even random right so the system can move from equilibrium to periodic to complex as you change your parameters these would be contextual tips that are between class right so tipping points what types are there
right so there's direct tips where you change the variable that causes the variable to tip right there are contextual tips where the environment changes once the environment changes then the system is likely to move from one state to another once somebody lights the match there's tips within class where you move from one equilibrium to a new equilibrium and there's tips between class whereas where a system tips from in equilibrium to you know a much more complex state or a periodic state so that's a simple taxonomy of tipping points and we've seen models that generate you
know pretty much each one of those types where we want to go next we're going to talk about how do we measure tips so is there a way to measure how tippy a system is thank you hi in this final lecture on tipping points what i want to do is i want to talk about how we can measure tips now remember previously we've talked about active tips or direct tips where the variable itself causes the system to tip and then contextual tips or something in the environment remember in the case of the percolation model causes
the environment to suddenly be willing to tip we've also talked about tips within class so from an equilibrium to an equilibrium and tips across classes from an equilibrium to complex what we'd like to do is now some sort of way of measuring how big a tip is like is it a tip that was completely unexpected was the tip that maybe was you know sort of likely to happen or just how rare the event that the system tipped into was so to do that we need some way of measuring now that's a key thing in this
model's class right because we want to be able to take these models and take them to data some way to use data so if we're going to use data we have to have some measures so what i want to do in this very simple lecture is just introduce two different measures on what tipping points are to measure the extent of the tip one is going to use something called the diversity index the other is going to use a formula called entropy which is used a lot in physics not that much in social science okay so
let's get started let's first think about inactive tip so an active tip what we have is we've got you know this ball up here and it's sitting p it's sitting on top of this peak and it could either go this way or this way and if it heads this way it's going to go down here or it can head down here each of these things has a 50 chance of occurring once the system is tipped then it's either going to be to the right or to the left so if it's down here now there's 100
percent chance that it's over here and there's no chance it's over here on the left when we think about a tipping point what we can think of is that previously what was going to happen was uncertain it could either go left or it could go right but once the tip has occurred we know what's happened we know the system is now completely going to be in the right so one way to think about tipping points and the measure we're going to introduce is going to depend on that idea that the uncertainty goes away initially there
was some uncertainty it could go left to right but after the tip we know where it's going to go so we're going to measure tippiness by reductions and uncertainty so to get there we first need a measure of uncertainty way to think about that is nothing but changes in outcomes so initially there's a whole bunch of different outcomes that could occur but then after the system tips either there's only one outcome that occur could occur that maybe if it goes to some equilibrium or it could be the case that there's a whole bunch of other
things that could occur so it could have done that you thought for sure a was going to happen but after the tip it could be bc or d so what you want to think of is changes in the likelihoods of different out so how do we measure that how do we measure likelihoods of different outcomes so we're going to do as i mentioned two measures one is going to be this thing called the diversity index which is used in social science a lot in economics political science sociology and the other is going to be a
formula called entropy which comes from physics and information theory so here's the idea suppose we've got four outcomes a b c and d and suppose each one has a probability of one fourth so each one has a one-fourth probability of being true the question is how do we measure how unlikely that is here's what we're going to do first we're going to introduce the diversity index and here's how it works you've got four possibilities a b c and d so here they are p a p b p c and p d so remember from probability
that these four things are going to happen if those are the only things that can happen these are going to have to sum up to one well we want some measure of sort of how diverse this distribution is over these four outcomes because if it's a fourth the fourth the fourth or fourth that's more diverse than if it's a half a half zero zero so here's the idea of the diversity index we basically first compute the probability that if two people meet they're of the same type so let's suppose we this is a distribution across
people there's types a b c and d well what are the odds that they're both type a's well that would be p a times p a and what would be the probability they're both type b's well that would be just p b times p pb and what's the probability they're both pcs well again that's pc plus times pc and then finally they could both be ds i can write this in a fancy way i can write it's a summation sign of p i i equals a b c or d p i squared right so i
can just take each of these types p 1 p a p b p c b d square them and just sum those up so this little funny sign here means a summation sign so let's suppose i have p a equals pb equals pc equals pd equal so they're all 1 4. well then what i'm going to get is i'm going to get 1 4 squared plus 1 4 squared plus 1 4 squared plus 1 4 squared which is 1 16 plus 1 16 plus 1 16 plus 1 16 which is 4 16 which is 1
4. so that's telling us to sort of like one-fourth so to get the diversity index what we do is we just take it's equal to one over the summation of those pi squareds so that's going to be one over one-fourth which is equal to four so what that's telling us is that look it's like there's four types here okay so diversity index remember was just one over the summation of these p i squared so before we had pa equals pb equals pc equals pd and we got a diversity index of four well let's suppose we
have um pa is a half pb is a third and pc equals a sixth so we have here so we've got three types a's b's and c's but they're not evenly distributed so we'd like is our diversity index to be maybe a little bit less than three because this isn't quite three types so let's see what happens when we get we're gonna get pa squared which is 1 4 plus pb squared which is 1 9 plus pc squared which is 1 36 so we can put all these over 36 and we get 9 plus 4
plus 1 over 36 that's 13 over i'm sorry 14 14 over 36 so 14 over 36 if we take the diversity index that's going to be the opposite of that which is 36 over 14. so that's 2 and 8 over 14 or 2 and 4 7. so what the diversity index tells us is well you know this is not quite three because you know you're more likely to get a then you want to get c and so it's we'll call this 2 and 4 7. so what the diversity index is sort of tells you approximately
how many different types of things there are so why does this work to measure tips well we could think of is if initially we thought there were five places it could go and then after the tip there's only one place it could go the diversity index would flip from five to one if initially could go to a with probability of half b with probably one third or c with property one sixth then we could say well initially there were two and four sevenths places it could go but then if it tips and goes to c
we can say now there's only one place it can go and so we get a reduction in the diversity index from two and four sevenths down to one and that helps us measure the extent of the tip okay now let's turn to entropy entropy is a slightly more complicated formula but it also measures the degree of uncertainty so entropy again comes from physics information theory and it looks like this it's equal to minus the summation of those p i's that looks similar to what we had before but now times the same times the log base
2 of pi now what is the log base 2 log base 2 of 2 raised to the power x just equals x so it tells you what's the exponent of the number with respect to 2 so it tells you the power of 2. so if i take the log base 2 of 1 4 that's going to equal the log base 2 of 2 to the minus 2 right so i can write 1 4 as 2 to the minus 2 so that's just equal to minus 2. so if we take our example we had before and
computed the entropy we'd get minus 1 4 log base 2 of 1 4 plus 1 4 log base 2 of 1 4 plus one-fourth log base two of one-fourth and then we got one more plus one-fourth log base two of one-fourth and if you haven't unlocked before don't worry about it i just want you to get a sense of the idea behind this what the entropy measures so i'm going to explain conceptually what this is after we can finish this calculation so we're going to get minus 1 4 times minus 2 right plus 1 4
times minus 2 plus 1 4 times minus 2 plus 1 4 times minus 2 right and we get this remember because the log of log base 2 of 1 4 is just the log base 2 of 2 to the minus 2 and so it's just minus 2. so that gives us minus and i've got four times one fourth times minus two so i get minus minus two which equals two okay so what does entropy tell us what entropy tells us is it tells us the number of pieces of information we'd know the number of bits
of information you'd have to know in order to identify the outcome so let's go back to our example where there's four outcomes a b c and d and they're equally likely well how many pieces of information would i have to tell you in order to identify what the outcome is well suppose i said the first thing i do is i can say well it's either an a and b it's either a a or b or c or d right so i could divide this set in half then in the second thing i can tell you
whether it's a or it's b right so i could always identify which one it is by asking two questions so question one could be is it a or b or c or d so you have to name which of these two sets it's in and then question two would depend on question one if you said it was in a or b then it would just ask okay is it a and if you said yes then it's saying if you said no then it's b so entropy tells you just how many pieces of information you would
have to know to identify the outcome and so if there's four outcomes you'd only need two the summaries the diversity index tells you sort of the number of types and entropy tells you the amount of information you would need to identify the type right so in our example the diversity index was four because there were equally spaced four types right a b c d were all equally likely and the entropy was two because with two questions you could identify whether which the type was right you could ask is it a or b or c or
d and if they said c or d and because it was a d they said yes it's d they said no it's c so entropy's amount of information diversity indexes number of types either one of these things will work so let's see this in action from our first example remember we had initially it was it was sitting up here it was a 50 chance it goes here and a 50 chance it goes here so the diversity index in this case would be two right because there's two outcomes and each one is equally likely and so
if we did our formula we just get one-half squared plus one-half squared which is the fourth plus a fourth which is equal to one-half and then we take one over one-half and that gives us the diversity index of two now if you compute the entropy of this we won't do it you're going to get the answer one and so why because all you need is one piece of information let's say is it to the left or is it to the right and that tells you where the ball is so diverse index is two entropy is
one well what happens after this tips after this tips it goes over let's say the left for the probability 100 percent what's going to happen the diversity index well when this thing goes to a hundred percent zero then the diversity index is now one right because you know where it is and the entropy is zero because you don't need to be told anything you don't need any information over the balls because it's already on the left so there's no uncertainty left in the system at all so you can see is this thing the diversity index
went from two to one entry went from one to zero that's how we can measure how much a system tipped right because it's telling us it went from we don't know what's going to happen it can either go left to right to now we know where it is so the amount of number of types has decreased from two to one or if we think in terms of information we can say previously i needed one piece of information now i need zero so if you think of tipping points when a system suddenly tips to some new
thing we go from a situation where we didn't know what was going to happen to a six generation where we knew what was going to happen so that would be true if we had a between class tip from a complex situation to an equilibrium situation alternatively right we could have a system tip from a nice equilibrium where the diversity index was low and the entropy was zero to one that became incredibly complex where uncertainty was very high then the diverse index would increase and the entropy would also increase so what we've got is these measures
tippiness we could have an increase or a decrease in both diversity index and an entropy but the tip in a system is basically telling us that what's going to happen is likely to change now the way to think of tips is changes in the likelihood of outcomes or we talk about these direct tips or these contextual tips or these between classes within class what's really happening with the tipping point is what we thought was going to happen is no longer going to happen or maybe we didn't know what's going to happen and now we know
what's going to happen so the system is tipped the way to measure that is we need some measure of uncertainty and we've introduced two right we've introduced the diversity index and we introduced entropy that's tipping points has been sort of a big thing but we've learned some stuff right we've learned that just because we see in a kink in in the graph doesn't mean it's a tipping point that could just be a growth process so that could be a diffusion process we've also learned there's difference between active tips and direct tips in between class and
within class tips and we've learned a little bit about how to measure those tips okay thank you hi in this next set of models we're going to talk about a very specific thing growth we're going to have an economic growth to be more specific we're going to ask why is it that some countries are rich and other countries are poor to put a framework around that a model around that we're going to start out by looking at an even simpler model a model of exponential growth so here's a model where you just put money in
the bank and we talk about sort of the rate at which it grows and see how that accumulates over time from that base model we're then going to construct a model of a very primitive economy and show how economies grow now one of the surprising results of that model of economic growth is going to be that it there's limits that without innovation growth stops so we're going to move from that simple model to something called the solo growth model and the solo growth model allows for there to be innovation and shows how innovation has this
sort of multiplier effect on our collective well-being and why innovation is so important and then we'll talk a little bit about some extensions in particular talk about sort of then once we've got this model how do we use it to think about why some countries are successful in other countries are and really what enables growth to continue over time okay so to get us started first we just need some basic definitions so first what is what do i mean by growth growth of what well you can think of growth of overall human happiness right that'd
be a nice thing to think about but that's harder to measure in some respects right so we're going to focus on the same thing economists do which is gdp so this is gross domestic product so this is just the the total market value of all the goods and services produced within an economy okay and there's data kept under so example if you look here's a whole bunch of countries in the world in their gdp so you see luxembourg at the very top if you look down little ways you'll see the united states right at 15th
and then you know we have a per of gdp about 47 000 and further down and so on you see spain and there's very poor countries where gdp is you know measuring the single digits you know thousands of dollars per person okay we want to know though like you know what causes growth so here's sort of an interesting graph right here's botswana and zimbabwe now these countries aren't the same right they're both in africa but botswana only has about two million people zimbabwe has about 13 million people so zimbabwe's much larger country but look at
gdp right you can see that zimbabwe has pretty much stayed flat right it hasn't gone up from 1966 to 2005. we're in contrast right botswana has been very successful so you understand what is it that enables one country to succeed in another country to not succeed and we'll talk about a book by dronona simone james robinson called why nations fail which really focuses on this line here right like why is it the case that zimbabwe hasn't been successful now if you look at gdp when you talk about growth what we mean is changes in gdp
here's the annual change in real gdp since 1930 in the united states when they say real notice in the script they say real when economists say real what they mean is taking into account inflation so if inflation goes up by 10 percent but the economy went up by 15 you've got to subtract off that 10 percent just due to prices rising you look at gdp united states what you see as you see during the war right there's huge increases in gdp and then we had this sort of nice post-war period where growth was fairly high
you notice it stays fairly high so you know it averages around you know three to four percent throughout this whole range right you see the occasional dips but basically see fairly you know steady constant rates of growth but if you look from 2006 to 2011 right when we had this contraction you see this period here where growth fell we had massive decreases in growth and those are you know people really feel that right because they're much worse thought now now that we look at growth you can ask sort of can you sustain super high levels
of growth so look at china you see these sort of unbelievable levels of growth right these things are all in the eight nine percent range over the last few decades of the last decade and happens what you can think is oh my god is china going to become 50 times the size of the united states when i was a kid people said the same thing about japan if you look at japan's growth rates what we see is again from 1950 to you know the early 1980s you see these huge growth rates but then if you
notice since then right their growth rates have fallen so one of the things this very simple model will do is explain why that's the case why can you sustain really high growth rates for a long time and then have them fall off once you sort of catch up to the other countries in the world and this is why a lot of people when you look at china right even though you see these high growth rates now a lot of people expect those to fall in the future we're going to start out by looking at exponential
growth and this is where you just put money in the bank and it just grows and grows and grows up up up up up like this right but look at economic growth it tends not to look that way right over time growth tends to fall off and that's because we put money into things like machinery and you know technologies but those depreciate over time and that prevents us from getting that upward sloping curve and so economic growth is often going to sort of go the other way it's going to start out fast and then tail
off sort of like it did in japan unless we can get some innovation and that will shift the whole curve up i'll show you what i mean in a minute one last thing now before we get started when you think about growth right that's focusing on material things material wants and you can ask yourself okay does this really matter shouldn't we be more interested in whether people are happy and we probably should be right it's probably more important than we're happy then we have a lot of stuff you could all trade stuff for happiness there's
a question like does gdp does higher growth make us happier well that's actually a fairly complicated question so here's a graph that shows gdp per capita and then here's a measure of life satisfaction so this is survey data my satisfaction there's a notice the problem here gdp is what we can think of as really hard because we can measure it life satisfaction is sort of soft because you're surveying people and so you ask me one day i can say my life is great you could ask me another day and i might say all my life
isn't that great so you've got to survey a lot of people you got to make sure you frame the question the right way still with all those caveats here's what you see does money make you happier and i want to focus on two different parts of this graph if you look in this region the answer is no as you go from 20 000 to 60 000 people just aren't that much happier but if you look in this region the answer is decidedly yes money does make you happier now the reason for this may be just
some basic services like health care food shelter those sorts of things so getting from zero to ten thousand that's huge getting from thirty thousand to sixty thousand maybe that doesn't matter that much so lifting people out of poverty clearly makes them happier that's one reason to focus on gdp now once you're making a hundred thousand dollars a year it may not make you any happier to make a hundred and twenty thousand dollars a year what we get is does growth make you happy at least according to survey data yes if you're lifting poor people up
know if you're making rich people richer okay so that's the framework we're going to do we're going to start out by talking about exponential growth then we're going to move on to economic growth models and we'll do three we'll do a simple economic growth model with something called solos growth model and i'll talk just a tiny bit about something called endogenous growth models all right let's get started thank you hi in the set of lectures we're talking about economic growth and what we want to do is we want to understand why is it that some
countries are rich and some countries are poor so to get our bearings on how growth works we've got to start with a much simpler model because our economic growth models are going to have a lot going on they're going to have labor they're going to have physical capital they're gonna have depreciation rates and saving rates and all sorts of stuff so to just sort of get us to understand the basics of growth we're gonna start out with a much simpler case we're gonna start by talking about just compounding so you put money in the bank
we talk about the rate at which that grows and from then we're going to talk about then countries growing like gdp growing and we're going to see why different growth rates are so important because we're going to see that again growth is sort of exponential so we're going to talk about just a very simple sort of exponential growth rate we just keep putting money in the bank from that we're going to learn a really cool trick called the rule of 72 the rule of 72 will tell us how quickly our money will double or how
quickly gdp will double so therefore if you think about what's in between an eight percent growth rate and a four percent growth rate and i think it's twice as much we'll actually see according to the rule of 72 that it's even more than that doubling the growth rate has really significant effects okay so let's get started so let's start with just sort of like you know basic accounting 101 you know you put some money in the bank so let's suppose you got x dollars and you put in the bank at our percent interest how much
do you get well let's suppose you got a hundred dollars and you put in the bank and let's suppose you get five percent interest well what you're going to get at the end of the next year is 105 right because the general formula for this thing is you just take x times one plus the interest rate right so that's the general formula so if i put 100 in the bank at five percent i'll get 105 back now if i put 105 back in the bank i'll get 105 times 1 plus 0.05 and that's going to
be 105 plus 5.25 which is 110.25 so it's in the two years i'll have 110.25 now if i kept this money in the bank for 10 years i'd just have 100 times 1 plus 0.05 raised to the 10th power right because i just keep multiplying this by 1.05 times 1.05 times 1.05 and that's what i get so that's you want to think about if i buy a certificate deposit for some money the banks say okay i'm gonna put this thousand dollars in for six years at five percent they'll tell you okay well then you're going
to get a thousand dollars times 1.05 raised to the sixth power that's how much you get back at the end of the six years move on to that same thing same very very simple thing with gdp now with gdp what we're going to do instead of setting x be the amount of money we put in the bank that's going to be per capita gdp if there's a gdp right now of g and we have r percent growth then next year we'll get one plus r and in ten years we'll have one plus r raised to
the tenth power now why does that matter so much why do we care so much about this r why do politicians always talk about what a banker's always talking about why do we care much about growth rates well to see why let's look at two cases let's look at a sort of low growth case a country that has a two percent growth rate and a high growth case a chemistry that has a six percent growth rate let's start them out both in year zero with everybody making a thousand dollars so per capita incomes a thousand
dollars well what happens after the first year for the first year the first country goes up by twenty percent so it's a thousand twenty the other countries at a thousand sixty you could say big deal forty dollars one per person that's not a huge difference well let's go ahead ten years in ten years if i use that formula the people in the first country making twelve hundred dollars a piece people in the second country are making eighteen hundred dollars a piece so now they're 50 percent better off and if i go ahead 35 years right
so you know really maybe one generation maybe generation a half the first country's now doubled right so they're 2 000. the second country said 7600 they've gone up three point eight times so now they're almost or seven point six times i'm sorry so they're almost four times better off let's suppose they go ahead a hundred years we've had a century one country plugs along at two percent growth the other country plugs along at six percent growth right the first countries now making seven thousand dollars per person the second country right people are making 339 thousand
dollars per person right so that's like 45 times as much so in a hundred year period this two percent versus six percent difference just becomes enormous and that's because this growth is exponential right so it's one plus r raised to the power t and so if r is bigger you get a huge increase so here's the rule of 72 and this explains sort of why what was going on was going on in that graph the rule 72 says divide the growth rate into 72 and the answer you get will give you the number of years
it takes to double right so let's suppose our growth rate is two percent right if our growth rate is two percent i take 72 divide it by two and i get 36. that means it'll take about 36 years to double let's go back to our graph go back to our graph you see it took 35 years to double so pretty close right what if i had six percent well 72 divided by 12 or so divided by 6 means that it's going to be 12 years to double so what that means is in this first period
that to two percent it's going to take me 36 years to double at six percent only takes me 12 years to double which means that this country will double three times which is two times two times two which is eight so its gdp will be eight times its original gdp in the time it took the first country to double and if we go back and look at our data sure enough after 35 years it's effectively eight times as big so you see the rule 72 isn't exactly right like it took only 35 years to double
and this isn't quite at 8 000 but it's really accurate so for low interest rates it tends to you know underestimate overestimate the number of years and for high interest rates it tends to over estimate the number of years but if it at eight percent or nine percent it works just about perfectly so the rule of 72 right again which is really cool so it's just look take your growth rate divide it into 72 that tells you how long it's going to take to double so the move from two percent to six percent isn't just
a four percent increase in the growth rate right it's a dividing by three of the time to double so it means that every 12 years your country's going to double its well-being it's gdp whereas in the first case at two percent it's going to take 36 years that's why this people focus so much on growth rates and that's why we want to look at models that explain where growth comes from let's go back and look at the united states remember we're hanging out about three four percent well you think what's the difference between three four
percent well four percent is 72 divided by four which means every 18 years will double and 3 is 72 divided by 3 which means every 24 years will double well what would you rather do double every eighteen years or double every seven twenty four years clearly you'd rather double every eighteen years so that's why we care a lot about boosting that growth rate even from something like three percent to four percent because it means we're going to increase our well-being much much faster okay excellent a lot going on here right we've talked about this sort
of you know simple interest rate thing where we've got x right times 1 plus r raised to the t power well this is sort of like sheet here because what i've done is i've just assumed for the interest of simplification that the growth is happening just once a year so it's like once a year we do growth rates when i was a kid there was a commercial on television for a bank and they talked about how some banks only gave interest once a year and it said this new bank we give interest every second of
the day so what we do is instead of saying okay we're going to give you at the end of one year x times 1 plus r we're going to give you interest let's just say suppose first we're going to give you interest every day so we're going to give you 1 plus r over 365. so we're going to reduce the rate divided by 365. we're going to give that to you 365 times so we're going to compute your interest daily and then they said we'll even do better than that we're going to do it we
could even do it hourly so we'll do it over we're going to divide the interest rate times the number of hours in the day and then we'll compute this interest every hour and then we can even do it every second and so on and so on and so on and they showed this guy the calculator he's plugging away right computing these interest rates and i thought how are they doing that well the way they did it is they just used math it turns out if you have this formula and this should be an infinity here
if you have the number of periods go to infinity right so if you're doing it infinitely fast then what happens is this formula this one plus the interest rate over n raised to the power nt just becomes e to the rt remember e was that number euros constant which is 2.71828 so why do we do this why do we do all this math the reason being all this math is basically you can think of the growth rate instead of thinking of this formula you know x times 1 plus r to the t you can do
something simpler you can just use e to the r t where e is this euler's constant this 2.71828 what you can do is if you think about growth this is sort of occurring continuously in this nice simple formula will give you sort of the rate at which things are going to grow and that's why it's called exponential growth because the rate at which you grow is exponential in this function in this number e so it's e raised to the exponent rt so why is that so important the reason it's important let's go back remember we
talked about linear functions in the previous set of lectures right remember linear function looks like this so that would mean that growth would sort of go 10 right 11 12 13 14 and so on right exponential growth goes like this it zooms up even faster than something like x squared so what that means is if you grow at a ten percent rate right you're not just going to go 10 11 12 13 14 15 16 right in fact if you go to 10 rate in seven years right remember the rule of 72 72 divided by
10 is equal to seven in seven years what you're going to do is you're going to double so you'd be twice as well off as you were before what have we learned we've learned that if we put we have a growth rate of say let's say three percent four percent five percent right that that can lead to significantly better you know higher gdp down the road than if the growth rate's just a little bit smaller and the reason why is because you've got this exponential growth the world is not outcome is not going to be
sort of linear in these growth rates right it's going to be exponential over time so what you'd like to do is sustain a higher growth rate so what we want to do next is construct some models of growth where we can see how does economic growth depend on things like savings depreciation and technology okay thank you hi we're now ready for our first real growth model this is actually going to be a simple one it's a simple growth model but it's still going to be far more complicated than any model we've seen so far it's
kind of all sorts of moving parts that said still by economist standards by the models that economists use today it's still relatively simple right so let's hang in there so here's how it's going to work a model's going to have a group of workers and there's gonna be these coconut trees and the workers can pick the coconuts now when they pick the coconuts they can do two things with them they can eat them because they're good right and drink their milk so they live off these coconuts and they can also instead of eating the coconuts
use the coconuts to build machines right and these machines can help them pick coconuts even faster now there's one other assumption we're going to add in right these machines will wear out over time because you know they're made out of coconuts right so they're going to sort of degrade over time and have to be replaced with new machines so that's it very simple economy coconuts workers machines machines depreciate what we want to do is construct a model of that and we're going to use that model to explain the role that investment in capital in this
case the machines produces growth and the limits of that and then from that right we're going to see why innovation is so important but for now we just want to focus on this very simple model right so let's get started lots of moving parts tons of them okay so hang in there so there's going to be workers at time t so we're going to call those l of t says the labors right then there's going to be machines at time t we'll call that m of t right and we put those t down there those
t's on the subscripts t's is because over time there's going to change right they'll be the workers at time one workers at time two machines at time one machine's at time two and so on the workers and machine are going to combine the form output right and we're gonna call that o of time t and then there's gonna be when you think of those coconuts remember they could be eaten that's e of t or they can be invested that's ift and when i say invested that means they can be turned into more machines now to
figure out how many are eating how many investors there's going to be a savings rate the savings rate will determine like you know the percentage that we put in savings which goes into investment and then the percentage that we don't say which we just eat and that's e sub t and then the last thing in the model i realize this is a lot is this depreciation rate that's the rate at which these machines wear out we're just going to assume that some fixed percentage of machines wear out each period that's a simplification but we're going
to use it all right so lots of stuff workers machines coconuts the coconuts get eaten and turned into more machines and there's depreciation now got to make some assumptions first assumption we're going to make is that the production of these coconuts is increasing but concave memory had concave functions going up and sort of falling off in both workers and in machines that means the more machines the more coconuts the more workers the more coconuts but those things sort of fall off right we're going to use a specific functional form that says it's just the square
root of the labors times the square root of the number of machines second assumption output is either consumed or turned into coconut or machines right so the coconuts are eaten or turned into coconut picking machines there's no waste so basically then that means output o is just equal to e plus i right and e is just the amount we eat i is the amount we invest in the more machines another way to write this is just to say that i is equal to s times o right because s is our savings rate that was our
total output so the amount we invest is just going to be equal to our savings rate times the output and then the last thing is these machines depreciate right so the machines we have at tim t plus one is going to be the machines we had at time t plus our investment minus however many depreciate okay so again lots going on right we got all these variables and then these are the equations that help us make sense of those variables okay let's step back on that first assumption second with concave so concave means that like
the first workers you know gives you more coconuts the second gives you more coconuts but it gives you fewer coconuts than the first one did so sort of falls off so economists call this diminishing returns to scale here's a picture so if this is the number of workers and this is the number of coconuts for the c's we can see is the first worker gives you quite a bit the second gives you fewer and the third gives you sort of even less right so what you get is as you add more workers you get yes
you get more coconuts but the workers become less and less valuable and the same is going to be true of machines okay so that's all concave means okay we have a lot going on here right so let's simplify things a little bit let's assume that we've just got 100 workers so remember before our output was equal to the square root of lt times the square root of mt we're going to have 100 workers so that's just going to mean it's this 10 times the square root of empty that'll make things simpler in a more realistic
model right we'd have workers deciding to go to work depending on what the wage is so we'd have to create a market for wages as a function of output and function of how much people like the coconuts and now we get really complicated so we're just going to skip all that stuff we're going to skip the entire labor market and just assume everybody goes to work every day and then we'll see what happens okay so let's do an example so now we're going to do some math it'll try and do it slowly so what we're
going to do is assume that the depreciation rate is 1 4 and the savings rate is 30 and let's suppose we start out with four machines so we started with four machines the output is just 10 times the square root of 4 which is 20. so how much do they invest well remember they invest 20 times 0.3 because they invest 30 so that means 6 machines so investment is going to be 6. how much depreciation is there well depreciation is on the old machine so there were four machines in the past twenty-five percent of fourth
of those get worn out so that means that's one so depreciation is one so we subtract those two and that means we're going to get a net of plus five machines right we have 20 machines before we're going to invest in six new ones we lose one to depreciation so that gives us five we started out with four machines so that means in year two we've got nine machines here's how our economy worked we had four machines we produced 20 coconuts we ate 14 we invested six we lost one machine to depreciation and so now
we have five new machines right the six minus the one and that gives us a total of nine so now we start out with four machines now we get nine so we had a nice gdp of 20 and now we've added more machines so we should do better so let's look at the next year next year we've got nine machines so output is 10 times the square root of 9 so that's going to be 30. and let's think about how many new machines do we get well we're going to take 30 times 0.3 that's how
much we save so that gives us nine new machines we're gonna buy but the question is how many do we lose to depreciation well we had nine machines and we're gonna multiply that by 0.25 so that's like nine over four so that's two and a quarter let's just simplify this and let's suppose it's two okay so depreciation we lose two so nine minus two is seven so we had nine machines to start we get seven new ones now we've got a total of 16 machines with that little fudge for the one-fourth we put a way
to keep the math simpler let's take a look at our gdp right previous period our gdp was 20 now our gdp was 30 so we've got this nice sustained growth okay let's look at year three now year three again with our little fudge factory we've got 16 machines so if we have 16 machines that means our total output is going to be 10 times the square root of 16 which is 40. so what's happened to our growth we start out with 20 and then 30 and then 40. actually a little less you know remember we
have that little fudge factor that we included where we should have subtracted two and a quarter but we get this nice sustained growth now we could ask is this is going to continue are we going to go from 10 to 30 to 20 to 30 to 40 to 50 to 60 or is it going to fall off right well we see a hint that it's falling off a little bit because this number again remember shouldn't be quite 40 it should be a little bit less than 40. so we said to go to 20 to 30
to less than 40. you can start asking are there limits to this growth well to try to understand whether growth would stop or not let's do the fine let's assume we have a big number of machines a really big number and see if it just continues to grow or to see if something different happens so let's suppose we have 400 machines so if we have 400 machines our output is going to be 10 times the square root of 400 right so that's 10 times 20 which is 200 right so that's great that's huge gdp right
huge output what's our investment going to be well we had 200 machines 200 output our savings rate is 0.3 so that means we're going to invest in 60 machines but depreciation is going to be 400 times one fourth right so four hundred machines we lose a fourth depreciation which is a hundred so we're investing in sixty we're losing a hundred so that's minus forty so if we started out with 400 machines we'd fall off to 360. so wait so now we see what somehow this economy is going to grow but it can't grow this big
if it grows to 400 machines we would shrink back down to 360 machines so now i'm going to think about whether is there some number it would reach if we started with four we go from not four to nine to 16 and so on and so on and it looks like it's never going to stop but then we said what if we had 400 would it continue to grow and we find out well no it's not going to continue to go it's going to stop in fact it's going to shrink down there must be some
place it's going to stop there must be some natural limit to the growth and in fact that's what economists like to call the equilibrium right so we look at this growth it's going to go up up up up up but then it's going to flat out flatten out and this flat line here is going to be the equilibrium level so what we want to do is we want to understand what is that equilibrium level let's think about it what's going to happen in equilibrium in equilibrium the number of machines stays fixed we stop growing but
what affects the number of machines two things right one investment you buy new machines right based on our savings rate what else depreciation whose machines the depreciation so the equilibrium is going to occur when investment equals depreciation okay well guess what that's going to be easy to solve this is why models are so great let's do this formally so think about what's our output it's just 10 times the square root of m what's our investment well that's easy right that's just 0.3 times 10 times the square root of m so that's just 3 times the
square root of n right what's our depreciation well that's just m over 4. and equilibrium depreciation has to equal investment so again easy we just get 3 times the square root of m equals m over 4. so that means 12 times the square root of m equals m and if i bring this divide both sides by the square root of m i get 12 equals the square root of m so that means m is going to equal 144. so if my total number of machines is 144 the depreciation is gonna be actually exactly the same
as the savings so again here's the math the investment is three times the square root of n depreciation is 0.25 squared of m i set those things equal and i go ahead and solve and i get 144 machines okay nothing complicated let's just check let's get check okay so we've got 144 machines what's my output let's get 10 times the square root of 144 so that's 120. well so what's my savings how much do i invest in new machine well i take 0.3 times 120 which is 36 new machines what's my depreciation well that's one-fourth
of 144 which is also 36 machines so i invest in 36 machines 36 machines wear out so the total number of machines is still at 144. so 144 is in equilibrium the number of new machines the number of machines exactly cancel out and that's exactly where that curve finishes right output is going to be at 120 right so what we get is we get a long run equilibrium in this model of exactly 120 units of output total okay you know wait i said something ironic here i call this a growth model well what's ironic about
this what ironic about this is eventually there's no growth right we're going to start with four machines and then nine and then 16 and then 23 and so on and so on and so on eventually we could take 144 machines and then we stop growth stops what's going on well let's think about it depreciation is linear right so depreciation is just a nice linear function but our output as a function of number of machines is concave that's falling off so at some point right the amount of more output we're getting is falling to match the
slope of the rate of depreciation and those things exactly balance out and that's why growth stops well if growth stops then the question is how do we get more growth well the answer is innovation innovation will allow us to continue to grow and that's why people focus so much on innovation so to get a real model of economic growth you've got to move beyond this quote-unquote simple growth model and actually include a parameter even complicate them even more right we've got to include a parameter that takes into account technological growth so let's step back what
have we learned we've learned that if we write down a simple model of growth economic growth that involves you know investing money in new machines right those machines depreciate that there are limits to growth right that the model is just going to max out at this point where the number of machines lost to depreciation is exactly offset by the number of machines that we invested in the previous period so growth is going to you know start if you start with no machines growth will go really really fast initially but then it's going to fall off
and reach this equilibrium level so to get sustained growth that's going to require new technologies new innovations and that's where we're going to go next we're going to construct solos growth model which includes this innovation parameter yeah even more complicated i know but by including this innovation parameter we'll see how growth can continue to be sustained okay thanks hi we just look at a very simple growth model and in that growth model we saw that well growth stopped right once we got to 144 machines and an output of 120 we no longer got any growth
so we use that very simple model to get at a really important fact that without innovation if technology stays fixed growth will stop now i'm sure the labor supply could get bigger we could have more workers or something like that but holding the amount of labor fixed and holding the technology fixed if we've got a fixed savings rate and a fixed rate of depreciation there's no more growth at some point we're going to go up up up up and then stop well that hasn't been human experience right economic well-being has continued to go way up
right and gdp continues to go up and so what's driving that well to get at that we're going to look at a deeper model richer model known as the solo growth model now what's nice about this model what i love about this model we're just going to add one variable we're just going to add one more variable to our other model and that's going to suddenly give us a way to include innovation now just to make this you know more interesting maybe more real these models are developed by real people right and so this figure
model's developed by the name bob solo and bobsled was an economist at mit and here's bob right here and this is bob and actually testifying before the house science and technology committee on the need to have multiple models to understand the economy right so this is a group of economists here we're standing up and we're swearing to tell the truth the whole truth and nothing but the truth about why models are important to understand where growth comes from and in this particular case to prevent things like you know the home mortgage crisis which cost us
a lot of money all right so how does solo's model work what does bob's model do well bob does is this wonderful thing he includes one more variable so everything's the same as before we have labor capital depreciation savings but we're going to include this thing a of t which stands for technology so when a is low technology is low when a is high technology's height it's better so a is just going to be this parameter we can tune to affect sort of how much or how good is the technology in the economy so for
making coconut picking machines right a is really small and if we're making incredibly cool you know laser pointers and iphones and stuff like that right technology is great okay so this is it very simple formula output is just equal to the technology at that time times capital to some beta in l to some one minus b now wait a minute this also got a little more company now i get these betas here before i had square roots well if beta equals one half right so if beta's a half then this is just the square root
of labor times the square root of capital right easy if beta gets bigger than a half that means that capital matters a little bit more and if beta is less than a half that means capital matters a little bit less so depending on the technology it could be that it's a capital intensive technology so the beta would be big or it could be that it doesn't use much capital and beta tends to be relatively small so you can estimate different betas for different manufacturing processes or even for different countries right and a half was just
a convenience we assumed so that's actually something when you take models today that you go in and estimate you figure out what is beta but for us since we're just trying to get the ideas here right we're going to take beta equals a half so let's go back just to remind ourselves of where we're before right remember our total output was 10 because we assumed 100 workers so 10 times the square root of n we had a savings rate of 30 percent and we had a depreciation rate of a quarter and we went through we
did all that stuff we sold for the equilibrium where investment was exactly equal to depreciation and we got that that happened at an output of 120 which required 144 machines right so that meant that we were going to invest in 36 new machines but we'd lose 36 machines to depreciation so that was our equilibrium now we want to say what would innovation do what innovation would do was would be put an a in front of this right so now we'd have an a in front of this 10 times the square root of m so let's
do that and let's see what happens so now we're going to say that output is 2 times 10 times the square root of so what we're going to do is we're going to assume that somebody had a technological innovation and our coconut machines are now somehow like everything's twice as good we're twice as productive okay well now let's let's walk through the math so what's our investment going to be investment is going to be 0.3 times 20 the square root of m so that's going to be 6 times the square root of m and what's
our depreciation well that's going to be 1 4 m right so that's just m over and so we just have to set these things equal again right so 6 squared of m equals m over 4 so that means 24 squared of m equals m so that means 24 equals square root of m so that means m equals right so our equilibrium is going to be m is equal 496 and output right is going to be 2 times 10 times the square root of 496 which is 24 right so that's 24 times 10 which is 240
which is 480 right so our output is 480 before 120 and now it's 480. so think about it productivity doubled right because our technology got twice as good but long run gdp went up by four but why is that well let's look back at our numbers here okay we became twice as productive so that means if we'd have kept the number of machines at 144 we now would have an output of 28 240 right so we'd have doubled where we're at but we didn't keep the number of machines at 144. when the technology got better
we actually increased the number of machines to 496. so we have a technological change two things happen first you just get more productive you get more stuff second because you're getting more stuff it makes sense to invest in more machines so there's this multiplier effect so that means productivity goes up by two right output eventually in the long run long run equilibrium goes up by four right and this is what you can think of as an innovation multiplier because what's happening right is these two effects right labor and capital become more productive so that boom
you just get more stuff but second of all because they're more productive it makes sense to invest in more machines so then you get even more stuff so there's this multiplier well let's think about it productivity by two right total up but eventually in the long run not immediately so we've got to build all those machines it goes up by 4. well that leads to a puzzle and here's where models are really useful is it additive or is it multiplicative here's the issue with 2 it could be that productivity went up by two and so
we get two plus two and something set up by four or it could be we get two times two two squared is the reason productivity went up by four so we want to figure out is this additive effect right the machine effect plus the productivity effect or is it multiplicative is it two times two well to make sense of that what we can do is we can increase the multiply to three because if it's additive then we get six and if it's multiplicative we'd get nine right so if we make make us three times as
productive we're going to ask in the long run do we end up with six times as much stuff or we end up with nine times as much stuff and again this is another thing why do we model we model to get the logic right without the model it'd be very hard to figure out is this going to go by six or is this gonna go by nine heck we might not have even got the second effect of more machines right so the model was really useful just giving us that now it's gonna tell us the
magnitude of the effect okay just to get our bearings in let's remember where we started from we start from an existing technology where it's just the square root of labor times the square of machines we assume 100 units of labor so it's 10 times the square root of n we save 30 percent and invest that new machines right so that's going to be 3 times the square root of m we lose a quarter of our machines to depreciation so that's just m over 4. we set those things equal we get m equals 144 we get
output of 120 that's our equilibrium now we want to say let's triple it okay so let's suddenly assume that there's some a that comes in that's got a value of 3. so now we're going to get 3 times 10 times the square root of m so our output is 30 times the square root of m and let's see what happens okay so what's our total investment going to be well that's going to be 0.3 times 30 times the square root of m so that's going to be 9 times the square root of n what's our
depreciation well that's still just m over 4. so let's set these equal 9 times the square root of m equals m over 4 so that means 36 squared of m equals m so that means 36 equals the square root of m so m is equal to 36 squared right so m we just keep it at 36 squared right so n equals 36 squared what's total output going to be so if we've got 36 squared machines which is a big number what's output going to be well output is 3 times 10 times the square root of
36 squared so that's 3 times 10 times 36 well 3 times 36 is 108 so that's 1080. so what happened when we made ourselves three times as productive well total output went up nine times so we see remember was our question our question was is it additive or we need three plus three or six are multiplicative three times three or nine the answer is it's going to be multiplicative we get 3 times 3 is 9. two effects multiply on top of one another right the first one is we just get more stuff the second is
we invested more machines and those two effects get multiplied together so becoming three times more effective means we get nine times as much output in the long run equilibrium so let me summarize for a second in the simple growth model growth stopped right at some point we got to 144 machines and then we no longer had any growth when we go to the solo growth model what happens is if we can continue to increase that a right if we can continue to increase our productivity then growth can continue but that sort of begs the question
where do increases in a come from and this has led to what people call endogenous growth models so an endogenous growth model labor can go to things like picking coconuts labor can also go to things like investing in new technologies research and design and those sorts of things to try and increase that a parameter so we could think of is you know before all of our labor what the increasing capital picking coconuts right now that labor could also go to doing research on new coconut-picking machines and what you get an endogenous growth model is how
much labor goes into actually making stuff and how much goes into research and design and thinking right is a choice variable in the model right and you solve for how much of that you get quick summary right growth ceases without innovation if that's true everybody should be protovation and in fact most people are right so here's two quotes here's a fun little quiz one of these quotes is from president obama who's a democrat the other quote is from president reagan i want you to try and guess which quote came from obama and which quote came
from reagan all right so both reagan and obama are pro-innovation right they they are because they're pro-gdp growth because we're going to lift those people out of poverty right we want to make everyone better off because we can lift people out of poverty we make people happier and what we've learned is the way to do that right is first by investing in capital right because that makes us you know all do better we get to that 144 machines but at some point then growth stops and then you need investment in technology investment technology leads to
innovation which raises the whole thing up and we get this multiplier effect right we get the increased productivity and then we also get the incentives to produce more machines right which raises our standard of living even more so that's the in essence what growth theory is thanks hi early in the set of lectures i talked about how in the past japan had had extremely high levels of growth remember in the 1950s to 1970s early 1970s you see that japan has growth rates that hang out in the seven to eight percent range i also talked about
how china is currently experiencing levels of growth that are equally high in the same sort of range eight to nine percent and there's a question can china continue to sustain those levels of growth but what we want to see is we want to use our very simple growth model to see why countries like japan and the post-war period can have incredibly high growth why china during this area of industrialization can have incredibly high growth but why would we we should be somewhat dubious about the prospects of china being able to maintain these levels of growth
unless they're able to somehow have massive increases in technological improvement so we'll see that if you're catching up to other countries then it's highly likely you can have experience super high growth levels just like the ones we see here but once you've caught up to other countries it's going to be hard to maintain those growth levels unless you somehow can support massive amounts of innovation so let's see how this works and we're going to do it just by doing some math so we use slightly different numbers than we used before but the exact same argument
so here we're going to do is we're going to assume that the savings rate is 20 the depreciation rate is 10 and we're also going to assume a much larger economy so we get bigger numbers so instead of assuming that there's 100 people so the square root of that is 10 we're going to assume that there's 10 000 people so if the square root is a hundred we're also going to start with a lot more machines we're going to start with 3600 machines as opposed to just six so this is a bigger economy and that'll
allow us to get slightly more reasonable numbers when we work through the math so remember how this works right what we do is we compute total output we see how many new machines then get invested in we see how many machines get depreciated we solve for the new number of machines and after we solve for the new number of machines then we can solve for the output next period so let's get started here we go 3 600 machines so output is 100 times the square root of 3600 so that's 100 times 60 so that means
output is 6 000. so let's think of that as per capita income of 6 000 so how much are they going to invest well investment is going to be 0.2 times 6 000 so that's 1200 they're going to invest in 1 200 new machines but depreciation is 10 of the existing machines of 3 600 which is 36 360. so if we net those out we get 840 new machines and with those 840 new machines that means the new machines in the next period is going to be 4 40. so that's we've got the first period
output is 6 000 and now they've got four thousand four hundred and forty new machines so let's go to the next period the next period we're gonna start out with four thousand four hundred and forty machines and what we're gonna do is then we're going to compute the output in this setting here we go the square root of 4 440 is approximately 67. so we're going to get 100 times the square root of four thousand four hundred forty which is a hundred times sixty seven which is sixty seven hundred so investment in new machines is
going to be twenty percent of that which is going to be one thousand and our depreciation is going to be or 10 of this which is roughly 440 so we're going to invest in about 900 new machines so the new machines next time is going to be 5 340. what we can also do is we can look at what is our growth rate so currently we've got 6 700 dollars per person before we had a gdp of 6 000 so that's roughly 11 percent growth so this is a growth weight not unlike a little bit
higher but not unlike what we currently see in china we went from 6 000 to 6 700 and now we've got 5 340 machines let's go ahead one more period we've now had 5 340 machines we want to ask what's output going to be and what's growth going to be this is just some very basic math so if we take the output it's 100 times the square root of 5 340 i'll write that out again that's going to be 100 times turns out again if you do this internet it's about 73 so that's 7 300
that's going to be gdp per capita investment in new machines is 20 of that which is 1460. if that's twenty percent of seventy three hundred and depreciation is ten percent which is five hundred and thirty so we're going to add 930 new machines which is going to give us 62.70 new machines what's our growth rate well growth rate we had it was 6 700 before now it's 7 300 so it went up 600 which is about 8 to 9 percent so we look at is in the previous period we had eleven percent growth now we've
got nine percent growth so we're seeing is consistently high levels of growth because of the fact that relative to how much labor you've got the amount of capital is pretty low so just by investing and saving you can maintain fairly high levels of growth now let's jump ahead let's suppose that we've been accumulating capital we've gone from 3600 machines to 400 machines to 5300 machines to 6200 machines and now we're all the way up to 10 000 machines we can ask what's going to happen to output and we can ask even what's going to happen
to growth so let's again do the math the output is going to be 100 times the square root of 10 000 which is 100 that means now per capita gdp let's just say is 10 000. what is investment going to be investment is going to be 20 of that which is 2 000 but what is depreciation going to be depreciation is 10 of 10 000 machines which is a thousand so that now we're going to have eleven thousand machines next period well we have eleven thousand machines next period we can ask what's growth gonna be
well that's gonna be a hundred times the square root of 11 000 is what gdp is going to be so that's going to be 100 times approximately 105 which is 10 500. so gdp is going to go from 10 000 to five hundred but what sort of growth rate is that well that's only five percent growth so what you can see is when we went from three thousand to four thousand to five thousand to six thousand during that period the economy sustaining you know ten eleven eight nine percent growth really high growth rates just like
those growth rates we're singing china once the number of machines gets to ten thousand now suddenly the growth rate falls to five percent but we're still adding notice we're still adding net a thousand machines well let's keep going let's go ahead you know another 10 20 years and let's suppose now the economy's got 22 500 machines and let's ask what happens here what's what's total output going to be well that's going to be 100 times the square root of 22 5 which is 100 times 150 so now we've got 15 000 per capita gdp that's
really good how much you're going to invest investment is going to be 20 of that which is 3 000 which is a lot of new machines but what's depreciation can be it's going to be 10 of that which is of the 22 000 machines we have which is 225 i'm sorry machines so that means net we add 750 machines so that gives us 23 250 machines well now we can ask how many what's our output going to be next period we'll next put up what's going to be 100 times the square root of 23 250.
well how much is that well that's 15 000 and 250 ballpark so now gdp has only gone up 250 and that's really like just a one to two percent increase so growth is now in the one one to two percent range so we've seen this economy let's step all the way back a second when we started out with 3 600 machines output was 6 000 then we had 11 percent growth then we had eight to nine percent growth so this economy is growing at a fairly fast clip once it gets to 10 000 machines though
now gdp only grows at five percent and once it gets to twenty two thousand machines now gdp only grows at one to two percent what's going on well remember our curve output if we have a fixed technology is going to be concave so there's this region here where maybe you see ten percent growth early on you don't have enough capital given your amount of labor you can get massive amounts of growth but once you get in this region here growth is going to fall to one two percent it's going to fall off unless right remember
from our model unless you get lots of investment in new technology so if you look at a country like china with really high growth rates you can ask what's causing that and you can think well maybe what's causing it is they don't have enough capital given how much labor they've got in the case of china that's true so they're at this part of the curve so they're experiencing incredibly high growth when they get to this part of the curve then most likely you're not going to see this same sort of high growth rate it's going
to look a lot like more like the picture in japan so for them to sustain high growth they can't just continue to pour money into capital what they're going to need to do is they're going to need to change technology they're going to need a to increase now increases in a necessary to give up eight to nine percent growth just have never been seen before so it'd be remarkable if technological advances occur at such a rate that they can maintain this level of growth once their capital levels get to the same level than other countries
what the model tells us is we're much more likely to see china follow a path similar to japan where we see high growth rates for a period but then as capital accumulates those growth rates fall over time so what have we done what we've done has been kind of fun we've taken a very simple growth model solar growth model we've looked at data from china and japan and seen how china has these incredibly high growth rates now japan used to have incredibly high growth rates and we've seen from our model that if a country is
under capitalized if their levels of capital are low relative to the amount of labor they've got then you can sustain super high growth rates for quite a while as you accumulate capital but once your capital levels become sort of appropriate for the technology your growth rate's going to fall off and the only way you can sustain that growth weight is through technological innovation by driving up and solos models that parameter a this doesn't mean that china's not going to continue to have growth or that japan is all growth will end or the united states or
europe all growth is going to end what it says is there's two different ways to think about growth one type of growth what we're seeing in china now and what we saw in japan post-war and what we saw in europe the united states post-war is growth that occurs through capital accumulation another type of growth is which is what we see in the united states in japan and europe now but not in china occurs from technological advances not from build-up of capital and as you advance technology and increase that a term then it makes sense to
buy more capital but different types of capital and that's what drives growth okay a lot going on here a lot of fun though right what we're going to talk about next in the final lecture is why then are some countries poor this this seems to make a lot of sense just invest in capital as you invest in capital you get lots of growth you catch up with everybody else why is it still the case that some countries are poor if we understand where growth comes from thanks hi in this final lecture on growth models what
i want to do is i want to return to that question that really motivated the whole section which is why are some countries really successful and why aren't others now to get some understanding of that with the help of our model we've got to go back and just remind ourselves of what what's in our model right what it can explain and what it can so it's got labor it's got capital it's got technology and those combine to produce output right and we've got this nice sort of just pretty straightforward equation that describes how output depends
on those other factors and then we include included things like savings and depreciation and that gave us you know these growth rates and we saw how innovation was incredibly important in terms of continuing to drive growth now notice what's not in here there's no inequality in this model there's no culture in this model all sorts of things are left out so it's very very simple nevertheless it's still really useful right it's just sort of a benchmark to help us sort of think about what's going to cause the country to be successful our country not to
be successful so there's a recent book out written by um duran asimov and james robinson called why nations fail and in this book they look over hundreds of years why countries some countries have been successful and why countries other countries haven't been successful what i want to do in this lecture is just take a couple of the main points that they make in that book and show how we can see them through the lens of this simple growth model right the simple growth model of the solo model and the endogenous growth model right so we're
going to use that as a lens to interpret some of the things that robinson and osimoglu get at right through their rich study of history okay so again let's think back to the case of botswana and zimbabwe right you see botswana has been very very well but zimbabwe hasn't and one thing to understand is why is that the case what causes one country to be successful in another country not to be okay well here's what assimil and robinson said the first one is growth requires a strong central government and they find over history that this
is true why is that true well what's true in our model in our model we need people to make investments in machines we also need people to make investments in innovation if there's not a strong central government that's there to maintain property rights to make sure that if i invest in these machines that nobody takes them or if i invest in this technology that nobody copies it that i get at least for a short period of time maybe some patent rights on it then there's less incentive for people to innovate and less incentive for people
to invest in machines and as a result what's going to happen is you're going to get low growth right because what drives growth is you know more physical capital and more innovations but there's a catch you can't have too strong of a central government you can't ever be controlled by just a few people now why is that and this actually goes a long way to explaining the situation in zimbabwe if the government's controlled just by one or two people then they often can't help themselves right so what they'll do is they'll extract resources they'll say
to you know people in different industries you have to pay me money and as they extract resources from the economy that does two things that money means money's not going into investment into new machines it also means that there's no ins there's less incentive to invest in new machines so it's almost like the opposite effect of improving our technology a right so remember when we increased a we got by three we got an increase of a factor of nine well if we decrease it by a factor of three we're going to get a decrease of
a factor of nine so if i'm siphoning money out of the economy to build myself palaces or to store money in you know gold or oil or something like that i'm storing money then what's going to happen is i'm going to hurt the economy doubly so right because of the fact that i get this multiplier effect so if we look at our formula right and we go through all that math we see that it explains why that extraction would be so you know would be such a bad thing and it also explains why we need
a strong central government because what drives growth is investments in machine and investments in innovation there's another thing our model gives us and this is sort of a subtlety and it's you know not as pretty as everything i've in rosie as everything i've described so far that is when a increases labor becomes more productive right so we need less labor in order to produce the same output so we think what happens in a real economy suddenly we're getting a lot more stuff but we may need less labor to produce that stuff and so that labor
eventually goes to producing more machines and more you know physical capital and more innovation so we get more growth but in the short run right when suddenly the machines are more productive those workers don't have any place to go and we get unemployment and that unemployment can be costly so one of the things then that we learn sort of implicitly from the model is growth is going to require some creative destruction because what happens when we increase a i made it all sound very nice like we just increase a and the coconut machines become twice
as good but actually when we increase a we may wipe out an entire industry so when we develop the tractor right all these horses you know blacksmiths farriers all these people suddenly are no longer needed for jobs because they've been replaced by the tractor right so whole industries get wiped out and new industries come in now joseph schumpeter was an economist to coin this term creative destruction it's exactly he's talking about one technology replacing an old so let's see that for example so here's just the music industry what you see in this graph is different
types of ways of listening to music so what you see here in this blue line is vinyl records and you see that that goes up and then goes down then you see cassette tapes go up and then go down then you see cds go up and then go down and then you see here digital which has just been going up up so each one of these things vinyl cassette cds all sort of had it stay in the sun and then fell off as a new technology replaced it each time those new technologies replace the old
technologies all sorts of jobs get lost let me take a very specific example so the american newspaper industry has been hurt horribly in the last you know decade or so because of people placing classified ads online in particular there's one company called craigslist where you can post you know things for sale that has really hurt the newspaper industry so if you look at you know revenue for craigslist the newspaper ads you see that craigslist is going up right like this and you see that newspaper ad sales are going down right as people move things from
newspapers to craigslist now you can think of this as an innovation and it is an innovation right because it's much more efficient to post things online you can pull the ad down as soon as it sells you can put a lot more information on there you're not restricted to small amounts of space it's just a much better way to advertise things you can put pictures it's cheaper everybody sort of wins well everybody wins except for the newspaper industry so if you look at the number of workers employed in the newspaper industry you see that from
1988 to 2009 it's fallen by almost a quarter of a million workers it's a quarter of a million workers now you could say well yeah but those workers all went to work at craigslist well not so so here's a list of the 10 biggest websites and the number of employees that they've gotten if you look you know yahoo has 10 000 employees time warner has ninety thousand employees microsoft has seventy thousand employees well here's craigslist which is seven it's 23 employees 23. so with only 23 employees they're able to wipe out in a sense 250
000 jobs it wasn't them alone right but this technology looked at all those jobs the reason craigslist only needs 23 employees is you when you place the ad you type it in yourself you don't need anyone from craigslist to you know typeset it and put it in the paper and lay it out and all that sort of stuff right they're also not selling any advertising so things they don't have all these advertising systems so 23 jobs in effect wiped out 250 000 shops that's what's called creative destruction something that's created a great new website a
great new way to advertise your stuff for sale but something also gets destroyed which are a lot of jobs let's step back a second think of this government that's being controlled by a few suppose you have a country where the newspaper industry had a lot of power in the government well the newspaper industry might have said hey look we shouldn't allow classified ads on the web because they could have looked down you know look down the game tree right and said hey look if they come in we're gonna lose a quarter million jobs now you
could say well why would the government you know stop craigslist from coming into play just to to help these newspaper people well two reasons one it saves 250 000 jobs that's a compelling case and two they might be captured by the newspaper industry so if the newspaper industry had captured the government if the government was controlled by a select few and those select fewer in the newspaper industry they might prevent this from happening and you could say well that's a good thing because it saved all those jobs but it's a bad thing because it lowered
productivity and so growth requires this creative destruction with new technologies replacing old right it requires going from vinyl to cassette to cd to digital music it's that growth right these new technologies that make us all better off but unfortunately you've got to break a few eggs and in this case a lot of eggs within the newspaper industry okay let's talk about the fertility of this model right because we've talked about this only in the context of economic growth can we apply in other cases well we probably can't apply to things like the growth of a
tree or growth of a person even though you know trees get to a maximal size and people get to a maximal size but what we can do is we can apply to things like our own production our own sort of personal gdp because what does it tell us it tells us that you know we can sort of work hard but if we don't invest in new technologies and new innovation if we don't become sort of better at what we do by possibly learning new models learning new techniques right developing new skills we're probably going to
level out if actually look at the data on what makes for really successful people people who are very successful in their careers one thing you find is they continue to learn so it's almost like their own personal a their own personal technology parameter they keep upping and upping and upping so what this model tells us that we can apply another context is that continued growth depends on innovation and getting better you can't just sort of do more at some point the rate at which things fall off which things increase are just going to even out
and that sustained growth requires innovation becoming better at what you do so just like countries have to invest in innovation so should a person and that's where personal growth and sort of personal success can come from okay sum the whole thing up so we've looked at these growth models and what have we seen but there's some really cool stuff like the rule of 72 that growth rates are very important because if you go from a growth rate from two percent to six percent it means you double every 12 years as opposed to every 36 years
which is a huge difference we also learned that absent innovation growth tails off right it just stops so we need a constant driver of innovation and then we also saw sort of in this last lecture that that's not so easy it may be easy to say well all we need is innovation well to maintain that innovation you need secure property rights you need people to have incentives to invest in things like machine and also invest in new technologies and so to get that you need a strong central government but the central government can't be so
strong that it starts extracting stuff because if it extracts stuff that is essentially the same effect as lowering the technology and at the third and third that government can't necessarily protect industries now sometimes it can there's cases where it's going to make sense to protect industries but one of the things that's going to drive growth through innovation is this process of creative destruction so the model tells us that sometimes we may have to you know throw out our vinyl records and move to cassettes and then throw out those cassettes and move to cds and then
throughout those cds and just listen to digital music there's going to be these processes of creative destruction that drive the growth and their representative of innovation makes us all better off okay thank you hi in this set of lectures we're going to talk about problem solving we're going to talk about how individuals and teams go about solving problems we're going to focus on a couple things we're talking about the role that diversity plays in problem solving and we're also going to talk about how ideas can get recombined and how a lot of innovation actually comes
from somebody having one idea in one place and it being applied someplace else so those are going to be the two main themes the role of diversity and the power of recombination so to get there to think about how we model problem solving we've got to start out by making it form like constructing a somewhat formal model so here's how we're going to do it we're going to assume that you take some sort of action a whether you have some sort of solution we'll represent by a and then there's a payoff function f that gives
you the value of that particular action so that action could be you know a particular string of code if you're writing computer code and f might be how fast that code runs right or alternatively a could be a healthcare policy enough would be how efficient that healthcare policy is so a is the solution that you propose and f of a is how good that solution is what we want to do is we want somewhere understanding how people come up with better solutions where innovation comes from so to do that we're going to invoke a metaphor
and we're going to use this metaphor of a landscape as a way as a lens through which to to interpret our models okay so think of it the following way you are trying to come up with some sort of solution to a problem each solution has a value so the altitude here is the value of it so b is the best possible solution now along here along the x-axis these are all the different solutions so i might start out by having some idea let's just put it right here and here's my idea and it's an
okay idea but we'd like to think about how do we find better ideas well one thing we might do is we might sort of try things to the left and the right and realize and sort of climb up hill here and we get to some point c and c might be where i'd get stuck because if i go to the left i'm lower and if i go to the right i'm lower so i can say boy c is the best thing i can come up with what we want to see is how people come up
with these ideas how teams of people come with better ideas and how we can avoid getting stuck on c and possibly get ourselves up to b how are we going to do it well here's what the model is going to look like we're going to start out by talking about something i'm going to call perspectives what is a perspective is perspective is how you represent a problem so if someone poses some problem to you again whether it's writing code healthcare policy designing a bicycle or designing an addition to your house you have some way of
representing that problem in your head that's going to be a perspective it's literally how you just how you encode the problem once you've encoded the problem what you do is you create again this is sort of metaphorically a landscape which i can think of your encoding is like that horizontal axis and then there's a value you know for each possible solution and that creates a landscape so we're going to talk about how different perspectives give different landscapes that's the first part second part is something i'm going to call heuristics heuristics are how you move on
the landscape so remember when i do that landscape a minute ago i talked about climbing up the hill well heel climbing is one heuristic random search would be another heuristic right you could just sort of randomly pick some points and then find which one has the highest value that's another heuristic so we'll talk about how different perspectives and different heuristics allow people to find you know better or improving solutions to problems so that's going to be the focus of our model of problem solving people have perspectives and people have heuristics once we finish talking about
individuals then we'll talk about teams and one of the interesting things here is if you have groups of people or team of people trying to solve a problem you actually can show that they'll be better than the individuals in it and the reason why is because they have more tools and those tools tend to be diverse so they have different perspectives and different heuristics and all that diversity makes them better at coming coming up with new solutions and better solutions to problems so teams are going to be important after we've talked about teams and after
we've talked about the role of how you know one person can improve upon a solution of another we're going to sort of extend our model a little bit and talk about recombination so here's sort of the big idea the big idea is this i have some solution from one problem you have a solution for a different problem sometimes i can take your solution and combine it with my solution and come up with something even better so if you think about sophisticated products like a house an automobile or you know even a computer that consists of
all sorts of solutions to sub problems and we're going to see how by recombining solutions to sub problems we get ever better solutions and that that's really a big driver of innovation so let's think back for a second remember in our previous lecture we talked about how without sustained innovation we no longer get growth right the growth depends on sustained innovation what we're going to talk about here is how diversity leads to innovations and how recombinations of innovations can lead to even more innovations so that's the big theme so that's where we're headed we're going
to start by talking about perspectives then we'll talk about heuristics then we'll talk about how teams of people can leverage their diverse perspectives and heuristics and then we'll talk about how recombining ideas can really drive a lot of growth all right let's get started thank you hi in this lecture we're talking about problem solving and we're talking about the role that diverse perspectives play in finding solutions to problems so when you think about a problem perspective is how you represent it so remember from the previous lecture we talked about landscapes we talked about a landscape
being a way to represent the solutions along this axis and the value of the solutions as the as the height and so this is metaphorically a way to represent how someone might think about solving a problem finding high points on their landscape what we want to do is we want to take this metaphor and formalize it remember part of the reason for this course is to get better logic or to be able to think through things in a clear way so we want to take this landscape metaphor and turn it into a formal model so
how do we do it the first thing we do is we formally define what a perspective is so we speak math to metaphor so what a perspective is going to be is it's going to be a representation of all possible solutions so it's some encoding of the set of possible solutions to the problem once we have that encoding of the set of possible solutions then we can create our landscape by just assigning a value to each one of those solutions and that will give us that landscape picture like we saw before now most of us
are familiar with perspectives even though we don't know it let me give some examples remember when we took seventh grade math we learned about how to represent a point how to plot points and we typically learned two ways to do it the first way was cartesian coordinates so given a point we would represent it by an x and a y value in space so it might be five units this would be the point let's say five two right it's five units over the x direction two units in the y direction but we also learned another
way to represent points and that was cartesian coordinates we could take the same point and say well there's a radius which is its distance from the origin and then there's some angle theta which says how far we have to sweep out in order you know sweep that radius out in order to get to the point so two totally reasonable ways to represent a point x and y r and theta cartesian polar which is better well the answer is it depends let me show you why suppose i want to describe this line if i want to
describe this line i should use cartesian coordinates because i can just say y equals 3 and x moves from 2 to 5. it's really easy but suppose i want to describe this arc if i want to describe this arc now cartesian coordinates are going to be fairly complicated and i'd be better off using polar coordinates because the radius is fixed and i just talked about how the radius is you know there's this distance r and theta just moves from you know a to b let's say so depending on what i want to do if i
want to look at straight lines i should use cartesian and if i want to look at arcs i should probably use polar so perspectives depend on the problem now let's think about where we want to go we want to talk about how perspectives help us find solutions to new to problems and how perspectives help us be innovative well if you look at the history of science a lot of great breakthroughs you know we think about newton you know his theory of gravity you can think about people actually having new perspectives on problems let's take an
example so mendeleev came up with the periodic table and in the periodic table he represents the elements by atomic weight right he's got them in these different columns in doing so by by organizing the elements by atomic weight he found all sorts of structure right so all the metals lie in one column stuff like that right remember from high school chemistry class that's a perspective it's a representation of the set of all possible elements he could have organized them alphabetically but that wouldn't have made much sense so alphabetical representation wouldn't give us any structure atomic
weight representation gives us a lot of structure in fact when mendeleev you know wrote down all the elements that were not at the time according to atomic wave there were gaps in his representation there were holes for elements that were missing those elements became scandian gallium and germanium and eventually found 10 to 15 years later after he'd written down the periodic table people were not able to find the missing elements that perspective atomic wave ended up being a very useful way to organize our thinking about the elements we do it all the time though when
you have any sort of task you'll find that you're actually using some sort of perspective suppose that you're hiring someone and you've got a bunch of recent college graduates who apply for a job and you've got to think okay how do i organize all these applicants by let's say 500 applicants well one thing you could do is you could organize them by gpa take the highest gpa down to the lowest gpa that'd be one representation and you might do that right if you valued competence or achievement but you might also value work ethic and if
that were the case you might instead organize those same you know cvs or application files by how thick they are figuring if you have really thick ones or people who work really really hard they've accomplished a lot well third thing you might do is you might value creativity and you might say well let's put the ones that are sort of most colorful most interesting over here and ones that are least colorful and least interesting over here that's the third way to do it now depending on what you're hiring for depending who the applicants are any
one of these might be fine the only point i'm trying to make is that there's different ways to organize these applicants and each one of those ways you organize them whether it's in your head or whether it's formally laying them out in some way is a perspective and those perspectives will determine how hard the problem will be for you let me explain why i want to take i want to go back to the landscape metaphor and i want to think of that landscape as being rugged and by rugged i mean that it's not it doesn't
look like a single peak that there's lots of peaks on it and i want to formalize this notion of peaks and a deuce i was following i'm going to define what i call a local optima a local optima is a point such that if you look at the points on either side of it they're lower in value so it's sort of a point that locally is the highest possible value so if i look at this particular rugged landscape again there's three local optima one two three for any one of these three points i'd be stuck
i couldn't if i looked to the left or to the right i wouldn't find a solution that's better so when we think about what makes a good perspective a perspective is going to be a good perspective it's going to perspective that doesn't have many local optima a bad perspective is going to be one that has a lot of local optima let me give an example okay so suppose i'm coming up with a candy bar suppose i'm tasked with coming up with a new candy bar so i have my team of chefs make a whole bunch
of different you know confections for me to try and i want to find the very best one but there's so many of them there's so many possibilities not even sure how to think about it but one way to represent those candy bars might be by the number of calories that they had so i could organize all the different things they make by number of calories and if i did that maybe i'd have three local optima so that's a reasonable way to represent these possible candy bars alternatively i might represent those candy bars by masticity which
is shoe time how long it takes to chew them so these would be the ones that maybe only take two minutes to chew and these may take 20 minutes to chew well chew time is probably not the best way to look at a candy bar and so as a result i'm gonna have a landscape with many many more peaks and so because it's got many more peaks that's more places i can get stuck so it's not as good of a way to represent the possible solutions it's not as good a perspective the best perspective would
be what we call a mount fuji landscape that'd be a landscape that just has one peak and these are called mount fuji landscape because if you've ever been to japan and you look at mount fuji it looks pretty much like this actually not quite like this there's like snow on the top but for the most part it looks like one giant cone if you're on a mount fuji landscape if you're sitting at some point you can always just climb your way right up to the top so these single peak landscapes are really good because you
basically taken a problem and made it very very simple what would be an example of a mount fuji landscape i'm going to take a famous example so a famous example comes from scientific management induced frederick taylor taylor famously suffered the optimal size of a shovel so let's think about the shovel size landscape so on this axis i've got the size of the shovel and on this axis i've got the value now what do i mean by the value i don't mean how much i can sell the shovel for i mean it's like how useful the
shovel is at the task so let's suppose that we're shoveling coal and i want to think about how many pounds of coal can some shovel in a day as a function of the size so let's start out here where the size is zero so this is the size of the pan well if i have a shell that has a pan of size 0 that's formerly known as a stick and we can't get anything we're not going to shovel anything with a stick was i make it bigger you know to make it the size of maybe
like a little spoon or something then we can shovel a little bit and as i make the shovel bigger and bigger and bigger we whoever by workers can shovel more and more coal but at some point the shovel's going to get a little bit too big and it's going to be too heavy to lift and the worker's going to get tired and i'll shovel less he'll shovel less and less and less and less and then eventually get to some point where the shovel's so big that he can't even lift it and it's as useless as
the stick so if i look at value in terms of how much coal the person can shovel in a day as a function of the size of the shovel i'm going to get a single peaked landscape that's going to be an easy problem to solve and this idea that we could represent scientific problems in this way or in particular engineering problems in this way and then climb our way to peaks was the basis of something called scientific management it was the idea was that you could then by finding these high points on these landscapes find
optimal solutions well you're only going to find the optimal solution right for sure if you heal climb like this if it's single peaked if it's rugged it looks like this if it looks like a fuji landscape you're fine but it looks like this means elasticity landscape if you have a bad perspective well then if you climb hills you could get stuck just about anywhere so what you'd like is you'd like a mount fuji landscape and in the case of simple things like the shovel that's easy to get let me give another example this one's a
lot of fun this is a favorite game of mine called sum to 15 and was developed by herb simon who's a nobel prize winner in economics and sum to 15 was developed to show people why diverse perspectives are so useful like different ways of representing a problem can make them easy you can make them like mount fuji or it can make them really difficult okay so here's how 7-15 works there's cards numbered from one to nine face up on a table that's nine cards in front of you each there's two players each person takes turns
taking a card until all the cards are gone possibly right you could end sooner if anybody ever holds three cards that sum up to exactly 15 they win that's the game so really simple nine cards alternate taking cards if you ever get exactly three that's 7 to 15 you win so let me show you a game here's a game between two people call them paul and david paul goes first i think when you play this game the thing to do would be to choose the five paul chooses the four which is sort of an odd
choice david goes next so he takes the five paul then takes the 6. now the 6 is a strange choice because 4 plus 6 plus 5 equals 15 so it looks like there's no way that he can win well this would be confusing to doug so doug's going to take the 8. now notice 8 plus 5 oops equals 13. so that means paul has to take the 2. so he takes the 2. well think about what happens next 4 plus 2 is six so if doug doesn't take the nine he's going to lose but six
plus two is eight so if doug doesn't take the seven he's gonna lose so what you've got here is that paul is one no matter what doug does paul's gonna win the game now this is a pretty tricky game right it was developed by a nobel prize winner you can imagine there's lots of strategy involved i want to show you this game in a different perspective remember the magic square from 7th grade math every row adds up to 15 8 plus 3 plus 4 1 plus 5 plus 9 and 6 plus 7 plus 2. so
does every column 8 plus 1 plus 6 sums up to 15 3 plus 5 plus 7 sums up to 15 and even the diagonals eight five two is fifteen six five four is fifteen every row every common diagonal sum up to fifteen let me show you this game again on the magic square this is just a different perspective on sum to fifteen paul goes first and takes the 4. doug goes next and takes the 5. paul takes the 6 which is an odd choice because now he can't win doug then takes the 8 paul blocks
him with the 2 but now it turns out either the nine or seven will let paul win what game is this well you're right it's tic-tac-toe some to 15 is just tic-tac-toe but on a different perspective using a different perspective so if you turn sum to 15 if you move the cards one to nine and put them in the magic square what you do is you create a mount fuji landscape in a sense right you make the problem really simple so a lot of great breakthroughs like the periodic table newton's theory of gravity those are
perspectives on problems that turn something that was really difficult to figure out into something that suddenly makes a lot of sense very easy to see the solution at least it's something that i call in my book one of my books the difference i call this the savant existence theorem for any problem that's out there there exists some way to represent it so that you turn it into a mount fuji problem now why is that well it's actually fairly straightforward all you have to do is you know if you've got all the solutions here represented on
this thing put the very best one in the middle and then put the worst ones at the end and then just sort of line up the solutions in such a way so that you turn it into a mount fuji so it's very straightforward the thing is in order to make them about fuji already you'd have to know the solution already this isn't a good way to solve problems but the point is it exists so there's always the possibility that someone could look at a particular problem and say hey what if we think of it this
way and in doing so turn something that was really rugged into something that looks like not fuji here's the flip side though there's a ton of bad perspectives so just like there's these mount fuji perspectives there's also lots and lots of horrible ways to look at problems think about it suppose i have just you know tan alternatives and i want to think about what are all the different ways i can just put them in a line well there's 10 things i could put first nine things i could put second eight things i could put third
and so on so there's ten times nine times eight times seven times six times five times four times three times two times one perspectives most of those are going to not be very good they're not going to organize the set of solutions in any useful way and particularly not going to only a few of them are going to create mount fuji's so when we think about the value of perspectives what we get is this there's really good ones out there that insightful smart people can come up with really good representations a problem to make the
landscapes less rugged if we just think about things in random ways we're likely to get a landscape that's so rugged that we're going to get stuck just about everywhere i'm not going to find good solutions to the problem and we're going to get things that look like domesticity landscape so what have we learned first thing we've learned is that when you go about trying to solve a problem we encode it in some way and that's a perspective and a perspective creates peaks right it creates these local optima so better perspectives have fewer local optima worse
perspectives have lots of local optima and if you think about how many perspectives are out there we just saw there's billions of them because there's billions of perspectives most of those probably aren't very useful some of them know turn problems into mount fuji's and sometimes it takes a genius it takes a newton it takes a mandalaya to come up with a way of representing reality so that something that was incredibly rugged becomes mount fuji like other times if you think about the size of a shovel we're going to get things with lots and lots of
p that problem most of us could probably figure out a way to represent that problem just by shovel size so that it becomes the mount fuji but the big point is this when we go about solving problems the first thing we do is we encode them we have some representation of the problem that representation determines how hard the problem will be if we represent it in such a way that it's a mount fuji it's easy if we run it represent a sexual it looks like that mysticity landscape it's probably going to be fairly hard where
we want to go next is we want to talk about once we've got this representation of the possible solutions once we have that landscape so to speak how do we search on that landscape so one thing we talked about was climbing hills but there's lots of different ways you can climb hills that's we'll talk about next the heuristics we use on a landscape thanks hi the previous lecture we talked about perspectives how we represent problems in this lecture we're going to talk about heuristics how you go about finding solutions to problems once you've represented them
in your perspective so what a heuristic is is it's just a technique it's a tool it's a way that which you look for new solutions so in a sense we've already talked about this right we talked about those landscapes what we're really talking about is hill climbing on those landscapes i sort of assumed you're at some point and what you do is climb hills so i define things like a local optima right so here's a picture of this local optimum what i was implicitly assuming is that if i'm at some point here that i can
climb a hill and get to here so the reason these are local optima is because if i tried to climb hills i'd be stuck at any of those points because any direction would be down but hill climbing the city of just sort of climbing up a hill is just one of many possible heuristics you could use now heuristics are going to be defined relative to the problem you're trying to solve so for example one famous heuristic that's in a lot of books on how to innovate is called do the opposite what does it do the
opposite it means think of what the existing solution is and do the exact opposite so for example think about how it is that when you go buy something when you go buy something somebody else tells you the price what would the opposite be do the opposite would be that you actually tell them the price well a lot of companies have been started that do exactly this so a company like priceline you go to the hotels and you say to the airlines here's how much i'd like to pay to stay at your hotel or to use
your airlines it's the exact opposite or alternatively you think about firms producing products we normally think that they want to be lower costs they want their costs below that of the other firm we could do the opposite and say i want my price to be higher because i want to signal quality so doing the opposite is a strategy that sometimes leads to really interesting innovations that's a heuristic now you can think of this in the context of problem solving generally when there's there's some sort of solution that exists i'm going to do the exact opposite
so if everybody makes else makes grilled cheese sandwiches by putting the cheese between the bread i'm going to do the opposite and i'm going to actually put the cheese outside the bread if you haven't tried that it's actually pretty good here's another one big rocks first now stephen covey has written a bunch of books on what makes people successful what rules do you file to be successful when you think of these books like the seven signs of successful people or you know almost any one of these self-help books they're filled with heuristics and one of
those heuristics often is big rocks first what is big rocks first it says suppose that you have to the following task you've got a bucket here and you've got a bunch of rocks that you've got to put in that bucket of various sizes if you put the little rocks in first what happens is that the bucket fills up the little rocks and then when you put the big rocks in they don't all fit they spill out the side but if you put the big rocks in first right so let's erase these all these rocks and
let's start over with the reader bucket i put the big rocks in first then when i put the little ones in they'll fill in the gaps here and everything will work fine so big rocks first little rock second if you're filling a bucket and caviar argues that this is something that successful people know how to do they put the big rocks in first now rocks here right it's not it's like most successful people spend a lot of time filling buckets with rocks the idea here is big rocks represent the important things so cubby's saying if
you want to be successful deal with important things first that's the sign that's a rule if it's a heuristic that successful people use so what he's saying is there's a lot of problems out there if you follow this heuristic you'll find better solutions if you deal with big rocks first here's the rub though there's a famous theorem in computer science called the no free lunch theory theorem approved by um wilpert and mccready and in this theorem what they show is the following if you take two heuristics that each tell you to search the same number
of solutions so that i mean if if you do the opposite versus random search or do the opposite versus check the thing that's one bigger in your representation so if they each tell you to search the same number of points then if you look across all possible problems now again all possible problems is going to mean that some of these problems are incredibly hard and some are really easy that no heuristic is any better than any other so if you take a heuristic like big rocks first that means that it's no better than any other
heuristic across all problems so does that mean that covey's wrong no it doesn't mean that covey's wrong because no freelance term says again if you look in here they use the word algorithm instead of heuristics but if you look across all problems no heuristic is better than the other what covey's saying is he spent a lot of time in management and what he thinks is the types of problems you see in management lend themselves to the big rock search first heuristic the types of problems you face as a person are big rocks first kind of
problems and so therefore big rocks first is a good algorithm to use a good heuristic to use to find solutions to problems here's another way to think about the no free lunch theory if you don't know anything about the problem if you don't know something about the type of the problem then no heuristic is really that much better than any other one or if you don't know if your perspective on the problem is very useful you might as well just heal climb once though you've learned something about the problem you might realize that you know
this is a big rocks first kind of problem but it could be the case that it's not a big rocks first kind of problem so for example there are some things that are little rocks first problems let me give an example from my own life i put in a fence around my yard and i had to dig a whole bunch of holes so if you're digging a hole in the ground like this right so here's the hole you're going to dig and there's big rocks in here and little rocks in here you actually want to
take the little rocks out first because if you don't take the little rocks out first you can't get the big rocks out so if you're filling a bucket it's big rock first if you're digging all the little rocks first so if you don't know anything about the problem what the no free lunch theorem says is no heuristic is better than any other if you know a lot about the problem you can figure out should i do big rocks or should i do little rocks now if you're talking a lot in terms of metaphor let's actually
take this to real problems and let's see how these heuristics in particular having diverse heuristics is really useful in terms of finding solutions to problems so let's suppose i've got a representation of a problem that consists of two dimensions so i've laid out all my possible solutions in this big grid and these could be you know anything you want so these could be let's say these are pints of ice cream and on this side um on this axis i've got the number of chunks right chocolate chips in it and this is let's say the size
of those chocolate chips this is my representation of that problem now what i could do is i could think what's my heuristic well my heuristic might be that i look to the north south east and west so that's one heuristic and these are actually formerly known as the von neumann neighborhoods and that'd be one way that i could look for possible solutions but that's not the only way i could take these same pints of ice cream and you could have somebody else who says well you know that's sort of inside the box thinking i'm going
to look to the northeast northwest southeast and southwest and this is a different way different heuristic and different way to look for solutions if i have one person who looks like this and another person who looks like that and i combine them right what do i get i get did i look at more points so diverse heuristics are really useful if we have different ways of searching the space of possibilities because of the fact that we're actually going to search more points so let's combine all this for a second what were perspectives perspectives or ways
of representing the problem right so one person might have a perspective that looks like this another person have a perspective that makes those same you know problems where the same set of solutions look like that what is a heuristic a heuristic is how we search so one person might he'll climb and so that person would get stuck at these points right in these two landscapes another person might not he'll claim they might sort of have some do the opposite rule which means that they sort of jump all the way the opposite side of the space
well that may mean that they don't get stuck at this point because they jump all the way to here and it might mean that they don't get stuck at this point because they jump all the way to there so what we're going to see in the next lecture is how diverse perspectives plus diverse heuristics enable teams of people groups of people to find better solutions to problems so let's wrap this up previous lecture we talked about perspectives and now perspectives are representations of problems in this lecture we talk about heuristics which is how we search
within our perspectives and we learn this important theorem called the no free lunch theorem but unless we know something about the problem no heuristic is better than any other but if we do know something about the problem then we might be able to come up with a better heuristic we've also seen how diverse heuristics if given a problem if i look in different directions and you look that means we search for more points what we're going to see next is how we've got lots of people working on a problem and we have lots of different
perspectives and lots of different heuristics then collectively we're going to be able to do better than any one of us could do individually okay thanks hi remember we're talking about problem solving we talked about when you first solve a problem what you do is come up with a perspective a way of representing the set of solutions then we talked about how to use heuristics to search among those possible solutions given your representation and we've seen how having lots of heuristics and diverse heuristics can help you find look at more points and possibly find better solutions
what we want to do in this lecture is combine those two ideas perspectives plus heuristics to show why teams of people can often find solutions to a problem that individuals can that why teams are better now i say teams i'm going to use this in a very loose sense i don't necessarily mean a team of people sitting in a room and brainstorming that sort of stuff what i mean is a collection of people possibly you know working even over time so if you look at something even like the toaster on your counter you can think
of that as being something that's been really consistently and constantly improved upon by a team of people so it's the person who first invented the toaster and then somebody improved it and somebody come up with the crumb tray and then somebody came up with the automatic shut off button and all sorts of things right so a toaster consists of a whole bunch of improvements and you can think of that as being that so that current solution we have as being something that a team has come up with so again by team i don't necessarily mean
a group all working together and you know in some unit i just mean a collection of people so how does it work why are teams who are groups of people better than individuals well let's go back and let's think about the candy bar example and remember i had one landscape one perspective based on calories and that had three local peaks right and let's represent those by a b and c and then i had another landscape that was represented by masticity and that had five peaks and we can call these let's call these a b d
e and f so these are different than the peaks for the chloric landscape with one exception notice for sure that a which is the best possible point that has to be a peak in the caloric landscape and it's also got to be a peak in the masticity landscape and that's because it's the best possible point so it's the best possible point it's got to be a peak in every landscape now we can characterize these problem solvers by their local peaks by their local optimum so the local optims of the caloric landscape are a b and
c the local optima for the masticity landscape are a b d e and f and remember we said the caloric landscape is a better landscape than the elasticity landscape because of the fact that it had fewer local optima so one way to think about how good you are at solving a problem is how many local optimum you have given your perspective and your heuristic now here we're assuming the heuristic right is just heel climbing let's go deeper because that's just that's a fairly crude way of thinking about how good a problem solver is we can
actually take into account the average value of those peaks so the peaks where people get stuck are a b c d e and f and we can assign a value to each of those so suppose the value to a is 10 b is 8 and so on so a is the out local up the a is the global optimum and some of these other peaks aren't so good what we can do is we can ask what's the average value of a peak for the caloric problem solver so the problem solver who thinks in terms of
the caloric perspective they get stuck at a b and c what's the average value well a has a value of 10 b is a value of 8 c is a value of 6. and so we can give the ability is just the average of those three peaks which is eight but if i look at the elasticity problem solver they get stuck at ten at a b d e and f and those have values ten eight six two and four and the average of those is six so anything about the ability of the masticity problem solver
as being six so not only did the chloric chromosome are fewer local optima they had higher average values this is another reason why that person's a better problem solver let's think of them now though as working as a team if i think i'm working as the team the caloric problem solver gets stuck at a b and c the elasticity problem solver gets stuck at a b d e and f let's suppose first that the caloric problem solver works on the problem first and she gets stuck at b she then passes the problem to the masticity
problem solver and the mesticity person says you know what i can't help you because b looks pretty good to me because b is also a peak for him suppose instead though that the caloric problem solver gets stuck at c and she passes c on to the elasticity problem solver and now the specificity person c if you notice isn't anywhere on this list c isn't a local optima that means that the masticity person can get from c to some other local optima and it's got to be one that's better why does that be better because she's
this person's heel climbing if he's hill climbing then he's got to be able to find something that's better than c and that's going to be either a or b so the intersection of these local optima a and b are the only places where they can get stuck if for example the elasticity person went first and got stuck at e then the caloric person could take e and get to someplace else either a b or c if she gets to a or b the masticity person's also stuck if she gets to c then the elasticity person
can then in turn take it up to a or b so the only places the team can get stuck is a or b you can make this four months called the intersection property that the local optima for the team is the intersection of the local optima for the inter individuals so if we look at the team there's only two places the team can get stuck 10 and 8 and the average value there is 9. so the ability of the team is higher than the ability of either person and the reason why is because the team's
local optima is the intersection of the local optima for the individuals so the reason why then we see over time products get better the reason why we see teams being really innovative the reason why we see a lot of science being done by teams of people is because the only place a team can get stuck is where everybody on the team can get stuck so this very simple model having perspectives and heuristics can explain why is it the case that teams are so much better than individuals and why over time we keep finding better and
better solutions to problems it's not necessarily that we're getting smarter now it's true we are coming up with new ways to represent problems and we also are coming up with all sorts of new heuristics all the time we're developing new ways to solve problems all the time but another thing that's going on is just because of the accumulation of so many different ways of looking at problems and different ways of trying to solve them so we get the intersection of all those peaks and that gives us better solutions so here's the big claim the team
can only get stuck at a solution that's a local option for everyone on the team that means the team has to be better than the people in it so what we want we want people with different local optima we want people to get stuck in different places well how do we get it we know we've already looked at this twice right we looked at it first with respect to perspectives if we have different perspectives so if you code it this way and i code it this way then we're going to get stuck in different places
we also want people with different heuristics if i look in this direction in this direction and you look in this direction in this direction then when you add us together we look in all four of those directions so what we want is we want diverse perspectives and we want diverse heuristics and that diversity will give us different local optima and those different local optima will mean that we take the intersections we end up with better points that's sort of the big idea so if we take again let's let's play this out more deals again imagine
we've got these just here's the set of solutions if one of us looks like this and one of us looks maybe two to the left and one looks too down and one looks to the north south east and west if we have all these different you know maybe one person likes two over this way all these different heuristics looking at a problem that means we're less likely to get stuck at the same point which means the team is going to do better or over time society is going to do better finding solutions to problems this
all seems really smooth and nice and great and we've seen teams are better and we see the value of diverse perspectives we see the value of diverse heuristics but what's missing because this seems highly stylized there's two things that i've left out first one is right we can write this down as communication i've assumed that when you've got a team solving a problem they can communicate their solutions to one another right away now that's not always the case there's a lot of misunderstanding that's going on and we might not listen i might just say no
i'm not listening i'm not listening right and then no matter what you say we don't find the better solution i think it's something like the toaster though it's weird we can communicate through the toaster if i come up with a better toaster and i make it then you can look at my solution you know what i've done and then you can add the crumb tray so think about making an artifact the artifact itself the artifact is the solution that gets communicated right away but generally speaking communication can be a problem the other thing i've assumed
is that um there's the possibility of sort of error in interpreting the value of a solution so i'm assuming if somebody proposes something that's better we instantly know it it's as if there's some sort of oracle we can go to and say oh yep that's a better solution that may not always be the case so it could be that i propose something really interesting and people just think no that's a bad idea they make an error in terms of whether or not it's interesting thing or it could be that i propose something that's worse and
people think oh that's a great idea and then we actually do it it's not a good idea so i've assumed there's no errors in determining the value of a solution that when somebody proposes a solution we know exactly what it's worth that's not always going to be the case so it won't always be true that there's perfect communication and there's perfect evaluation so in a richer model we could include communication error and that's going to hurt teams and we can also include just error and evaluation that's also going to hurt teams even so right this
power this model's shown is something fairly powerful which is the diverse representations of problems and diverse ways of coming up with solutions can make teams of people better able coming up with solutions than individuals and it also sort of told us where innovation is coming from right innovation is coming from different ways of seeing problems and different ways of finding solutions there's a lot going on right now i got this model of problem solving and we can think about people finding solutions to particular problems now we want to step back a bit in the next
lecture when i think about what about bigger things like designing a house designing a car designing a railway system designing a city you know bigger problems well oftentimes those bigger problems the solutions you think about like making a computer a computer may consist of the solutions to a whole bunch of sub problems so where we want to go next is we want to talk about how we combine solutions to come up with new solutions and we'll see how that can even be used as an argument to where economic where economic growth comes from it actually
comes from individual solutions being recombined okay thanks hi in previous lectures we've talked about how when solving problems people have perspectives representations of the problem and then they have heuristics which are techniques they use to find solutions given the representation we've been focusing on individual problems individual solutions in this lecture what i want to do is i want to talk about recombination so once i've got a solution or even once i've got a heuristic how i can recombine those to come up with even more solutions or more heuristics and we're going to see the awesome
power of recombination now remember stepping way back for a second we've been trying to think about innovation in the previous lecture where does innovation comes from what we're going to see is that recombination is incredibly powerful and if we have a few solutions or a few heuristics we can combine those to create ever more and that may be the real driving force behind innovation in the economy is that when we come up with a solution we can then recombine it with all sorts of other solutions and that leads to ever ever more innovation let me
give an example to show how this works think about those you know math test or iq tests you might take online and they might give you a question like this one two three five blank thirteen you've gotta ask what number goes in there right and the answer here is just eight you get this either one of two ways you can do one plus two is three two plus three is five you know five plus eight is thirteen and so on or you can subtract 13 minus eight is five eight minus five is three and so
on here's another one one for blank 15 or 16 25 36 right the answer here is nine and this is just squares they can also come with very hard ones one two six like 1806 i don't put this on here to make us not feel intelligent i just want to show you that these can be hard and i want to show the power of recombination the first one which was very easy required subtraction second which was harder involved squares this one actually is almost impossible just requires combining those two techniques let's think about it what's
2 minus 1 that's 1 but that's also 1 squared what's 6 minus 2 that's 4 but that's also 2 squared what's 42 minus 6 that's 36 which is 6 squared and what's 1806 minus 42 that's 1764 which is 42 squared so the answer to this one 42 could be gotten by realizing just combine the first two tricks squaring and subtracting and that gives us the answer well this idea that you can recombine is really a driver of economic growth generally and also a driver of science because when you come up with a new solution we
can combine that solution with other solutions and we get this geometric explosion in the number of possibilities to show you how the geometric explosion works we wanted to just want to just do a little bit of math so let's start with something simple and reasonably more complicated suppose i've got 10 possible you know solutions or techniques i can use and i want to just pick three of them i can think of this as the following mathematical problem i've got a box with 10 objects and i just want to pick three objects from those 10. how
many ways to do it well there's ten things i could pick first nine things i could pick second and eight things i could pick third this actually though overstates the total number because if i pick object a and then object b and then object c that's the same thing as picking c and then b and then a or c and then a and then b so if i think about those three objects there's three things i could pick first two things i could have picked second and one thing i could pick third so these are
the different ways of arranging those three objects so i get 10 times 9 times 8 divided by 3 times 2 times 1 which is 120. so if i have 10 solutions or you know 10 technologies or 10 heuristics or 10 ideas that gives me 120 combinations of three 120 doesn't seem very big but it's not but the point is we've got way more than 10 solutions and way more than 10 heuristics and way more than 10 scientific theories we've got a ton of them so let's blow up the numbers a little bit let's suppose you
have a deck of cards suppose i have 52 cards in a deck and i want to just combine 20 of them how big of a number do i get then well there's 52 cards i could pick first 51 second and so on all the way down to 33 cards i could pick for the 20th but now i've got those 20 cards i could have picked those same 20 cards in lots of different orders so there's anyone of 20 could have been picked first anyone of 19 could have been picked second and so on so my
answer is going to be 52 times 51 times 50 all the way down to times 33 divided by 20 times 19 times 18 times so on that's going to be the number of ways to pick 20 cards from 52. well how big is that huge it's 125 trillion so anything about combining technologies combining heuristics combining ideas we get this huge explosion and every time anybody has an idea it can combine with every other idea and every other combination of ideas and this may be a big reason why we see so much growth why we've been
able to sustain growth so think back to our economic growth model right remember we had that like a times capital to the beta times labor to the one minus beta thing and a was the technology parameter well and for sustained growth we needed that a to get bigger and bigger and bigger well one thing that makes that a bigger and bigger and bigger is when people have ideas they can be recombined with every other idea which leads to more and more growth this idea of ideas building on ideas is the foundation of a theory of
economic growth due to martin weitzman's at harvard called recombinant growth and the idea is ideas get generated all the time you know the steam engine gets developed the gasoline engine gets developed the microprocessor gets developed and all these things get recombined into interesting combinations and those combinations in turn get recombined right to create ever more growth so that's the basic idea behind recombinant growth so if you take something like the steam engine right here's a picture of the early newcomen atmospheric engine which is really just a steam engine right it's got all these things it's
got pumps right it's got a steam piston it's got a boiler right it's got a water reservoir it's got this little like level thing like a teeter-totter these are all solutions to previous problems if you think of the gasoline engine right it's also got pistons and fuel injectors and all that sorts of stuff it consists of recombinations of all sorts of different problems so what we get is these big machines or even the computer on your desk right it consists of solutions to all sorts of other problems so a lot of our inventions are recombinations
of old solutions take a car your car consists of an engine wheels steering mechanisms now consists of all sorts of electronic stuff so a car even though it's a solution to a problem is comprised of a whole bunch of solutions to other problems combined in interesting ways so it's just recombinations that can drive a lot of growth when you think about all those parts that went into the car in the steam engine they weren't developed with the car and the steam engine in mind they were developed for other purposes and this is an idea from
biology called exception now the classic example of exception is the feather birds developed feathers primarily to keep themselves warm but eventually those same feathers allowed them to fly so exception simply means this you come up with some innovation some solution for one reason but then it gets exact it gets used in another context so emily dickinson famously said hope is the thing with feathers it's a good thing to keep in mind right because feathers are this classic example of exception and hope innovation change is the thing with feathers as well right it's our ability to
take a solution for one problem and apply to something new what do i mean take the laser the laser was not developed when they came with the laser they didn't think wow we can now have laser printers we can have laser pointers no that wasn't what they're thinking at all they just came up with a laser so once something's developed it gets used for all sorts of things that were never expected through this power of recombination even perspectives do so remember my sort of silly perspective on the candy bar the masticity perspective you might think
you know that doesn't really make a lot of sense mysticity doesn't make a lot of sense as a perspective it's a use it's a sort of a useless perspective but in fact nasticity might be a really useful perspective for other problems for example if i'm coming up with pasta or breakfast cereal or something like that there may be that sort of a sweet spot in terms of masticity so that could be a really good way of looking at those problems so even failed solutions for one problem may work really well with solutions to other problems
so the famous example here right is the post-it note that the glue that's used in post-it notes was originally sort of a failure was a glue that didn't stick very well but it turned out to be useful for other sorts of problems mainly making sticky notes now there's more to it than this though so it's not just the recombination of ideas because for hundreds thousands of years people had ideas and here's where we've got it sort of reached just one level deeper if you think about why we've had such sustained growth how is it these
ideas have been able to be recombined we have to recognize there had to be some way to communicate those ideas so joe mulcure wrote a wonderful book called the gifts of athena in the gifts of athena he talks about how the rise of things like modern universities printing press and scientific communication allowed ideas to be transferred from one location and one person to another and so what really led to this you know huge burst of activity you know sometimes called the technological revolution was the fact that we can now share those ideas and then recombine
them because you know like a tree if an idea falls in a forest nobody hears it and nothing happens to it so where are we here's how we think about innovation you think about problem solving several things going on first is you have to represent that problem in some way second things you got to have some way of looking for solutions to that problem different people represent problems in different ways different people look for different solutions to problems that means different people can help one another out through that diversity second thing once somebody finds a
solution to a problem or when somebody comes up with some sort of product or even comes up with a representation like a perspective or a heuristic that can be recombined with all the other ways of thinking and all the other solutions we have to lead to ever more growth so if you want to ask where does innovation come from comes from two things comes from diversity of perspectives and heuristics and it comes from recombination of those new ideas and that's what allows for you know ever improving solutions to problems and ever improving new ideas new
products new technologies and new policies thank you hi in this set of lectures we're going to talk about markov models now markov models are really simple they consist of just two parts first thing is there's a set of states so those are states that a person's psyche could be in they could be the state of a particular government or an economy and then there's going to be transition probabilities and the tradition probabilities are going to tell us the probability of moving from one state to another so remember all the stuff we learned about probabilities how
they sum to 1 and that sort of thing all those rules are going to apply here but the properties are going to tell us how likely it is to move from say one state to another let me give a couple examples so first let's suppose we have some students and those students could be either one of two states they could be alert or they could be bored and there's going to be some probability p that they move from alert to board and maybe some probability q that they move from board to alert and over time
students are moving back and forth from the alert state to the board state and the markov process will give us a framework with which to understand how those dynamics take place now learning board students don't seem maybe that important so let's do something more relevant let's talk about countries being free or not free so those are the two states country can be free a country can be not free and we can do is we can look at data in esco what's the probability that a state moves from free to not free and what's the probability
that a state moves from not free to free that'll also be a markup process so if we look at historically the number of free states and not free states and then create a third category called partly free which is the red line we can see that there's these different trends right the free states seem to be increasing the not free tend to be decreasing what we can do is we can mark off process to figure out where is this process going to end up is this going to do we end up with all free states
or we're going to end up with maybe some moderate number of free states in the process still churning that's where the model is going to help us now remember we talked about the different sorts of things that processes can do they can go to equilibria they can be cycles they can be completely random or they can be complex what we're going to find is as long as just a few assumptions hold that markov processes are going to be here they're going to go to equilibria and so there's a theorem called the markov convergence theorem this
markov convergence theorem tells us as long as a couple really mild assumptions hold they may get like a finite number of states and those probabilities stay fixed and then one other thing you can get from any position any state to any other state then what we'll get is the system goes to an equilibrium so this is a really powerful thing it has all sorts of implications that we're going to flesh out as we look more deeply into them all now to do this to understand markov processes we're going to introduce a little bit more notation
and another technique another tool for understanding models these are called matrices so matrices really just like a little grid like a two by two or three by three where you put numbers in here like point four point five point six point five and those will be the transition probabilities so what we're going to learn how to do is we're going to learn how to multiply by matrices in order to understand these markov processes in particular to understand how the markov convergence theorem works we'll use these matrices to explain why these systems go to equilibria now
the reasonably markov processes is two-fold one is they're they're really sort of a useful way to think about how the world works and we get this really powerful result the markov convergence theorem that says these systems are going to go to these this these equilibria any markov process goes to an equilibrium second reason we're going to do them is what we talked about in the previous lectures this idea of exception that the markov model is incredibly fertile once we have the markov id in our head once we understand what a markup process is we can
apply it in a whole bunch of different settings in fact one of my colleagues when you give him almost anything he'll say oh that's a markov process and there's a sense in which a lot of things are markov processes and it's often really really useful to think of things in the context of markov processes it's also true once we have this idea of transition properties and matrices we'll see that we can use those in a lot of settings as well okay so let's get started we're going to start out by just doing very very simple
markup process then what we're going to do is we're going to look at slightly more complicated one and then see how the markov convergence theorem works once we've got all that in play then we'll go back and talk about exception where we can apply other settings okay thanks hi welcome back in this set of lectures we're talking about markov models and in the previous lecture i introduced what they were other these finite set of states and there's transition probabilities between those states what i want to do in this lecture is just show you how a
simple markov process works so we're going to take a very very simple markup process and work through it see exactly how the dynamics unfold and in doing so we're going to learn how to use matrices actually specifically how to multiply matrices okay so let's let's take our simple example let's do the case of the alert and board students so i'm teaching a class like many people do this is an online class there's some percentage of people that are alert and there's some percentage of people that are bored now what's going to happen at any given
moment in time someone who's alert could switch and become bored and someone who's bored could switch and become alert and what we want to do is understand the dynamics of that process we want a model that helps us understand these dynamic processes where there's these states alerting board and people move between them so we've got to make some assumptions we've got the two states alert and board now we need to understand the transition probability so we need to assume something about transition probabilities so let's assume the following is true that 20 of the alert students
become bored but the 25 percent of the board students suddenly say hey that sounds interesting these markov processes sound really cool and they become alert so that's what we want to model so we can think of that as we've got the set of alert students 20 of them are going to become bored and of the board students 25 are going to become alert so we can draw this picture but this doesn't help us much we want to sort of figure out what's gonna happen and this is where the matrices will be useful now before we
do the matrices let's just try and do it by hand so let's start with the scenario we've got a hundred alert students so if i have a 100 alert let's go say a is 100 and board is zero and i know that 20 percent of the alert become bored and the rest are going to stay alert so that means what's going to happen is i'm going to then have 80 alert and i'm going to have 20 board but now i've got to think okay what happens next what happens next i know that of the 80
alert i know that 20 percent which is 16 are going to become bored and the rest 64 will be alert now the 20 that are bored i know that 25 percent 25 percent of that is five so i'm going to put five of those become alert and i'm going to know that 75 of it which is 15 stay bored so what i'm going to get is 69 alert and 31 board and i can say okay now i've got alert 69 and board i've got 31 and now i've got to do this again i've got to
think okay well 20 of these which is going to be 13.8 become bored and so on unless you think okay this gets really complicated i've got numbers all over the place maybe there's a better way maybe instead of writing all these numbers with these arrows there must be a simpler way to keep track of all this well there is and the idea is something called a markov transition matrix and here's the idea we basically write down the probabilities of moving from state to state so these columns tell us what's true at time t and the
rows tell us what's true at ten t plus one so if you're alert at times t there's an eighty percent chance you stay alert in a twenty percent chance you become bored if you're bored at time t there's a twenty five percent chance you become alert and a 75 chance that you stay bored so this gives us just a matrix representation a simple representation of all these transition probabilities now the reason this is useful is then we can just multiply these matrices to see how the transitions unfold so here's an example suppose i start with
a hundred percent or one this is just a probability so property one someone's alert and i want to figure out how many alert people are next time well this number this point eight tells me the percentage of alert people that stay alert so i'm going to get so i take 0.8 and i multiply it by the one and that gives me 0.8 now i want to ask how many board people become alert well 25 do and how many board people were there was zero so i'm going to get 0.25 times 0 so i end up
with 0.8 so the way i multiply matrices is i basically take this row here and i multiply it by this column now let's make this more formal so what i do to multiply these things out is i've got here's where people go at times t plus one this is how you're gonna be alert at time t plus one eighty percent of the alert people and twenty five percent of the board people here is the percentage of alert people the percentage of the board people so i want to know how many alert next time it's eighty
percent of the one and twenty five percent of the zero and that's going to be 0.8 now i want to ask how many are going to be bored well 20 percent of the alert people in 75 percent of the board people so i take this row and also multiply by the columns i take 0.2 times 1 and 0.75 times 0 and i get 0.2 so what you get is when i multiply this matrix matrix the transition matrix here by this initial vector 1 0 i get 80 20 just like i did before remember the way
i did that is i take this row multiply it by the column and then i take this row multiply by the column well watch now i can do it again now i'm at 80 20 and i want to ask how many am i going to get next period do i take this row 80 25 and multiply it and say 80 percent of the people are alert eighty percent of them stay alert so that's right there twenty percent are bored twenty five percent become alert so that's right there and when i add those up i'm gonna
get sixty nine and here i'll get thirty one we'll see what happens in the next period again just take this row times this column so the 69 percent that are alert 80 of them stay alert and of the 31 that are bored 25 of them become alert and i get that 63 percent are alert and then i can do it again and take 0.8 times the 63.25 times 2.37 and get the new percentage that's alert which is going to be 60 and i can keep going and going and going and if i do it one
more time i end up with 58 so what does that tell us that tells us that if we started with all alert students after six periods i'm going to end up with 58 percent of students being alert and we want to know where does this process stop is it going to go end up with nobody being alert well let's think that through let's suppose we started out with nobody being alert and we can ask what happened so i started with all board students what's going to happen well now all i do is put a 0
for the alert and a one for the board and ask what happens next well eighty percent of these alert students will stay there but that's zero so it's point eight times zero and twenty five percent of the board students will become alert so that means i'm gonna get and that means since this sums to 1 i'm going to get 0.75 over here next period i've got 25 percent of the students are learning 75 are bored well now i can just put that here as my population at time two and i can think okay of these
25 percent that are bored they're alert i'm sorry 80 will stay alert of the 75 that are bored 25 become alert and if i multiply all that out i'll get that 45 alert 55 on board if i do it again i want to put an even number of alert in board and if i do it one more time i'll get that 53 percent alert and 47 percent are bored so sort of looks like this is going to an equilibrium when i started with everybody alert i got down to 58 alert and i started with everybody
being bored i ended up with 52 and a half percent being alert so it looks like it's converging somewhere between 53 and 58 well how do i find what that equilibrium is this is where the matrices become really powerful so let's think of it this way there's some percentage of people that are alert that's p there's some percentage of people that are bored that's one minus p what i'd like is after this process takes place for the same percentage to be alert so how many are going to be alert well that's going to be 0.8
p plus 0.25 times 1 minus p so what would it mean for there to be an equilibrium equilibrium would mean that after i multiply this out i have the same percentage of people being alert so the equilibrium i put a little star here is going to be the p star such that 0.8 p star plus 0.25 times 1 minus p star just gives me p star back that i end up with the same percentage of people alert that i started this just becomes algebra i can now write this out i've got an equation where this
is my markov transition matrix and i want some probabilities p of people being alert such that after the transition i've got p back right so that's just going to be 0.8 p plus 0.25 times 1 minus p should equal p i want to find the p that solves this well let's multiply through by 20 just to make this simpler so i'm going to get 16 p plus 5 times 1 minus p equals 20 p so that's going to give me 16 p plus 5 minus 5 p equals 20 p so i bring all the p's
over to the one side i'm going to get 5 equals 9p so p equals 5 9. so what that says if i start with 5 9 other people being alert i'm going to end up with 5 9 of the people being alert so let's think about how that works precisely so five nights at people are alert what do we know 20 percent of them are going to become bored so that means 20 means that each period one-ninth of the population will become moved from alert to board i also know that 25 percent of the board
people become alert what's 25 percent of four ninths that's also one ninth right so what i guess it so each each period one ninth of the people are moving from alert to board and one night to the people moving from board to alert which means that exactly five ninths stay alert and exactly four ninths stay bored now notice what this equilibrium is like it's a statistical equilibrium so we can think of an equilibrium point where nothing changes here the thing that's not changing is the probability so the population is still churning people are moving from
alert to board but if i think in terms of probabilities that probability is staying fixed that probability is staying fixed at five ninths people are moving around but the probabilities think fixed that's why this is sometimes called a statistical equilibrium because the statistic p the probability of someone being alert is the thing that doesn't change okay it's pretty involved right what we did is we wrote down the markov transition matrix and we showed how using that matrix we could solve for an equilibrium and we saw at least in the simple example of alert and board
students that the process went to an equilibrium and it was fairly straightforward to solve for it what we want to do next is we want to do slightly more sophisticated model that involves multiple states instead of just two involves three states and we'll see how that process also converges to an equilibrium thanks hi in the previous lecture we talked about kind of a fun model where students could be alert or bored remember as a markov model in these markov models there's a set of states in that case the learner board and then there's these transition
probabilities that give you the probability of moving from alert to board i want to move in this lecture to a slightly more complicated model and it's going to be a model that involves countries and these countries can either be free partly free or not at all free you know being run by dictators and i want to look at the dynamics of that situation it'll help us sort of maybe learn a little bit more about how these markup processes work see how we can extend them to more dimensions and also come up with a somewhat counterintuitive
finding okay so let's start simply with even just a two-state democracy model so i'm gonna imagine there's two types of countries there's democracies and there's non-democracies i'm assuming that five percent of democracies switch into dictatorships every decade and that 20 of dictatorships become democracies so that'll be my assumption and then let's just walk through the logic so how do we do this well we're going to write down a markov transition matrix so let's let's start by assuming we have 30 percent of countries that are in democracies and 70 are dictatorships so of this 30 percent
that are democracies we know that 95 percent will stay democracies of the seventy percent that are dictatorships we know that twenty only twenty percent will become democracy so to figure out how many would democracies next time we just take this row and multiply it by this column so we get 0.95 times 0.3 plus 0.2 right times 0.7 so it makes your parentheses around here so this is going to be 0.285 and this is going to be 0.14 and that gives us 0.425 so we're going to get 43 percent of countries in the next decade are
going to be democracies and we could do this one more time and say if we have 43 percent or 42.5 percent democracies we multiply this row by the column we're going to get that 52 percent of democracies next time now if you look at this trend you say we start out with 30 democracies then we go to 42 percent then we go from 42 to 52 well you might just sort of extrapolate look that looks like a linear trend and eventually we're going to end up with everybody being in democracy but yet we know that
that's not true right we know we can solve for the equilibrium and that it's probably going to involve some churn so how do we solve for the equilibrium remember from last time we just want to take this row times this column where now instead of putting down a specific probability we put p and 1 minus p we want after we multiply that through that we get the p the same p back so that means it's going to be 0.95 times p plus .2 times 1 minus p should equal p all right we've got a bunch
of stuff let's just multiply this by 100 to get rid of everything so we get 95 p plus 20 minus 20 p equals 100 p right i just multiply both sides by 100. now if i bring everything over there i'm going to get 20 equals 25 p so that means p equals four fifths so here's sort of the surprising thing we only end up with 80 percent democracies even though right 95 percent of democracies stay democracies and 20 of dictatorships become democracies in each decade we still end up with only 80 democracies that's what's counterintuitive
and that's why having this markov model can be really useful because it helps us you know really figure out what's going to happen as opposed to just maybe extrapolating and thinking boy there's this big trend towards democratization if things continue as they are everything's going to be a democracy well now let's move to a more sophisticated model now let's suppose that we classify countries not as just democracies or dictatorships we have three categories free partly free and not free and this is some data actually from freedom house that i you know just plugged into excel
and plotted out what we see is an increase in free countries and a decrease in not free countries and a slight decrease in partly free countries we could ask what's likely to happen well what you can do if i plug in sort of in five year increments the transition probabilities and do some crude estimates sort of get the following i get that each decade five percent of free and about 15 of not free become partly free so those are those transition probabilities and five percent of not free and 10 of partly free become free and
ten percent of partly for partly free become not free so all sorts of transition properties are kind of complicated the matrix is more useful so i can put free partly free and not free right here and then i can put free partly free and not free here and now i've just got three states so it's just like i had before except for instead of a two by two matrix i've got a three by three matrix same thing goes now used to be before we had computers when you went to three by three and four by
four matrices you just go oh no it's gonna be a lot of math a lot of algebra it was but now that you've got you know computers it's very easy to just you know make a huge matrix and solve for the equilibrium there's nothing complicated about it so what does that equal look like well all we do is take each one of these rows and multiply by the columns but now the column has a p a q and a 1 minus p minus q instead of just a p and a 1 minus p and i
want it to be the case that when i multiply this row times this column i get p and when i multiply this row by the column i get q and when i multiply this row by this column i get 1 minus p minus q so a lot of algebra here you can do it and if you do it you get the following answer you get that 62 and a half percent of countries will be free 25 will be partly free and 12 and a half percent will be not free now if you look at that
initial graph you might have thought oh look there's this trend towards freedom we're all going to be free but in fact if transition probabilities stay fixed well fewer than two-thirds of countries will be free so again that's sort of surprising because if you look at this train you think my gosh you've gone from 25 up to 45 percent it looks like very soon we'll all be free but in fact assuming those transition properties stay the same you end up with about 62 and a half percent being free here's what our model shows our model is
not quite as good model sort of shows these general trends like this if i plugged in that same initial condition and ran my model i get this sort of picture it doesn't look exactly the same as that picture but it doesn't look bad another way to look at is to say if i feed those probabilities in and start it in the initial case how close does it get and what you can see is the model comes up at the end of the 40-year period with values that are really close to what we saw in the
real world now the reason they're so close is because i estimated my transition probabilities from the actual data so it's likely to be really close like this but what's more interesting is that the patterns look fairly similar as well now does that mean we can buy into the sixty two and a half percent for sure probably not you know it doesn't mean it's going to be exactly right but it does mean unless those transition probabilities change in a very serious way we're not going to get to 100 free countries we may be more likely to
see something like two-thirds so what have we done all we've done in this lecture has shown that we started that simple alert board model that was a toy model we often do that when constructing models we take something very simple it's kind of fun we can all understand it and see how the process works once we understand the model then we can take it to real problems with more dimensions and even tie it into real data and get a deeper understanding of how the model works in some cases get fairly surprising solutions right so here
the surprising result was that even though there's a trend towards more free countries if transition probabilities stay in the range we're in we shouldn't expect to see everyone be free we should expect to see only two-thirds of countries be free okay let's move on now though and talk about why that converges at sixty two and a half percent what's causing these markov models to go to equilibrium and we're gonna learn something called the markov convergence theorem okay thanks hi we've been studying markov models and we looked at that first model where there's students that were
alert and bored and then we looked at the more realistic model the more interesting model countries being free partly free or not free and in each of those cases what we saw is that the process converged right that it went to a nice equilibrium what we want to do in this lecture is study something called the markov convergence theorem sounds scary is a little bit scary what it's going to tell us is that provided a few assumptions are met and they're fairly mild assumptions that markup processes converge to an equilibrium so this is a powerful
result because it tells us what's going to happen to a markup process now remember this is a statistical equilibrium right it's going to keep churning but the probability that you're in each state will stay fixed and what we want to do is understand what are the conditions that must hold for that to be true so let's go back right and think of our first example we had alert students and board students and we had some p that was the probability that you were alert and what we could do is we could say well 0.8 p
plus 0.25 1 minus p has to equal p that's an equilibrium and we found that if p was equal to five ninths then that probability stayed the same if five ninths of students were alert four ninths were bored then we stayed in those proportions that's what we mean by an equilibrium so what we want to do is we ask what has to be true of our markov process for an equilibrium to exist there's just four assumptions first one is you gotta have a finite number of states well that's the definition of a markov process at
least the ones we're considering so that's always going to be satisfied second is the transition probabilities have to be fixed so by that i mean that from period to period the probability you move from one state to another doesn't change now we'll talk in a minute about why that might not always be true but for the moment let's just assume that's the case third and this is sort of the big one you can eventually get from any state to any other state so it may not be that you can get from state a to state
c right away maybe they have to go through b but as long as there's some way from getting from state a to state c that's fine that'll satisfy assumption three and then the last assumption the fourth one this is sort of a technicality it's not a simple cycle so if i wrote down a process where you automatically go from a to b and automatically went from b to a then the thing would you know churn the thing is it wouldn't really go to this nice stochastic equilibrium necessarily it could be the all a's and all
bs and all a's and all these then all a's and all b's so if we rule out simple cycles and just assume finite states fixed probabilities can get from any state to any other then you get an equilibrium so this is the markov convergence theorem given a1 through a4 a markov process converges to an equilibrium and it's unique so you're going to go where you're going to go no matter whether you start with all board or all alert all free all not free you're going to end up at an equilibrium and it's going to be
the same one that's determined sometimes entirely by those transition probabilities so if i write some markup process like this and i go ahead and solve for it there's only one answer there's going to be a unique answer for what that equilibrium is going to be so let's think about what this means because this is incredibly powerful first thing it sort of says the initial state doesn't matter it doesn't matter where i start if i start with all free all not free i'll alert while board right anything that's a markup process any market process the initial
state will not matter history doesn't matter either it doesn't matter what happens along the way if it's a markov process history doesn't matter what's going to happen is going to happen we're going to go to that equilibrium now history could depend on you know which students move from alert to board but the long run percentages of alerting board the long run percentages of free and not free states is going to be the same regardless of how history plays out intervening to change the state doesn't matter so i go in and change a state like if
i go and say let's just make a country move it from free to not free well guess what in the long run that's going to have no effect now i've posed all of these as puzzles the initial state doesn't matter history doesn't matter intervening to change the state doesn't matter and they're puzzles because that doesn't seem to make any sense because if you think about it we history matters a lot initial conditions matter a lot interventions can matter a lot when you think about you know whether you're running a small organization a big business or
a government you think about let's come in and intervene here because we're going to make the world better this markov process seems it's very deterministic it's sort of saying none of these things make a difference it doesn't matter where you start out what happens along the way doesn't matter and if you intervene it's not going to have any effect but let's see what we mean by that so suppose you can have let's just think in the context of like a relationship so some other relationships you could be happy or the relationship could be tense and
suppose we're modeling hundreds of relationships like a whole community of people and we're just keeping track of how many relationships are happening how many are tense let's suppose that these relationships have a markov process so there's fixed probabilities of moving between happy and tense we might say well you know there's a lot of tension in the community right now so let's just you know buy a whole bunch of people dinners and let's just move 50 couples from the 10th state to the happy state by like giving them free dinners on the town well if you
do that what's going to happen well for a very short period of time you'll make more people happy but there's going to be that movement back towards tense and the transition probabilities if they stay fixed are going to take you right back to the same equilibrium or before so there's going to be no effect in the long run on the system so does this mean in general so i mean we want to take these models seriously but not too seriously does this mean that interventions have no effect that interventions are meaningless and does it mean
that we shouldn't even redistribute stuff if we redistribute happiness or if we give these people meals you know to make them happy that that is absolutely known either does this mean we shouldn't do these things but let's let's be careful there's a number of reasons why even though the markov model tells us that history doesn't matter interventions don't matter initial conditions matter don't matter that it really could matter and the first one is this it could take a long time to get to the equilibrium so let's go back to those happy intense couples it could
be that if you make those couples happy some of those tense couples yeah eventually we're gonna go back to the old equilibrium but it could take 20 years and if it takes 20 years well those intervening 20 years there's a lot more happy couples or if we think about only maybe only 60 percent of countries will be free but if we artificially make too many free we could have 30 40 50 years of a whole bunch of countries remaining free that wouldn't have been free otherwise so even if the long run we end up at
the same place it could be in the intervening years we still get some sort of benefit but that idea that it just takes a long one to get a long time to get there maybe we can get a little boost in between it's still sort of you know taking this somewhat negative view that any intervention we do can't matter but yet we've got this darn theorem right we've got this theorem that says find a number of states fixed transition probabilities can get from any state to any other then none of these things do matter well
let's look at these let's look at them seriously and ask which of these things maybe doesn't hold well the finite state thing that's kind of hard to argue with because we could just sort of bin reality into different states remember earlier we talked about categories well these states are categories so we can think about which you know categories do we create to make sense of the world and having a finite number doesn't seem like a big idea um this can eventually get from any one state to any other well that one you know maybe there's
cases where that's not true maybe there's cases where you can't go from one state to another so that's one we want to look at but the one we really want to focus on is this fixed transition probabilities because it could be that when we move from one state to another when we move from tenths to happy or we move from not free to free or as more countries move from not free to free that suddenly the transition probabilities in the system change that there's some large effect of these transition probabilities change so the thing we
want to focus on we think about why history may matter why interventions may matter is because those transition probabilities may change over time as the function of the state we're in now this doesn't mean that the markov model is wrong markup model is right it's a it's a theorem it's always true but if we want history to matter if we want interventions to matter then we've got to focus on this we've got to focus on interventions or policies or histories that can change those transition probabilities let me phrase this in a slightly different way if
we think about changing the state of the process like moving from tense to happy it's just going to be a temporary effect but if you think about changing the transition probabilities then we can have a permanent effect so when you think about what are useful interventions they're going to be interventions that change the transition probability when you think about moments in history that could be things like tipping points right which we talked about before those are going to be moments in history that change the transition probabilities so if we have a tipping point if we
move from one likely history to another history to another what must be going on is those transition probabilities have to be changing so what have we learned learn something really powerful if we have a finite set of states fixed transition probabilities and we can get from any state to any other then history doesn't matter interventions don't matter initial conditions don't matter now that's not to say that those things don't matter in the real world they probably do but if they do then one of those assumptions have to be violated either states aren't finite that's sort
of a hard one to disagree with so it must be that either we can't get from any some place to every place else or that those transition probabilities can change now the most likely one is the transition probabilities can change and interventions that really matter interventions that tip histories that matter are events that change those transition probabilities so what we see is not that everything's a markov process but we see is that this markov model helps us understand why are some results inevitable because they satisfy those assumptions and why are some results not okay thank
you hi in this last lecture about markov processes what i want to talk about is acceptation i want to talk about how we can use the markov model in context and for problems we never would have thought of and i do this in two ways first i'm going to talk about taking it whole hog taking the entire process and modeling other things just like we modeled the process of states becoming free or dictatorial second what i want to do is i want to just take part of the markov model the transition probability matrix and use
that to understand some things that are really kind of surprising and interesting so let's just remind ourselves of what a markup process is fixed set of states and then there's fixed transition probabilities between those states now if it's possible to get from any state to any other through a sequence of transitions then remember we have this markov convergence theorem that says the process is going to go to unique equilibrium so history doesn't matter initial conditions don't matter all that sort of stuff let's think about where else we might apply that so we one place right
away is voter turnout so you can think about there's a set of voters at time t and there's a set of non-voters at time t you know what we can do is we can draw a little matrix and say well how many of those are going to vote at 10 t plus one and how many of these are not going to vote at time t plus one and it could be that eighty percent of voters at time t vote and t plus one twenty percent don't it could be of non voters that forty percent vote
and sixty percent don't vote this would be our markup transition matrix and if you apply this what you'd get is the unique equilibrium we should tell you the number of people you'd expect to vote in the election now it's not going to be the same people right because the process is going to churn it's a statistical equilibrium not a fixed point but this model if it's right if these transition probabilities stay fixed this would tell us what turnout should be even though it won't tell us who votes where else doesn't use it well we could
use it for school enrollment same sort of thing right you can imagine here's kids who go to school and here's the ones that don't go to school and then we could ask okay at tim t plus one how many go and how many don't go it could be that of those that go ninety percent go the next day and ten percent don't of those who don't go it could be that there's only a 50 50 chance that they come the next day well again if you work through the logic here you'll get some sort of
percentage of people students who show up each day and some percentages that don't they'll still be a churn but this model will give you an estimate of what total enrollment should be on a given day what percentage of students show up these two applications are very standard applications they're not unlike the one we looked at in terms of alert and board students and they're not unlike the one with free countries versus dictatorial countries what i want to do next is sort of go way outside the box i want to just take a part of the
markov model the markup transition matrix and i want to think about what that tells us what it tells us is if this is the state at time t what are the likelihoods we go in these other states at time t plus one so think about this for a second there's all sorts of things you could use this framework for where something happening at time t and then it transitions into something at time t plus one i want to talk about three uses of this matrix three very surprising uses first one to identify writers i mean
i could mention you can use this idea this transition matrix idea to figure out who wrote a book so suppose some anonymous person writes a book and you're trying to figure out did this person write it did bob write it are you trying to figure out did you know carlos write it and you can't tell well what you can do is the following you can figure out transition probabilities what do i mean well take this book take the book written by the anonymous author author and then say okay every time this book uses the word
for i'm loading the whole book in your computer what percentage of the time is it follow the word for with the record what percentage of the times that follow the word for with example and what percentage of the time does it follow the word for with the sake of and what i'm doing is i'm creating just a giant matrix so if it in sometimes like at time t i'm using four and then i'm saying what's the probability that i felt with a record example or sake and i'm putting a 0.17.9.11 so i'm just writing a
big transition matrix what you can do is you can take some key words create these giant transition matrices and then figure out what does it look like does this transition matrix look like bob's transition matrix or does it look like carlos's transition matrix now how do i know bob's transition matrix is what carlos's transition matrix is well that's easy i just take one of their other books load it into the computer and figure out what their transition matrix is for the other book once i've got that in there i can figure out but does this
look more like carlos or does it look more like bob now this actually gets used let me tell an interesting story so this is arlene saxon house he's one of my colleagues at the university of michigan and she was a young graduate student she found in the library at yale this book of um that included four essays by someone who she thought was a young thomas hobbes book was published in 1625 she thought 1620 and she thought oh my gosh these are essays by hobbs and the thing is she's a young grad student who's necessarily
going to believe her how does she prove it so she couldn't prove she had a strong instinct that it was true well eventually she found someone who knew how to do this stuff right who knew how to do these transition probabilities and they took the essays and they put in some of hobbs other writings and what they showed was that it seems fairly clear that three of the four essays were actually written by hobbs and now those three essays are actually considered part of hobb's work now even though it appeal you know just sort of
felt to her like they're written by hobbes that comes down to a matter of opinion by having a model by having these transition probabilities models right and by able to take that and take other hobbs work and this work you can show statistically that it seems very very likely that hobbs wrote the work so this combination of things the model alone doesn't do it the combination of the model plus her intuition plus the intuition of others gives us a common understanding now at least we think that hobbes wrote those particular volumes it's really cool let
me give you another example medical diagnoses so if you think about giving someone a treatment for some disease typically there's a sequence of reactions to that treatment whether it's a drug protocol whether it's an exercise or diet regimen what you can do is you can write down transition probabilities now these can be multi-stage so it could be for example that if this is going to be successful if this treatment's going to be successful that you go through the following transition you go first feel some pain then you're slightly depressed then more pain but then you
get better alternative if it's not successful it could be that initially you're depressed then there's mild pain then there's no pain and then the system fails right your system tells me you fail the drug fails the regimen fails so what does this mean this means that if i give someone the treatment and then i see this sequence of pain and depression i can say to them you know you're feeling pain you're feeling depression but guess what that's consistent with a regimen that's going to be successful whereas alternatively someone else could say well you know i
feel depressed that now i'm not only feeling that much pain you consider them okay even though you're not very much pain this probably isn't a good sign it doesn't look like the treatment is going to work so by gathering all sorts of data and past experiences you can use that transition probability to figure out early on on a treatment protocol whether it looks like it's working or not working another example lead up to war suppose you've got two countries a little bit of tension so what you get is the following you get let's say first
you get some political statements on each side then that leads to trade embargoes then that leads to military buildup so now you've got the sequence of three things you have these transitions between these three things you can ask historically when i've had those three transitions what's the likelihood that i've had war and what's the likelihood i haven't had one it could be that there's a twenty percent chance of one and eighty percent chance of not having more so if you're just sort of on the ground watching what's going on you say oh boy look at
this first of these political statements now there's a trade embargo now we're seeing military buildups looks like things are going to go to war well if actually gather a lot of data and calculate these transition probability matrices you could figure out you know this actually happens a lot and only 20 percent of the time does actually lead up to war so again we're not using the full power of the markov model not saying these transition probabilities necessarily stay fixed we're not worried about solving for the equilibrium all we're trying to do is just use this
matrix to organize the data in such a way that we can think more clearly about what's likely to happen so that's markup processes markup process fixed set of states fixed transition probabilities you can get from any one state to any other and then you get an equilibrium so that equilibrium doesn't depend on where you start doesn't depend on interventions and um it doesn't depend on history in any way the model is really powerful and so if you want to argue history matters if you want to argue interventions matter you somehow argue that this isn't a
transit this isn't a markup process or you've got to argue that you're changing the transition probabilities now that is impossible in fact policies that really make a difference interventions that really make a difference do change transition probabilities so what's really cool about this markup process is it's given us this model it's given us a new lens to look at the world when we think about i want to take this action it's going to make things better we have to be saying it's changing the transition probabilities not just that it's changing the state so if i
make my students a little more alert for three seconds by screaming or something that's not going to change the long run equilibrium alert in board students if i change my teaching style or if i add more interesting examples in class or something then it's possible i can change those transition probabilities and end up with more alert students we've also seen in this last lecture that we don't even need to use the full markov process model just the transition probabilities just that idea of a matrix of transition probabilities and we can find out all sorts of
interesting things like who wrote a book is there likely to be or is this medical treatment working so that framework the transition probability framework and that matrix of transition properties is a really powerful tool to keep in your pocket when you confront some sort of dynamic process and you're trying to figure out what do i think is likely to happen thank you hi in this set of lectures we're going to talk about something called leopardof functions now what lyapunov functions are is they're functions that really can be thought of as mapping models into outcomes in
the following way so what we can do is we can take a model or take a system and we can ask ourselves can i come up with a lyapunov function that describes that model or describes that system and if i can then i know for sure that system goes to equilibrium so what lyapunov function is is it's this tool it's an incredibly powerful tool to help us understand at least for some systems whether they go to equilibrium or not let me explain what i mean a little bit more remember how we talked about there's four
things a system can do it can go to equilibrium it can cycle it can be random or it can be complex leopardof functions if we can construct from that's going to be one of the challenges here we can come up with one then we'll know for sure that the system's going to go to equilibrium if we can't construct one then maybe it goes to equilibrium maybe it's random maybe it's chaos maybe it's complex we don't know we can't really say anything so the challenge here the really hard and fun part is coming up with lyapunov
functions if you come up with the opponent function then you know for sure hey this system's going to an equilibrium which is a nice thing to know not only that we'll see in a minute that you can see how fast it's going to go to an equilibrium so how does it work here's the idea suppose you have a system and i've got something i care about here which might be velocity on this axis and suppose there's a minimal velocity which is zero which i'm representing by this big black region down here now suppose what i
say is the following property holds i start with some positive velocity and every period if the velocity changes it goes down so it's going to go down to there and then it goes down to there now it could be the velocity doesn't change well if the velocity doesn't change then you're fixed then you're in equilibrium but if the velocity does change it has to go down well if that's the case if it changes it has to go down at some point it's going to hit this barrier down at the bottom the zero velocity point and
when it hits zero it has to stop so that's the idea if something if it falls if it moves it has to fall that's property one it's got to go down if it moves it's got to fall and there's a minimum well those two conditions are going to mean that the system has to stop with one little we got to pick up one little peculiar detail besides that but that's basically the idea if the system is going to move it's got to fall and there's a min so therefore at some point it's either going to
stop before the man like it might fall fall fall and then stop right here or eventually it'll hit the thing at the bottom that's the idea now how do economists do it economists do the opposite they have something where maybe this is happiness on this axis and maybe people are making trades and you say people trade happiness goes up so we've got a happiness here people trade it goes up people turn it goes up so anytime people trade total happiness goes up otherwise they wouldn't trade so that means any time the system moves happiness is
increasing but you've got this caveat that there's a maximum happiness here it can't go above this black bar so what does that mean if any time people trade it goes up and at some point you're going to hit this bar that means the process has to stop and if it has to stop that means it's an equilibrium there's no more trade everybody's happy with what they've got so there's these two substance identical ideas right one is from physics that if things fall every period and there's a min the process has to stop and then from
economics you have the things go up every period and there's a max it has to stop that's it that's the theorem i know it sounds sort of frightening like leoponof it sounds really scary i'm sure when you look at the syllabus she thought oh my gosh lyapunov functions this is going to be hard maybe i'll skip this lecture i thought about calling it dave functions or maria functions because then it wouldn't sound so frightening if i said we're going to study maria functions and i'll say that's probably going to be pretty easy or dave functions
it's just that with these russian surnames you sort of think oh my goodness this is frightening it's not very very easy here's the formal part what we do is we say there's a lyapunov function of the following holes first i just have some function f and i'm going to call this a lyapunov function and there's just three conditions the first one is it has a maximum value i'm going to do the economist version and the physics version i'd say there's a minimum value so it has a maximum value second assumption there is a k bigger
than zero so there's some number k bigger than zero such that if x t plus one isn't equal to x t so what that means is if when f is going to basically map the state now to x t into x t plus one if they're not equal right so if if the state in time t plus one is not equal to the time the state at time t then f of x t plus one is bigger than f of x t plus k what does that mean in words not in math what it means
is if it increase if it if it's not fixed so if the point is not fixed then it increases by at least k just by some fixed amount it doesn't always have to increase by exactly k can increase by more but it's got to be increased by at least k if those things hold so it's got a maximum you're always increasing by at least some amount k then at some point the process has to stop because if it didn't stop it would keep increasing by k and would go above our maximum that's the theorem now
what is this assumption to what is this thing about it's got it before i guess it has to be bigger now i've got it's got to be bigger by plus k what's going on well this goes back to something way back in philosophy called xenos paradox and aristotle's treatment of this is probably the one most of you learned in in college and that is suppose i want to leave this room suppose i'm going to leave this room right here and the first day i'm standing right here here i am and the first day i go
halfway to the door then before i get half and then the next day i go another halfway and then the next day i go another halfway the next day another halfway the next day another halfway i'd never actually leave the room well if i don't assume because what's happening here is i'm going up a half and then a quarter and then an eighth and then a 16. so if i made my steps smaller and smaller and smaller and smaller and smaller and smaller and smaller it could be that i continue to increase but i never
actually get to the maximum but if instead i assume that each step has to be at least 1 16 well then after 16 steps i'm going to be out of the room so what xenos paradox is that you can basically keep making steps halfway and you'll never actually exit and the paradox was that you could keep moving towards the door but never actually get to the door the way we get around that is we make this formal assumption that says there's some k such that if you move you go up by at least k and
so in this case i talked about it being 1 16 if you go by up by at least 1 16 then in 16 steps you're out of the room which and since you can't leave the room that's a max what's going to happen is the process has to stop so that's all there is to it the op enough function consists of is f it's got a maximum value and then there's some if it's the case that the process moves over time then in the next period you've gone up by at least some amount k and
since there's this max and you're going up at least cage time eventually you're going to hit that max and the process has to stop now there's a bonus that we just got as well right if each time i go up by 1 16 then in 16 steps i'm going to have to stop so you can also say how fast the process is going to stop and that's obviously not a very complicated calculation at all here's the tricky part though i talked about this the hard part about this is constructing the function so the theory the
idea that there's a function there's a max we go up by k each time that's really straightforward the really tricky part is going to be coming up with the lyapunov function coming up with that function f so what we're going to do in this set of lectures is we're going to take some processes things like arms trading trading within markets people deciding where to shop and we're going to show how in some of these cases it's really easy to construct the op enough functions in other cases it's really hard to construct the argument functions or
maybe we can't even construct the alpha enough functions so we're just going to explore how this framework this leopard function framework can help us make sense of some systems and help us understand why some things become so structured and so ordered so fast and why other things still seem to be churning around a little bit so the outline of what we're going to do is we're just going to start out by first doing some simple examples seeing how the opponent functions work then we're going to move on and see some sort of interesting applications of
zebra functions maybe when they don't work and then from there we'll go on and talk about processes that maybe we can't even decide whether they open up functions exist or not some open problems in mathematics that involve trying to figure out does this thing go to an equilibrium or does this thing continually churn and then we'll close it up by talking about how lyapunov functions differ from markov processes remember markov processes also went to equilibria we'll talk about how those equilibria are different from the equilibrium we're talking about in these lyapunov functions and also how
just the entire logic about how the system goes to equilibrium is different in the two cases okay so let's get started thanks hi welcome back remember we're talking about lyapunov functions and the happiness functions are really the simple thing we have two ways of explaining them first was the physics way where there's a minimum value and you've got a process that if it changes its value it moves down every period so if it's moving down down down down eventually it's going to stop it's going to hit the floor it could stop before the floor but
it's got to stop because of the floor on economics we often talked about systems where there's a max and so every period if the process moves its value is going up and since there's a ceiling here the process has to stop there so the optimal functions give us a way to say for sure that a particular system is going to equilibrium now if we that's if we can construct one if we can't construct one then we don't know maybe it goes to equilibrium maybe it doesn't what we want to do in this lecture is just
remind ourselves of what the opera functions are and then take an example take a famous example of a puzzle that's out there and show how this very very simple framework helps us make sense of that puzzle before we get to the puzzle though i first want to just remind ourselves of what we lyapunov function is a leaving function is a function f that has a minimum value we knew the minimum case here and there's another assumption that it satisfies if the process moves it's not on an equilibrium then the value of f falls by some
amount k some amount at least k so what you've got is a process that's got a minimum value and if it's moving if it's not equilibrium then it has to fall by at least k so what that's going to mean is eventually you're either going to stop or you're going to hit the floor and so what that means is you're going to get an equilibrium here's the puzzle go to any major city and this is a picture of stockholm that you see to my right and what you see is amazing order restaurants have the right
number of people in them so the coffee shops there's not huge line behind dry cleaners there's traffic but it's typically not incredibly backed up and the interesting thing is there's no central planner it's like the city self-organizes in some way so that there's the right number of people at the right places we're not all bunched up in particular places and there's not places that are completely vacant it's almost as if there was a central planner telling people where to go but we know there isn't so how is it that cities have this amazing structure that
when you go to the grocery store they've got the right groceries for you there's the right number of workers when you hop on the train the lines aren't incredibly long that when you go to the grocery store when you go to dry cleaners it's not incredibly crowded nor is it particularly empty what what enables the city to self-organize in the way that it does to be so darn efficient that's the puzzle and what we're going to see is that lyapunov functions can give us some inkling as to why even huge cities can self-organize in interesting
ways so here's the idea suppose that you've got five things you've got to do during the week you've got to go the cleaners the grocery the deli the bookstore and the fish market so these are five things you have to do at some point during the week you always got to go get fish and books and get your clothes clean so this sort of stuff and you can choose which day to go so here's how to think of it there's five days during the week soon we take the weekends off and just read your book
and have some fish wearing a nice clean shirt there's five days monday through friday and each day you have to decide during your lunch hour where to go you assume maybe monday you go to the dry cleaners right tuesday the grocery store wednesday's the deli thursdays the bookstore friday the fish market this would be just a route that you would take during the week and somebody else might take a different route what we want to see is by people choosing these routes whether or not the system is going to organize in such a way that
you don't get huge crowds of particular locations we'll see how we can map the open up function onto this process so here's the idea suppose you've got five people and each one of these people chooses some random order in which to visit these different locations so everybody else is just like you everybody else has to go to cleaners the grocery store the deli the bookstore and the fish market and they also pick one day a week to go to these things so each person has chosen their route this may be your route this may be
my route this may be somebody else's route everybody's got their own route what we'd like to do is not go to some place that's really crowded because it's really crowded then we've got to wait in line and it may take our whole lunch out of time for lunch so the rule is you're just going to want to sort of avoid crowded places and what we're going to see is if people follow that rule then we can put a lyapunov function on the process and show that it's going to go to an equilibrium and go to
a pretty darn good equilibrium so here's the idea we're going to assume the following behavior that people want to avoid crowds so i pick a route and if it turns out that i noticed boy when i go to the cleaners on monday it's incredibly crowded i switch that with another location so that i'm going to so monday i go to a place that's less crowded so i'm just going to switch the time i visit the dry cleaners maybe the time is the fish market in order to bump into fewer people that's the rule and then
we're going to see if that's the rule that this process is actually going to self-organize into something that makes a lot of sense so again here's the idea everybody's choosing these routes and let's look let's look at this person here this first person the very first day they're going to the cleaners but notice there's three other people at the cleaners so that means there's four people in the cleaners what they'd like to do is not have four people cleaners enough to wait in line so what they might think of is like if i go to
the fish market here there's no one going to the fish market on the first day so if i switch the fish market with the cleaners then monday i won't see anybody at the fish market and friday i won't see anybody at the cleaners so this first person realized if i just switch these two then i'm going to run into fewer people that's the idea that's the behavior we're going to assume that people follow what we want to show is we can put a lyapunov function on this process and show that this system is going to
you know keep going down and eventually has to stop because there's a min so what's the leap in a function remember i said this is the hard part and it's hard so the first thing you might think of as well maybe it's just the total number of people at each location well let's try that so how many people go to the cleaners will five people go to the cleaners how many people go to the deli five people go to the deli and what you realize is five people go to every location so that's that's not
gonna work right because even if i switch my route there's still five people going to the cleaners and five people going to the deli and five people going to the fish market so this first attempt of total number of people each location not going to work so let's try something else here's another attempt let's let the to be the total number of people that each person needs so how many people do i meet in a given week and now let's look at our example so we start out here we look at this person and right
here and you think okay in the first day they meet three people on the second day they meet no one he meets no one on the third day meets two people on the last fourth day on thursday two people and on friday one person so that's five seven eight so this person needs eight people well i was supposed to switch and go to the fish market on monday and go to the cleaners on fridays this person switches to be less crowded well now on monday they meet no one on tuesday they meet no one on
wednesday they make two people on thursday they meet two people and on friday they meet no one for a total of four people now remember before they met eight so by switching those two they reduced the number of people they meet from eight to four we have to look careful here because there's also four other people what about those four other people could their numbers have changed well they did right because these four people these people that were see here before we're meeting this person and now they're not so in addition to this person running
into four fewer people the four people they were meeting also went into four fewer people so the total reduction the number of people that meet each other is eight it's going to be four times two which is eight because each person that person one doesn't meet also doesn't meet personal one there's a total number of eight fewer meetings so this is going to be a leopard function if people's rule is switch so that i meet fewer people then when somebody switches they meet fewer people fewer people meet them so the total number of people who
meet each other falls now let's ask is this a lyapunov function well what are the conditions the first is does it have a minimum value sure zero if nobody meets anybody then that's the best you could do so yes there's a minimum value it's just a zero second if it's the case that somebody changes their route doesn't mean that the total number of people that people meet falls and the answer again there is yes because if i move i'm moving so i meet fewer people it also means that fewer people meet me which means that
the number of people met has to fall so if anybody moves the number people met has to fall but remember we also have to have that has to follow by at least some amount k well in this situation it's easy that k is easy because i'm meeting at least one fewer person and if i'm running one fewer person that person's also not meeting me so k is going to equal 2. if i'm meeting one fewer person then there's one true person meaning me so at least two people have lower the number of people they meet
so i've got a function with a minimum of zero it goes down by two each period so therefore the process has to stop so if i take a route selection process like this and people are switching what you're actually going to get is you're going to get that everybody meets no one because you can keep switching so we're going to keep switching until you would get an ordering of people so that nobody's running into anyone else now to prove that you actually this thing only stops at zero takes a little bit more work so we
won't do that but what's gonna happen is you're going down by two weeks period and it just keeps going down down down down down until eventually nobody's meeting anybody this gives us an understanding it's not a full explanation because of some intuition as to why when we go to a city it's so organized because people are trying to avoid crowds if everybody's trying to avoid crowds then what happens is you get a relatively efficient distribution of people across activities and restaurants and shops and museums and things like that so the whole city seems to be
organized as if by a central planner when in fact it's self-organizing because of the fact that people are trying to avoid running into too many people and what you end up getting them is a reasonably smoothly running city without some massive central planning attack without us getting signals like it's okay you can now go to the cafe scott they have to tell me that because people are going to develop routines of when they go to particular locations in order to avoid avoid those crowds this is pretty cool i think what have we got a very
simple model right simple model is there's a min if the process moves it goes down by some amount each time therefore the process has to stop we use that model to say let's talk about how a city organizes itself how is it the people in a city choose where to go and how does it seem to be so efficient and what we see is that people's maneuvering within the city is probably somewhat to avoid crowds to go to places you like but not wait in huge lines so in doing that you're always reducing the number
of people that you meet let me just a little bit critical this for a moment this was a an extreme simplification because this model says the city is going to go to an equilibrium but everybody's choosing the exact same routes now in fact a city is more of an open system there's tourists coming in there's all sorts of you know people being born and people dying and new businesses starting and all sorts of things so that's going to keep a city churning and somewhat complex but within that process there's all sorts of people who develop
regular routines of places that they go and those regular routines move that the opponent function down down down in terms of the number of people that one of each of us runs into and allows the system even though the influx maintains some complexity to be relatively efficient to sort of keep down the number of crowds that people run into so the model doesn't fully explain the city but what it does is gives us the insights into how the city is able to organize itself in such a way that there's never too many people to barbershop
and never too many people to cleaners and only some people at that cafe it's never completely empty all right thank you hi welcome back remember that we're talking about lyapunov functions and they're this intuitively very simple thing so lyapunov function would be a case where you've got a maximum value and if the process stops that's fine but if it doesn't stop it goes up by some fixed amount therefore eventually it's going to hit this thing and stop which means that if a system has a lyapunov function associated with it it goes to equilibrium now if
it doesn't have a layup enough function we can't think of one then maybe it goes to equilibrium maybe it doesn't the point is if we can construct the leap in a function then it definitely goes to equilibrium but it's a useful thing to know what we want to do in this lecture is give an example of another process in this case it's going to be an exchange market that has a leopard function so therefore ghost equilibrium then i'm going to problematize a little bit and show that well why might a system not go to equilibrium
so i'll create another sort of market that doesn't go to equilibrium and we'll see why that's the case why what has to be in the way from us to create a lyapunov function so what prevents us from constructing a really simply open function and showing yes the system goes to equilibrium interestingly enough we can relate this back to something we studied earlier in the course i'm chris langton's lambda parameter and that was in the context remember those very abstract one-dimensional cellular automata models we're actually going to bring that logic back into play here to understand
something about the open off functions so let's get started what's an exchange market an exchange market consists of a situation people just bring stuff so this is in bologna italy people bring fish other people bring money people bring baskets and you trade things so everybody brings their stuff in a cart and they trade with other people who brought their stuff in the cart and at the end of the day people go home and we want to ask is that system going to go to equilibrium or are people just going to keep trading things constantly all
throughout the day is just going to be some muddled mess of behavior well let's think through it what are our assumptions for the model first is each person just brings a wagon full of stuff second assumption we're going to assume that you only trade with someone if you're happier with what you have now than what you had before otherwise why would you trade and the third assumption hidden in here a little bit is we're going to assume that you have to increase your happiness by some fixed amount x so there's some amount and i probably
should have called this k because i was using k before so let's cross out that x and call it a k there's some fixed amount k you have to be happy about in order to make the trade now why would i assume next i'm assuming that there's some transactions cost so if i don't if i'm trading a basket for some fish it's got to be the case that i want the fish by at least you know the cost of going through that whole trading thing in order to get the fish for the basket so this
k is just the cost of trade and that's going to be important because we need that in order for the opponent function to to work because we need that happiness has got to go by at least some amount k let's recall what is a leopard function some function f that has two assumptions assumption one it's got a maximum minimum value here we're gonna assume a maximum assumption two that either the process is at an equilibrium or if it moves the liability function the value the function f goes up in value by at least some amount
k those are the two assumptions so let's think about what a lyapunov function here might be well here in this exchange market we can let it be the total happiness of the people that's it very straightforward total happiness of the people is only open a function let's think about it does it satisfy these two assumptions does it have a minimum a maximum value sure because people brought just wagons loads worth of stuff in there's only so much happiness that can go around right people can only get so happy with a fixed amount of stuff so
if you give everybody exactly what they wanted and add that all up that would be the maximum happiness you could possibly get so there's some maximum we might achieve it but there's definitely a maximum because we've just got some wagons we're full of stuff assumption two there's some k such that if the process doesn't stop happiness goes up by at least k well that also holds remember because we assume that you only trade if you're improving by some amount k because there's some cost of doing the transactions you're going to trade with someone if you're
happier so therefore every trade makes people happier and there's a maximum so therefore we're going to get an equilibrium so this function is going to work let me add a little more detail two pieces of detail first i'm going to say that nw here stands for nuclear weapons o stands for oil and k here stands for north korea and i stands for iraq so north korea is going to say to iraq i'll give you some nuclear weapons for oil in iraq says to north korea that's great here's some oil give us some nuclear weapons and
you who's not involved i'm going to assume near the united states so this is no longer people trading fish for baskets this is north korea getting some oil and iraq getting some nuclear weapons let's think about our happiness function north korea is happier they got rid of some nuclear weapons they got a bunch of oil to help their economy iraq's happier they got rid of oil and now they can better defend themselves they get nuclear weapons the united states whose actor you here even though they're not materially engaged in the transaction they're affected there's an
externality they're less happy and because they're less happy that means that total happiness didn't necessarily go up because germany france england brazil venezuela all sorts of people could be less happy and if all sorts of people are less happy this trade may mean total happiness went down and if total happiness went down that may mean that other people that have to make other trades as they try and make total happiness go up so what can happen is we don't know for sure whether the system is ever going to stop it could continue to churn because
we can't put a lyapunov function on the process let's think of some other situations political coalitions when party a merges with party b party c may be upset total happiness isn't going up what about mergers within firms you think of firms you know you think of they're being the op enough function could be profitability it could be firm happiness could be firm security but when two firms merge that could make other firms less profitable less secure reduce their market share whatever and so what you could get is that that process may still churn seems to
with political alliances if you think of one country forming alliance with another country that could make other countries less secure and that could mean that there is no leopon function the process is going to continue to churn finally this is when i talk about my undergraduates quite a bit what about dating you think when two people decide to date they're both happier or two people break up presumably they're both happier but that could affect other people who are friends of those people who maybe wanted to date one of those people and it's not clear maybe
dating has lyapunov function maybe happiness isn't we have enough function for dating maybe it's not it depends on the size of the externalities what have we got we looked at exchange markets and we said if we have a total happiness function as a leopard a function works great happiness goes goes up process has to stop we get an equilibrium then we said not all exchange markets if there's externalities like with north korea trading oil for nuclear weapons for oil with iraq other people could be affected and that could mean that happiness doesn't necessarily go up
total happiness could go down and the process could keep churning and we related that to langton's lambda parameter from that simple cellular automata model this is really interesting right that cellular automata model which was very abstract told us that systems that where behavior isn't influenced by others tend to go to equilibrium systems where my behavior and actions are influenced a lot by others tend to be more likely to be complex or random and that same logic applies here we can apply only up enough functions to things like you know changing which location when i go
shopping for fish and when i go to the bookstore or trading fish for baskets because my actions don't material affect other people or if they do they make them happier right so in each case when i move from a crowded place to a less crowded place they make everybody happier and when i trade fish for a basket i make the person i trade with happier so i'm not changing the i'm not lowering the happiness of anyone else there's no externality going on but in the cases where there are these externalities now when i take an
action i materially affect other people in the opposite direction with these negative externalities and that means that they may then want to change what they're doing which could mean that the system keeps moving so what do we have we've got that without externalities or with only positive externalities in the case of finding a maximum what you're going to get is that it's easy to construct the leopard function and boom you get there the system's gonna stop but if there's these negative externalities when i'm making myself happy you're gonna make other people less happy then the
system could continue to churn and we may not be able to say whether or not the system's gonna go to equilibrium or whether it's going to be complex but we do have some intuitions and those intuitions suggest that simple markets trading goods should go to equilibrium should we should constantly improve people choosing routes should constantly improve but things like international alliances or coalitions within political parties or possibly even dating that these things may be more complex and certainly that's how it appears out there in the real world okay thanks hi in this set of lectures
we're talking about lyapunov functions and the opponent functions give us a way to explain or understand why some processes go to equilibrium what i want to do in this very short lecture is just clean up two details two questions that might be out there lingering in the minds of people the first question is this can we say how long it's going to take the process to go to an equilibrium so we know it's going to an equilibrium can we say exactly how fast so talk about that and then second does the process always stop at
the max or the min so when you talk about the op amp processes they can either always be going up or always be going down until they reach an equilibrium the question is if they're going down do they automatically hit the floor they necessarily hit the floor could they stop above it and if they're going up do they necessarily hit the top or they could they stop below it that's the second question both pretty straightforward both fairly easy to answer but it's worth cleaning up those details let's look again at the formal definition of the
lyapunov function just so we can answer these questions precisely so it's some function f that has a maximum value if the process stops then it's at an equilibrium if it doesn't stop then its value according to this f increases by some amount that has a value by some amount at least k so it goes up at least k so that means is either the process is stopped or it's going up k since there's a maximum that means at some point the process has to stop should also get this question how long until the process reaches
equilibrium that's really a fairly easy question to answer suppose that we start out with f of x one equals a hundred and k equals two so that means that the original value of the function's 100 k equals two unless suppose the maximum equals two hundred so that means starting out at 100 the highest it can go is 200 and it's got to go up at least two each period but what we can say is is that the number of periods has to be less than 50. because it's gotta go at least two and the most
can go up is a hundred and so a hundred divided by two equals fifty so what we get is the maximum number of periods is fifty so when we write down lyapunov function if we can make k as big as we can possibly make it and make the maximum as small as we can possibly make it then we can put a bound on the number of periods we can't say for sure the process could stop in one period of something two periods it could stop at 47 periods but we can say for sure is that
the number of periods is going to be less than 50. now it could be that if i think really hard about the model i realized that you know what k is not two but actually k is four that i can show that it's got to go by at least four each period well if that's the case then instead of the number of periods being less than 50 you could show it's less than 25. so if you want to put as tight about as you possibly can on the time it it's going to take away up
enough process to converge what you want to do is you want to make k as big as you can make it and make that maximum value as low as you can make it and that will help you put a tighter bound on how many periods it's going to take to get the equilibrium but putting that bound on once you know k and the maximum value and then the shift of the minimum value as well the starting value is really straightforward it's just some really simple algebra the other question is a little bit harder that is
does the process necessarily reach a max or a minimum and the answer here is going to be no now that first thing we did where people were choosing routes in terms of where to go in the city that when it did go to a max of a minimum it did go to the to an efficient case always we didn't prove it but you can show that that's true but generally that's not going to be the case generally it can be the case that a process can get stuck someplace less than the max so i'm going
to explain this in two ways let's go back and talk about a rugged landscape model remember that rugged landscape model there were peaks so here's a peak here's the peak here's the peak you can think of lyapunov function as saying i'm going to step up at least some distance each period it doesn't necessarily mean that you're going to get to the highest peak you could just go up a particular hill and it could be a sub-optimal peak it doesn't seem necessary that these processes would take you to the maximal point take you to the optimal
point again there's a difference between metaphor actually having a mathematical example so let's see if we can come up with an example to show where we get stuck at less than the optimal point to do that we're going to go back to our preference model so you might be noticing at this point uh-oh i better pay attention to the earlier lectures right because we've done langton's model now we're going to do the preference model so remember the preference model individuals have preference over different things so person one here here's person one person one they like
apples more than bananas more than coconuts and here's person two and they like bananas coconuts and then apples and then here's person three they like coconuts apples and then bananas let's suppose the following is true person one has a banana person two has a coconut and person three has an apple and now we're gonna have an exchange market we're gonna ask do they wanna trade can they trade to make themselves happier well it's clear they could try to make themselves happier but let's see if they can do it so person one is saying but i
would like to have the apple person one goes over person three and says hey how about if i give you this banana for your apple and person three says the banana no way because i like apples more than bananas so they reject the trade so person one can't make any trade that makes him better off person two has the coconut but they'd rather have the banana so they go to person one who's got the banana and says hey person one would you like to have my coconut how about my coconut for your banana and person
one says the coconut no way i like my banana more than the coconut so forget it so no one so person one is not going to trade with person two now person three's got the apple and they'd like person two's coconut so they go to person two and say hey person two how about if i give you my apple for your coconut and person two looks and says the apple forget it i like my coconut more than the apple i'm not gonna trade with you so what we've got here is a situation where person one
has the banana person two has the coconut person three is the apple none of them can make a pairwise trade and be better off one way to understand that metaphorically right is to think here's the landscape where they've got the certain things they've got they're stuck at this point there's some place they could get they could be better it could be that a person one had the apple person two had the banana person three of the coconut they'd be better off but they can't get there by pairwise trades they could do it if they had
a more sophisticated trade where they put all three things on the table and each person grabbed the thing they wanted but through pairwise trades they don't get there so what do we learn what we learn is that it's at least possible to put a lyapunov function on a process and have it stop somewhere less than the optimal point it doesn't have to stop the output point it could stop below that's what we're seeing here so we've answered two important questions the first one is can we know it goes to equilibrium can we say how fast
and the answer is yes and the better bound we get on k and the better bound we get on the max the more accurately we can put a restriction on how fast how long it's going to take so we can put a tighter bound on how long it's going to take if we can estimate k accurately and if we can estimate the maximum value accurately we also learned that it can stop a lot faster than that because of the fact that the process may not get to that optimal value some processes get stuck in sub-optimal
points and at least metaphorically you can understand that as being stuck in a landscape and a sub-optimal peak instead of climbing the mountain you're getting stuck somewhere below the optimal point okay where we're going to go next is we're going to talk about another lingering question and that is are there processes where we don't know whether they got equilibrium or not and the answer to that surprisingly going to be yes there's somewhere it's just sort of hard to figure out all right thanks hi so we're talking about lyapunov processes and the open up processes give
us one technique one tool for possibly determining whether or not a system goes to equilibrium so it can tell us for sure the system's going to go to equilibrium but sort of lingering out there's this question of what can we always tell maybe not using up enough functions but if we have a system can we say for sure hey does this go to equilibrium or does it not go equilibrium well that's the question that we're going to take up in this lecture does the system go equilibrium or does it not or can we even tell
well and that's that's a good question so what we're going to do is we'll do this in two ways we do this in sort of a fun way with some examples and then we'll do something a little bit deeper we'll see in fact that um why some processes are very hard to figure out so here's the fun example this is called chairs and offices this actually comes from an experience that one of my former students had so the student had learned about the opponent functions in class and she's working for some firm and this firm's
moving to new offices and they've also got a whole bunch of new office furniture and they're trying to decide how do we allocate furniture and how do we allocate the offices so i'm going to just put furniture in this one category chairs and all the offices in the separate category which i call offices so to allocate the chairs my students said well here's a really great idea let's just give each person a chair just randomly assign each person's gym we can all hold our chairs because these cares are different and then people can just trade
each person can trade their chair with someone else if they want to and this will be great and the process will stop and we'll be done and her boss said you know that seems kind of crazy because we could be trading chairs all day and she said no no we're not going to trade shares all day because this will be up enough function let the happiness function be how happy people are with their chairs and this is just a simple exchange market if i'm going to only trade with someone else if i'm happier and if
they're happier and so that means happiness is going to go up now there's a maximum happiness which should be if everybody had their preferred chair and of course not everybody may get their preferred chair but there's still definitely a max of them so if you assume even some small cost of trading you know if it takes time and everything else you got to push the shares around that means happiness is going to go up for each trade there's a maximum happiness process stops so the bus is that's a brilliant idea so we can do that
we cannot get the shares that way well then the boss says hey that's such a great idea we can also do it you can also do it for our offices we can just randomly assign people offices and then let people trade and my students said that's a terrible idea and the boss says wait a minute why is that a terrible idea it was your idea for the chairs but now it's a bad idea for the offices what's going on there the student says look office is different because there's externalities it's like the north korea iraq
example so suppose there's a bunch of offices and here's you know me right here and now there's someone let's suppose there's someone who wears a hat and this person likes to play really loud music well when this person and maybe that person wants to be next to this person over here who's really happy and always whistling so there's the whistler over here and the loud music person well the music person moves into this center office because the center cubicle because they want to be the near the whistler so they're sitting near the whistler and they're
happy but now i'm sitting next to this person who plays loud music so i then decide to i would want to move so i'm going to move over here let's say but maybe when i move over here maybe i'm someone that frequently what gets out of my cubicle and wanders around because i like to walk around when i think because i do do that well this person who's sitting here she may not like that at all she may not like people who get up and wander around because that disturbs her so when i move here
she may want to move someplace else so when you think about getting offices i may move someplace to be happier but that may make other people less happy in fact we saw that a little bit in the shelling model right one person moving to cause other people to move in the shelling model the process eventually stopped in the office case depending on the nature of the externalities maybe it'll stop maybe it don't won't stop so my student could say here's well we don't know with the offices this may be undecidable in order to know whether
or not the office relocation thing will stop we're going to need to know a lot about how people feel about other people and how much they care about who their office is next to so that leads to the deep question when can you decide and can you always decide and that's a hard question the answer is it depends on the problem so in some cases you can figure out some other way to show or prove that the process goes to equilibrium or in some cases you come up with a really sophisticated lyaphanov function to show
the system goes to equilibrium but other pumps even what seem like incredibly simple problems it turns out they're very hard to solve so let me take just a simple instance of this can we know that a process stops so this is the question let me take just an incredibly simple instance of this to show you just how complicated even simple processes can be this is something called the collapse problem the problem works as follows i call it hot po half or three plus one so that is this you pick a number if it's an even
number divide it in half if it's an odd number multiply it by three and add one and you stop the process stops if you ever reach one so pick a number it's even divide by two if it's solved multiply by three and add one so it's going to be going down every time you divide it's going to be going up every time you multiply by three and you want you stop if the process never reaches the number one now you could ask does this process ever stop let's do some examples to see how this works
so let's suppose i start with five five's odds so i'm going to take 3 times 5 plus 1 so that's 16. then i divide that's even so i divide by 2 that's 8. that's even i divide by 2 that's 4. that's even i divide by 2 that's 2. that's even i divide by 2 that's 1 process stops let's take seven that's odd so i multiply by three three times seven plus one so it's 21 that's 22 that's even so i divide by two that's eleven that's odd so i multiply by three and when that's 34
then i divide by 2 that's 17 that's even and that's odd so i multiply by 3 and add 1. that's 52 even so i divide by 2 even divide by 2 that's 13. odd i multiply by 3 and add 1 that's 40. that's even 20 even 10 even 5 and once i get to 5 i know look 16 8 4 2 1. process stops so i start with five i get to one pretty quickly when i start at seven i get the one but it takes a long time what if i start with twenty seven
start with a bigger number we see whoa this process seems to take an incredibly long time it's no longer a really simple process turns out that this is a very hard problem in fact the mathematician paul erdesh is one of the great number theories ever number theorist ever said mathematics is not yet ripe for such problems in other words mathematics as a subject hasn't matured enough to be able to solve this we can put a man on the moon but we can't solve the call-outs problem sort of amazing just to give you a sense of
how complicated this is here's a graph of from from carnegie mellon of all the numbers up to 2000 on this axis and this is how many periods it takes them to stop in the collets problem so what you see is you see wow there's a lot of structure in here there seems to be one set of numbers that sort of goes like this and another sometimes it sort of has this pattern but it seems really interesting and hard to figure out well that's why we don't know because it's so interesting and hard to figure out
we can't tell whether the colette's problem stops and we probably can't tell whether the office problem stops but we can tell whether the chair problem stops that's what's so interesting so when you think about a model or you think about a process like the the open off process right we open up functions what you've got is you can say hey there's some cases like in the case of chairs or pure exchange markets this thing works great and we can just say boom it's going to stop we're fine there's other things like the office process that
unless you know a lot about the nature of the externalities you can't tell it could be oh yeah this thing's going to go right equilibrium or it could be that whoa it's going to churn a long time like the number 27 did right where just goes and goes and goes and goes and goes when we put in the collapse process or the hotpot process okay thanks hi welcome back and i want to conclude our discussion and analysis of lyapunov functions and when i do so in a particular way i want to contrast the opponent functions
with markov processes which we previously studied because markup processes also went to equilibria but there's some fundamental differences between the equilibrium we described with lyapunov functions and equally we described with markov processes now remember the broader context here is when we look out there in the world we look at a system there's lots of things that can happen the system can go to equilibria the system can be ordered it can be random or it can be complex those are the four things that we can get and both markup processes and the openoff functions gave us
sort of conditions under which we can say for sure the system is going to go to an equilibrium for a leopard function it was it was really simple right we just had that there's a function f it has a maximum value in some cases the minimum value if it has a maximum value if the process isn't at equilibrium it goes up by some fixed amount k that means since it's always going up by some fixed amount k it has to stop really simple idea so it's a way if we can construct a function to say
that the system's going to equilibrium now markov process was a very different thing the system went in some finite number of states they could be high values low values whatever and it moved between those states and our assumptions were that the probability of move between those states stayed fixed over time and it was possible to get from any one state to any other that was assumption three and then the third assumption was there was no simple cycle so it didn't just go like a b c a b c a b c given those assumptions we
had the markov convergence theorem that said the system goes to an equilibrium remember it goes to a stochastic equilibrium so it goes to some equilibrium if the states are a b and c it may go to some equilibrium like a half a fourth a fourth so half the time it's in a a quarter of the time it's in b a quarter of the time it's in c so it's a stochastic equilibrium the system is churning there's equilibrium percentages of the time it stays in those three states and moreover that equilibrium is unique so i have
a markup process history doesn't matter initial point doesn't matter none of those things matter it's going to go to this stochastic distribution over these three states regardless of where you start very different from the opponent function the avenue function could be highly path dependent it could depend a lot also on the initial condition so it could depend on where you start and where you go so there could be many many equilibria so there's no reason to assume that it's going to be unique if you put a lyapunov function on a process good there's lots of
equilibria or there could be lots of equilibrium there isn't necessary but there could be second thing is it's not a stochastic equilibrium it's not half the time you're in a court of the time you're in b quarter timer in c it's a fixed point it's a fixed you know allocation of places where people go shop you know time prison people go shop it's a fixed set of trades among people so the system stops whereas in a markup process the system keeps churning so this front the structure we had for the op enough functions right maximum
value keeps going up is fundamentally different than the structure we had for markov processes both go to equilibrium to go to different types of equilibrium let's remind ourselves just some of the things we've learned though about these processes on which we can place the op enough functions first is if you can construct the op-amp function then it goes to equilibrium now if you can't that doesn't mean it doesn't but if you can then it does that's the first thing so when you look at a process one thing you might do is you sit there and
think huh can i place a lyapunov function on this so in the case of my student with the chairs yes she could in the case of the offices no she couldn't so she could say with great conviction hey let's do this trading thing with the chairs but she could be skeptical of whether the same procedure would work with the offices because she couldn't think of lyapunov function second if you can compute if you can write down the hub of a function you can figure out how long it's going to take so you can say this
is going to go to equilibrium and it's going to go pretty fast or you can say it's going to go to equilibrium it could take a long time so you can bound how long it's going to take to go to equilibrium so that's also good third thing right just talk about this that equilibrium need not be unique or efficient so there could be many equilibria it could be bad equilibrium all you know is it's going to go to equilibrium to prove it's a good equilibrium requires other techniques it's going to require a deeper analysis so
all the leopard function will tell you is is it going to go to an equilibrium and how fast and then last the reason a system won't go to equilibrium is because there's externalities and even more than that i can be more specific here there's sort of externalities pointing in the other direction so pointing opposite so if you're trying to increase happiness if the externalities cause people to become less happy that's going to keep the system churning if you're trying to minimize waiting time and the externalities increase waiting time that's going to keep the system churning
so it's not just externalities it's externalities that point in the opposite direction that's what cause that's what causes a system not to have a leopard function and that's what makes it possibly undecidable so let's think back to the hot poke case the collapse problem remember that was half or three plus one so some of the time it was going down by half and other times it was going up by three plus one so a system that's going up and down systems that go up and down you can't necessarily say they're going to go to equilibrium
it's those externalities that point in the opposite direction that causes a system that's trying to go down to go up or a system that's trying to go up to go down and it's externalities to prevent equilibrium that's the same lesson we learned remember from the langton lambda in that very simple cellular automata model when we have externalities when one person's action or one cell's action depends on the actions of others so if my happiness depends on other people's actions the system's likely to churn when one i do is unaffected by other people so my happiness
is unaffected by the actions of others then i'm likely to get a system that equilibrates that's it we're done now with the open up functions learned a lot of really cool stuff and it's a really nice framework for just thinking about systems so when you think about let's just set something loose you think oh boy have i just set something that's just going to go crazy or if i set something loose that's just going to very smoothly lead to a nice equilibrium one way to get some insight into that is through the opponent functions now
what's nice though about this as well is by looking at lyapunov functions in con and comparing it to some of our other models like markov processes and the langton model we begin to see how having multiple models in our heads enables us to understand some of the richness we see out there in the world and actually have deeper understandings of the processes we see to understand like this process is going to equilibrium because it's a markov process and it's a stochastic equilibrium and this process the exchange market is going to equilibrium because of the fact
that it's a leopard function and happiness is going up so what you get is different processes in good equilibrium for very different reasons and having different tools for understanding why equilibrium exists is very useful thing for making sense of the world which is one reason why we model okay thanks
Copyright © 2025. Made with ♥ in London by YTScribe.com