oh it says 2019 but actually this is 2020. i used this slide previously so what i'm going to first everybody hear me okay okay yeah i see heads nodding good okay um so we are going to look at the major proteins that exist in cells and so our focus today is going to be on what's the function in the structure of dna the importance of rna as an intermediate taking dna information and converting it into protein we'll look at what is a protein really and its function which is largely to serve as a catalyst of
enzymatic reactions and this is done by the major class of proteins which are called enzymes so to let's start though i've got to click this here by talking about scale we're talking about the molecular world and it's important to get some sense of really how small as small as this slide shows so if we think of things we can see easily like a tennis ball that's 10 times bigger than the period at the end of the sentence and that's about 10 times bigger than a typical cell so what this means is that the period that
you see at the end of the sentence if you're looking at some tissues that's a space that's big enough to accommodate about a hundred cells all right now this is the type of cell that would be found in plants and animals and visible cells what we call eukaryotic cells cells with nuclei but there are other major cells in the world of course are all the bacteria and they're about at one tenth the size of a typical uh sort of human cell and then today we're dealing with viruses with the coronavirus and the size of that
is something like maybe one tenth the size of a typical bacterial cell all right so the scale of cells here is then that you could take a typical virus like the coronavirus and you could put a hundred of them lined up across the typical cell and a typical you could have take a hundred of those to put across the period in the sentence so these are big size differences but when you start to look at molecules you go down even smaller so a protein like an antibody protein is uh you know compared here is like
a thousandth the size of a bacteria in a sugar molecule yeah i i said said that wrong that's a hundredth the size of a typical bacteria the diameter but a sugar molecules like a thousand in other words a typical bacterial cell you could line up a thousand sugar molecules to go from one side of the cell to the other so this gives you some sense of size and scale and we're going to be looking at molecules that are at the bottom end here really small ones things that are really difficult or in some cases impossible
to see with the best of all working microscopes okay so let's let's look at a cell to begin with to try and put this in perspective we have two general categories of cells first we have the bacteria and these two categories of cells i've put here in a scale of how different they actually look in terms of their sizes the typical bacteria in comparison to a typical cell in your skin for example they differ by about this much in size bacterial cells are much simpler in overall structure which you can see in this case looking
with an electron microscope you really just see an outer wall and a membrane and you see clear zones in and dark zones and these are the proteins involved in making other proteins this is places in the cell where there's just soluble sugars and other small proteins etc when we look at a typical cell that we might find in our body we find all this intracellular complexity so we see this big nucleus here that's where our dna is kept and this nucleus is surrounded by a double membrane and then out here are all of these membranes
here they're involved in making other membranes they're organelles involved in producing energy there are organelles involved in recycling there are organelles involved in transport and at some point in the future we may talk up more about cell biology but today we're going to really look at the molecules so if we could if we had a powerful enough microscope they only exist in science fiction unfortunately and we just zoomed in on this little piece here where there's a pore in this double membrane that allows this part of the cell to communicate with the nucleus here it
would look like this now this is an attempt to show an accurate drawing of what the inside of a cell looks like putting all of the molecules at their correct size and so this here you see is the double membrane that surrounds the nucleus and there's an opening in this double membrane and it has this very complex set of proteins here that functions as a gate that allows some molecules to enter the nucleus other molecules to exit the nucleus and this flow from one compartment to the other is kind of organized regulated by this complex
here that sets in this opening now what are all of these molecules most everything you see here are proteins all these different colored molecules in here are different proteins in in the molecular world proteins are very big in that a typical protein could easily contain ten thousand to a hundred thousand atoms now what we see here is the membranes and they are made up of lipid molecules which are kind of long linear molecules inside the nucleus here this is where the dna and rna is and in this drawing the dna is represented as blue whoops
let me go back here hit the wrong button uh so to give some sense of the scale of it then is inside the nucleus if you could really see the molecules a dna molecule would be about this big a very thin thread and there are just so much of it here in actual size i'm not saying an analogy here i'm seeing the actual size of the dna in every nucleus in your cell is that this thread is about six feet long so you know to try to get that in your head of the scale the
cell itself is so tiny you know just 100 the dimension of the period of a sentence and then inside that cell is the nucleus and inside that nucleus is a thread that is six feet long all right we're going to come back a little bit more to its structure now let me say also that um you know when i've given this talk before it's always been to a live audience and it could be more interactive if somebody really has a burning question don't uh don't be afraid to just unmute yourself and ask your questions so
i'm very open to to interruptions in fact is there anything up to this point somebody has some burning question okay well we'll move on then all right so let's look at what's called the central dogma of molecular biology that is that the way life is made is to store the information for making a cell or an organism in a dna molecule this long thread-like molecule that we're going to talk about we can think of this as the book of instructions in here if the cell needs to make some particular protein it goes into the book
of instructions and makes a copy of one instruction set that if this book is like a cookbook this would be a recipe for one particular thing in that book and then this recipe is read by a protein machine that synthesizes a protein and we're going to look at exactly what these different molecules are and how you get from dna to rna to protein the central dogma of molecular biology okay so here's a cartoon of what dna looks like and it's a very complex molecule i'm going to show you more details but it has a set
of atoms that form a long linear structure and in fact you take two of these long linear structures and you put them side by side and connect them and you get the the helix the dna helix this information in here is transcribed which just means copied you make a copy of the order of the rings on the ladder here into an rna molecule which is really a very close cousin of dna in terms of the chemistry in the structure and then this rna molecule goes out into the cell binds to an apparatus called the ribosome
and that's a manufacturing machine to make protein and we'll talk about what that looks like too a very different kind of molecule these two are very similar dna and rna protein is very different from them all right so what i'm going to do is i'm going to show you this cartoon which gives you here's a cell the basic keyboard uh silence the sound on it here okay and have to hit go this is going to show you an overview of the whole thing it's going to give you way too much information in a compact way
but it's an overview and then in the end we'll come back to it after we've described the details so we're going inside the nucleus and we see that the dna in the cell sometimes is all compacted into chromosomes other times it's unraveled so that it can be accessed in red to be used to make proteins so this is a way to represent the dna and a piece of the dna is what makes a gene now here a protein is coming in to make a copy of a part of this dna it pulls the two strands
apart and the basic building blocks that make this new molecule comes in these are hooked together to make a strand of rna and as you'll see it looks very much like dna it has this long linear part with these side branches but there are four different flavors of kind of rungs on this ladder if you will and that constitutes an alphabet a code that give instructions to make protein so that rna then after it's synthesized goes out of the cell out of the nucleus rather out into the soupy part in the cell and a big
machine made of protein binds to it and basically reads the code and what it does is for three letters that specifies bring in an amino acid to this synthesizing machine the ribosome all right so if you had the letters u c and a down at this end that might be mean bring in alanine and so it goes along this code three letters at a time each three letters driving the transport into this machine of an amino acid the amino acids that are then hooked up one by one to make this long chain they go out
a pore in the top of the ribosome and you end up with a linear string of amino acids but then they collapse and bind to each other and in doing so they make a protein all right so that was the whole whiz-bang i have a question yes okay it always uh i never had this question answered so maybe you can what starts the process what says to the dna uh it's time i want this or i want that um so do we know yeah i mean it's somewhat a chicken in the egg uh kind of
problem but but let's say that uh take a simple case of a bacterium in in your gut if suddenly you've just eaten a meal and there's a lot of sucrose sugar in that meal that sugar will bind to the outside of the bacterial cell and that'll trigger a reaction inside the cell saying sugar is out here bring it in and that will bind to a protein inside the cell that will bind to the dna in front of the gene that makes the protein to degrade the sugar the whole series of signaling pathways which are pretty
well understood for many for many metabolic reactions okay okay so let's go back to the to the really the core of it so this is one way to represent a dna molecule in which you can see every atom in the dna molecule so its basic structure is what we have here is this long linear piece on the outside and then attached to this linear piece are these sort of side branches all right these side branches here exist in four different structural forms a t c and g you see they look somewhat similar the major big
differences is that a and g you have two rings there's a ring here and a ring here two fused rings right whereas t and c only have one ring the structure is such that t will always stick to a's because this is slightly there's an electrical positive and negative charge difference between these two molecules and these weak electrical signals attract to each other and will allow a and t to bind to each other in this way when c's and g's come together they line up such that there are three places where a minus and a
plus will attract each other and hold these together these connections are very weak and a really good analogy is to think of this as almost velcro-like in that if you have velcro for just a few tiny connections it takes very little effort to pull them apart but if you have hundreds or thousands of connections then these two strands in the dna helix are actually held together quite tightly all right so uh to kind of back away from the atomic complexity of it what we have is what's called the backbone and the backbone is is basically
made up of sugar molecules and and a type of molecule called a nitrite um well anyway sugar molecules and phosphates and then we have the nucleotides the a c t and g that are attached to the backbone so that a single strand of dna right this one would be the backbone and it'll read t g c and d t however because a always binds with t c always binds with g this can be copied to put its complementary strand kind of on the other side here so it has its backbone and that will read in
the complementary way where you have t you have a bring in g you get c or c you get g now i'm not going to go into it in this talk but the core the beauty of this is that if you pull these apart and you want to replicate it exactly you have you it's can be done because the whole code here uh is set up such that if you want to make a double helix from a from the single strand you have to bring in an a molecule for this one you have to bring
it in another a molecule for this one you have to bring in a g molecule in other words it would pull them apart you can read it and replicate the whole strand uh and one thing i'm planning to do with the course this year is for my second lecture i really want to return to many aspects of dna that i'm just superficially going over today and go into that in a little bit more depth now this is what a dna molecule would look like if you flattened it now it's not really a planar structure it's
a very bumpy lumpy structure and so what it really does is it really forms this this helical structure and if in fact we looked at a model that represents every atom in the dna it really looks like this so what you have here is the yellow is the phosphate that's in the backbone reds are oxygen molecules blues are nitrogens uh okay whites are hydrogens grays in the middle are carbons etc another thing to really emphasize because the cartoon or this model which is an accurate model really a kind of hard concept to grasp just looking
at it is that although this looks very uniform it is not exactly the same here as it is here in why because uh the sequence is different so in this part here you have a pair of t's for example t a base pairs down here you don't have a pair of dta base pairs that means the actual detailed structure of the molecule up here the lumps and bumps on the outside are different from the lumps and bumps down here and that is the core to how dna gets recognized the difference in ctang gives you a
different surface structure in the dna molecule and this can be effectively read by protein molecules to find sequences in the dna it's really the heart of how the whole thing works okay so let's go back then to look at dna in the cell and this this uh picture here is really an attempt to show you how you take this six feet of dna this long complicated molecule and put it into a cell in a way that it isn't damaged and yet it's accessible so when cells are dividing the dna is highly complex it's wrapped up
super coiled we call this into a big lumpy structure that we recognize as a chromosome but when the cells are not dividing it unravels so that this structure is exposed to the kind of the water out here in the nucleus and these sequence differences can be sensed by proteins that are bouncing around here inside the nucleus okay now in in humans when our cells are replicating then we can see all of our dna as chromosomes and every human cell has 23 different kinds of chromosomes and we get a set of 23 from our mother and
a set of 23 from our father giving us 46 all together um for quite a few years we've been able to use dyes to specifically label each of the of the different chromosomes there are no you know i i don't actually know how this is done it's really quite quite a trick but it it can be technically done and so if you want to for example in in newborns sometimes they will do what's called a karyotype analysis karyotype simply means what do all your chromosomes look like and what we're looking for here is some some
abnormality in a chromosome all right so the way this is actually done is you take some cells you you kind of squish them on a microscope slide you add these stains so that the chromosomes become visible and then you in the past they would actually kind of cut out the photograph now of course you can do this all on a computer so you line up the two longest ones the the second longest ones the third longest ones and this is called chromosome one chromosome tomb chromosome three all right through the whole set now one thing
you can just see is that this is this male or female well it's a female you can tell that because there are two x's in here if this were a male it would only have one of these big chromosomes and then it would have a stubby little y chromosome difference between males and females and if you were doing an analysis here you would get this odd finding that the smallest chromosome the two are not exactly the same and if you were missing this much of a chromosome this cell would die so how can you get
that well in this particular example it turns out this chromosome was broken and the other piece of it stuck to chromosome number nine up here so that this is an anomaly would that be good or bad most of the time it would be bad but if in fact the break didn't actually break in the middle of a gene uh this this person could actually be rather phenotypically normal all right so anyway you can see the dna in your cells but keep in mind that even this single chromosome the dna here if you unraveled it is
probably about five inches long it's exceedingly thin but five inches long okay now just a few other points here so this is called our genome genome is just the word that means the collection of all of your genes and all of your dna in humans we have three billion base pairs that's nine zeros and within these three billion base pairs there are regions that code for 25 000 genes roughly and that number is over the years that's we're kind of settling on that number and this interesting fact here that over 90 percent of our dna
is not in genes per se it's what was that a question yeah when you're defining genes are you talking about just ones that translate into proteins uh not necessarily no no so for example there are genes that encode the rna that go into ribosomes and those those are called those are considered genes uh so um it's it's uh gene is really uh defined as something that makes a product that kind of persists and is used in some kind of structure but you know 99 of them are making proteins and about one percent of them make
rna molecules all right so what you have a space or dna in between the genes but the other thing you have it's estimated about 15 of all the dna is just old viruses old kind of pieces of self-replicating dna that infected our ancestors hundreds of thousands of years ago and it just persists on our genome all right so in looking at this non-gene dna is is actually a very active area of modern molecular biology okay so actually so that's a good point in any other uh questions at this point because we're going to slightly switch
gears here on the kind of basic dna structure okay hearing none so what is a gene so i'm showing you here the coding sequence for a real gene so what you see is a representation of what the base pairs the rungs on the ladder of dna what they are for to this particular piece of dna and it's reading just one strand of the double helix so there's a convention to to know what strand is being represented so one is if this is a gene this would be the coding strand the sequence of the coding strand
of that dna and just looking at the the basic alphabet code you have here there's not obvious information there's there's no uh punctuation marks uh you know it's just continuous uh some of your chromosomes are uh let's see some probably your biggest chromosomes are anyway more than 10 million uninterrupted letters in a row so how do we know where gene starts and where one stops and the answer is the information is in the sequence itself yes uh who came up with those four letters like why did they decide on a c t and g okay
it's because the the the names of these molecules are adenine cytosine thymine and guanine so it's kind of straightforward from the chemistry of what what makes them up and uh we'll see that uh the the t thymine comes in a slightly different chemical form which is called uracil and the only difference between dna and rna are one key difference between your dna and rna is that uh in dna you use t and rna you use u we'll come to that in a bit okay so we now have worked out what these sequences mean to some
extent and we know that certain sequences specify the kind of punctuation stop and start signals within a gene so you have small sequences often made up of only as few as six to maybe 12 nucleotides and what these are are sites that can be recognized by proteins as the starting place for a gene so this particular piece here uh and on your screen you know my screen right now i see this this the all the zoom menu is pulled down do you see that is it blocking the top of the it's not okay good all
right so that's tgtc tc agt that's is the signal for a one particular type of protein to recognize this as the start of a gene and so what will happen is in the long dna strand it'll just be bouncing along here kind of bumping against this piece of dna until the lumps and bumps in the protein exactly complement the lumps and bumps of this particular piece of dna and that protein will bind right to this point it'll stick right here and that protein has the function of marking the start of a gene and it will
essentially provide instructions for other proteins to come in and say i want to make a copy of this gene and so it will start right here and i'm going to show you how this copy is made in a cartoon form and it will go down to the end down here and down at the end here there will be some signal that's saying end of the gene these are rather more weakly defined now within the the piece that gets made going from here down to here there are two other important signals so if we look in
here we see that in this first place where we see the sequence atg that means start it means the start of a protein and so the protein gets read all the way through until we get to the sequence taa and that means stop all right and then this is kind of a this actually gets made into rna but it doesn't get made into protein it's kind of a a tail on the end of the sequence we think one of its function for example is these molecules can get chewed off at the end and we don't
want to lose any of our gene that's in here so it's kind of a protective little tail on the end of the sequence so let's go and next and look at this question of the genetic code the atc and g you know working this out was just kind of one of the the thrills in molecular biology when they got the secret the structure of dna and they saw that it was a simple code with only four letters in it and the idea was how in the world are we going to figure out and break the
code and it was kind of worked out sort of one one word at a time real really nifty story then maybe i'll be able to talk about a little bit when i get my expanded dna lecture but what we know is that in fact three consecutive letters make a word and what a word is in the central dogma of molecular biology is it specifies an amino acid and this is what proteins are made out of all right so atg means start it also specifies a particular amino acid one methionine and that means that almost all
proteins always start with a methionine there turned out to be three different letter combinations that mean stop t-a-a t-a-g or t-g-a all right if you work out the math here if you've got uh four different letters in your alphabet and all your words are three letters long how many possible words are there and the answer turns out to be 64 64. all right there are not 64 amino acids there are 20 and so you have situations where like gga might mean lysine but ggg will also mean lysine so there's some we say redundancy in the
code all right but your basic situation is here's the gene as specified in its nucleotide sequence atg means start and it also means methionine cgt is the second amino acid and that turns out to be an arginine and gcg it turns out to be a cysteine so you take the gene you've made a copy of it in rna which is identical except instead of using ts you use a yeah use rather and so aug means methionine cgu means arginine ugc means cysteine you keep going until you you get a stop codon and that comes brings
you to the end so this is kind of the core of what the genetic code is in questions questions on this all right in a way it's amazingly simple amazingly simple alphabet and it is the same in all organisms from plants to animals to bacteria virtually no exceptions to this use of the genetic code okay so we've looked at what a gene is and the stop and start and i mentioned that what you've got in the sequence is some sequence that says to a protein bind to this site and make an rna copy so let's
say here is the gene here's the start site the big enzyme has come in the function of this enzyme is to thus pull the two strands of the dna apart this is this is going to be the strand that's going to get copied down here so as it's pulled apart a t is exposed so it brings in the a base to match up with this t this one here is an a so a u base is brought in then these are connected and then you bring in another t and that one t will bind an
a and that will get connected a g will bind the c and that will get connected and so you'll end up with this copy of your dna of one of the strands of the dna and it'll be copied into this molecule rna so i said at the beginning that dna and rna are are nearly identical in their chemical structure the really only difference is is that uh u's are used instead of ts and rna and there is another important but subtle chemical difference in that is that in the backbone uh there there's is one atom
difference dna is deoxy ribonucleic acid the ribo refers to the fact that it's made out of in part ribose sugar but it's deoxyribose meaning deoxy missing in oxygen rna on the other hand is just ribonucleic acid it's not missing the oxygen this simple plus or minus of this oxygen molecule turns out to have a really important difference in the chemistry and the reactivity of the molecule fundamentally the rna form is more easily degraded and is less stable but the dna form is more difficult to degrade and is very stable and of course we now know
that dna molecules can last last for tens of thousands of years whereas rna is just very easily degraded in the environment or really very short lifetime in cells also okay so we've seen what our genes are we make a copy of a gene into an rna and then we go to the next step the next step is that that rna alphabet code is somehow used to string together a whole sequence of amino acids so let me first look at what are amino acids you know they're arguably you know maybe the most important molecules in our
body because this is what does the chemistry this is what metabolizes our food it's what drives our muscles it's what signals our neurons is the chemistry that can take place proteins which are made out of these building blocks amino acids in virtually all kinds of organisms we have the same basic set of 20 amino acids and so i'm showing you all 20 here the core of it is that you have an amino group nh2 is an amino group nitrogen and two hydrogens tends to be positively charged connected to a carbon and then on the other
side of that carbon you have a carboxyl group so the carboxyl group is carbon connected to two oxygens tends to be negative so this oxygen here can o minus will readily and often bind h plus which it gets out of water water is h2o and water can dissociate to make h plus so often this will have a plot a hydrogen on it if it loses that hydrogen that's what acid is if you say something is very acidic that simply means that it's got a lot of h plus in it so what we've got here is
an amino end and an acidic end and that's why this is called an amino acid an amino acid now that's the core in all of these molecules amino carbon acid group amino carbon acid group all right what they have on this central carbon though then is what we call the side chain it can be as simple as just a hydrogen atom it can be slightly more complex than having a carbon combined to two hydrogen atoms or it can have three of those complexes like this or it can have four of them etc in some cases
it can bind to a more complex ring-like molecule in some cases we said that this is an acidic group here it can have another acidic group down at the end these are examples of where we have an amino group here we have another amino group down here at the end what all of these differences specify really are are two important chemical properties one is how well do they bind water are they soluble or are they insoluble basically if they don't find water like these up here all the ones in green what they'll end up binding
is really they'll bind each other they'll be excluded from the water they'll be kind of fatty we would say on the other hand these molecules down here all bind water rather readily and they'll tend to stick out into the water these down here have the important additional feature that they're charged so these are have negative charges on them and these have positive charges on them negative is going to bind positive very strongly so what you've got is many ways in which these side chains can interact can be attracted or repelled and that's what drives the
structure of a protein in the end okay so this is the same 20 amino acids it's just shown in a different way it shows you that the simplest side chains have kind of no bump on the molecule if you add a few atoms you get a little bump if you add more atoms you get a bigger bump big bumps like this and so if we look at the kind of overall surface of these 20 different amino acids they've all got the same core but they all have different sizes lumps and bumps some of these will
bind water some will not some are positively charged these are the basic modules that we use these 20 amino acids that we use to make all kinds of proteins one last thing to say about this that in our diet you know this this being this central kind of molecule that we need to make everything else our bodies can only synthesize half of these we have to get the others from our food so plants bacteria they can make all 20. they have to make all 20. plants don't eat other plants or animals they've got to make
everything for ourselves so the reason you know that we we need a balanced diet is to make sure that we get a diet that's going to provide us with all of the amino acids especially the 20 that we cannot synthesize ourselves okay so let's look a little bit then and how we go the last step in the central dogma which is from rna to protein so we saw that fly by in the movie that i showed at the beginning here but now let's let's look at this in a little more casual way here the machine
in the cell that drives the synthesis of protein is called the ribosome it is basically a very large complex of proteins and rna it turns out uh in our cells i think our ribosomes have 80 different kinds of small proteins and four different kinds of small rnas and it makes uh makes this machine that has boy i i don't remember it's like a a million different atoms and uh one of the one of the prides of our campus ucsc is is probably arguably the world leader in understanding the structure and function of this ribosome is
harry knowler up at ucsc and he is one of the people in the world who has figured out the structure of the exact position of all million atoms in this ribosome it was is a great feat and he is now working on showing exactly how it does its job but here's the core of it so the ribosome can bind to an rna molecule this is a had copied a particular gene all right and it's not shown in my cartoon but it will bind it at one end where that atg was located it can recognize that
as the start and it will position that atg at the right place in the ribosome then we have these special carrier molecules that are going to pair up the right amino acid with the correct three-letter code these are called trna molecules they're a specialized form of rna the reason they need to be made of rna bus is because the a's and the g's can recognize whether they're ttc and any place in this code down here where there was ttc and rna however t is always u so it would be u uuc and here's a uuc
here this will kind of effectively pair up bind weekly to this three-letter code position it on the ribosome bring in this amino acid and position it right next to this one and the ribosome will then link the two together all right hook them up and so then the ribosome will move the rna along by three letters the next trna will come in bringing the correct amino acid the amino acid will get hooked up and then the whole process will repeat and so in this way this this every three letter whoops set here is going to
bring in the appropriate amino acid it's going to get hooked up and you're going to be making this iron this chain of amino acids out at the other end here okay any questions on this part here what i'm going to show you now is a of a movie uh and these animations are getting better and better in terms of being not just cartoons but really trying to draw all the structures to their actual size and it's going to show you protein synthesis at the actual rate actual speed that it occurs in the cell and it's
and it's pretty darn fast okay so here we go let me see get my cursor here all right so central dogma the dna double helix contains two linear sql all right so here's the dna protein binds to the start site that brings in the enzyme that copies the dna and all its factors that it's needed once everything is assembled you say go and so it zips along this is the real speed it really just flies along and it's making the rna copy of that gene and a typical gene is often about 3 000 nucleotides long
so that will often take about a minute it does quite a few nucleotides per second so what we've got here now is the rna copy of that particular strand of dna that particular gene now there's a real odd complexity that comes involved in here and it turns out that some of this rna can't actually be included in the final protein it's called intervening sequence and it has to be edited in cut out and so these proteins are recognizing the green part as a part that has to be edited out before we move on in my
digging deeper into dna i'll i'll come back to what this is all about this is called a gene splicing doesn't actually splice the gene it splices the rna that was copied from that gene so the parts that need to be cut out are cut out by the splicing machinery splicing machinery is made of protein and so you finally end up with the actual coating part and that goes out one of these pores in the nucleus and there it encounters a ribosome the protein manufacturing machine and once the ribosome comes in then it can open position
the rna in the right place and these carrier molecules the trnas will bring in the correct amino acid that corresponds to the three-letter code and it flies along and it occurs at the rate of about a hundred per second something like this and so a typical protein will have you know maybe at least uh 500 amino acids in it so it still takes a while still takes a minute or two to make a protein and so the protein comes out at the top of the ribosome and it's released and it folds up into its final
form so that is the process of translation we're translating the dna code into a different sequence into an amino acid sequence so let's look then at the this final step which is the production of the protein and the function of the protein so in this cartoon we just want to illustrate the fact that this strand of amino acids will have some that for example in some cases one of them might be positively charged and the other is negatively charged so they will stick to each other in other cases they might not be charged or compatible
at all and they might actually repel one another and it may be that a strain an amino acid at the end of the chain ends up binding most tightly to something that's way back in the middle of the chain and so it starts out as just a long chain and it kind of flops around until you've got the sort of maximum binding sites that can be achieved this is called protein folding it's one of the big challenges of molecular biology to figure out how it gets the right amino acids stuck to the its corresponding amino
acid somewhere else in the chain this is a as you can see an immensely difficult calculation of how it achieves its final structure but it does and it does it quickly and in most cases it does it completely on its own in other words there's no apparatus that's required to to wrap up this chain into the correct final structure all right sort of 95 plus of all proteins function as enzymes and so this is just two ways to portray an enzyme so at its core it's a strand of amino acids and so uh it's often
shown in this form which we call a ribbon diagram and this comes out of the fact that there are some sequences of amino acids that form a fairly regular structure they can be kind of like helical in other places they're just sort of random in other places they're sort of like flattened sheets that lie close to each other and so we can represent these sort of regular folding patterns with this sort of regular diagram they can be held together by many different kinds of interactions i mentioned plus dominus etc or lipid like things do other
lipid-like things one that is a really strong bond is there's one amino acid cysteine contains the sulfur in it it actually forms a strong molecular bond so if there's a cysteine here and a cysteine here they will form a strong molecular bond and in this particular example there are four of these what we call disulfides that help to hold this whole thing together now if we wanted to look at where are all the atoms in this chain of amino acids then we can show an atomic structure like here you can see there are pluses and
minuses to both ways of representing us this actually allows us to follow the the chain here we can't really see where the chain is but this allows us to get a good look at what the actual surface of the protein is like and if we looked at this one for example there's a big cavity in here and this may well be where the chemistry occurs let's say that this this particular enzyme in fact is involved in cutting up rna molecules and it could be that an rna molecule lies across this that that side chain sticks
into this cavity and that may allow the chemical reaction to occur to slice it into all right so uh just just to elaborate on this point a bit more so here's a protein it's folded up into its final form and let me emphasize this is not random in any way this is a stable particular structure that it that it obtains and in this particular protein there's this big cavity here and it is just the right size and shape kind of lock and key like to bind to some substrate molecule uh this could be uh oh
this could be some complex sugar for example and once it's locked into here the fact that you have all of these chemical interactions on the surface and it can interact with other things here it can either split this into pieces or it could add another molecule to make it bigger so it can degrade it it can build it but the point is that it promotes chemistry and it promotes it for very specific molecules so we have 25 000 genes 95 of those are making 25 000 different kinds of proteins which can bind to 25 000
different kinds of small molecules all the amino acids vitamins nucleotides sugars lipids etc that are required to make a cell and it and it some of these are degrading some of these are synthesizing kind of all the set of we of what we need to make all the chemical reactions to make a cell grow okay now just a a couple of final points one is what makes this all go how does this substrate the substrate is the chemical would that's going to be changed in this reaction how does it find this pocket here what actually
makes the whole thing go and the answer is that it is just random collisions so this is in water and if this for a sugar molecule for example you will have millions of them banging around in the water and they're going to bounce off of this enzyme but every once in a while bouncing randomly one is going to bang into this pocket and it's going to have a size and shape that makes it fit precisely in there and it will stick it will allow the chemistry to occur so things that are at moderate levels in
our cells sort of millimolar concentration in the cell this might be like glucose in your in your cells they will encounter an enzyme uh just by this random collection the chance that a sugar molecule at this concentration will hit any particular enzyme to which it has a binding site that chance is that it will occur a million times every second so enzymes and substrates don't have to find each other they're just banging around kind of doing this testing do i feed here do i fit here do i fit here at super high rates kind of
higher rates than we can conceive of and at our body temperature if it's at a typical concentration this is going to happen a million times per second which means that in principle this enzyme could do a million chemical reactions in a second and there are enzymes that work almost that fast it's called catalytic perfection one example is that the water molecules in your cell every once in a while make peroxide hydrogen peroxide that's a really toxic chemical you know bleaches your hair it reacts you don't want a lot of peroxide so you have enzymes just
sitting there waiting for a hydrogen peroxide molecule to bump into it then and as soon as that does they combine it with water and basically destroy the peroxide make it go back to water again eventually and those enzymes work near to catalytic perfection they can catalyze reactions up to a million per second at these at this concentration okay so to give a kind of sense of this this is a cartoon that was done on a on a super computer to look at a molecular reaction with real size uh to see what really happens so in
this case this is the protein back here and they've kind of split it into so we're looking at the inside of a protein this protein normally sits in a membrane so it's in a barrier here and its function is to allow water molecules see there's h2o two hydrogens and an oxygen that's water molecule of water it allows water molecules to move in from one side of a membrane to another by going through there's a cavity in the protein here and then there's a couple of amino acids that form a gate right in the middle and
they take a water molecule and allow it to pass through here and go inside the cell you know you want to keep the good things in and the bad things out so you need a hole that will allow water to go through but you don't want salt to go through you don't want sugars to go through you just want water and so it has this gate that is just the right size and structure for only water molecules so how long does it take a water molecule to go through well this is the reactions that occur
in i think this is like a picosecond of time and they've colored one of the water molecules yellow here so you can see it so this is just a just like a millionth of a second watching a water molecule go through and it makes this point that it's just all random collisions it's all bouncing around in here you can see our our cavity and here is our yellow one it's kind of kind of pushing its way through it gets to the key gate part here and it binds to this amino acid and then that amino
acid allows it to kind of bind here and then by kind of random collision it sort of works its way into the out part but what it went back but then finally it goes down again and goes back but eventually it gets water molecules in between it and it escapes to the other side so you know the beauty of this is that as complex and as sophisticated as all of this chemistry is it's ultimately just driven by chaos just driven by this random collision it's something that's rather difficult to get your head around okay so
now that you've seen all of the details let's take another here is the cell the basic unit it's a big picture here so we saw that uh okay we're gonna zoom into our cell so we've got the outer membrane of the cell we've got all of the various organelles and then we zip into the nucleus of the cell through one of these pores in the membrane in there we have the dna which is sometimes compressed into chromosomes but more frequently is unraveled so that the proteins can be bouncing along there looking for binding sites that
indicate the start of genes that if the proteins are available to mark the start of genes and that might be signaled by some something you ate outside your cell it will bring in the enzyme that will make a copy of a gene transcribe it into an rna molecule so the four letter code will be assembled in the new rna and that rna will then leave the nucleus okay this is the messenger rna and encodes for a protein a sequence of amino acids okay it get i told you about the splicing reaction that has to occur
for most of the our genes anyway goes out of the nucleus encounters the ribosome which is the machine that can read this code symbols on the ribosome then these trna molecules the carriers of the amino acids come in and they do the translation process they say this amino acid should be inserted wherever i see the code aat for example and so it will bring the right amino acid to the right place and then the ribosome will attach those amino acids that are brought in to make this chain this molecular chain that's actually fed through a
channel that exists in the ribosome at the top of the ribosome and the chain of amino acids 20 different kinds of amino acids but in a specific sequence they will interact and bind to each other until they fold up into a protein with a very specific shape and on the surface of this protein are binding sites for other chemicals in the cell and that's what drives all the chemistry of life so we go back here to our where we started with looking at all of this molecular complexity here's inside the nucleus the dna getting transcribed
into rna the rna going out here the rna binding the ribosomes these are the ribosomes making new proteins and you kind of fill the cell with all of these different kinds of proteins okay so uh let me just mention this and then i'll open this up to questions so the course they're going to be three more lectures at the end i'm going to kind of digging deeper into dna i mean how how is it replicated what's the stuff that's not genes uh what what how big are genes and what's this splicing all about maybe tell
you a little bit about you know how do all these new dna sequencing technologies work that you know allow us to you know go out and collect bird poop in the woods and be able to identify individual birds not just species of birds but any individual birds how how in the world can we do this with dna et cetera lots of interesting dna things to cover um our next speaker is going to be josh era bear he's a rather new young assistant professor i think he's in his second or third year here uh came from
stanford in fact he was in a nobel prize winners lab at stanford and he is looking at the question of how do cells maintain quality control because you know if i'm saying so much of this is done by random collisions aren't there a lot of mistakes made every once in a while and the answer is yes there are mistakes made but cells have figured out ways to detect good proteins from bad proteins good message from bad message and how do they do this josh in his experimental methods is also uh heavily relying on the new
crispr technology and he may be able to tell us a little bit about crispr and gene editing and then uh in this era of cobit i thought ah i've got to get martha to talk to us martha has talked to this group a couple years back she she is an immunologist and so one of the most knowledgeable people on campus is to understand exactly how our immune system deals with virus infections and then the fact that many of the most serious cases of covet infection the the serious diseases and the deaths are often caused not
by damage to to the virus itself but by our immune system overreacting and she's going to talk about the pathology that comes about over reaction of the immune system so those should all be interesting so i i hope i will see all of you uh for the final three lectures all right so with that i open this to uh to questions now just unmute yourself if you you have a question i have one sure um i'm just wondering about the location of the spliceosome uh is that in the nucleus and then before the rna comes
out it's all done in the nucleus it is it's all within the nucleus and it's uh it's a really you know for all i've shown these very complex pathways transcription translation etc removing those introns is even more complex and involves more proteins than any of the others it's a bizarre thing and we don't fully understand to this day why it happens i'll talk a little bit about that in my digging deeper into dna raise your hand uh okay uh linda and gordon all right i was wondering where the uh trna comes from that the ribosomes
are busy working with so in the dna some of the genes encode dna that specifies the sequence of nucleotides in the trna so in fact there are a couple hundred genes in our bodies that encode a different trnas so you know you've got 64 codons that have to be red it turns out that you don't make the full set of 64 different trnas but we make about 40 40 different kinds but they're genes just like protein yeah uh okay other questions uh you have a question uh this is kitty i don't know you can't see
me i think yeah so um so did you say that 90 of the um genetic material of the dna is not genes it's and it's a lot of its old viruses that invaded our bodies that is correct it's okay i just want to make sure you got that number right that's yeah yeah i mean it you know it originally a lot of this was called junkies and some of them a lot of what we think still is junk dna uh however one of the big things that's happened in the last uh probably going on 10
years now is that we know that a lot of the dna is converted to rna even though it doesn't code for a gene some of that just gets degraded it's as though the machinery the copying machinery is just making a lot of mistakes but other parts of it function to regulate um which genes are turned on and which genes are turned off you know we we really kind of understand a lot of the the core processes pretty well but the really challenging thing is what's on and what's often is it on a lot or on
a little you know you think of this challenge you've got 25 000 different kinds of genes you know the cells the cells the genes to make red blood cells and muscles those genes are present in the cells in your brain but you don't want them making muscles and red blood cells when they're supposed to be making neurons and so they've got to turn on just the right set of genes at the right time and the right amount and this some of this junk dna turns out to be involved in regulating what gets turned on when
and how much okay so we don't really know like the percentage of that that's not old viruses that attacked us yeah but but kind of kind of you know about five percent is genes about 15 percent is viruses and 80 of it is something else okay great and then just this i could just look this up but just for clarity on the amino acids um so how many are uh not reproducible by humans uh 10 basically 10 of the 20 yeah okay yeah okay thanks i'd like to ask another one um there's a um remnant
of a gene that used to be able to make uh in earlier animals could make um vitamin c and i think it's in all the primates they have the same defect in the in the vitamin c synthesis i don't know how many there are maybe you do just genes that are no longer functional that they're still there but don't serve a purpose anymore there are a lot they call them pseudogenes because they have start and stop signals uh and sometimes uh they have uh some of them are made into rna and then it just gets
degraded uh often because it it makes a a nonsense protein a typical thing is a pseudo gene if you compare the sequence it will look like you know 98 identical to a functional gene in another organism but in humans there have been stop codons taas tags inserted in there so it won't actually make a fur a functional protein but there there are a lot of those lois you had a question i have a comment i um what i think you know this is also fascinating but i'm sure what is fascinating is how people men and
women have figured this all out i mentioned that i would love a course on the history of molecular biology i think it would be so interesting we all read that early book on you know finding the the helix or whatever the double helix probably yeah jim watson yeah so much more and i think it would be really fun and interesting you know what's happened at ucsc but what happened you know many years ago um well i i i like that idea and it's something i've been interested in a long time and i i have i
have read a lot of that in fact i i thought at one point that would be angle on that is uh going through uh science biographies i mean just some really fascinating things in here yeah we should think about that that's that's a good idea okay doesn't some other people here what would you like to hear the history of how it all happened yeah louise you had your hand up yeah uh i want to agree with that and uh supplement what she just said i think what would be interesting as a part of the history
is how scientists um model the what they cannot see how do they work with these things that cannot even be seen and yet they find evidence that teaches them how to draw models that would try to explain relationships between all of these unseen things tied into the history uh it's part of the scientific method and i'd love to see that okay i know i like that idea too okay uh louise you had a question yeah i think the idea of your teaching more is a real fine one i'll sign up um but my question when
you talk about things that the parts that don't have a purpose um i think about the body parts we used to think didn't have a purpose but it just we didn't know it and i'm wondering um what you think might be the um the percentage of things that actually have a purpose we just don't know you mean at this this 80 percent etc well we slowly are our fingering out that some of that stuff we thought was junk does have a role but i would say most of it is still continuing to be mysterious uh
and and it is you know a dramatic result of of uh evolution is that is we carry so much of our old history with us in not just the successful parts but a lot of the garbage we carry with us too at a molecular level uh interesting interesting problem yeah uh let's see constantine uh you're muted contrast right um so to add more complexity to all this i've read some research that um scientists have been dabbing into into adding more letters to the pain are you familiar with some of that research that's actually being done
a little bit i yeah in fact i think you sent me some links ah okay yeah yeah so what it it's interesting so in principle they're saying there's no reason why we we couldn't add some new base pairs in a kind of an artificial system and it doesn't work and it might it might allow kind of the synthesis of products that are not made naturally so you could have because as as good as as organic chemists are there are still lots of molecules that are just impossible or impractical to make in the lab but organisms
can make them uh jack i have a question you've presented the these with the connection sorry i see a gene but how but were each of us different and our genes are different in some way could you indicate where the different why one is different from another well i am different from you where does that uh show up okay all right well it's um you know it it it's fairly complicated at you know if we went into the details but but the simple overview is that we we all have the kind of same core set
in the same order and the actual sequence of nucleotides is is just like 99.9 percent identical all right but there are variants so you know this this gets you back to this sort of basic you know gregor mendel and genetics in that you have tall plants in short plants and if you look at the details this is often due to just a single change one a nucleotide in 3000 that can exist as an a in one organism and a g in another and that will affect one amino acid in the protein and that may make
the protein make work a little better or a little worse and so you'll make a little more of a product or a little less of a product we call those alleles they're variants of a given gene and so so there are these mutations that uh occur out in there that in fact is the basis of you know all of this genetic ancestry uh mapping of your dna what they're looking for there of course are the differences but they're really subtle they're kind of like 1 000 base sequences and if you make a change in the
dna most of the time it will be neutral and some of the time it'll make a change but not necessarily a bad one and just make a little more or less of something and that's that's where you get all this difference now then on top of that you have the the thing that like a simple difference of hair color is not determined by one gene so you might have a gene decision make red pigment don't make red pigment make black pigment don't make black pigment make high levels make low levels so you could have five
different genes that affect it and so you could really see it as like a having a a palette of different colors of paint and with just five variations you can make 50 different tones and and so that's the other way we vary so fundamentally it said for any given gene there are many different minor variants and then for any given character there are multiple genes that are involved and so it's a it's a kind of exponential opportunities for complexity to occur okay uh oh marjorie uh you're muted marjorie yeah okay unmuted can yes i was
given uh a supplement of uh um oh dear um l-lysine when i had shingles to stop its movement what is l what is lysine what kind of amino acid and what does it do so it's one of the 20. lysine is an important one they're all important of course but it has a charge it's a plus charge on it so it's an example of if you you change the code for a lysine you're more effect more likely to really affect the shape of a protein than if you change the code for another end but as
far as what it does there are so many possibilities that you know you're looking at something that is in every protein in your body and so it's uh it it's not going to be at this genetic level it's somehow going to affect the you know the the metabolism of something that uses lysine i don't know i wonder i wonder if they know in fact that this is just experience then it has the effect yeah oh that's a really complicated one uh okay uh pat and you're you're muted also and okay oh okay we didn't have
a question all right any all right i think we finished out all right time for dancing in the street folks get out there [Laughter] all right so we'll see you all uh next week and it'll be josh era baron and he's a sharpie so it'll it'll be good thank you gary okay