Sanger DNA Sequencing, From Then to Now.

66.29k views2000 WordsCopy TextShare

ClevaLab

This video explores the basics of Sanger sequencing and the fascinating history behind this groundbr...

Video Transcript:

ClevaLab. In 1977 Frederick Sanger described a method of DNA sequencing using chain-terminating Inhibitors. The aim was to find out the sequence of nucleotides in a piece of DNA.

This method became known as Sanger sequencing. These chain-terminating Inhibitors are also called ddNTPs. DNA is made up of a chain of four different nucleotides called dNTPs.

To copy DNA and grow the DNA double strand. DNA polymerase adds the complementary nucleotide. dNTP stands for deoxyribonucleoside triphosphate.

A closer look at its structure shows that a dNTP is one deoxyribose, a base and a triphosphate. A nucleoside is a ribose sugar and base together. The base is one of four bases, Guanine (G), Cytosine (C), Thymine (T) or Adenine (A).

The sugar is deoxyribose because it has one less oxygen than ribose. ddNTP is short for Dideoxyribonucleoside triphosphate. A ddNTP has two oxygens less than ribose, as di- means two.

The role of DNA polymerase is to add new bases to a growing DNA strand. It does this by catalyzing a chemical reaction. The incoming dNTP's phosphate group reacts with the bound dNTP's ribose oxygen.

This results in the release of two phosphate groups and the addition of dNTP to the strand. But, if a ddNTP gets added to the strand, there is no ribose oxygen to add another dNTP. This lack of oxygen terminates the DNA chain.

It also makes sense to mention here the naming conventions 5' and 3'. 5' and 3' refer to the positions of the carbon atoms in the deoxyribose of dNTP. They're numbered from the carbon linked to the base to the phosphate.

The oxygen needed to add new dNTPs to the DNA strand is bound to the 3' carbon. So it's common to say that the DNA extends from the 3' end. The other sticky part of the dNTP is the triphosphate.

The triphosphate is bound to the 5' carbon. This end of the dNTP is the start, and the 3' end is the finish. When you write down a sequence of DNA, the order of nucleotides is always in the 5' to 3' direction.

Also, note that the DNA polymerase only adds a complementary base to the template DNA. So, C always pairs with G and A always with T. So how does Sanger sequencing work?

The original singer sequencing method is different from the one used today. The original method was completely manual and used radioactive dyes. Let's take a look at the original Sanger sequencing method.

We need a primer, DNA polymerase, dNTPs, DNA template and ddNTPs. One of the dNTPs, dATP, is labelled with a radioactive tag. A total of four tubes, one for each ddNTP, are used.

To start, the DNA, primer and buffer are heated to 100 degrees. This separates the DNA into single strands. Remember, this was before PCR existed.

Heating up regular DNA polymerase inactivates it. So, it gets added later. Next, the mixture cools to 67 degrees to allow the sequencing primers to bind.

Now we add DNA polymerase, all four dNTPs and one of the four ddNTPs to each tube. DNA polymerase extends the DNA template. A ddNTP incorporates into the strand, terminating the fragment.

The ddNTP is at a lower concentration than the dNTPs, so this incorporation is random. The result is a termination at each base, creating different-length fragments. All fragments in each tube start with the same primer sequence and end in the same nucleotide.

Low incorporation of the ddNTP allows the sequencing of longer stretches of DNA. In the original Sanger method, sequencing of up to 200 nucleotides was possible. Next, the four sequencing reactions get mixed together with a loading dye.

Each reaction is loaded in a separate lane of a polyacrylamide gel. The fragments move through the gel at different speeds depending on their size. The smallest moves the fastest.

This type of gel can differentiate a single nucleotide difference in length. At this stage, the fragments can't be seen. The loading dye tells you when the fragments have reached the end of the gel.

To visualize the fragments, the sequencing gel gets dried onto a paper support. Then, the radiation from the dATPs in the fragments gets detected with X-ray film. This results in bands showing for each fragment.

The term used for reading a DNA sequence is "base calling". The DNA is read from 5' to 3' to call the bases. So we start with the shortest fragment first.

In this case, it's in the lane with a ddTTP, so the first nucleotide is a "T". The next shortest is in the ddGTP lane and thus is a "G". You continue up the gel based on size to read the whole sequence.

So, on this gel, it would read TGCATGCCA. The original Sanger sequencing method was very labour-intensive. It also took four days to sequence 200 nucleotides from only a few samples.

There was a great need to streamline and automate this process. Applied Biosystems created the first commercial sequencing instrument in 1987, the AB370A. Applied Biosystems had already shown that fluorescent dyes could replace radioactive dyes.

These are safer and cut out the time needed for X-ray film detection, which took several days. In this instrument, the sequencing reaction had fluorescent sequencing primers. A different coloured fluorescent dye labelled each of the four ddNTP reactions.

After sequencing, the four reactions could be mixed together and loaded in the same lane of the gel. The AB370A also had a laser that scanned the bottom of the gel. This laser detected the fragments as they passed by.

The instrument fed the data into a computer to call the bases automatically. Sixteen samples could be run on one gel with a read length of 450 nucleotides. The AB370A showed that sequencing could be faster and more automated.

Scientists started to think sequencing the whole human genome could be within reach. In 1990 the U. S.

government announced the Human Genome Project. This project aimed to map and sequence all the genes in the human genome. By 1990 only <2% of the human genome had been sequenced.

Sequencing the human genome would have important implications for science and medicine. In identifying disease-causing and associated genes to treat genetic disease. Kary Mullis invented PCR in 1983.

It wasn't until 1989 that Vincent Murray used Taq polymerase for Sanger sequencing. In Sanger sequencing, the primer binds to the DNA, and the DNA polymerase extends the fragment. But, as the primer is in excess.

Most of the labelled sequencing primers are not extended by DNA polymerase. With Taq polymerase, the DNA can be melted apart after the first extension. Taq polymerase will survive this high heat.

It can then be cooled again to anneal another sequencing primer. These cycles of melting, annealing and extension repeat the same as in PCR. Many more primers get incorporated into the fragments, increasing the fluorescent signal.

But, as there is only one primer. Only extra forward strands are made, and no reverse strands. So the number of fragments increases by the same amount each cycle.

This increase is linear over the cycles and is called linear PCR. The method was later termed cycle sequencing. The higher fluorescent signal also meant that less DNA was needed for each reaction.

Another important advance was in capillary electrophoresis. This is where a small amount of gel is in a fine tube. The DNA is taken in one end, runs through the gel under an electric current, and gets detected by a laser at the other end.

The fine tube used in capillary electrophoresis allows heat to escape. A higher current can be used without the gel overheating. Higher currents mean a faster run time and better resolution.

Beckman Coulter launched the first commercial capillary electrophoresis instrument in 1989. This paved the way for the development of a capillary-based Sanger sequencing system, the ABI PRISM 310. Applied Biosystems launched this system in 1995, and modern Sanger sequencing was born.

The ABI PRISM 310 had one capillary for electrophoresis in place of a PAGE gel. One sample could be run in under three hours compared to 14 hours. The sequencing length was also improved and could now sequence up to 600 base pairs.

The capillary also allowed automation of the sample loading. Up to 96 samples could be loaded in a plate on the system and left to run on its own. Due to electrokinetic injection, low sample volumes and amounts of DNA are needed.

This is because DNA is pulled into the capillary by an electrical current. The current concentrates it at the end of the capillary. The capillary then moves into a running buffer.

Fragments pass through the gel and separate based on size. Then the fragments pass by a laser at the end of the capillary. The size and colour of the fragments get sent to a computer.

The software then detects and calls the bases. While fluorescent dNTPs were available. Sequencing was still performed with fluorescent primers.

This was because the peak heights were very even with fluorescent primers. Labelled ddNTPs couldn't achieve this even peak height. Not until the introduction of BigDyeTerminators in 1997.

With fluorescent primers, four reactions are needed. But, with fluorescent ddNTPs, the sequencing reactions can all be in the same tube. Applied Biosystems continued to improve its system.

Demand continued to grow for the automation of Sanger sequencing. The Human Genome Project was making slow progress. By 1998, only 6% of the human genome was sequenced.

It was in this year that Applied Biosystems launched the ABI PRISM 3700, which had 96 capillaries. At the same time, they announced a partnership with The Institute of Genome Research, also known as TIGR. TIGR was a not-for-profit institute headed by Craig Venter.

Together they formed a new company called Celera and purchased 230 ABI PRISM 3700s. Celera aimed to sequence the human genome faster than the Human Genome Project. It planned to make money selling access to its sequence data.

It also planned to patent genes that could be useful for disease treatment. Profiting from sequencing the human genome was controversial and upset many scientists. The race between public and private sequencing of the human genome had begun.

The ABI PRISM 3700 played a huge role in sequencing the human genome. Each run of 96 samples took less than 2. 5 hours and generated 800 base pairs of sequence for each sample.

With only 15 minutes of hands-on time by a technician, 1,536 samples could be sequenced daily. With this instrument, the cost per base of sequencing was also reduced. with this new technology, Celera produced a draft sequence of the human genome in three years.

Publishing their results in 2001. The Human Genome Project, also aided by ABI PRISM 3700, published its draft genome at the same time in 2001. This modified Sanger sequencing method is still used today.

But why when there are newer technologies like Next Generation Sequencing (NGS). Let's look at how they compare. Sanger sequencing remains the gold standard for sequencing.

It is the method that all other sequencing methods are compared against. This is because it's 99. 9% accurate in calling bases.

NGS is 99 to 99. 9% accurate but depends on the sequencing depth. Sanger sequencing is more cost-effective for sample numbers under 20.

It's also faster for this amount of samples. For large sample numbers, NGS is more cost-effective and quicker to run. But, the sensitivity of Sanger sequencing to detect a base within a background of other DNA is only 15 to 20%.

Compared to NGS with a sensitivity of 1%. Sanger sequencing also has a low sample coverage of one read per sample of only 300 to 850 base pairs. In comparison, NGS can generate billions of reads per sample of up to 16 Tb.

So big that 128 human genomes can be sequenced in one run. So if you have less than 20 samples or genes you'd like to sequence, Sanger sequencing is still the method of choice.