The genome of a person is their instruction book. It contains 3.096 millions base pairs of chemical letter. Every detail, including hair color and susceptibility to diseases, is contained in a long sequence DNA that spans 23 pairs of chromosomes within every cell, except sex. The first draft of the human genome was published 21 years ago. It contained 92%. Today, the Telomere to Telomere Consortium, (T2T), a group of international scientists led by the National Institutes of Health of the United States presents the remaining 8 percent in the journal Science. This is a breakthrough in understanding the origin of disease.

Celera Genomics and Human Genome Project published the first drafts of our instruction book on February 1, 2001. He said that we had only 30,000 genes, which is far less than the 70,000-140,000 believed to be possible. Those confirmed genes were reduced to 20,465, thanks to the passing of time. Since then, the first draft has been improved and served to genetically diagnose pathologies in patients who were previously not diagnosed. It also helped to determine that there are at most 4,400 genetic diseases, according to Lluis Monoliu, a researcher at CNB- CSIC, who was not part of the new study.

Genes are the units of biological inheritance. They are distributed among chromosomes. These DNA strands compressed into twisted ladders look like twisted ladders. Each rung on that ladder is composed of a combination from two of the four types nucleotide bases known as A, C and G. These chemical letters are the genetic alphabet. They code for genes.

It is difficult to read a genome. Scientists first cut the genome into pieces that contain hundreds to thousands of letters. Next, sequencing machines will read each letter from each fragment and then scientists will attempt to put them together in the correct order. 2001 draft did not include 8% of the DNA. These are difficult-to-read regions located in the middle -centromere – and ends -telomeres – of chromosomes with long repeat sequences. Scientists were unable to find them.

Megan Dennis, a biochemist at University of California at Davis, acknowledged that these are “important regions, but difficult sequence”, and was co-author of the version published by Science today. The new study has added 99 genes to our human genome after summing up 200 million base pairs within these areas. The centromeres are home to about 90% of the supplement to the 2001 draft. These areas contain a lot repeat letters. Charles Langley, a University of California, Davis biologist, jokes that we used to warn young geneticists not to go into the centromere as they wouldn’t get out. The new version maps the narrow area of each chromosome, which separates it into a shorter and longer arm.

Two techniques have been developed that enable large chunks of DNA to be sequenced, and make it easier to put together the puzzle. This is a significant improvement in reading our instruction book. One technique, Oxford Nanopore DNA, can read millions of letters simultaneously, but it is not very precise. The other, PacBio HiFi DNA, reads almost 20,000 letters in one go. Evan Eichler, University of Washington researcher, said that “we are seeing chapters we have never seen”.

“We have a great understanding of human biology and diseases by knowing about 90% of the genome. But there were many important parts that remained hidden because we didn’t have the technology to see them. David Haussler, Director of the Genomics Institute at University of California, Santa Cruz, says that now we can see the whole landscape from the top of the mountain and have a complete view of our genetic heritage.

You might think that 92% of the genome has been completed, so the remaining 8% would not be much. We are now gaining a new understanding of cell division, which allows us to study many diseases that we haven’t been able before,” Erich D. Jarvis, University Rockefeller’s co-author of the study that led to the development of the sequencing techniques, says.

The resultant genome is not the same as a person. Montoliu states that the DNA is derived from a cell from “a failed embryo” that was affected by a rare condition in pregnancy. It loses the genome of one parent, and copies the genome of the other. This has the advantage of allowing for easy reading, as there are two identical copies (and not different) of each chromosome. However, this sequence does not contain the Y chromosome (the male one). Megan Dennis says that despite its unusual origin, the sequence does not suggest anything extraordinary.