A New Gold Standard For Solving Genomes


Research collaboration between the University of Adelaide, the US Department of Agriculture, and the National Institutes of Health have led to a new method to create complete DNA sequences of chromosomes inherited from mother and father.

To disentangle the DNA sequences inherited from the mother and the father in diploid organisms such as human or cattle has been a difficult problem. The human genome project was initiated in the 1990, and after spending a few billion dollars on it, the project was considered done but the genome still contains many gaps. Subsequent efforts to finish the genome (i.e. close the gaps) by over 1000 researchers from more than 10 institutions had greatly improve the genome but it still contains hundreds of gaps. This result was published in a Nature paper in 2004. Besides gaps, there is a problem in that the human genome is not a true haploid representation. A human genome with a normal karyotype should have 46 chromosomes but you will find that each pair of autosomes is represented by just one copy.

A haploid set of chromosomes. Source: ensembl.org

If you have ever seen the human genome represented as above, did you know that the reason is because we did not have a true diploid assembly? Why is that? It turns out that genome assemblers usually treat the assembly problem as if the task is to assemble a haploid genome. To consider DNA polymorphism that exist in an individual due to the fact that the two copies of chromosomes inherited from the father and the mother are not identical is a challenging problem.

In an article published in Nature Communications, the researchers showed that diploid genomes can be completely decoded using an innovative assembly strategy. As a proof of principle, the work was done using cattle, but the assembly method is applicable to other diploid organisms including human.

The researchers showed that the genomes of two important modern-day cattle subspecies, Bos taurus taurus (Angus) and Bos taurus indicus (Brahman), can be deciphered from a single hybrid offspring that was a cross from Angus x Brahman. Previously, in a paper to Nature Biotechnology, the same group of researchers have demonstrated the assembly strategy worked at the contig level. The current work extends from the contig-level assembly to create the complete diploid genomes using some of the latest scaffolding technologies.

Trio Binning Technique

Dr Lloyd Low from the University of Adelaide’s Davies Research Centre says the technique, called trio binning, produces true representation of diploid genomes.

“It has been difficult to obtain true diploid genome without using painstaking cloning procedures. The trio binning technique requires sequencing father, mother and their offspring. The DNA sequences from the offspring will then be categorised as originating from the paternal or maternal chromosomes. ~99% of the sequencing reads from the offspring can be successfully categorised to the right parents in this study,” Lloyd said.

The trio binning method has now been tested on a number of species including human, cattle and yak. Besides trio-binning, there are also other sequencing and assembly techniques on the horizon to tackle this problem such as FALCON-Phase and HiFi PacBio/Strand-seq. FALCON-Phase appears to be more dependent on heterozygosity level than trio binning for correct phasing whereas HiFi PacBio/Strand-seq requires the use of cell culture.

Angus and Brahman Cattle Genomes Decoded Using New Assembly Strategy

A Black Angus bull.

The cattle assemblies produced from the method represent the best livestock genomes to date. Two different cattle breed genomes, Angus and Brahman are decoded at once using this new assembly strategy. Angus cattle are highly valued for their meat quality and they reach puberty early, which are important production traits. This cattle breed belongs to Bos taurus taurus and it is the same as subspecies as the original cattle genome project, which is of Hereford origin. This Hereford genome has also been recently upgraded using long read sequencing technology but it remains a composite of two haplotypes rather than a true diploid genome.

Bos taurus indicus. Source: José Reynaldo da Fonseca CC BY 2.5

The work presented is the first high-quality genome of Bos taurus indicus or Indian breed cattle, which is of great importance to those working on improving cattle traits since the indicine cattle are well known for heat tolerance, disease and drought resistance, and cope well with parasites such as ticks.

The cattle breed representing Bos taurus indicus in this project is the Brahman. Given the contrasting traits between Angus and Brahman cattle, a comparison of the two genomes should reveal some of important genetic regions responsible for their phenotypic differences. The researchers of the study found many differences between the two genomes that include SNPs, small INDELs and larger structural variants. One interesting genetic difference is the finding of an indicus-specific extra copy of fatty acid desaturase, which may be important for the regulation of the metabolism related to heat tolerance.

The Pranava Mantra (Om/Aum).

Origin of Brahman Cattle

The Brahman cattle have an interesting history because it was created from mixing the genetics of four Indian cattle breeds, Gir, Guzera, Indu-Brasil and Ongole, in the US from 1854. This is one interpretation but there is another source on its origin.

According to PASUTHAI, there were three principal breeds (Guzerat, the Nellore, and Gir) brought into the US to develop the Brahman cattle. A Krishna Valley strain has also been used but to a lesser extent. The development of this breed was close to the World War I and while there are records available on its breeding program, some assumptions need to be made on the exact link between Indian cattle and Brahman because mixing with European cattle genetics had occurred.

To digress a little and to finish, the word Brahman itself also has an interesting meaning because it means the highest Universal Principle, the Ultimate Reality in the universe. In Hinduism, which is the largest religion in India, Brahman is the material, efficient, formal and final cause of all that exists.

Cows are considered as sacred in India because the Hindus respect them and do not eat them. The Hindus in India did not name any cattle as Brahman even though the concept of Brahman originates from them and they revere and worship cows.

Pair-wise sequence alignment

Pair-wise sequence alignment methods

Construction of substitution matrices

DNA scoring matrices

Multiple sequence alignment (MSA) tools and resources

Single nucleotide polymorphism (SNP) tools and resources

Back to Blog Main Page