Traditional Culture Encyclopedia - Traditional stories - What are the genetic testing methods

What are the genetic testing methods

Abstract: What are the methods of genetic testing? This article introduces several common methods of DNA level genetic testing, comparing its advantages and disadvantages and its application in clinical diagnosis and scientific research, which has a guiding significance in guiding postgraduate students and clinicians in extracurricular learning, advancing clinical scientific research and improving the level of scientific research and teaching. Comparison of genetic testing methodsWhat are the genetic testing methodsPrinciples of genetic testing technology

1, the first generation of sequencing

1.1Sanger sequencing using direct sequencing method

In 1977, FrederickSanger and others invented the double deoxy chain end termination method, this technology then became the most commonly used gene sequencing technology. In 2001, Allan Maxam and Walter Gibert invented the Sanger sequencing method, which became the gold standard for genetic testing in the following 10 years. The basic principle is that dideoxyribonucleosidetriphosphate (ddNTP) lacks the 3'-OH required for PCR extension, so whenever molecules of ddNTP are added to the DNA strand, the extension is terminated. Each DNA sequencing is composed of four independent reactions in which the template, primers, and four types of ddNTP containing different radioisotope-labeled nucleotides are each mixed with DNA polymerase to form fragments of varying lengths, and a large number of DNA fragments with the same start point and different termination points are present in the reaction system, and DNA sequences with single base differences can be separated by polyacrylamide denaturing gel DNA sequences with single base differences can be separated by polyacrylamide denaturing gel electrophoresis to obtain radioisotope autoradiography bands. The base sequences of the DNA double strands are read from the electrophoretic bands.

The sequencing of the human genome is based on this technology, and Sanger sequencing, a direct sequencing method, is highly accurate, simple, and fast. At present, Sanger sequencing is still of high practical value for the identification of genes for genetic diseases in small clinical samples. For example, direct sequencing of the FGFR2 gene by Sanger was used to confirm single-gene Apert syndrome, and direct sequencing of the TCOF1 gene could detect up to 90% of the mutations associated with TreacherCollins syndrome. It is important to note that Sanger sequencing involves designing primers for mutation sites in known causative genes and performing direct amplification sequencing by PCR. Amplification of a single mutation site is sufficient to include a fragment of the exon within the site, rather than amplifying all exons of the gene in which the site is located.

Therefore, it is necessary to clearly locate the exon of the gene in which the site to be amplified is located and the specific position of the site, and to design primers for the exon fragments of 150-200 bp upstream and downstream of the site, including the site. In addition, despite the advent of NGS, Sanger sequencing is very economical and efficient for the detection of causative genes in single-gene genetic disorders with well-defined and limited number of causative loci. To date, Sanger sequencing remains the gold standard for genetic testing and is the primary means of intrafamilial and normal control validation after NGS genetic testing.

It is worth noting that the purpose of Sanger sequencing is to look for specific genetic mutations associated with disease. It is difficult to screen large samples without a clear candidate gene or with a large number of candidate genes, and such sequencing studies rely on NGS with its high-throughput sequencing capabilities. although Sanger sequencing has a high degree of analytical accuracy, its accuracy depends on the sequencing instrument as well as the setup of the sequencing conditions. In addition, Sanger sequencing cannot detect the type of mutations such as large segment deletions or copy number variations, so it is not yet possible to make a genetic diagnosis of some related genetic diseases.

1.2 Chain analysis uses indirect sequencing

Before the emergence of NGS, the internationally common strategy for cloning disease gene loci was the cloning of positional candidate genes based on large-scale whole-gene scanning and chain analysis. Human chromosomes occur in pairs, one from the father and one from the mother, and each pair of chromosomes has the same genes at the same locations, but their sequences are not identical and are known as paternal and maternal alleles.

A genetic marker is a DNA sequence that exhibits polymorphism in a population and can trace any of the genetic characteristics transmitted in a chromosome, a segment of a chromosome, or a locus in a family line. It is present in every individual but varies in size and sequence and is heritable and recognizable. A second generation of genetic markers, known as repeated sequence polymorphisms, especially short tandem repeats, also known as microsatellite markers, is currently used.

Chain analysis is a method of studying the relationship between disease-causing genes and hereditary markers based on the genetic phenomenon known as chaining. If a genetic marker exists near a gene controlling a phenotypic trait, then the presence or absence of an interlocking relationship between a genetic marker and a proposed gene and the closeness of the interlocking can be used to localize the gene to a certain chromosomal position. 1986 Morton et al. proposed the dominant logoddscoremethod (LOD), which is mainly used to test the interlocking of two genes at a certain rate of recombination, but it is also known as microsatellite marker. A positive LOD value supports linkage, while a negative LOD value negates linkage. By calculating the LOD value between the microsatellite marker and the causative locus in a family line, the genetic distance and the degree of linkage between the two can be initially estimated, thus determining the rough location of the gene on the chromosome. The chromosomal gene map of the region is then used to analyze the function and expression of all genes in the localized region, select suitable candidate genes for mutation detection, and ultimately localize or clone the pathogenic gene.

However, there are great limitations in using chain analysis for genetic testing. Not only does it require a large amount of genetic samples, generally requiring the provision of blood samples from patients with three or more generations of genetic lineage, but also a large amount of data, complex processing, slower output, and less precise localization (generally only localized in a certain interval of the chromosome), which makes the research work heavy and the time period for locating genes particularly long. Currently, single nucleotide polypeptide sex and short tandem repeat sequences used for chain analysis are still in use, but classical indirect sequencing methods such as single-stranded conformational polypeptide sex, denaturing gradient gel electrophoresis, and heterologous double-stranded analysis have been phased out in the United States, while they are still used on a limited basis in developing countries as a research tool.

2. Next-generation sequencing (NGS)

Mainly includes whole-genome resequencing (WGS), whole-exome sequencing (WES), and targetedregionssequencing( Targetedregionssequencing (TRS), which belong to the new generation of sequencing technologies. Overall, NGS technologies have the advantages of high throughput, short time, high accuracy and rich information, which allow geneticists to precisely localize genes of interest in a short period of time. However, these different sequencing technologies are very different in terms of sequencing range, amount of data analyzed, as well as the cost and time of sequencing, etc. If the appropriate method is chosen, it will be twice as effective for clinical diagnosis and scientific research.

2.1 Target region sequencingThe commonly used technology is gene chip technology

The sequencing principle is based on the principle of DNA hybridization, the use of the target genomic region of the customized probe and genomic DNA chip hybridization or solution hybridization, the target gene region DNA enrichment, and then through the NGS technology for sequencing. The sequencing process involves placing tens of thousands of cDNAs or oligonucleotides on a chip to make an array, complementary pairing of nucleotide probes with known sequences fixed on the chip and corresponding nucleic acid sequences containing fluorescent labels in the solution, obtaining information about each group of point arrays based on the position and intensity of strong fluorescence displayed by the sequencer, and then using bioinformatics algorithms to determine the sequence composition of the target nucleotide. The target region selected for sequencing can be a continuous DNA sequence or fragments distributed in different regions of the same chromosome or on different chromosomes. Target region sequencing technology is a very good means of further detection for mutations that were previously identified by linkage analysis but could not be identified in the region of a chromosome segment. 2010, Nicholas et al. successfully identified a new gene for microcephaly, WDR62, by using genotyping microarrays in conjunction with linkage analysis, and the article was published in the journal NatGenet. Similar studies have identified eight candidate loci in familial pancreatic cancer and a susceptibility gene, TSPAN12, in familial exudative vitreoretinopathy.

Gene microarray sequencing allows for deeper study of specific genes or regions that have been targeted by linkage analysis or genome-wide screening, and is an effective means of resolving the problem of disease-causing genes that are not detected by linkage analysis. Gene chip technology is an effective means to address the failure of linkage analysis to identify disease-causing genes. Gene microarray technology has obvious advantages for the screening of known gene mutations, and can quickly and comprehensively detect the target gene mutations. At the same time, because the target region is restricted, the sequencing range is substantially reduced, and the sequencing time and cost are reduced accordingly. However, the amount of DNA required for gene chip testing should be large, and due to the risk of degradation of extracted DNA, blood specimens used for gene chip studies should preferably be frozen whole blood, so that the amount of DNA that can be used for testing can be adequately assured.

2.2 Whole Exome Sequencing (WES)

The exome is the sum of all protein-coding sequences on the genomic DNA of a single individual. The human exome sequence accounts for about 1% of the total human genome sequence, but contains about 85% of disease-causing mutations.WES is a genetic analysis method that utilizes sequence capture technology to capture and enrich DNA from the whole exome region, and then performs high-throughput sequencing. The main technology platforms used are the SeqCapEZ whole-exon capture system from Roche, the Solexa technology from Illumina, and the SureSelect exon-targeted sequence enrichment system from Agilent. The target regions captured range from 34 to 62 M, including not only coding regions but also some non-coding regions.The sequencing process of NGS mainly includes preparation of DNA sequencing libraries, anchoring and bridging, PCR amplification, single-base extension sequencing and data analysis. The researcher based on the sequencer captured in the sequencing process mixed with different fluorescent labeled base fragments, through the computer fluorescence signal into different colors of the sequencing peak map and base sequence. The gene sequencing results were compared with international authoritative databases, such as NCBI's SNP database and the Thousand Genomes Database, to finally determine whether the gene was mutated.

Since the introduction of NGS technology, unprecedented results have been achieved in the identification of clinical disease-causing genes using WES. These results have not only focused on single-gene genetic diseases, but have also yielded a large number of relevant gene discoveries in complex diseases with polygenic effects. New genes or new mutations in known genes have been identified in monogenic diseases, such as retinitis pigmentosa and terminal bone dysplasia. New causative genes have been identified in rare diseases such as Kabuki syndrome, familial mixed hypolipidemia, and spinocerebellar ****jetty disorder. Fruitful results have also been achieved in research on tumors such as small-cell lung cancer and chronic lymphocytic leukemia, as well as complex diseases such as obesity and cerebral cortical dysplasia.

WES technology has obvious advantages over other sequencing technologies in terms of screening scope and detection rate. For example, WES can be used for further gene screening and identification of samples that cannot be screened by Sanger sequencing and GeneChip sequencing. The application of WES technology can obtain deeper coverage and more accurate data than traditional Sanger and other methods for sequencing coding regions. Due to the substantial increase in the amount of information, WES can obtain coding region information for a larger number of individuals, thus becoming an effective means of detecting disease-causing and susceptibility gene loci. Compared with the linkage analysis localization method, WES is not very strict on the family line requirements, in monogenic genetic diseases in the same family line of two to three patients and a normal person can be carried out for the identification of disease-causing genes, rather than the need for three consecutive generations of genetic family line. By eliminating the need for a strict genetic lineage of more than three generations, WES has made it possible to study previously unavailable family lines. Not only is it a good research tool for single-gene genetic diseases, but also for many common diseases, such as tumors and diabetes, large-scale comparative studies can be conducted.

2.3 Whole genome resequencing (WGS)

WGS is a research method that sequences the whole genome of different individuals of a species with a known genome sequence, and then splices and assembles the sequences and obtains a genome map after data analysis, or sequences different tissues and analyzes somatic cell mutations. Although WES can quickly and comprehensively identify all mutations in the genome of an individual and thus find differences between individuals, it is not effective in detecting genes in regions other than exons. For such cases, genome-wide testing is currently performed with the help of WGS. However, due to the huge size of the human genome, it is difficult to achieve the required sequencing depth with one single-end whole genome sequencing. Therefore, repeated sequencing or double-end sequencing is required, which leads to a significant increase in sequencing costs and a decrease in the accuracy of the results due to the inability to achieve sufficient sequencing depth. For clinical disease diagnosis and general research, the high cost of the assay is unaffordable. Nonetheless, WGS is needed for more comprehensive genetic testing for some clinical studies and research topics that cannot be solved by WES.

3. Outlook

The emergence of NGS has added unlimited vitality and imagination to the emerging genomic technology. In particular, the vitality demonstrated by the introduction of gene chips and their already clinical application in large-sample disease screening and genetic diagnosis, as well as the model of their commercial development, are encouraging. In ophthalmology, the discipline where monogenic diseases are most common, screening for Laber disease using microarray technology has led to the definitive diagnosis of many optic nerve atrophies of unclear etiology. Primary open-angle glaucoma is the most insidious and dangerous blinding eye disease in ophthalmology, and the identification of its causative genes or mutations will have very important clinical value and great commercial value for disease screening. The use of gene chip technology in the screening of neonatal diabetes can be faster, more comprehensive and economical, avoiding the overly cumbersome and leaky detection of first-generation sequencing.

Gene microarray technology is even more promising in prenatal diagnosis. As long as the pregnant women DNA blood test can be screened for genetic diseases, avoiding the limitations and dangers of invasive testing through amniocentesis to extract amniotic fluid in the past. At present, the improvement of the level of genetic testing technology and the continuous reduction of testing costs, the development of large-scale individualized genetic testing will become possible in the near future. At the same time, the in-depth development of the testing of drug susceptibility genes and susceptibility genes for the occurrence of diseases, individualized medical treatment will be realized on the basis of genetic testing. It is reasonable to believe that, with the continuous improvement of people's living standards and health awareness, genetic testing in the future development of medical applications will be very promising.