Title
Abstract
In this paper we report the complete mitochondrial genome of Chitala chitala which belogs to the order osteoglossiformes. The complete mitochondrial genome sequence (mtDNA) was determined by Next Generation Sequencing platform Illumina HiSeq 2500. The genome is 16,381 bp in length, with a standard set of 22 transfer RNAs (tRNAs), two ribosomal RNAs (rRNAs), 13 protein coding genes and a non-coding region. The ratio of synonymous and non synonymous substitutions (Ka/Ks) indicates that 10 genes evolved under purifying selection.
Keywords-
Introduction
Chitala chitala (Hamilton, 1822) is a freshwater fish commonly known as Indian featherback and is among the oldest extant teleost freshwater fish groups; (subdivision Osteoglossomorpha, order Osteoglossiformes, family Notopteridae) (Mandal et al., 2009). Teleosts represent the largest vertebrate group having over 24,000 species, accounting for more than half of all the vertebrates. The fish is found in freshwater bodies of Indian subcontinent including Nepal, Bangladesh, Pakistan, Myanmar, Thailand and Cambodia as well (Roberts TR., 1992; Jayaram KC., 1999; Froese and Pauly., 2003).
The only genetic material outside of nuclear DNA is mitochondrial DNA. There are many characteristics of mitochondrial DNA which makes it ideal for analyzing genetic relationship such as simple construction, small molecular weight, maternal inheritance, conserved gene content and accelerated rate of nucleotide substitution (Lin et al., 2004; Cameron 2014). Mitochondrial DNA has been widely used for phylogenetic analysis of many groups (Simon et al., 1994; Dowton et al., 2002; Simon et al., 2006; Behura et al., 2011; Cameron 2014). These markers have become popular for evolutionary studies.
Based on the high economic importance of the fish C. chitala there has been an increased interest in its evolutionary history. These freshwater fishes are a significant aspect of biogeographical studies as they do not spread easily through the saltwater areas so their evolution might be tightly linked to the geological histories of landmasses on which evolution took place. (Banarescu 1990, pp.11-55; Lundberg 1993).
Current progress in the DNA sequencing technology allows cost effective and rapid sequencing of the complete mitochondrial genome. Therefore it has become very popular when it comes to studies of molecular evolution, phylogenetic relationships and phylogeography (Wilson et al, 1985; Boore et al, 1995; Avise 2000; Boore et al, 2005; Cui et al, 2011). Mitogenomics data have popular use when it comes to phylogenetic, phylogeographic and ecological studies. The complete mitochondrial genome sequence is important for the study of genome evolution and species phylogeny. In this study we present the first complete nucleotide sequence for the mitochondrial genome of Chitala chitala. We also report the organization, gene arrangement and codon usage of C. chitala mitochondrial DNA and compare it with other freshwater fishes. Finally, we conduct the phylogenetic analysis based on the protein coding genes with the main aim of investigating the phylogenetic position of Chitala chitala.
.
Materials and Methods
Sample preparation and DNA extraction
Blood samples were collected from ______ through caudal puncture and immediately preserved in 95% ethanol. Total genomic DNA was extracted using the phenol-chloroform protocol (Singh et al, 2012). The concentration was checked in picodrop spectrophotometer (Picodrop Ltd, Cambridge, UK) and the quality of DNA was assessed in 0.8% agarose gel stained with ethidium bromide.
PCR amplification, library preparation and Sequencing
The mitogenome was amplified in its entirety through a long PCR technique (Cheng et al, 1994). The mitogenome was divided into two overlapping segments that were amplified with two primers: ______ and _______. The reaction mixture contained 0.5µl Takara LA-Taq Tm DNA polymerase (Takara, Japan), 0.5µl buffer (10X), 8µl dNTPs (2.5mM each), 0.5µl (20pM) each primer, 1 µl template DNA(500ng/ µl) and nuclease free water 34.5 µl in a final volume of 50µl. PCR conditions included an initial denaturation of 5 min at 94C followed by 30 cycles of denaturation of 30 s at 94C, annealing of 15 min at 68C, followed by extension step of 10 min at 72C and 4C, forever. The amplification was performed using Biorad C 1000 thermal cycler.
The quality and quantity of the PCR products was checked on NanodropTM 2000 spectrophotometer and dsDNA estimation was done on Qubit® 2.0 Flurometer. UltraTM DNA library prep kit was used for preparing library. The amplicon was validated for quality and length of library by Tape Station (Agilent, USA). Sequencing of libraries was done on Next Generation Sequencing platform Illumina HiSeq 2500 utilizing a 500 cycle Illumina Hi Seq Kit.
Annotation and Sequence analysis
The nucleotide composition was calculated by Mega 6 software. The AT and GC asymmetries called the AT skew and GC skew were calculated using the formulas by Hassanin et al : AT skew [(A-T)/(A+T] and GC skew [(G-C)/(G+C)]. The AT content, AT skew and GC skew were calculated to investigate the nucleotide- compositional behavior of mitogenome.
The majority of transfer RNA (tRNA) genes were found out by tRNA Scan-SE 1.21 (Schattner et al, 2007) using sequence source and genetic code as vertebrate mito, search mode was kept as default with a cut-off score of 5.
The codon usage of thirteen protein coding genes was summarized with Mega 6. To calculate the non-synonymous (Ka), synonymous (Ks) and their ratio (Ka/Ks) for protein coding genes, Mega 6 was used. The P-distance was calculated between 15 species of the order osteoglossiformes used in this study as ingroup also through the software Mega 6.
Codon usage bias patterns in C. chitala
The RSCU (Relative Synonymous codon usage) for all protein coding genes of 30 species was calculated using Mega 6. A heat map was drawn by CIMMiner using quantile binning method (https://discover.nci.nih.gov/cimminer) (Weinstein et al, 1997) and clustered the mitochondrial RSCU values using a Euclidean distance method and an average linkage cluster algorithm.
Phylogenetic relationships
For the phylogenetic analysis of C. chitala 12 concatenated protein coding sequences were considered of 30 species out of which 15 species belonged to Osteoglossiformes. The rest belonged to Hiodontiformes (1), Gonorynchiformes (1), Cypriniformes (2), Salmoniformes (3), Clupeiformes (2), Anguilliformes (3) and Polypteriformes (3) which were used as outgroups. The mitogenomes of these 30 species were collected from NCBI (Table. __) and the 12 protein coding genes excluding ND6 were aligned using the software BioEdit 7.2.5 version (Hall 1999) which drives clustal W program.
Phylogenetic trees were constructed by using Maximum Likelihood (ML) methods and Neighbour-Joining (NJ). The phylogenetic tree was build from molecular data with MEGA 6 software (Tamura et al. 2013). For these data sets GTR+G+I model was selected for ML analysis as these showed the lowest BIC (Bayesian information Criterion) and AICc (Akaike information Criterion, corrected) values. The bootstrap value of NJ and ML was kept as 1000.