The chemosensory receptors of codling moth Cydia pomonella … – Nature.com

Transcriptome assembly and overview

A de novo C. pomonella transcriptome was generated by an Illumina HiSeq 90PE approach. A total of 159,222,221 filtered sequencing read pairs plus 6,471,828 unpaired reads from male antenna, female antenna and neonate larval head RNA libraries were assembled with the Trinity assembler into 152,890 contigs (>201 nt). After removing redundant sequences, our transcriptome contained 108,218 contig clusters (Trinity gene level) and 145,710 total sequences, with a mean length of 675.44 nt, an N50 length of 1019 nt and 26,135 sequences greater than 1,000 nt (Supplementary Data S1).

To assess the quality of the transcriptome, we ran a blast query against it with the D. melanogaster core eukaryotic genes dataset23. A total of 454 out of 457 conserved core gene homologues were present (blast e-value less than 7e-06, similarity greater than 47.72%). The nature of the 50 most abundant gene clusters was quantitatively examined (Supplementary Data S2) to provide a tissue-specific qualitative characterization (Supplementary Fig. S3). In the male antenna, 27 of the top 50 most abundant gene clusters were homologous to genes with putative chemosensory function (16 odorant binding proteins (OBP), 10 chemosensory proteins (CSP) and one sensory neuron membrane protein (SNMP), with an expression share of 41.7% of the summed fragments per kilobase of transcript per million reads (FPKM) value. In female antennae, 30 of the top 50 most abundant gene clusters were homologous to putative chemosensory proteins (20 OBP, 10 CSP), with a 27.3% share of the summed FPKM values. In larval heads, no genes with putative chemosensory function were observed in the 50 most abundant clusters; 39 of the top 50 gene clusters were homologous to either ribosomal proteins (n = 20), muscle-related proteins (n = 8) or mitochondrial related proteins (n = 11), for a collective 63.0% share of the summed FPKM values. The observation of enrichment of OBP and CSP expression in the male and female antennal tissues is consistent with a recent previous report utilizing similar methods24.

Odorant Receptors

Through qualitative transcriptome analysis, 58 putative ORs were identified (Supplementary Data S4), of which 47 were determined to contain full-length ORFs, based upon the presence of predicted start and stop codons and 5′ and 3′ untranslated regions (UTR).

Further assembly of our transcriptome, combining incomplete transcripts with published data19, yielded 53 full-length OR sequences, including complete ORFs for 20 of 21 previously incomplete OR transcripts19 and 21 novel ORs (Supplementary Table S5). Six previously annotated ORs comprise 3 unique gene models, where N and C terminal encoding sequence fragments were reported as unique transcripts (previously OR7/OR41, OR27/OR29 and OR36/39). Additionally, through blast homology analysis, the previously annotated OR43 and OR44 fragments have been determined to represent 3′ UTR sequence and do not encode OR proteins.

Based on our findings and recent transcriptome reports of OR repertoires in other tortricids21,22, we present a revised nomenclature of C. pomonella ORs (Table 1). Our annotations did not include an additional 13 partial transcript cluster sequences that contained protein-coding sequence fragments homologous to ORs and may represent novel, functional gene transcripts (Supplementary Data S6). The top insect OR blast hits for the translated protein sequences for each of these fragments had e-value scores less than or equal to 7e-08. The C. pomonella ORs are presented phylogenetically within the context of other tortricid moth species from which large OR repertoires have been published, along with B. mori ORs serving as a lepidopteran outgroup (Fig. 1)15,21,22.

Table 1 Cydia pomonella Odorant Receptors: Revised Nomenclature and ORF Status in comparison to Bengtsson et al.19.

Figure 1

Maximum likelihood phylogenetic tree of candidate CpomOR sequences with other lepidopteran OR sequences.

Unrooted. Includes sequences from Cydia pomonella (Cpom), Epiphyias postvittana (Epos), Ctenopseustis obliquana (Cobl), C. herana (Cher) and Bombyx mori (Bmor). Branches of the Orco clade are colored light blue; branches of the moth “pheromone receptor” clade are colored orange; branches of the secondary clade with sex-biased receptors are colored green; C. pomonella ORs are indicated with a larger bold font and novel C. pomonella ORs are marked with a “•”. Node support was assessed with 600 bootstrap replicates and values greater than 70% are shown.

For most ORs, only one unique ORF was identified. However, two protein encoding transcripts for OR6 were identical, except for their C-terminal ends (variant sequence after amino acid residue 398T). These putative variant transcripts have been confirmed through molecular cloning and sequencing and are annotated as OR6a/OR6b. To provide a greater degree of confidence in the transcriptomic sequences, we cloned and sequenced complete ORFs of several other ORs, namely, CpomOR2a/b/c (see below), CpomOR5, CpomOR30, CpomOR39, CpomOR41 and CpomOR71. For each of these receptors, all full length ORF nucleotide sequences identified in our primary transcriptome displayed between 98.52 and 99.52% identity to a sequenced clone (Supplementary Data S7). Comparison of the 16 ORs that contained complete ORFs in both this transcriptome and the previously reported assembly19 revealed that nucleotide sequences were at least 98.2% identical within the protein coding region.

The OR2 locus was present as a complex cluster of sequences. Four variant transcripts encoded similar N-terminal fragments. Three additional transcripts encoded C terminal fragments; these transcripts contained identical C-terminal ORFs that overlapped with the N-terminal fragments, but were variant in their 3′ UTRs. The C-terminal ends of these fragments were nearly identical in amino acid sequence to three receptors (named CpOR1a, CpOR11 and CpOR11a) previously identified by 3′ RACE18. We obtained full-length ORFs by 5′ RACE using primers designed in the 3′ UTR, each of which contained unique nucleic acid sequence. These receptors have been annotated as CpomOR2a, CpomOR2b and CpomOR2c. Analysis of the full-length amino acid sequences encoded by the transcripts for CpomOR2a/2b/2c indicate three unique, but highly similar transcripts (Supplementary Fig. S8). Comparison of the nucleic acid sequences show that CpomOR2a is 92% identical to OR2b and 89% identical to OR2c, while OR2b is 89% identical to OR2c (Supplementary Fig. S9). The deduced proteins for the CpomOR2 group were also highly similar, with OR2a being 88% identical and 91% similar to OR2b, OR2a is 84% identical and 89% similar to OR2c and OR2b is 84% identical and 90% similar to OR2c. Analysis of seven individual clones obtained from CpomOR2a revealed that there might be two forms of this receptor. Comparison of the OR2a clones revealed 18 distinct single nucleotide polymorphisms (SNPs) resulting in 6 amino acid changes (Supplementary Figs S10 and S11), with OR2a1 and OR2a2 being 98% identical on the nucleic acid level and sharing 96% identity and 98% similarity on the amino acid level.

For quantitative analysis of OR expression levels, we compared FPKM values for transcript clusters. For male antennae, estimated OR transcript expression levels ranged from 0.03 to 1129.88 FPKM (Fig. 2, Supplementary Data S12). OR1 (1129.88 FPKM), OR6 (83.68 FPKM) and OR3 (67.28 FPKM) were the most abundantly expressed tuning ORs in the male antennae and all three of these receptors cluster phylogenetically within the lepidopteran pheromone receptor subfamily clade20. For female antennae, FPKM values of ORs ranged from 0.22 to 785.91. OR3 (105.54 FPKM), OR13 (96.59 FPKM) and OR40 (78.65 FPKM) were the most abundantly expressed tuning ORs. In the larval head, FPKM values for ORs ranged from 0 to 4.8. OR64 (4.67 FPKM), OR18 (4.39 FPKM) and OR71 (3.9 FPKM) were the most abundantly expressed in neonate heads. In all three tissue samples, the OR co-receptor, Orco25 displayed higher expression levels compared to most tuning ORs (male antennae, FPKM = 987.83; female antennae, FPKM = 785.91; larval heads, FPKM = 4.8).

Figure 2

Heat-plot of relative expression values for CpomORs.

Estimation of abundance values determined by read mapping. Black indicates low/no expression, dark colors indicate low/moderate expression, bright colors indicate moderate/high expression. Color plots represent binary log of FPKM plus one for each gene (See Supplementary Data S12 for raw data). Color scales for each tissue type are independent of other tissue types. Range of values for Male Antennae: 0.04–10.14; Female Antennae: 0.29–9.62; Larval Heads: 0.00–2.54. Letters are indicative of lepidopteran OR subfamily clade nomenclature as inferred from de Fouchier et al., (unpublished data, manuscript submitted).

Comparison of transcript abundance levels between male and female antennae revealed several receptors with sex-enriched or biased expression patterns (FPKM > 10-fold difference between sexes; FPKM < 1 in non-enriched sex). OR1, OR5, OR6, OR7 and OR31 were male-enriched, while OR21, OR22, OR30 and OR41 showed female-enriched expression. Sex-biased enrichment of ORs in male or female antennae was examined qualitatively through end-point PCR analysis (Fig. 3, Supplementary Fig. S13). Consistent with transcriptome FPKM values, male sex-biased amplification was observed for OR5, OR6 and OR31 and female sex-biased amplification for OR21, OR22, OR30 and OR41. We were unable to consistently amplify OR7 in either tissue and amplification of OR1 was observed in both male and female antennal tissues, which is also consistent with FPKM values.

Figure 3

Sex-biased expression of CpomOR genes in C. pomonella antennae.

Gel electrophoresis of end-point PCR products using antennal cDNA from adult male and female C. pomonella. Primers designed to amplify the complete ORF of putative CpomOR genes. Expected sizes are indicated in Supplementary Data S7; Uncropped gel images are shown in Supplementary Fig. S13. NTC: No Template Control.

Gustatory Receptors

The primary transcriptome contained 20 GRs (Supplementary Data S14), including 19 novel gene models (Supplementary Table S5). Seven GR transcripts contained complete ORFs based upon the predicted presence of start and stop codons and 5′ and 3′ UTRs. The conserved C-terminal motif, “TYhhhhhQF” (h = hydrophobic amino acid R group), characteristic of GRs, was identified in 13 transcripts (Supplementary Table S5). CpomGR1 was identified as an incomplete fragment, but the full-length ORF was obtained by RACE to verify sequence. As the most complete moth GR repertoire was first reported for B. mori16, we adapted C. pomonella GR nomenclature to the nearest neighbor B. mori homologues where strong bootstrap support was available, including a revision of the previously annotated CpomGR4 to CpomGR8. Additionally, based upon blast homology, 13 transcript clusters contained putative GR fragments deemed too short to annotate with confidence (Supplementary Data S15). A phylogeny of C. pomonella GRs are shown together with the complete GR repertoires of B. mori, Danaus plexippus and Heliconius melpomene (Fig. 4)16,26,27. All candidate codling moth GRs cluster with CO2, sugar and putative bitter receptor families.

Figure 4

Maximum likelihood phylogenetic tree of candidate CpomGR sequences with other lepidopteran GR sequences.

Unrooted. Includes sequences from Cydia pomonella (Cpom), Heliconius melpomene (Hmel), Danaus plexippus (Dple) and Bombyx mori (Bmor). Branches containing putative carbon dioxide receptors are colored dark blue; branches containing putative sugar receptors are colored light blue; branches containing putative bitter receptors are colored black; C. pomonella GRs are indicated with a larger bold font and all C. pomonella GRs are novel, except CpomGR8. Node support was assessed with 600 bootstrap replicates and values greater than 70% are shown.

For a quantitative analysis of GR expression levels, we compared FPKM transcript values. In general, CpomGRs were expressed at relatively lower levels than ORs (Fig. 5, Supplementary Data S16). In male antennae, GR transcript abundance levels ranged from 0.06 to 30.86 FPKM, in female antennae, GR FPKM values ranged from 0.16 to 29.91. In both sexes, putative fructose receptor, GR8 (30.86 and 29.91 FPKM, respectively) and a close homologue, GR9 (12.56 and 10.7 FPKM, respectively) were most abundant. In the larval head, GR FPKM values ranged from 0 to 2.73. The most abundantly expressed larval GR was GR2 (2.73 FPKM), which clusters within the carbon dioxide sensing GR clade16.

Figure 5

Heat-plot of relative expression values for CpomGRs.

Estimation of abundance values determined by read mapping. Black indicates low/no expression, dark colors indicate low/moderate expression, bright colors indicate moderate/high expression. Color plots represent binary log of FPKM plus one for each gene (See Supplementary Data S16 for raw data). Color scales for each tissue type are independent of other tissue types. Range of values for Male Antennae: 0.08–4.99; Female Antennae: 0.21–4.95; Larval Heads: 0.00–1.90

Ionotropic receptors

Six novel IR encoding transcripts were found (Supplementary Table S5), in addition to the 15 previously reported C. pomonella IRs19. Complete ORFs, based on predicted start and stop codons and 5′ and 3′ UTRs, were identified for 15 of the 21 CpomIRs, including 8 of the 12 previously reported incomplete IR gene transcripts (Supplementary Data S17).

We compared our predicted IR protein products with IRs and iGluRs identified from B. mori and D. melanogaster (Fig. 6)28, as well as 8 novel CpomiGluRs. Determination of IR status has largely been inferred by phylogenetic analysis of IRs versus ionotropic glutamate receptors (iGluRs), which cluster separately from the IRs. Based on phylogenetic relationships between IRs across these three species, IRs previously annotated as CpomIR41a, CpomIR75, CpomIR75p and CpomIR75q2 have been re-annotated as CpomIR41a.1, CpomIR75q.1, CpomIR75p.2 and CpomIR75q.2, respectively.

Figure 6

Maximum likelihood phylogenetic tree of candidate CpomIR/iGluR sequences with Bmor/Dmel IR and iGluR sequences.

Unrooted. Includes sequences from Cydia pomonella (Cpom), Drosophila melanogaster (Dmel) and Bombyx mori (Bmor). Branches containing putative ionotropic glutamate receptors (iGluRs) are colored light blue; branches containing putative IR co-receptors are colored purple; branches containing divergent IRs are colored orange; branches containing putative antennal IRs are colored black. C. pomonella IRs are indicated with a larger bold font and novel C. pomonella ORs are marked with a “•”. Node support was assessed with 600 bootstrap replicates and values greater than 70% are shown.

For an initial estimate of IR abundance levels, FPKM values were compared. For all tissues, three putative IR co-receptors, IR8a, IR25a and IR76b29 were the most abundantly expressed. FPKM values in male antennae, female antennae and larval heads, respectively, were 255.83, 278.75 and 1.18 for IR8a; 291.98, 243.76 and 14.17 for IR25a; and 383.43, 390.48 and 7.74 for IR76b. Excluding the putative co-receptors, IR transcript FPKM values ranged from 0.6 to 159.69 in males and 0.99 to 192.58, in females. In male and female antennae, IR75q.2 was the most abundantly expressed candidate tuning IR with FPKM values of 159.69 and 192.58, respectively. In larval heads, excluding the co-receptors, IR FPKM values ranged from 0 to 3.87, with IR64a being the most abundantly expressed IR transcript. (Fig. 7; Supplementary Data S18).

Figure 7

Heat-plot of relative expression values for CpomIRs.

Estimation of abundance values determined by read mapping. Black indicates low/no expression, dark colors indicate low/moderate expression, bright colors indicate moderate/high expression. Color plots represent binary log of FPKM plus one for each gene (See Supplementary Data S18 for raw data). Color scales for each tissue type are independent of other tissue types. Range of values for Male Antennae: 0.67–8.59; Female Antennae: 0.99–8.61; Larval Heads: 0.00–3.92.

Leave a comment

Your email address will not be published. Required fields are marked *