
Canola SNPro Panel
We offer the Canola SNPro Panel for full-service, PlexSeq genotyping for Brassica napus to support Genomic Selection and other molecular breeding methods.
The panel developed by NRGene includes 500 SNPs that were designated by filtering a Canola pan-genome of 12 de novo assemblies (https://nrgene.com/).
​
We also offer the Canola SNPro panel to the public at a discounted price and expedited turnaround through the AgriPlex Connect program.
* The following is derived from the White Paper
Introduction
Rapeseed (Brassica napus subsp. napus), also known as rape, or oilseed rape, and commonly referred to as Canola plant when grown in Canada or Australia, is presently the third-largest oil crop and accounting for the production of over 72.3 million tons in 2020.
Rapeseed has an allotetraploid genome (2n = 4× = 38) with an estimated haploid size of 1.1Gbp. The complex polyploid structure has resulted in abundant structural and copy number variations, which has posed a challenge for accurate genotyping of genome-wide markers for association studies and genomic predictions. Many breeding programs utilize an existing microarray for B. napus consisting of ~60K SNPs.
NRGene recently launched Canola SNPro™, which features a generic version of the SNPer™ solution. The minimal panel is customized for Spring Canola and, similarly to SNPer™, combines low-density genotyping with high-density imputation to an industry-standard panel such as the 60K set or derivatives thereof.
​
Reliable SNP selection
The 60K Canola SNPs were mapped to 12 B. napus reference genomes, previously assembled de novo by NRGene . The 12 genomes were contributed by commercial and academic participants of the B. napus pan-genome consortium and consisted of 4 Winter and 8 Spring lines (Table 1).
Of the 60K SNP sequences, 39.5K were aligned to a single conserved genomic position in all 8 Spring Canola assemblies, while 33K passed the single locus criterion in the combined 12 Spring and Winter lines. This means that the selected SNPs exist (not absent) in all the reference sequences and are also specific for the A and C sub-genomes that constitute the B. napus tetraploid genome. The removal of the polygenic and absent SNPs from further analysis is important for reliable genotyping, accurate imputation, and ultimately genomic prediction.
​
​
​

Target Set definition
The allele frequency of the selected SNPs were determined in two datasets that were based on the 60K array: a Chinese semi-winter rapeseed dataset of 203 lines that were genotyped with a 24K subset, and an Australian/Canadian dataset of 61 lines genotyped with 36K SNPs : 19.7K SNPs complied with a call rate threshold of >0.1 and a Minor Allele Frequency (MAF) cutoff of 0.05. This Target SNP Set was used as the target panel for the imputation and benchmarking processes described below. In the Canola SNPro™ process, the parental lines of a typical breeding program will be genotyped for these markers by means of an array or by whole genome sequencing. The high density 19.7K panel can also be used in any other project with the objective of high resolution genotyping.
​
​
Minimal panel development
The 500 selected SNPs were used to develop an amplicon sequencing plex (PlexSeq™ ). Figure 3 displays the distribution of the selected SNPs along the B. napus reference sequence. If needed, a larger ~1000 set can be used to increase the robustness and accuracy of genotype calls.
​
​

Figure 3: SNP distribution along the B. napus physical reference
Call rates and agreement of datasets
A potential pool of 9,234,576 data points represents the complete set of genotypes that can be captured representing the 19.7k SNPs used as the Target Set. The call rate of the array data and the imputed set was 97% and 98%, respectively. This result is typical of the ability of a conservative imputation process to salvage No calls that are a technical artifact of the array. Table 2 shows the agreement between the different calls of the two genotype calls. As shown in Table 2, a total of 93% of data points were consistent. The imputation pipeline could also rescue 51% of the data points that were not called by the array. Only 32% of the No Call data obtained by SNPro™ had a genotype call in the 60k array. Imputation accuracy is higher in homozygous loci than in heterozygous loci. The overall imputation accuracy for the entire set was 94.2% (excluding the SNPs that had a No Call in the array data).
​
​

Table 2. 60k and SNPro™ agreement. Rows represent the number of resulting calls made by Canola SNPro™ and columns represent the array calls. NC= No Calls, AA BB AB represent homozygous ref, homozygous alt, and heterozygous calls. The data points are distributed on the -2dimensional matrix such that the sum of data points for which both sources made the same call would be on the diagonal of the matrix.
Conclusions
This white paper details the process of designing and validating a SNPro™ genotyping solution for B. napus based on filtering and optimization of the Canola 60k panel. The SNP set was filtered by aligning to 12 reference genome assemblies and adjusted to the call rates and allele frequency of 2 different B. napus populations to generate a robust target set of SNPs that can be utilized for genomic predictions. A minimal SNP set of 500 SNPs was validated through simulations, to produce an imputation accuracy of 95- 98% for different population types. Empirical validation using 468 NAM samples yielded a 1% increase in call rate and 95.6% per sample consistency (accuracy) compared with the full 60K array data of the same samples. The Canola SNPro™ Minimal Set and imputation can be utilized to fit any B. napus population (including winter types) by adjusting the Target Set as shown for the two populations in this study.